[BOLT-1223] Type information not available in plans. Created: 2019/04/02  Updated: 2019/05/13  Resolved: 2019/05/06

Status: Resolved
Project: Puppet Task Runner
Component/s: None
Affects Version/s: BOLT 1.15.0
Fix Version/s: BOLT 1.20.0

Type: Bug Priority: Normal
Reporter: Chris Barker Assignee: Cas Donoghue
Resolution: Fixed Votes: 0
Labels: docs, docs_reviewed
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Template:
Sprint: Bolt Kanban
Method Found: Needs Assessment
Release Notes: New Feature
Release Notes Summary: A new `to_data` method is available for plan result objects that provide a hash representation of the object.
QA Risk Assessment: Needs Assessment

 Description   

The use of type to refer to the type of action that created a result object is unable to be exposed as an attribute on the result object since type is used elsewhere in Puppet.

Solution:
1. Rename this key to action and expose it with a method, the values should remain the same.
2. Add a to_data method to all bolt datatypes that return a hash or array equivalent of their json output. This will make the usecase which triggered the initial ticket, the ability to pass the result from one action directly to a task easier. This may or may not include renaming type.

Note the "target" in a Result should be the name not the full target hash. Adding to_data to Target is out of scope



 Comments   
Comment by Chris Barker [ 2019/04/02 ]

Realize this is a duplicate of BOLT-833

Comment by Chris Barker [ 2019/04/02 ]

Also since I'm unsure on how it should behave, I don't know if I should create a ticket - but getting the ruby object definition when trying to load the result Type seems to be a bug from what I can tell.

Comment by Henrik Lindberg [ 2019/04/02 ]

Not a bug - you end up calling the type() function which correctly is telling you what the type of the value is (by giving you the description of the type; arguably it could just display the name of the data type instead of the full definition, and you may consider it to be a bug that it outputs the full definition).

Comment by Chris Barker [ 2019/04/02 ]

Henrik Lindberg However given the above output from the notice($result) - how would I access the equivalent of - clarify this isn't psuedo code, this is actual output indicating that the Result is from the run_command function in a Puppet Plan:

"type": "command"

And see the output of "command" in an easy fashion?

This is going back to the point that it is ridiculously complicated to just be able to get the type of the function/action that created that result.

Comment by Henrik Lindberg [ 2019/04/02 ]

I understand the frustration that the intended "human readable output" and the technical implementation are different - I was just explaining what I think happens when trying to get the "type" field. If there was an actual type() method for instance of that data type it would have been called instead of the general function with the same name. I supposed it is called something else internally, but is presented as "type".

Looking at the output for $result where you see its attributes - you see there is a "value" attribute which you can get to with $result.value() and that is a hash. I don't know what is in it... but maybe that helps you for now until one of the Bolt folks respond.

Comment by Alex Dreyer [ 2019/04/03 ]

Agreed type is a problem, we will change the name and make it available with a function on the Result object.

Adding .to_json and/or .to_data to the Result and ResultSet objects should make them easier to pass to tasks and interact with in the pure data format. This seems like it may be what you really want.

Injecting _status, _target into the value of the result and later moving the result key to the top level is probably the solution for BOLT-833. Deciding how to roll that out in the least disruptive fashion is a bit of a problem and it will need to be mimicked in PE's orchestrator responses at some point.

Comment by Chris Barker [ 2019/04/03 ]

Great - .to_json or .to_json would be useful, but I guess I'm coming from the background of structured facts where you access nested data via $facts['os']['family'] etc. I can see why we can't just make [] render all of Result as a hash without introducing a breaking change since thats how you access value - but there's really the general inconsistency that one can create a thing named $foo that could be treated as a hash when printed, but can't be accessed like a hash would be when called vs all the other examples of interacting with a hash.

While in the interim having $result.to_json['object'] be a working alternative until $result['object'] could be introduced as a breaking change may be acceptable - it's not intuitive that when I print $result via notice($result) I see a json hash, so to access it I need to call it as $result.to_json['object'] in my code.

Comment by Alex Dreyer [ 2019/04/03 ]

We will never be able to allow result['object']. The task spec allows tasks to return arbitrary keys so object could be a value from the task itself. It would have to be result['_object'] to avoid collisions.

Comment by Chris Barker [ 2019/04/03 ]

Wouldn't those be returned under $result['result']['object'] if we put all task results there (or 'value' to make it consistent with the api / internals) - again making the hash you access match the hash you'd see above in the notice($result) output?

Comment by Alex Dreyer [ 2019/04/03 ]

having to always index into the actual value of a task result is too verbose. If we're going to standardize it will be by eliminating a layer from the JSON output rather than adding a layer of boilerplate to the more common interaction of indexing into the Result object in a plan.

Comment by Chris Barker [ 2019/04/04 ]

What I'm trying to reconcile as an enduser and as someone whose looking at teaching these concepts to a novice user with no programming experience is:

The $result.value() is important and referenced often, so we want to make less keystrokes to get to it, so $result[] should just include all the contents of it.

However a task can return arbitrary values and any keyname it wants, so for internal things we want to prevent being conflicted with, we will prepend them with _*.

But do we prevent a task from returning a _*? Or are we dependent on the task author to not do that.

Since we're exposing keynames as _* to the user, they may assume that is a safe behavior, we make no indication as such in the code, instead relying on the user to read and learn more about the philosophy of our tool and internalize the mental model how objects and methods behave, before they can continue working on it.

Going back to the original problem:

I am trying to get values from a thing called $result, when I do notice($result) I get:

{
    "node": "puppet.c.splunk-217321.internal",
    "object": "mkdir ~/boltexamples",
    "result": {
        "exit_code": 0,
        "stderr": "",
        "stdout": ""
    },
    "status": "success",
    "type": "command"
}

However $result['node'] doesn't return anything because that is really $result.target.name

$result.object doesn't return anything because that method isn't implemented

$result['status'] doesn't return anything, because that is really an interpretation of $result.ok

All I can really access is $result['exit_code'], $result['stderr'], $result['stdout'] because in our documentation it's pointed out that $result[] is really $result.value(), but that's not even what it is referred to in output hash. There is literally 0 mapping from what is printed in the notice to the $result object and what the correlating functions are to retrieve that information.

How is this better for someone who is trying to write their first plan? Why are we optimizing for the person who is already writing multiple plans?

Per your comments about making the result value as equal to the result metadata, we could make it be:

{
    "_node": "puppet.c.splunk-217321.internal",
    "_object": "mkdir ~/boltexamples",
    "exit_code": 0,
    "stderr": "",
    "stdout": "",
    "_status": "success",
    "_type": "command"
}

But that isn't consistent with the API calls, and unless $result['_node'] returns the value then that isn't exactly useful either.

If we made the printing of the object consistent with the referencing of the object, but still asserted that _* was the internal metadata indicator, we'd have something like:

{
    "_target": {
    	"name": "puppet.c.splunk-217321.internal",
    	"user": "cbarker"
    },	
    "_object": "mkdir ~/boltexamples",
    "exit_code": 0,
    "stderr": "",
    "stdout": "",
    "_ok": "true",
    "_type": "command"
}

A side effect of this is that we're now illustrating that a node is more than a hostname, but other attributes such as user to connect with, and is a target.

Of course it is still not clear what are the results from the action being taken VS the success of performing the action - you'd have to refer to documentation still to realize that _ok indicates that the command was triggered successfully and that exit_code = 0 means the command was completed successfully. Having the two values on the same plane doesn't make that easy. Because with error handling you can have an action successfully fired, but the results not be what you want, or successfully return that it failed, we need to disambiguate between those two states.

Lets say for the sake of completeness we instead returned this data as a full hash, assuming that a novice user would benefit from being able to see how a result is created and can easily see the different between a return value from the task and the execution of the task. This is returning to using status instead of ok for readability:

{
    "target": {
    	"name": "puppet.c.splunk-217321.internal",
    	"user": "cbarker"
    },	
    "function_arguments": "mkdir ~/boltexamples",
    "function_name": "run_command",
    "function_results": {
        "exit_code": 0,
        "stderr": "",
        "stdout": ""
    },
    "status": "success"
}

Now that we could say that this is really the Result from a Function, the name "function" is implied:

{
    "target": {
    	"name": "puppet.c.splunk-217321.internal",
    	"user": "cbarker"
    },	
    "arguments": "mkdir ~/boltexamples",
    "name": "run_command",
    "results": {
        "exit_code": 0,
        "stderr": "",
        "stdout": ""
    },
    "status": "success"
}

Of course we can probably drop the use of the name $result in the output, if we want:

$results = run_script('firstproject/firstscript.sh', $nodes, arguments => ['boltexamples/file.txt'])
 
$results.each |$r| {
  notice($r)
  notice($r['results']}
}

Matches the behavior of the other most common data object a user would be working with in a plan, the facts hash:

run_plan(facts, nodes => $nodes)
$targets = get_targets($nodes)
$targets.each | $target | {
  notice($target)
  notice(facts($target))
  notice(facts($target)['os']['family'])
}

And there is definitely a case to be made that a novice plan writer would start with handling basic errors and conditionals around a functions result than they would handle fact values. Someone new to scripting or scripting at scale would most likely start with a series of actions they would have to run across multiple nearly identical machines and be able to handle the results of those actions in an easy fashion, before going onto the more complicated logic of "if it's redhat 6 use this command, if it's redhat 7 use this other command". Doing fact evaluation will happen after handling result handling - they will see the wrong command was run and want to now start investigating how to detect what command should be run.

If we don't want to go that route, then I'd say we just throw out the whole idea of trying to represent a $result method automagically as a json hash that you can't actually interact with it. doing a notice($result) gives you:

Result[attributes => {'value' => Hash[String[1], Data], 'target' => Target}, functions => {'error' => Callable[[0, 0], Optional[Error]], 'message' => Callable[[0, 0], Optional[String]], 'ok' => Callable[[0, 0], Boolean], '[]' => Callable[[String[1]], Data]}}]

Or something to represent that it is a method, and that if you want the hash, you have to call it via notice($result.to_json)

The magical transmogrification of $result is entirely misleading, so either we decide to make it behave like a hash, or explicitly make it something you have to turn into a hash and interact with it. Not display a hash that you can't continue to interact with because we're doing non-intuitive things behind the scenes.

Comment by JD Welch [ 2019/04/04 ]

FWIW, I'm trying to sort out the current behaviour and am getting fairly lost myself. As a reasonably adept but novice end-user of Bolt, I would expect both

$t = $result.target.name
$t = $result['target']['name']

to return a sensible value.

Comment by Alex Dreyer [ 2019/04/04 ]

JD Welch thats valid syntax in javascript but not puppet(or most other programming languages).

It makes sense in javascript because the language is based the concept of an object which has properties that can be accessed with [] or . notation. It's not something that can tacked on to a mature language without causing a lot of other problems.

Comment by Chris Barker [ 2019/04/04 ]

Alex Dreyer which one is the right one?

1) notice($result.target.name) gives me:
puppet.c.splunk-217321.internal

or

2) notice($result['target']['name']) gives me:

{
  "kind": "bolt/pal-error",
  "msg": "Evaluation Error: Operator '[]' is not applicable to an Undef Value. (file: /Users/cbarker/src/pdx/bolt/firstproject/site-modules/firstproject/plans/firstplan.pp, line: 15, column: 12)",
  "details": {
  }
}

Option 2 is how it works in Puppet and Bolt when accessing a hash, but Option 1 looks like dot notation that works in JS and is how it was implemented for this method apparently?

Comment by JD Welch [ 2019/04/04 ]

OK, here's my real example I'm using to try and figure out the current behaviour:

# works:
$t = $result.target.name
$v = $result.value['stdout']
 
# following on that, I expect these to work, but do not:
$t = $result.target['name']
$v = $result.value.stdout

As an end user, I simply don't understand why one works and the other doesn't. And I shouldn't have to care.

Comment by Alex Dreyer [ 2019/04/04 ]

But do we prevent a task from returning a _*? Or are we dependent on the task author to not do that.

We are dependent on the task author following documentation. The use of some special keys is documented and encouraged for example _error and _output. The only other option for ill behaved tasks would be to error or warn after running any task that uses an unexpected special key which would introduce unnecessary versioning pain.

But that isn't consistent with the API calls, and unless $result['_node'] returns the value then that isn't exactly useful either.

It is useful, you can pass it to get_targets or --nodes. Returning references instead of full objects in a json representation is a pretty common pattern. It also avoids issues about returning secrets with this data or it bloating with all of the facts and vars that are also conceptually part of a Target object.

How is this better for someone who is trying to write their first plan? Why are we optimizing for the person who is already writing multiple plans?

We work balance the needs of the new user with those of the returning user. Ideally language syntax is easily understandable, efficient, and powerful. Most of the time especially when building on top of existing syntax there are tradeoffs. As a rough heuristic the language features that a beginning user is likely to use are optimized for understandability while those targeted at more advanced users are optimized towards efficiency and power.

Comment by Alex Dreyer [ 2019/04/04 ]

notice($result.target.name) 
puppet.c.splunk-217321.internal

Is correct. The Result object has a .target method that returns a Target object. The Target object has a .name method that returns a name.

Comment by JD Welch [ 2019/04/04 ]

So why doesn't $result.value.stdout behave similarly? Because there's not a method stdout on value, of course. But how am I supposed to learn name is a method and stdout is a property? "Read the docs" isn't a reasonable answer.

Comment by Alex Dreyer [ 2019/04/04 ]

JD Welch unfortunately puppet is a language in which the user has to care about that distinction. The only language that isn't that I can think of off of the top of my head is javascript which is fundamentally built around the concept that all objects are the same and indexable with brace and dot notation. It's not a feature that can be tacked on to a language and even in javascript users still cannot use dot indexing in every case.

The nature of the puppet language is that is has DataTypes which can have methods they can also implement indexing with []. For specific cases we can enable . indexing into hashlike objects but that means those objects cannot be generic data structures like Hash. In addition to methods that live on the DataType users can write their own functions that live separately in modules and still use . indexing. When we enable magical . indexing we make that interface behave poorly.

It would be feasible for us to have a special DataType for the results of run_command and run_script that exposed .stdout as a method without allowing magical .indexing into the value. Due to the fact that the puppet language does not have inheritance we would need a separate DataType for those results from the other types which would have to be reunited with a type alias. There are some rough edges with using an alias instead of a proper type and there is no way we can do this without introducing a breaking change to the plan language.

Read the docs, copy/paste from stack overflow, rely on an editor plugin that exposes options and docs as you type, and try something and see what works and what fails are the main ways to learn a programming language. Learning what the core objects are and how they behave is just part of learning to use a new language or framework. We should make that as easy as possible without limiting the power of the language(for example by only using hashes for everything) or making expensive leaky abstractions(like magical . indexing).

Comment by JD Welch [ 2019/04/04 ]

For specific cases we can enable . indexing into hashlike objects but that means those objects cannot be generic data structures like Hash.

Ah, OK. Thanks for the detailed context; that helps a lot. So because target will always have a name, we can enable the more magical indexing. Gotcha.

try something and see what works and what fails are the main ways to learn a programming language

That's exactly how I'm finding inconsistent behaviour.

Will neither syntax (dots or braces) work for all cases? I don't even care all that much about which it is, just that all the properties of $result can be accessed in a predictable way.

Comment by Alex Dreyer [ 2019/04/04 ]

All attributes of Result object itself are all available with dot notation. Those attributes return different types of objects which each have their own behavior.

Some like value return a Hash object which exposes bracket indexing but not many attributes.
Some return a simple objects on which attributes don't make sense for example success.
Some return a more specific which has other attributes and methods Target for example has .name and .port methods.

The keys inside the results value are not really "properties" of the result object in the same way .target is. $r.value['key'] turned out in our initial testing to be one of the most common interactions with a Result object. The value is the core attribute of the object therefore we decided to add the syntactic sugar of letting bracket indexing on the Result object itself index into the value.

We could, if desired, add bracket indexing to the Target object to allow $target['port'] to behave the same as $target.port this makes some sense semantically because port could be part of the options hash of the Target.

In the puppet language there is no way we can do all of the following: support powerful methods like $result.errorset, have meaningful errors when things are used incorrectly and support a single lookup method either dot syntax or bracket syntax. While the puppet language has some rough edges this is not one of them. It's true of pretty much every language except javascript where bracket syntax can be used everywhere. It could be argued that javascript doesn't provide meaningful errors when things are used incorrectly because of this design choice.

Comment by Alex Dreyer [ 2019/04/04 ]

I appreciate the feedback advocating for new users. Please keep in mind the following constraints.

1. The plan language is constrained by the syntax of the puppet language on top of which it is built. The core syntax has been mature for years. We can add new features but we can't really change the core behavior of the language.
2. We need to balance the ease of learning the language with adding features that increase it's expressiveness for advanced users. Standardizing in such a way that advances users have to type out lots of boilerplate or run into leaky abstractions is not an option.
3. Bolt is 1.0. Changes that break existing plans are expensive for us to develop and release and a burden for the community to adopt. Most requests for breaking changes will be closed.

I'm constraining this ticket to handle the type issue with the result. The rest of this is covered by BOLT-833.

Generated at Sat Aug 24 21:27:40 PDT 2019 using JIRA 7.7.1#77002-sha1:e75ca93d5574d9409c0630b81c894d9065296414.