[PUP-3600] Create support for binary content on the File type Created: 2014/10/31  Updated: 2018/09/25  Resolved: 2018/09/25

Status: Closed
Project: Puppet
Component/s: Language, Type System, Types and Providers
Affects Version/s: None
Fix Version/s: PUP 6.0.1

Type: New Feature Priority: Normal
Reporter: Andrew Parker Assignee: Henrik Lindberg
Resolution: Duplicate Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Blocks
is blocked by PUP-5831 Add a Binary type and runtime object Closed
is blocked by PUP-8598 make 4.x loaders available to puppet ... Closed
Duplicate
duplicates PUP-9110 Accept and produce ASCII_8BIT as Bina... Closed
Relates
relates to PUP-3852 Switch from PSON to JSON as default s... Closed
relates to SERVER-1082 puppet file resource can't handle bin... Closed
Template:
Epic Link: 6.y Rich Data
Sub-team: Language
Team: Server
QA Contact: Eric Thompson

 Description   

Once the catalog is transferred in JSON, non-UTF-8 sequences will no longer be allowed in the content. This means that Files that have a content of some arbitrary data may or may not be able to be transferred.

Henrik Lindberg suggested a solution to this that will allow us to do our switch to JSON but not require an entirely new catalog format. The idea is to create a function called binary(str) that takes a String and returns a Struct[data => String] where data is a UTF-8 compatible encoding of the bytes (not the characters) of the input String. The content parameter of File is then changed to accept either a String or a Struct[data => String]. When it receives a Struct[data => String] then the data element is decoded with the algorithm decided for the implementation of the binary(str) function.

This solution should allow the follow manifests to continue to work:

file { "/tmp/binary": content => binary(file("/local/binary")) }

define my_file($content) {
  file { "/tmp/binary": content => $content }
}
 
my_file { "yep": content => binary(file("/local/binary")) }

At least one variation on those will not work (because binary data will end up in the catalog):

define my_file($content) {
  file { "/tmp/binary": content => binary($content) }
}
 
my_file { "yep": content => file("/local/binary") }



 Comments   
Comment by Henrik Lindberg [ 2014/10/31 ]

What works depends on how we treat the case of untyped parameters - are they Data or Any ?
We could support Binary everywhere simply by letting the user type the parameter and include Binary.

No change to format, but serialization/deserialization must naturally be aware of the possible types.
In case the type allows Binary, a hash must also be encoded a different way to make it possible to differentiate
(if it supports both Binary and Hash that is).

Comment by Chris Price [ 2015/01/27 ]

Kylo Ginsberg Henrik Lindberg these are the tickets that I pulled out of that "JSON all the things" epic. I think I got all of the ones that were PSON/networking.

Comment by Henrik Lindberg [ 2015/01/28 ]

Not sure this is a "Server" ticket. We need the following:

  • A Binary runtime type (a Ruby equivalent of ByteBuffer, that holds the bytes and encoding)
  • Serialization support that allows a binary format to use a more optimal on the wire result than a text based
  • A Binary Type in the Puppet Type System
  • The binary function (or possibly binary_file) that returns a Binary
  • Support in the File Type to handle an instance of the Binary Runtime Type as the value of content

Not sure if any of the above are for the Server Team...

Comment by Chris Price [ 2015/01/28 ]

Yeah, that makes sense; I guess I was just thinking that this is a blocker for going to JSON, right? I can move it back to the other epic and flag it as a blocker for the server epic, maybe?

Comment by Henrik Lindberg [ 2015/01/28 ]

Not really a blocker, but we absolutely want to do this - we can serialize to JSON now but the binary support is not there in a real way.

Comment by Chris Price [ 2015/01/28 ]

What happens if we try to serialize to JSON without this, in the case where a file has binary content? Would it just throw an exception during the serialization attempt? If so, that seems like it would be a blocker for removing PSON?

Comment by Henrik Lindberg [ 2015/02/04 ]

PSON is not needed, but we need to look at the result of JSON serialization and treat a hash differently from a string for the content attribute.

Comment by Henrik Lindberg [ 2016/02/07 ]

PUP-5831 was created to deal with the Binary Type.

We will use this ticket to implement support for binary type in one place in the File type content by treating a Hash as an object representation of a Binary (here shown in puppet Hash notation):

{ type => 'binary'
  encoding => 'application/octet-stream'
  content => <base64 encoded string>
}

Comment by Henrik Lindberg [ 2017/01/09 ]

The proposed solution is a bit of a hack to be able to do this without too many other changes. With the introduction of "rich data in catalog" we can instead leave it up to a user to either include a String (which is always UTF-8), or to include a Binary (PUP-5831) and simply send either a String or a Binary in the catalog. That is currently possible by turning on the experimental feature --rich_data on both master and agent. Rich data is however not fully support throughout the eko system (no support in PDB) and the opt in is not easy to use since it requires both agent and server side opt in and cannot be controlled per env.

I would much rather finish the support for Rich Data than trying to shoehorn only support secret file content.
Removing this from the 5.0.0 release.

Comment by Henrik Lindberg [ 2017/05/15 ]

The right solution is for this is to just use Binary rich data in the catalog. The File type may still need logic to deal with Binary when sent in a catalog.

Comment by Josh Cooper [ 2017/05/21 ]

This isn't working with --rich_data enabled:

file { '/tmp/dest.bin':
  ensure => file,
  content => binary_file('/tmp/file.bin')
}

When running the agent:

$ bx puppet agent -t --rich_data
Error: Could not retrieve catalog from remote server: Could not intern from application/json: Internal Error: Puppet Context ':loaders' missing
/Users/josh/work/puppet/lib/puppet/pops/loaders.rb:154:in `loaders'
/Users/josh/work/puppet/lib/puppet/pops/loaders.rb:80:in `find_loader'
/Users/josh/work/puppet/lib/puppet/pops/types/types.rb:304:in `create'
/Users/josh/work/puppet/lib/puppet/resource.rb:76:in `convert_rich_data_hash'
/Users/josh/work/puppet/lib/puppet/resource.rb:44:in `block in from_data_hash'
/Users/josh/work/puppet/lib/puppet/resource.rb:43:in `each'
/Users/josh/work/puppet/lib/puppet/resource.rb:43:in `from_data_hash'
/Users/josh/work/puppet/lib/puppet/resource/catalog.rb:437:in `block in from_data_hash'
/Users/josh/work/puppet/lib/puppet/resource/catalog.rb:436:in `collect'
/Users/josh/work/puppet/lib/puppet/resource/catalog.rb:436:in `from_data_hash'
/Users/josh/work/puppet/lib/puppet/network/formats.rb:123:in `data_to_instance'
/Users/josh/work/puppet/lib/puppet/network/formats.rb:104:in `intern'
/Users/josh/work/puppet/lib/puppet/network/format_support.rb:12:in `convert_from'
/Users/josh/work/puppet/lib/puppet/indirector/rest.rb:308:in `deserialize_find'
/Users/josh/work/puppet/lib/puppet/indirector/rest.rb:169:in `find'
/Users/josh/work/puppet/lib/puppet/indirector/indirection.rb:195:in `find'

I get the same when using preferred_serialization_format=pson

Comment by Matthew Gyurgyik [ 2018/03/06 ]

What is the current state of this issue?

With puppet 5.4.0 the following code puts the base64 version of the file in place

file { '/var/ace/sdconf.rec':
  content => binary_file('rsa/sdconf.rec'),
}

Comment by Henrik Lindberg [ 2018/03/08 ]

Matthew Gyurgyik Do you mean you get the Base64 text in the file?

Comment by Henrik Lindberg [ 2018/03/08 ]

This once worked but that there are no acceptance tests that detected the regression. We have since made a number of changes to safe serialization and data and special handling of a Binary instance got lost as those changes took place.

The right path forward is for us to make sure that --rich_data works as it should in the agent-master scenario. Now it should work using a puppet apply.

Comment by Matthew Gyurgyik [ 2018/03/08 ]

Henrik Lindberg yes, exactly what I meant. The file get created, but its content contains the base64 text.

Comment by Josh Cooper [ 2018/05/15 ]

Matthew Gyurgyik We have more work to do to serialize type information in the catalog in a backwards compatible way, and update the agent to base64 decode the content at catalog application time.

Comment by Henrik Lindberg [ 2018/09/10 ]

This should work on Puppet 6.0.0 (master branch at this moment) as we made other changes that resolves the reported issue.
However, PUP-9110 will change this again such that the file type never sees a PBinary::Binary data type, it and all other type/providers will instead get an ASCII-8BIT encoded String.

Suggest closing this when PUP-9110 is fixed.

Comment by Josh Cooper [ 2018/09/25 ]

The fix for PUP-9110 has been merged and will be released in 6.0.1. Closing this.

Generated at Fri Oct 18 21:48:16 PDT 2019 using JIRA 7.7.1#77002-sha1:e75ca93d5574d9409c0630b81c894d9065296414.