Details
-
Bug
-
Status: Closed
-
Normal
-
Resolution: Fixed
-
PUP 4.8.1
-
None
-
None
-
Puppet Developer Experience
-
PDE 2017-01-11
-
Bug Fix
-
A string with two adjacent unicode characters did not result in two unicode characters being placed in the string. Instead only the first was recognized as being a unicode character and the second was taken as verbatim text.
-
No Action
-
covered by unit tests
Description
In working on support for unicode characters in the value of the "comment" property, I noticed this warning in the output:
...
|
Warning: Unicode escape '\u' was not followed by 4 hex digits or 1-6 hex digits in {} or was > 10ffff at /Users/moses/foo/unicode_comment.pp:3:38
|
...
|
The manifest in question is:
user { 'foo' :
|
ensure => present,
|
comment => "A\u06FF\u16A0\u{2070E}",
|
}
|
I.e., the value of the comment field is made up of these characters:
"A" : http://www.fileformat.info/info/unicode/char/0041/index.htm
"\u06FF" : http://www.fileformat.info/info/unicode/char/06FF/index.htm
"\u16A0": http://www.fileformat.info/info/unicode/char/16a0/index.htm
"\u
" : http://www.fileformat.info/info/unicode/char/2070e/index.htm
I believe this string is legal unicode - is it misrepresented in the manifest in some way?
Given a user with name "foo" and comment "bar", if I apply this manifest, (in addition to the warning above) this is the change notice:
Notice: /Stage[main]/Main/User[foo]/comment: comment changed 'bar' to 'Aۿ\u16A0𠜎'
|
It appears that "\u16A0" has been escaped and subsequently treated literally?
The error is being raised here: https://github.com/puppetlabs/puppet/blob/master/lib/puppet/pops/parser/slurp_support.rb#L94-L95