[PUP-4385] Can't write WOMANS HAT emoji with \uXXXX unicode escapes Created: 2015/04/07  Updated: 2015/05/20  Resolved: 2015/05/20

Status: Closed
Project: Puppet
Component/s: Language
Affects Version/s: PUP 3.7.5
Fix Version/s: PUP 3.8.1, PUP 4.1.0

Type: Bug Priority: Minor
Reporter: Nicholas Fagerlund Assignee: Eric Thompson
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Relates
Template:
Epic Link: 4.x Language
Story Points: 1
Sprint: Language 2015-04-29, Language 2015-05-13
Release Notes: Bug Fix
QA Contact: Kurt Wall

 Description   

Unicode includes a bunch of 5+ hex digit characters now; most notably emoji, but I think there might be some real human language characters in there too.

Puppet's \u escape sequences don't accommodate these. The 5-digit versions get truncated, and the alternate 2x4-digit forms result in an error.

notice("5 digit unicode \u1f452 hat") # prints: 5 digit unicode ὅ2 hat
# notice("double 4 digit unicode \uD83D\uDC52 hat") # Results in Error: Could not parse for environment production: invalid byte sequence in UTF-8 on node magpie.lan
notice("literal 👒 hat") # works fine

Update

The implementation allows using the escape

\u{nnnn}

where n is 1 or more hex digits.

notice("5 digit unicode \u{1f452} hat")

QA Risk Analysis

Probability Medium (anyone using 5+ byte unicode characters)
Impact Medium (broken Unicode support breaks puppet)
Risk Level Medium
Test Level Spec


 Comments   
Comment by Henrik Lindberg [ 2015/04/07 ]

We should add support for \u{...} where a number of hex digits can be specified between the braces - including 5 digit chars. The specification is currently only allowing exactly 4 hex digits.

Comment by Henrik Lindberg [ 2015/04/07 ]

The PR 3797 adds support for 1-6 hex digits enclosed in braces. A max limit is 10ffff (largest supported value). If value is shorter than 6 digits the value is the same as if left padded with 0.

Comment by Henrik Lindberg [ 2015/04/07 ]

An update to the specification is needed, it currently only specifies 4 hex digits.

Comment by Henrik Lindberg [ 2015/04/08 ]

Specification updated to allow "\u{XXXXXX}", where is a hex-digit occuring 1-6 times.

Comment by Henrik Lindberg [ 2015/04/22 ]

Not important enough to also do for PUP 3.8.1, but can be cherry picked if something thinks this should be done.
Doh, this was already targeted at 3.x

Comment by Thomas Hallgren [ 2015/04/23 ]

Merged to 3.x at 1940bbf

Comment by Thomas Hallgren [ 2015/04/23 ]

Merged to master at b5dd74d with subsequent maintenance commit containing a fix for a failing spec test at c938142.

Comment by Henrik Lindberg [ 2015/04/23 ]

Tip: When testing manually make sure your console window can display emoji.

Comment by Kurt Wall [ 2015/04/29 ]

Verified in master at SHA=8964241dc3890af9d48335f93a015d7566193e08.

# bundle exec puppet apply -e 'notice("\u1f452 hat")'
Notice: Scope(Class[main]): ὅ2 hat
Notice: Compiled catalog for ge06qva4qg9btuu.delivery.puppetlabs.net in environment production in 0.42 seconds
Notice: Applied catalog in 0.01 seconds
# git log -1
commit 8964241dc3890af9d48335f93a015d7566193e08
Merge: 867e24c 3e2007d
Author: Henrik Lindberg <henrik.lindberg@cloudsmith.com>
Date:   Tue Apr 28 17:31:43 2015 +0200
 
    Merge branch 'stable'
[root@ge06qva4qg9btuu puppet]# git branch
  3.x
* master

Verified in stable at SHA=f0b962b0ff87d778554b64e7ca1ea022b72f0444.

# bundle exec puppet apply -e 'notice("\u1f452 hat")'
Notice: Scope(Class[main]): ὅ2 hat
Notice: Compiled catalog for ge06qva4qg9btuu.delivery.puppetlabs.net in environment production in 0.40 seconds
Notice: Applied catalog in 0.02 seconds
# git log -1
commit b9f2f9214d3ff78eca3b8b5b3f6273b81fb05a7f
Merge: 0cae782 9ecb3fc
Author: Peter Huene <peterhuene@gmail.com>
Date:   Wed Apr 29 10:49:15 2015 -0700
 
    Merge pull request #3877 from joshcooper/maint/stable/confine-ubuntu-agents
 
    (maint) skip upstart test if there are no ubuntu agents
[root@ge06qva4qg9btuu puppet]# git branch
  3.x
  master
* stable

Comment by Kurt Wall [ 2015/04/29 ]

Covered in spec: spec/unit/pops/parser/lexer2_spec.rb

Comment by Kurt Wall [ 2015/04/29 ]

Resolved per previous comment.

Generated at Wed Aug 21 01:30:02 PDT 2019 using JIRA 7.7.1#77002-sha1:e75ca93d5574d9409c0630b81c894d9065296414.