[PUP-7033] Consider adding a StringScan Data Type that can Reuse Patterns Created: 2016/12/19 Updated: 2018/11/30
|Component/s:||Language, Type System|
|Affects Version/s:||PUP 4.8.1|
|Remaining Estimate:||Not Specified|
|Time Spent:||Not Specified|
|Original Estimate:||Not Specified|
|Epic Link:||5.y Type System|
|QA Risk Assessment:||Needs Assessment|
I'm honestly not sure how you would go about this, but currently you must repeat your Data Type aliases constantly to cover all cases.
Base IPv6 Address: https://github.com/simp/pupmod-simp-simplib/blob/master/types/ip/v6/base.pp
Bracketed IPv6 Address: https://github.com/simp/pupmod-simp-simplib/blob/master/types/ip/v6/bracketed.pp
It would be really nice if the Bracketed case could be something like the following instead:
|Comment by Henrik Lindberg [ 2016/12/20 ]|
Trevor Vaughan - so, the Pattern is an OR between a set of patterns. For this case, it looks like maybe a general And[T,...] type would work. Just a concat of regexps is difficult though, since it requires rewriting a pattern that is already anchored (as in your case).
|Comment by Trevor Vaughan [ 2016/12/20 ]|
Henrik Lindberg An AND would probably work.
String + String is easy. Regex combinations could be done as long as they are not anchored or if the first and last elements are the only ones that are anchored.
String + Regex + String should also be easy as long as it's a bookend.
|Comment by Thomas Hallgren [ 2016/12/21 ]|
I fail to see how an And would solve this problem. The first regexp is anchored. Saying that it should match together with another regexp will not create a match for something that is bracketed. Building regexps like this will require:
1. A string that matches the regexp without anchors.
I don't think we will ever try to parse a regexp and make intelligent decisions on how to dissect and reassemble it so in order to address pattern reuse as presented here, we need to discuss how to make string concatenation possible in type expressions. One alternative could be:
There's no instance method on a type at present, but there could be. It would distinguish the type from the instance that the type describes in cases where this is applicable (which it might be for String and Number, Struct, and Tuple types).
The distinction between type and instance is important. We already use binary operators on types and we may want to use '+' as a way to concatenating the types themselves.
|Comment by Henrik Lindberg [ 2016/12/21 ]|
Hm, what you seem to be after is more like interpolation into a "composite" regexp. Problem is then that you do not have variables to interpolate, only types. Maybe we can come up with something along those lines.
|Comment by Henrik Lindberg [ 2017/05/16 ]|
This is actually a string scanner type (like in a lexer). I can imagine that the type describes a sequence of scans (as done by the Ruby StringScanner). The types added as parameters must be string matching types. If a type describes variants (Variant, Patterns) it is taken as an OR. A type that matches an anchored regexp can be used, but it will match from current pos for ^ and the end of input/line for $ and \Z. As an example:
would match strings like "(a)", "(b)", and "(c)".
|Comment by Moses Mendoza [ 2017/05/18 ]|
Henrik Lindberg do you have an epic this issue might go into?