-
-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Propagate captured RegExp groups to the output value #1204
Comments
Definitely would love to add this! Will have to think a bit about the syntax to differentiate from a pattern-only regex, but would be a very valuable API to have, especially if we can make it type safe. As with #695, would welcome external contributions here and would be happy to answer questions about the type system and help get it integrated smoothly! |
A low hanging fruit implementation would be to simply make all returned At my level of understanding regarding ArkType's typing system (read: it might as well be straight up witchcraft), I'm not sure how to get beyond that point. |
@Abion47 I assume you're referring to adding Unfortunately, although encoding extra information on a primitive via branding works at a type-level, it is impossible to attach additional properties to a string. |
That's an interesting factoid I did not know. Mark that as one more entry under the "Javascript has rules except when it chooses to break them" category. Still, it seems to work if you target type StringWithProp = string & { groups?: string[] };
function foo(): StringWithProp {
const s: any = new String('abc');
s.groups = ['a', 'b', 'c'];
return s as StringWithProp;
}
function test(s: string) {
console.log(s, `${s}`);
}
const bar = foo();
console.log(bar); // String: "abc"
console.log(bar.groups); // ["a", "b", "c"]
console.log(typeof bar === 'string'); // false
test(bar); // String: "abc", "abc" That's not quite a solution as In that case, there doesn't seem to be a backward compatible way to implement this simply - so much for the low hanging fruit option. I don't see another real solution beyond returning a compound type in the event that a // Returns a string if there are no capture groups
const SimpleRegExp = type("/^abcd$/");
const value = SimpleRegExp.assert("abcd"); // 'abcd'
// Returns a tuple containing the captured groups
const GroupedRegExp = type("^/(\d{1,3}[A-Z])? ([A-Za-z ']+), ([A-Za-z \-]+), ([A-Z]{2}\d{1,2} \d{1,2}[A-Z]{2})$/");
const [address, houseNumber, road, city, postCode] = GroupedRegExp.assert("221B Baker Street, London, NW1 6XE"); |
@Abion47 Yeah this is a whole mess, trust me you don't want to start turning It would need its own dedicated syntax, but TBH I'd probably go for that regardless because the use cases for transforming a string into groups and checking if it matches a pattern are pretty distinct and the API will likely be a lot clearer if it reflects that rather than trying to cram multiple results together. |
How about this. What if in addition to const LondonAddress = pattern({
// The presence of the regex property indicates that this is a RegExp powered pattern
regex: ^/(\d{1,3}[A-Z])? ([A-Za-z ']+), ([A-Za-z \-]+), ([A-Z]{2}\d{1,2} \d{1,2}[A-Z]{2})$/,
// Optionally, the groups property assigns a type to each matched group
// If the type system's regex parsing can manage it, this could possibly be inferred
groups: [
// Should group 0 be required to be specified or should it be assumed to just be "string"?
"/^\d{1,3}[A-Z]$/",
"/^[A-Za-z ']+$/",
"/^[A-Za-z \-]+$/",
"/^[A-Z]{2}\d{1,2} \d{1,2}[A-Z]{2}$/",
],
});
const [address, houseNumber, road, city, postCode] = LondonAddress.assert("221B Baker Street, London, NW1 6XE"); |
@Abion47 Hmm, in the end it would just be a special syntax for a type with a morph attached (i.e. something like In general the philosophy has been to offer syntax + chainable methods for APIs like this rather than a new API entry point. Honestly, I'm very confident we can come up with a reasonable API for it, but I just don't want to think about it until someone is interested in implementing it since I need to focus on docs for now 😛 If you or someone else wants to take a stab at it though, let me know and I will commit to something. |
Related to #695 and possibly related to #491 (specifically similar to the second half of this comment), it would be reasonable to assume that
RegExp
fields should be able to use capture groups and be able to propagate those groups to the result.In practice, when a
RegExp
field is parsed, if the pattern contains capturing groups, those groups should be accessible from the resulting value as a tuple propertygroups
. In order to maintain parity with vanilla JSRegExp
and standard regular expression behavior, the first value of the tuple should be the entire matched string with the matched groups beginning at index 1.Examples:
The text was updated successfully, but these errors were encountered: