Skip to content

Conversation

@rxwei
Copy link
Contributor

@rxwei rxwei commented Dec 20, 2021

Produce capture structures for regex literal type inference.

Introduce CaptureStructure, a tree representing the structure (i.e. not the value) of captures. This will be used by the compiler to infer the capture types of a regular expression literal.

libswiftParseRegexLiteral(_:_:_:_:_:) encodes a capture structure to a byte sequence to hand it off to C++. The Swift type checker deserializes it to form actual Swift types. The serialization format is as follows:

encode(〚`T`〛) ==> <version>, 〚`T`〛, .end 〚`T` (atom)〛 ==> .atom 〚`name: T` (atom)〛 ==> .atom, `name`, '\0' 〚`[T]`〛 ==> 〚`T`〛, .formArray 〚`T?`〛 ==> 〚`T`〛, .formOptional 〚`(T0, T1, ...)`〛 ==> .beginTuple, 〚`T0`〛, 〚`T1`〛, ..., .endTuple 
@rxwei rxwei force-pushed the capture-structure branch from 635a939 to cc063b4 Compare December 20, 2021 09:59
@rxwei
Copy link
Contributor Author

rxwei commented Dec 20, 2021

@swift-ci please test linux

@rxwei rxwei force-pushed the capture-structure branch 2 times, most recently from 7f67a6f to dc70859 Compare December 22, 2021 09:40
@rxwei rxwei changed the title Produce capture structure trees during parsing for type inference. Produce capture structures for regex literal type inference. Dec 22, 2021
@rxwei rxwei force-pushed the capture-structure branch from dc70859 to c6c7cf7 Compare December 22, 2021 10:01
Introduce `CaptureStructure`, a tree representing the structure (i.e. not the value) of captures. This will be used by the compiler to infer the capture types of a regular expression literal. `libswiftParseRegexLiteral(_:_:_:_:_:)` encodes a capture structure to a byte sequence to hand it off to C++. The Swift type checker deserializes it to form actual Swift types. The serialization format is as follows: ``` encode(〚`T`〛) ==> <version>, 〚`T`〛, .end 〚`T` (atom)〛 ==> .atom 〚`name: T` (atom)〛 ==> .atom, `name`, '\0' 〚`[T]`〛 ==> 〚`T`〛, .formArray 〚`T?`〛 ==> 〚`T`〛, .formOptional 〚`(T0, T1, ...)`〛 ==> .beginTuple, 〚`T0`〛, 〚`T1`〛, ..., .endTuple ```
@rxwei rxwei force-pushed the capture-structure branch from c6c7cf7 to 9aaacb4 Compare December 22, 2021 10:11
@rxwei
Copy link
Contributor Author

rxwei commented Dec 22, 2021

@swift-ci please test Linux

rxwei added a commit to rxwei/swift that referenced this pull request Dec 22, 2021
When parsing a regular expression literal, accept a serialized capture structure from the regex parser. During type checking, decode it and form Swift types. Examples: ```swift '/(.)(.)/' // ==> `Regex<(Substring, Substring)>` '/(?<label>.)(.)/' // ==> `Regex<(label: Substring, Substring)` '/((.))*((.)?)/' //==> `Regex<([Substring], [Substring], Substring, Substring?)>` ``` Also: - Fix a bug where a regex literal parsing error is not returning an error parser result. Note: - This needs to land after swiftlang/swift-experimental-string-processing#92 and after `dev/4` tag has been created. - See swiftlang/swift-experimental-string-processing#92 for regex parser changes and the capture structure encoding. - The `RegexLiteralParsingFn` `CaptureStructureOut` pointer type change from `char *` to `void *` will not break builds due to implicit pointer conversion (SE-0324) and unchanged ABI. Resolves rdar://83253511.
@rxwei rxwei merged commit ed05306 into swiftlang:main Dec 22, 2021
@rxwei rxwei deleted the capture-structure branch December 22, 2021 23:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants