Skip to content

Unicode tokens in a character class #543

@NDietrich

Description

@NDietrich

I'm trying to match a string of all printable Unicode characters, similar to [\w] character set for ascii. However it seems that Unicode properties (such as the \p{L} category for all letters in Unicode) are not supported in a character class (this isn't specific to Nearley).

can anyone suggest a way to match all printable Unicode characters or various Unicode Property categories?

conceptually, if I wanted a string of Unicode letters, I would want:
str -> [\p{L}]:+
(I know this doesn't work, but it highlights the string I want to match).

Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions