Skip to content

Pattern matching spans of chars against constant strings #1351

@brianrourkeboll

Description

@brianrourkeboll

Proposal

I propose that we allow pattern-matching values of type ReadOnlySpan<char> and Span<char> against constant strings.

match "abc123".AsSpan 1 with
| "bc123"  -> printfn "Matches."
| "abc123" -> printfn "Doesn't match."
| "a"      -> printfn "Doesn't match."
| _        -> printfn "Doesn't match."
let [<Literal>] SomeConstant = "abc123"

let f (span : ReadOnlySpan<char>) =
    match span with
    | SomeConstant -> printfn "Matches."
    | _            -> printfn "Doesn't match."
let span = "abc123xyz".AsSpan ()
let lastDigit = span.LastIndexOfAnyInRange ('0', '9')

match span.Slice lastDigit with
| ""     -> printfn "Ends with a digit."
| suffix -> printfn $"Ends with '{suffix.ToString ()}'."

Compare the equivalent C# proposal (implemented in C# 11).

We would do this by first automatically generating calls to System.MemoryExtensions.AsSpan on a constant string being matched against, and then by passing the resulting span and the input span to the appropriate System.MemoryExtensions.SequenceEqual overload.

The existing way of approaching this problem in F# is...

  • Use string slicing instead of span slicing. This allocates a new string.

    match "abc123"[1..] with // Or "abc123".Substring 1, etc.
    | "bc123"  -> printfn "Matches."
    | "abc123" -> printfn "Doesn't match."
    | "a"      -> printfn "Doesn't match."
    | _        -> printfn "Doesn't match."
  • Use a series of if-then-else expressions instead of pattern-matching.

    let slice = "abc123".AsSpan 1
    
    if slice.SequenceEqual "bc123" then
        printfn "Matches."
    elif slice.SequenceEqual "abc123" then
        printfn "Doesn't match."
    elif slice.SequenceEqual "a" then
        printfn "Doesn't match."
    else
        printfn "Doesn't match."
  • Define and use an active pattern like:

    [<return: Struct>]
    let (|SequenceEqual|_|) (string: string) (span: ReadOnlySpan<char>) =
        if span.SequenceEqual string then ValueSome SequenceEqual
        else ValueNone
    
    match "abc123".AsSpan 1 with
    | SequenceEqual "bc123" -> "Matches."
    | SequenceEqual "abc123" -> "Doesn't match."
    | SequenceEqual "a" -> "Doesn't match."
    | _ -> "Doesn't match."

    I almost didn't include this, because I'd rather not need to think to do this — and (in my experience) most people would not think to. Even if an equivalent active pattern were included in FSharp.Core, it would not be very discoverable.

Notes

  • We would not change the type inference rules for string patterns — the inferred type for the pattern input of a string literal pattern would still be string.

  • As in the C# implementation, I think we would want a compile-time error for matching against null:

    let [<Literal>] NullLiteral : string = null
    
    match "abc123".AsSpan (2, 3) with
    | (null : string) (* This should be an error at build-time. *) ->| NullLiteral (* This should be an error at build-time. *) ->| _ ->
  • Any named pattern bindings should still have the input (span) type:

    match "asdf".AsSpan () with
    | "asdf" &  span -> printfn "`span` is a `ReadOnlySpan<char>` here."
    | "asdf" as span -> printfn "`span` is a `ReadOnlySpan<char>` here."
    | span           -> printfn "`span` is a `ReadOnlySpan<char>` here."

Pros and cons

The advantages of making this adjustment to F# are

  • It brings the ergonomics of working with spans closer to the ergonomics already enjoyed by strings. This would help improve the legibility of performance-sensitive code that uses spans to avoid unnecessary string copying and allocations.

The disadvantages of making this adjustment to F# are

  • None, except that we're moving the complexity from the programmer to the language.

Alternatives

  • Type-directed […] syntax (Type-directed resolution of [ .. ] syntax #1086) could theoretically subsume this, although when the underlying thing being sliced is a string, I think allowing matching against string literals still has a readability advantage. C# supports both, for what it's worth:

    _ = "abc123".AsSpan(1, 3) switch
    {
        "bc1" => "Matches.",
        ['b', 'c', '1'] => "Also matches.",
        _ => "…"
    };

Extra information

Estimated cost (XS, S, M, L, XL, XXL):

  • S — to the best of my understanding, since such code does not currently compile, and since I don't know of anything that making this addition to the compiler would later preclude.

Related suggestions: (put links to related suggestions here)

Affidavit (please submit!)

Please tick these items by placing a cross in the box:

  • This is not a question (e.g. like one you might ask on StackOverflow) and I have searched StackOverflow for discussions of this issue
  • This is a language change and not purely a tooling change (e.g. compiler bug, editor support, warning/error messages, new warning, non-breaking optimisation) belonging to the compiler and tooling repository
  • This is not something which has obviously "already been decided" in previous versions of F#. If you're questioning a fundamental design decision that has obviously already been taken (e.g. "Make F# untyped") then please don't submit it
  • I have searched both open and closed suggestions on this site and believe this is not a duplicate

Please tick all that apply:

  • This is not a breaking change to the F# language design
  • I or my company would be willing to help implement and/or test this

For readers

If you would like to see this issue implemented, please click the 👍 emoji on this issue. These counts are used to generally order the suggestions by engagement.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions