Skip to content

Proposal: Allow carriage return as valid newline sequence #837

@hukkin

Description

@hukkin

Related to some of the discussion in #835

I'd like to propose adding CR (carriage return) character that is not followed by an LF character to the list of allowed newline sequences. This is how e.g. CommonMark defines line ending.

The reasoning for the change is:

  1. Avoid roundtrip instability when an LF is prefixed by more than one CR characters. EDIT: Not an issue as per .abnf

  2. In Python (perhaps other languages?) the standard way of opening files supports something called Universal Newline Support, meaning that LF, CR, and CRLF are all normalised to LF. This normalisation should NOT be used with the current TOML spec because

    • CR characters in strings will be converted to LFs when not allowed to do so. EDIT: Not an issue as per .abnf
    • Files where CRs are used as newline sequence are parsed successfully even though they are invalid TOML (spec v1.0.0).

    As far as I know, all popular Python parsers and their documented API make the mistake of doing this incorrect newline normalisation.

Now for point 2, a valid counter-argument is to "just fix all Python parsers". 😄 However, perhaps other languages do similar normalisation by default or otherwise we agree that a spec change makes more sense here instead.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions