Skip to content

Account for \n and <br> appearing in same string#27

Merged
jonw-cogapp merged 2 commits intomainfrom
linebreaks-sometimes-removed-186906726
Feb 9, 2024
Merged

Account for \n and <br> appearing in same string#27
jonw-cogapp merged 2 commits intomainfrom
linebreaks-sometimes-removed-186906726

Conversation

@jonw-cogapp
Copy link
Collaborator

@jonw-cogapp jonw-cogapp commented Feb 8, 2024

What does this Pull Request do?

The regular expression in IIIFSaysThisIsHTML returned a false negative if \n appeared inside a HTML string owing to . not capturing whitespace characters.

For reference:
. = any character
\s = any whitespace character
\S = any non-whitespace character

As a result this issue only effects manifests which mix HTML and newline characters (which is technically incorrect. The newlines should be ignored in a HTML context, but should be included in the regex in order to match correctly).

We noticed this in manifests that came out of StoriiiesEditor, but that has been patched separately to replace \n with <br> (these were statically wrapped in a <p> tag, qualifying them as HTML).

I also removed the m flag a this shouldn't be needed.

N.B.
I tried to see if this would be easy enough to add a new test fixture and assertion in Cypress, but it seems to have trouble comparing the fixture value when mixing HTML and plain text newline characters

Test coverage

Yes/No: Are changes in this pull-request covered by:

  • Cypress tests?

Interested parties

@tristanr-cogapp

@jonw-cogapp jonw-cogapp force-pushed the linebreaks-sometimes-removed-186906726 branch from b8581b9 to d7eba8e Compare February 8, 2024 13:13
It created a false _negative_ if `\n` appeared in a HTML string
The `m` flag shouldn't be necessary in json context. I don't _think_ this was contributing to the issue, but it feels correct to remove it.
@jonw-cogapp jonw-cogapp force-pushed the linebreaks-sometimes-removed-186906726 branch from d7eba8e to dbb783b Compare February 8, 2024 13:28
@jonw-cogapp jonw-cogapp changed the base branch from develop to main February 9, 2024 10:06
@neilh-cogapp
Copy link

@jonw-cogapp The change looks good, would it be possible to add a sample test fixture or some example string values that should / shouldn't be caught by the update?

@jonw-cogapp
Copy link
Collaborator Author

@neilh-cogapp Thanks. Looking at this again today I saw a way to test the behaviour using Cypress. We should expect that anything that is HTML according to that regex shouldn't have any \n's in its value converted to <br>

@jonw-cogapp jonw-cogapp merged commit 20432d6 into main Feb 9, 2024
@jonw-cogapp jonw-cogapp deleted the linebreaks-sometimes-removed-186906726 branch February 9, 2024 13:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants