-
Notifications
You must be signed in to change notification settings - Fork 146
parse and serialize with dot in term name #718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
||
| ## Serializer flags | ||
|
|
||
| The Turtle/N3/JSON‑LD serializers accept an optional flags string to tweak output formatting and abbreviation behavior. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| The Turtle/N3/JSON‑LD serializers accept an optional flags string to tweak output formatting and abbreviation behavior. | |
| The Turtle/N3/JSON‑LD serializers accept an optional `flags` string to tweak output formatting and abbreviation behavior. |
|
|
||
| - `s` `i` – used by default for Turtle to suppress `=`, `=>` notations | ||
| - `d e i n p r s t u x` – used for N-Triples/N-Quads to simplify output | ||
| - `dr` – used with JSON‑LD conversion (no default, no relative prefix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just looking for a confirmation that dr is a two-character flag
| - `dr` – used with JSON‑LD conversion (no default, no relative prefix) | |
| - `dr` – used with JSON‑LD conversion (no default, no relative prefix) |
| - `d e i n p r s t u x` – used for N-Triples/N-Quads to simplify output | ||
| - `dr` – used with JSON‑LD conversion (no default, no relative prefix) | ||
| - `o` – new: do not abbreviate to a prefixed name when the local part contains a dot. This keeps IRIs like | ||
| `http://foo.test/ns/subject.example` in `<...>` form instead of `ns:subject.example`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to match line 6 of scratch-serialize.js.
| `http://foo.test/ns/subject.example` in `<...>` form instead of `ns:subject.example`. | |
| `http://example.org/ns/subject.example` in `<...>` form instead of `ns:subject.example`. |
| const base = 'http://example.com/'; | ||
| const doc = $rdf.sym(base + 'doc'); | ||
| // A URI in a different namespace so it can abbreviate to a prefix | ||
| const other = 'http://foo.test/ns/subject.example'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The different namespace should also be in the reserved TLDs.
| const other = 'http://foo.test/ns/subject.example'; | |
| const other = 'http://example.org/ns/subject.example'; |
| documentString = sz.statementsToN3(newSts) | ||
| return executeCallback(null, documentString) | ||
| case NTriplesContentType: | ||
| sz.setFlags('deinprstux') // Suppress nice parts of N3 to make ntriples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be deinprstux or d e i n p r s t u x?
| // Allows dots inside the local name but not as trailing character | ||
| // Also allows empty local names (for URIs ending in / or #) | ||
| isValidPNLocal(local) { | ||
| // Empty local name is valid (e.g., ex: for http://example.com/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that resolution of this "empty" local name is determined server-side. It often defaults to index.html, but not always. This is configurable in Apache and some (most? all?) other HTTP servers, so it cannot be relied upon without out-of-band communications. Configured local name can be things like index.php.
Fix Turtle parser/serializer handling of dots in terms
Summary
Fixes #601 - Parser and serializer now correctly handle dots within local names per Turtle 1.1 specification.
Problem
The Turtle parser incorrectly treated dots (
.) inside local names as path operators or statement terminators, breaking valid Turtle like:The serializer also failed to produce spec-compliant abbreviated forms, outputting
<http://example.com/subject.example>instead of the validex:subject.example. It also occasionally created a spuriousloc:prefix for base‑relative IRIs like</results.ttl>.Solution
Parser (
src/n3parser.js)dotTerminatesName(str, i)helper to centralize logic across 3 call siteswsOrHashregex at module level for performancenode()leaves.for unifiedcheckDot()handlingSerializer (
src/serializer.js)isValidPNLocal(local)validator per Turtle 1.1 specex:subject.example✅ex:subject.→<http://example.com/subject.>✅ex:for URIs ending in/or#✅loc:with</>) ✅o: do not abbreviate to a prefixed name when the local part contains a dot (opt‑out of dotted locals)serialize(doc, kb, doc.value, 'text/turtle', undefined, { flags: 'o' })'p'still disables all QName abbreviationTests (
tests/unit/dot-in-term-test.ts)ex:subject.example[] :loves [] .ex:subject.example(not<URI>)/<IRI>instead ofprefix:localResults
Behavior Change
Before: Conservative approach — URIs with dots serialized as
<IRI>.After (default): Spec-compliant — valid qnames with dots abbreviated (e.g.,
ex:subject.example).Opt-out: Pass
'o'inflagsto keep dotted locals as<IRI>.This produces more compact, spec‑correct Turtle by default while offering a simple opt‑out for legacy expectations.
Files Changed
src/n3parser.js— Parser fix, helper, shared regexsrc/serializer.js— PN_LOCAL validator; improved splitting logic; base‑dir guard;'o'flag supportsrc/serialize.ts— Merge user flags with defaults so'o'is honored in Turtle/JSON‑LDtests/unit/dot-in-term-test.ts— Added tests including'o'flag casetests/serialize/data.js— Pass'o'for Turtle fixture generation; preserve RDF/XML behaviorREADME.md— Document serializer flags and'o'usagechanges.txt— Changelog entrylib/*— Transpiled updates kept in sync (for repo consumers)References
.in suffix #601