Skip to content

Commit 4c18472

Browse files
committed
Document the semantics of annotation ordering
It's important to specify the way that annotations relate to the characters of the underlying string and each other. Along the way, it's also worth explaining the behaviour of the internal functions _clear_annotations_in_region! and _insert_annotations!.
1 parent 5c6245e commit 4c18472

1 file changed

Lines changed: 43 additions & 7 deletions

File tree

base/strings/annotated.jl

Lines changed: 43 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,17 @@ and a value (`Any`), paired together as a `Pair{Symbol, <:Any}`.
2525
Labels do not need to be unique, the same region can hold multiple annotations
2626
with the same label.
2727
28+
Code written for `AnnotatedString`s in general should conserve the following
29+
properties:
30+
- Which characters an annotation is applied to
31+
- The order in which annotations are applied to each character
32+
33+
Additional semantics may be introduced by specific uses of `AnnotatedString`s.
34+
35+
A corollary of these rules is that adjacent, consecutively placed, annotations
36+
with identical labels and values are equivalent to a single annotation spanning
37+
the combined range.
38+
2839
See also [`AnnotatedChar`](@ref), [`annotatedstring`](@ref),
2940
[`annotations`](@ref), and [`annotate!`](@ref).
3041
@@ -317,6 +328,9 @@ end
317328
318329
Annotate a `range` of `str` (or the entire string) with a labeled value (`label` => `value`).
319330
To remove existing `label` annotations, use a value of `nothing`.
331+
332+
The order in which annotations are applied to `str` is semantically meaningful,
333+
as described in [`AnnotatedString`](@ref).
320334
"""
321335
annotate!(s::AnnotatedString, range::UnitRange{Int}, @nospecialize(labelval::Pair{Symbol, <:Any})) =
322336
(_annotate!(s.annotations, range, labelval); s)
@@ -349,6 +363,9 @@ annotations that overlap with `position` will be returned.
349363
Annotations are provided together with the regions they apply to, in the form of
350364
a vector of region–annotation tuples.
351365
366+
In accordance with the semantics documented in [`AnnotatedString`](@ref), the
367+
order of annotations returned matches the order in which they were applied.
368+
352369
See also: `annotate!`.
353370
"""
354371
annotations(s::AnnotatedString) = s.annotations
@@ -483,6 +500,15 @@ function write(dest::AnnotatedIOBuffer, src::AnnotatedIOBuffer)
483500
nb
484501
end
485502

503+
"""
504+
_clear_annotations_in_region!(annotations::Vector{Tuple{UnitRange{Int}, Pair{Symbol, Any}}}, span::UnitRange{Int})
505+
506+
Erase the presence of `annotations` within a certain `span`.
507+
508+
This operates by removing all elements of `annotations` that are entirely
509+
contained in `span`, truncating ranges that partially overlap, and splitting
510+
annotations that subsume `span` to just exist either side of `span`.
511+
"""
486512
function _clear_annotations_in_region!(annotations::Vector{Tuple{UnitRange{Int}, Pair{Symbol, Any}}}, span::UnitRange{Int})
487513
# Clear out any overlapping pre-existing annotations.
488514
filter!(((region, _),) -> first(region) < first(span) || last(region) > last(span), annotations)
@@ -508,14 +534,24 @@ function _clear_annotations_in_region!(annotations::Vector{Tuple{UnitRange{Int},
508534
annotations
509535
end
510536

537+
"""
538+
_insert_annotations!(io::AnnotatedIOBuffer, annotations::Vector{Tuple{UnitRange{Int}, Pair{Symbol, Any}}}, offset::Int = position(io))
539+
540+
Register new `annotations` in `io`, applying an `offset` to their regions.
541+
542+
The largely consists of simply shifting the regions of `annotations` by `offset`
543+
and pushing them onto `io`'s annotations. However, when it is possible to merge
544+
the new annotations with recent annotations in accordance with the semantics
545+
outlined in [`AnnotatedString`](@ref), we do so. More specifically, when there
546+
is a run of the most recent annotations that are also present as the first
547+
`annotations`, with the same value and adjacent regions, the new annotations are
548+
merged into the existing recent annotations by simply extending their range.
549+
550+
This is implemented so that one can say write an `AnnotatedString` to an
551+
`AnnotatedIOBuffer` one character at a time without needlessly producing a
552+
new annotation for each character.
553+
"""
511554
function _insert_annotations!(io::AnnotatedIOBuffer, annotations::Vector{Tuple{UnitRange{Int}, Pair{Symbol, Any}}}, offset::Int = position(io))
512-
# The most basic (but correct) approach would be just to push
513-
# each of `annotations` to `io.annotations`, adjusting the region by
514-
# `offset`. However, there is a specific common case probably worth
515-
# optimising, which is when an existing styles are just extended.
516-
# To handle this efficiently and conservatively, we look to see if
517-
# there's a run at the end of `io.annotations` that matches annotations
518-
# at the start of `annotations`. If so, this run of annotations is merged.
519555
run = 0
520556
if !isempty(io.annotations) && last(first(last(io.annotations))) == offset
521557
for i in reverse(axes(annotations, 1))

0 commit comments

Comments
 (0)