Skip to content

Commit 426ae67

Browse files
Fix issues in adoc to HTML formatting found by Copilot review
Signed-off-by: James Ball <jameball@qti.qualcomm.com>
1 parent 36dcc26 commit 426ae67

6 files changed

Lines changed: 62 additions & 30 deletions

File tree

tests/norm-rule/expected/test-norm-rules.adoc

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,13 +12,14 @@
1212
| [Zicsr, ABC] | Rule Instances
1313
| inside inline a| link:test.html#norm:inline[norm:inline]
1414

15-
.2+| no_tag
16-
| Normative rule *without* tag/tags | Rule's 'summary' property
15+
.3+| no_tag
16+
| Normative rule *without* tag/tags and *nested **bold** cases*. | Rule's 'summary' property
1717
| This normative rule has no references to the standard. This should only be used in extraordinary circumstances.
1818
It does include a link to <<table1>> (another normative rule).
1919
Has basic adoc formatting such as *bold*, ita__lics__, `monospace`, 2^superscript^, ~subscript~, [.underline]#underline#,
2020
and &le; (Unicode text for less-than-equals-to) and &#8800; (Unicode decimal value for not-equal-to).
2121
| Rule's 'note' property
22+
| Let's try a nested *_bold italics_* case or all 3 *_`bold italic monospace`_* too. | Rule's 'description' property
2223

2324
.1+| inline-with-hash
2425
| includes a hash # symbol. a| link:test.html#norm:inline-with-hash[norm:inline-with-hash]

tests/norm-rule/expected/test-norm-rules.html

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -130,14 +130,18 @@ <h3>my-chapter_name</h3>
130130
<td><a href="test.html#norm:inline">norm:inline</a></td>
131131
</tr>
132132
<tr>
133-
<td rowspan=2 id="no_tag">no_tag</td>
134-
<td>Normative rule <b>without</b> tag/tags</td>
133+
<td rowspan=3 id="no_tag">no_tag</td>
134+
<td>Normative rule <b>without</b> tag/tags and <b>nested <b>bold</b> cases</b>.</td>
135135
<td>Rule's "summary" property</td>
136136
</tr>
137137
<tr>
138138
<td>This normative rule has no references to the standard. This should only be used in extraordinary circumstances.<br>It does include a link to <a href="#table1">table1</a> (another normative rule).<br>Has basic adoc formatting such as <b>bold</b>, ita<i>lics</i>, <code>monospace</code>, 2<sup>superscript</sup>, <sub>subscript</sub>, <span class="underline">underline</span>,<br>and &#8804; (Unicode text for less-than-equals-to) and &#8800; (Unicode decimal value for not-equal-to).<br></td>
139139
<td>Rule's "note" property</td>
140140
</tr>
141+
<tr>
142+
<td>Let's try a nested <b>_bold italics_</b> case or all 3 <b>_`bold italic monospace`_</b> too.</td>
143+
<td>Rule's "description" property</td>
144+
</tr>
141145
<tr>
142146
<td rowspan=1 id="inline-with-hash">inline-with-hash</td>
143147
<td>includes a hash # symbol.</td>

tests/norm-rule/expected/test-norm-rules.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,9 @@
2323
"name": "no_tag",
2424
"def_filename": "tests/norm-rule/test.yaml",
2525
"chapter_name": "my-chapter_name",
26-
"summary": "Normative rule *without* tag/tags",
26+
"summary": "Normative rule *without* tag/tags and *nested **bold** cases*.",
2727
"note": "This normative rule has no references to the standard. This should only be used in extraordinary circumstances.\nIt does include a link to <<table1>> (another normative rule).\nHas basic adoc formatting such as *bold*, ita__lics__, `monospace`, 2^superscript^, ~subscript~, [.underline]#underline#,\nand &le; (Unicode text for less-than-equals-to) and &#8800; (Unicode decimal value for not-equal-to).\n",
28+
"description": "Let's try a nested *_bold italics_* case or all 3 *_`bold italic monospace`_* too.",
2829
"tags": []
2930
},
3031
{
57 Bytes
Binary file not shown.

tests/norm-rule/test.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,8 @@ normative_rule_definitions:
1414
instances: [Zicsr, ABC]
1515
tag: "norm:inline"
1616
- name: no_tag
17-
summary: Normative rule *without* tag/tags
17+
summary: Normative rule *without* tag/tags and *nested **bold** cases*.
18+
description: Let's try a nested *_bold italics_* case or all 3 *_`bold italic monospace`_* too.
1819
note: |
1920
This normative rule has no references to the standard. This should only be used in extraordinary circumstances.
2021
It does include a link to <<table1>> (another normative rule).

tools/create_normative_rules.rb

Lines changed: 49 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -568,16 +568,16 @@ module Adoc2HTML
568568

569569
# Apply constrained formatting pair transformation
570570
# Single delimiter, bounded by whitespace/punctuation
571-
# Matches: *text*, _text_, ^text^, ~text~
572571
# Example: "That is *strong* stuff!" or "This is *strong*!"
573572
#
574573
# @param text [String] The text to transform
575-
# @param delimiter [String] The formatting delimiter (e.g., '*', '_', '^', '~')
574+
# @param delimiter [String] The formatting delimiter (e.g., '*', '_', '`')
575+
# @param recursive [Boolean] Whether to recursively process nested formatting
576576
# @yield [content] Block that transforms the captured content
577577
# @yieldparam content [String] The text between the delimiters
578578
# @yieldreturn [String] The transformed content
579579
# @return [String] The text with formatting applied
580-
def constrained_format_pattern(text, delimiter, &block)
580+
def constrained_format_pattern(text, delimiter, recursive: false, &block)
581581
escaped_delimiter = Regexp.escape(delimiter)
582582
# (?:^|\s) - start of line or space before
583583
# \K - keep assertion (excludes preceding pattern from match)
@@ -586,32 +586,41 @@ def constrained_format_pattern(text, delimiter, &block)
586586
# #{escaped_delimiter} - single closing mark
587587
# (?=[,;".?!\s]|$) - followed by punctuation, space, or end of line
588588
pattern = /(?:^|\s)\K#{escaped_delimiter}(\S(?:(?!\s).*?(?<!\s))?)#{escaped_delimiter}(?=[,;".?!\s]|$)/
589-
text.gsub(pattern) { block.call($1) }
589+
text.gsub(pattern) do
590+
content = $1
591+
# Recursively process nested formatting if enabled
592+
content = convert_nested(content) if recursive
593+
block.call(content)
594+
end
590595
end
591596

592597
# Apply unconstrained formatting pair transformation
593598
# Double delimiter, can be used anywhere
594-
# Matches: **text**, __text__, ^^text^^, ~~text~~
595599
# Example: "Sara**h**" or "**man**ual"
596600
#
597601
# @param text [String] The text to transform
598-
# @param delimiter [String] The formatting delimiter (e.g., '*', '_', '^', '~')
602+
# @param delimiter [String] The formatting delimiter (e.g., '*', '_', '`')
603+
# @param recursive [Boolean] Whether to recursively process nested formatting
599604
# @yield [content] Block that transforms the captured content
600605
# @yieldparam content [String] The text between the delimiters
601606
# @yieldreturn [String] The transformed content
602607
# @return [String] The text with formatting applied
603-
def unconstrained_format_pattern(text, delimiter, &block)
608+
def unconstrained_format_pattern(text, delimiter, recursive: false, &block)
604609
escaped_delimiter = Regexp.escape(delimiter)
605610
# #{escaped_delimiter}{2} - double opening mark
606611
# (.+?) - any text (non-greedy)
607612
# #{escaped_delimiter}{2} - double closing mark
608613
pattern = /#{escaped_delimiter}{2}(.+?)#{escaped_delimiter}{2}/
609-
text.gsub(pattern) { block.call($1) }
614+
text.gsub(pattern) do
615+
content = $1
616+
# Recursively process nested formatting if enabled
617+
content = convert_nested(content) if recursive
618+
block.call(content)
619+
end
610620
end
611621

612622
# Apply superscript/subscript formatting transformation
613623
# Single delimiter, can be used anywhere, but text must be continuous (no spaces)
614-
# Matches: ^text^, ~text~ where text contains no spaces
615624
# Example: "2^32^" or "X~i~"
616625
#
617626
# @param text [String] The text to transform
@@ -625,26 +634,43 @@ def continuous_format_pattern(text, delimiter, &block)
625634
# #{escaped_delimiter} - single opening mark
626635
# (\S+?) - continuous non-space text (no spaces allowed)
627636
# #{escaped_delimiter} - single closing mark
637+
# Note: Superscript/subscript don't support nesting in AsciiDoc
628638
pattern = /#{escaped_delimiter}(\S+?)#{escaped_delimiter}/
629639
text.gsub(pattern) { block.call($1) }
630640
end
631641

632-
# Convert bold notation: *foo* -> <b>foo</b>
633-
def convert_bold(text)
634-
text = constrained_format_pattern(text, "*") { |content| "<b>#{content}</b>" }
635-
text = unconstrained_format_pattern(text, "*") { |content| "<b>#{content}</b>" }
642+
# Convert nested formatting within already-captured content
643+
# This processes innermost formatting first to support nesting
644+
# For example: *_foo_* becomes <b><i>foo</i></b>
645+
def convert_nested(text)
646+
result = text.dup
647+
# Process unconstrained first (double delimiters)
648+
result = unconstrained_format_pattern(result, "*", recursive: true) { |content| "<b>#{content}</b>" }
649+
result = unconstrained_format_pattern(result, "_", recursive: true) { |content| "<i>#{content}</i>" }
650+
result = unconstrained_format_pattern(result, "`", recursive: true) { |content| "<code>#{content}</code>" }
651+
# Then process constrained (single delimiters)
652+
result = constrained_format_pattern(result, "*", recursive: true) { |content| "<b>#{content}</b>" }
653+
result = constrained_format_pattern(result, "_", recursive: true) { |content| "<i>#{content}</i>" }
654+
result = constrained_format_pattern(result, "`", recursive: true) { |content| "<code>#{content}</code>" }
655+
result
636656
end
637657

638-
# Convert italics notation: _bar_ -> <i>bar</i>
639-
def convert_italics(text)
640-
text = constrained_format_pattern(text, "_") { |content| "<i>#{content}</i>" }
641-
text = unconstrained_format_pattern(text, "_") { |content| "<i>#{content}</i>" }
658+
# Convert unconstrained bold, italics, monospaces notation.
659+
# For example, **foo**bar -> <b>foo</b>bar
660+
# Supports nesting when recursive: true
661+
def convert_unconstrained(text)
662+
text = unconstrained_format_pattern(text, "*", recursive: true) { |content| "<b>#{content}</b>" }
663+
text = unconstrained_format_pattern(text, "_", recursive: true) { |content| "<i>#{content}</i>" }
664+
text = unconstrained_format_pattern(text, "`", recursive: true) { |content| "<code>#{content}</code>" }
642665
end
643666

644-
# Convert monospace notation: `zort` -> <code>zort</code>
645-
def convert_monospace(text)
646-
text = constrained_format_pattern(text, "`") { |content| "<code>#{content}</code>" }
647-
text = unconstrained_format_pattern(text, "`") { |content| "<code>#{content}</code>" }
667+
# Convert constrained bold, italics, monospaces notation.
668+
# For example, *foo* -> <b>foo</b>
669+
# Supports nesting when recursive: true
670+
def convert_constrained(text)
671+
text = constrained_format_pattern(text, "*", recursive: true) { |content| "<b>#{content}</b>" }
672+
text = constrained_format_pattern(text, "_", recursive: true) { |content| "<i>#{content}</i>" }
673+
text = constrained_format_pattern(text, "`", recursive: true) { |content| "<code>#{content}</code>" }
648674
end
649675

650676
# Convert superscript notation: 2^32^ -> 2<sup>32</sup>
@@ -729,9 +755,8 @@ def convert_unicode_names(text)
729755
# Apply all format conversions (keeping numeric entities).
730756
def convert(text)
731757
result = text.dup
732-
result = convert_bold(result)
733-
result = convert_italics(result)
734-
result = convert_monospace(result)
758+
result = convert_unconstrained(result)
759+
result = convert_constrained(result)
735760
result = convert_superscript(result)
736761
result = convert_subscript(result)
737762
result = convert_underline(result)

0 commit comments

Comments
 (0)