Skip to content

Commit 5793ad5

Browse files
committed
fix: Support URL extraction from markdown link syntax
Fixes #55849 Update URL_REGEX to match URLs embedded in markdown link syntax [label](url). Previously, the regex required whitespace before/after URLs, which didn't match markdown links where URLs are preceded by ]( and followed by ). Also support square brackets in URL query parameters (e.g., ?foo[bar]=baz). Changes: - Add \]\( as valid URL prefix (markdown link start) - Add \) as valid URL suffix (markdown link end) - Add support for [ ] characters in URL paths - Update both backend (IURLGenerator.php) and frontend (comments.js) regex - Fix ReferenceManager to extract clean URLs using capture groups - Maintain backward compatibility with plain URL extraction Signed-off-by: Alexander Askin <[email protected]>
1 parent 095e470 commit 5793ad5

3 files changed

Lines changed: 17 additions & 6 deletions

File tree

core/src/OCP/comments.js

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ import $ from 'jquery'
1616
*
1717
* This is a copy of the backend regex in IURLGenerator, make sure to adjust both when changing
1818
*/
19-
const urlRegex = /(\s|^)(https?:\/\/)([-A-Z0-9+_.]+(?::[0-9]+)?(?:\/[-A-Z0-9+&@#%?=~_|!:,.;()]*)*)(\s|$)/ig
19+
const urlRegex = /(\s|^|\]\()(https?:\/\/)([-A-Z0-9+_.]+(?::[0-9]+)?(?:\/[-A-Z0-9+&@#%?=~_|!:,.;\[\()]*)*)(\s|$|\))/ig
2020

2121
/**
2222
* @param {any} content -

lib/private/Collaboration/Reference/ReferenceManager.php

Lines changed: 14 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -51,10 +51,20 @@ public function __construct(
5151
*/
5252
public function extractReferences(string $text): array {
5353
preg_match_all(IURLGenerator::URL_REGEX, $text, $matches);
54-
$references = $matches[0] ?? [];
55-
return array_map(function ($reference) {
56-
return trim($reference);
57-
}, $references);
54+
// Use capture groups 2 (protocol) and 3 (domain/path) to extract clean URLs
55+
// This excludes the prefix group 1 (\s|\n|^|\]\() and suffix group 4 (\s|\n|$|\))
56+
$references = [];
57+
if (!empty($matches[1]) && !empty($matches[2]) && !empty($matches[3])) {
58+
for ($i = 0; $i < count($matches[2]); $i++) {
59+
$url = $matches[2][$i] . $matches[3][$i];
60+
// If the URL was in markdown syntax [](url), remove the trailing )
61+
if ($matches[1][$i] === '](') {
62+
$url = rtrim($url, ')');
63+
}
64+
$references[] = $url;
65+
}
66+
}
67+
return $references;
5868
}
5969

6070
/**

lib/public/IURLGenerator.php

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,8 +30,9 @@ interface IURLGenerator {
3030
*
3131
* @since 25.0.0
3232
* @since 29.0.0 changed to match localhost and hostnames with ports
33+
* @since 33.0.0 changed to match URLs in markdown link syntax and square brackets in query parameters
3334
*/
34-
public const URL_REGEX_NO_MODIFIERS = '(\s|\n|^)(https?:\/\/)([-A-Z0-9+_.]+(?::[0-9]+)?(?:\/[-A-Z0-9+&@#%?=~_|!:,.;()]*)*)(\s|\n|$)';
35+
public const URL_REGEX_NO_MODIFIERS = '(\s|\n|^|\]\()(https?:\/\/)([-A-Z0-9+_.]+(?::[0-9]+)?(?:\/[-A-Z0-9+&@#%?=~_|!:,.;\[\()]*)*)(\s|\n|$|\))';
3536

3637
/**
3738
* Returns the URL for a route

0 commit comments

Comments
 (0)