Hi there,
The implementation of the Element.data() function listed below is executed recursively in an unsafe manner.
public String data() {
StringBuilder sb = StringUtil.borrowBuilder();
for (Node childNode : childNodes) {
if (childNode instanceof DataNode) {
DataNode data = (DataNode) childNode;
sb.append(data.getWholeData());
} else if (childNode instanceof Comment) {
Comment comment = (Comment) childNode;
sb.append(comment.getData());
} else if (childNode instanceof Element) {
Element element = (Element) childNode;
String elementData = element.data();
sb.append(elementData);
} else if (childNode instanceof CDataNode) {
// this shouldn't really happen because the html parser won't see the cdata as anything special when parsing script.
// but incase another type gets through.
CDataNode cDataNode = (CDataNode) childNode;
sb.append(cDataNode.getWholeText());
}
}
return StringUtil.releaseBuilder(sb);
}
The recursion is not checked for depth and can lead to a StackOverflowError if there are enough nested children.
I suggest to handle this case similar to Element.text() where a NodeTraversor is used to compute the result iteratively.
Another approach could be something along the lines:
public String data() {
StringBuilder sb = StringUtil.borrowBuilder();
var descendants = new ArrayDeque<>(childNodes);
while (!descendants.isEmpty()) {
var descendantNode = descendants.pollFirst();
if (descendantNode instanceof DataNode) {
DataNode data = (DataNode) descendantNode;
sb.append(data.getWholeData());
}
else if (descendantNode instanceof Comment) {
Comment comment = (Comment) descendantNode;
sb.append(comment.getData());
}
else if (descendantNode instanceof Element) {
Element element = (Element) descendantNode;
// We must visit the first child on the list next.
// If this, in turn, has children, his children are visited next, and so on.
// Therefore, we have to append the child nodes of this parent backward to the front of the queue.
var childNodes = element.childNodes();
for (int i = childNodes.size() - 1; i > 0; i--) {
descendants.addFirst(childNodes.get(i));
}
}
else if (descendantNode instanceof CDataNode) {
// this shouldn't really happen because the html parser won't see the cdata as anything special when parsing script.
// but incase another type gets through.
CDataNode cDataNode = (CDataNode) descendantNode;
sb.append(cDataNode.getWholeText());
}
}
return StringUtil.releaseBuilder(sb);
Hi there,
The implementation of the
Element.data()function listed below is executed recursively in an unsafe manner.The recursion is not checked for depth and can lead to a
StackOverflowErrorif there are enough nested children.I suggest to handle this case similar to
Element.text()where aNodeTraversoris used to compute the result iteratively.Another approach could be something along the lines: