An example to illustrate the issue (note the quotation marks):
Input:
<p>How about "<a href="/foo">that</a>" said the old man.</p>
Output:
How about ”that“ said the old man.
Expected:
How about “that” said the old man.
Why does this happen?
This is because each text node is handled individually. In the example, the three text nodes are How about ", that and " said the old man..
Solutions
Compute the text content of block elements (p tags, blockquotes) and run the substitution on that, instead of on the nodes individually?