Bug #6453
openReusable XML parser is not immutable
0%
Description
The Configuration holds a reusable XML parser, or a pool of such parsers, and many interfaces use a parser from this pool: notably Sender.send()
and things built on top of it such as Configuration.buildDocument()
and DocumentBuilder.build()
.
If parsing options such as DTD validation are set (for example via DocumentBuilder.setDTDValidation()
, then it appears that the configuration of the shared (pooled) parser is modified, and subsequent uses of the same parser may run with the modified configuration settings.
This is showing up in tests that add a new DTD validation option to the parse-xml()
function.
Updated by Michael Kay 6 months ago
It seems that ActiveStreamSource.deliver()
takes care to reset the parser properties, but ActiveSAXSource
fails to do so.
I added a test case to TestDocumentBuilder()
which fails to show up the problem because it uses an ActiveStreamSource
.
Updated by Norm Tovey-Walsh 6 months ago
Possibly related, https://saxonica.plan.io/issues/5949 where I spent a long time trying to untangle how the various properties are set, reset, managed, copied, bent, folded, spindled and mutilated.
Updated by Michael Kay 6 months ago
Yes, it looks as if #5949 fixed the problem for ActiveStreamSource but not for ActiveSAXSource.
Updated by Michael Kay 3 months ago
I don't think the fix for #5949 is adequate. It ensures that ActiveSourceParser.deliver(), when it gets a parser from the configuration pool, always sets DTD validation explicitly to on or off. But the parser that's returned to the pool might be in either state; and there are lots of other paths that get a parser from the pool without explicitly setting the state. It would seem safer to ensure that when a parser is returned to the pool, DTD validation is always unset.
More generally, all properties should ideally be in their default setting. Except for the InputSource, which has to be explicitly set each time before you can do a parse.
Please register to edit this issue