Project

Profile

Help

Bug #6453

open

Reusable XML parser is not immutable

Added by Michael Kay 5 months ago. Updated 2 months ago.

Status:
New
Priority:
Normal
Assignee:
Category:
Configuration
Sprint/Milestone:
-
Start date:
2024-06-18
Due date:
% Done:

0%

Estimated time:
Legacy ID:
Applies to branch:
Fix Committed on Branch:
Fixed in Maintenance Release:
Platforms:

Description

The Configuration holds a reusable XML parser, or a pool of such parsers, and many interfaces use a parser from this pool: notably Sender.send() and things built on top of it such as Configuration.buildDocument() and DocumentBuilder.build().

If parsing options such as DTD validation are set (for example via DocumentBuilder.setDTDValidation(), then it appears that the configuration of the shared (pooled) parser is modified, and subsequent uses of the same parser may run with the modified configuration settings.

This is showing up in tests that add a new DTD validation option to the parse-xml() function.

Actions #1

Updated by Michael Kay 5 months ago

It seems that ActiveStreamSource.deliver() takes care to reset the parser properties, but ActiveSAXSource fails to do so.

I added a test case to TestDocumentBuilder() which fails to show up the problem because it uses an ActiveStreamSource.

Actions #2

Updated by Norm Tovey-Walsh 5 months ago

Possibly related, https://saxonica.plan.io/issues/5949 where I spent a long time trying to untangle how the various properties are set, reset, managed, copied, bent, folded, spindled and mutilated.

Actions #3

Updated by Michael Kay 5 months ago

Yes, it looks as if #5949 fixed the problem for ActiveStreamSource but not for ActiveSAXSource.

Actions #4

Updated by Michael Kay 2 months ago

I don't think the fix for #5949 is adequate. It ensures that ActiveSourceParser.deliver(), when it gets a parser from the configuration pool, always sets DTD validation explicitly to on or off. But the parser that's returned to the pool might be in either state; and there are lots of other paths that get a parser from the pool without explicitly setting the state. It would seem safer to ensure that when a parser is returned to the pool, DTD validation is always unset.

More generally, all properties should ideally be in their default setting. Except for the InputSource, which has to be explicitly set each time before you can do a parse.

Please register to edit this issue

Also available in: Atom PDF