Project

Profile

Help

Feature #5830

closed

Stylesheets with entity references do not compile

Added by Evan Lenz almost 2 years ago. Updated 6 months ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Category:
-
Sprint/Milestone:
-
Start date:
2023-01-18
Due date:
% Done:

0%

Estimated time:
Applies to JS Branch:
Fix Committed on JS Branch:
Fixed in JS Release:
SEF Generated with:
Platforms:
Company:
-
Contact person:
-
Additional contact persons:
-

Description

See the attached test.xsl file. When I try to compile it using the latest release:

xslt3 -xsl:test.xsl -export:test.sef.json

I get the following output:

Error FODC0002:
Failed parsing XML in file://path/to/test.xsl: Reference to unknown entity &REUSABLE_PATTERN; at line 8 column 41
Failed to compile stylesheet

I did notice feature #4597 ("Recognize entity declarations in DTDs"). However, I believe this should be considered a bug report rather than a feature request, as I believe a conforming XML processor will always be able to recognize entity declarations in the internal DTD subset. This is actually the first time I've ever seen them not recognized in an XML processing context.

I sometimes use them for repeated long XSLT patterns. My temporary workaround is to expand them all (repeating them throughout the stylesheet).


Files

test.xsl (318 Bytes) test.xsl XSLT to reproduce the problem Evan Lenz, 2023-01-18 05:20
Actions #1

Updated by Michael Kay almost 2 years ago

Hi, Evan, if you know of an XML parser in the node.js world that supports DTDs and entity references, please let us know. It's a while since we searched, but last time we did, we couldn't find one. It's probably because everyone is scared stiff of XXE attacks and the like.

Actions #2

Updated by Evan Lenz almost 2 years ago

Thanks, Mike. A disappointing reality but understandable, I suppose. I will keep my eye out. (And rewrite those patterns to use user-defined functions in the mean time.)

Actions #3

Updated by Martynas Jusevicius almost 2 years ago

This is how I pre-process the stylesheets by canonicalizing them in order to inline entities:

find ./target/ROOT/static/com/atomgraph  -type f -name "*.xsl" -exec sh -c 'xmlstarlet c14n "$1" > "$1".c14n && mv "$1".c14n "$1"' x {} \;

Can also be done using net.sf.saxon.Query instead of xmlstarlet.

Actions #4

Updated by Evan Lenz almost 2 years ago

Thanks, Martynas. However, will that use of net.sf.saxon.Query work on NodeJS? Doesn’t it still depend on an XML parser being available that will recognize the internal DTD subset in order to properly canonicalize the XML? Or are you just suggesting the style sheets be pre-processed outside of NodeJS?

Actions #5

Updated by Norm Tovey-Walsh almost 2 years ago

I think Martynas is suggesting that you pre-process them with a tool that processes internal subsets before passing them to Node.js. Kind of a kludge, but it should work.

Actions #6

Updated by Norm Tovey-Walsh 6 months ago

  • Tracker changed from Bug to Feature
  • Status changed from New to Rejected

Updating the parser to support entities isn't something on our near-term list. A better parser is an interesting longer-term goal, though my first stab at it was a dismal failure.

I think pre-processing is the only workaround at the moment.

Please register to edit this issue

Also available in: Atom PDF Tracking page