Bug #4419
closedfn:normalize-unicode fails when DTD-based validation of source data is turned on
100%
Description
Hello!
I am running Saxon-EE 9.9.1.4 on Java from the command line.
When I turn on DTD-based validation of source data (-dtd:on), then my stylesheet using fn:normalize-unicode fails.
Here is XSL to demonstrate the problem:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0">
<xsl:output encoding="ASCII" omit-xml-declaration="yes"/>
<xsl:template match="/">
<xsl:value-of select="normalize-unicode('Müller', 'NFKD')"/>
</xsl:template>
</xsl:stylesheet>
The source does not matter as long as it's XML that's valid against a DTD. For example:
<!DOCTYPE article SYSTEM "http://jats.nlm.nih.gov/archiving/1.2/JATS-archivearticle1.dtd">
<article><front><article-meta/></front></article>
When I run this with -dtd:off, I get the expected output:
Müller
When I switch -dtd:don, then the transformation fails:
Recoverable error on line 2 column 13 of normalizationData.xml:
SXXP0003: Error reported by XML parser: Document is invalid: no grammar found.
Recoverable error on line 2 column 13 of normalizationData.xml:
SXXP0003: Error reported by XML parser: Document root element "UnicodeData", must match
DOCTYPE root "null".
Error at char 18 in xsl:value-of/@select on line 7 column 68 of x.xsl:
The XML parser reported two validation errors
net.sf.saxon.trans.XPathException: The XML parser reported two validation errors
It seems to me I should be able to use fn:normalize-unicode with -dtd:on.
Files
Updated by Michael Kay almost 5 years ago
Thanks for reporting it. It looks as if it's incorrectly applying DTD validation to one of the internal documents that holds the normalization tables.
Updated by Michael Kay almost 5 years ago
- Category set to Internals
- Status changed from New to Resolved
- Assignee set to Michael Kay
- Priority changed from Low to Normal
- Applies to branch 9.9, trunk added
- Fix Committed on Branch 9.9, trunk added
I have fixed this on 9.9 and trunk.
I have reviewed all the places were we are parsing internal or system files where DTD validation is likely to be inappropriate, not just the issued files containing normalization and regex data, but also for example SEF files, configuration files, catalog files, and so on, and explicitly set DTD (and where necessary schema validation) off to override a request for DTD validation at the global configuration level.
Updated by O'Neil Delpratt over 4 years ago
- Status changed from Resolved to Closed
- % Done changed from 0 to 100
- Fixed in Maintenance Release 9.9.1.7 added
Patch applied in the 9.9.1.7 maintenance release.
Please register to edit this issue