Project

Profile

Help

performance issue with document() and DOCTYPE

Added by Anonymous over 18 years ago

Legacy ID: #3328431 Legacy Poster: Jean-Claude Moissinac (moissinac)

Hello ABSTRACT: performance issue with document() and saxon I have a lot of documents starting by: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1 Tiny//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11-tiny.dtd"> <svg ... If I use such documents in an XSLT with something like ... <xsl:for-each select="document($path)"> ... where $path contains a path to such a document Saxon 8 seems to access to the URL specified in the DOCTYPE line, for each document() call, which could be very slow or impossible. I suppose it use to validate the document. Is it possible -without programming- to inhibit Saxon to do such access? Jean-Claude Moissinac


Replies (3)

Please register to reply

RE: performance issue with document() and DOC - Added by Anonymous over 18 years ago

Legacy ID: #3328860 Legacy Poster: Michael Kay (mhkay)

If a document has a reference to an external DTD, then the XML parser will always attempt to fetch the DTD - even if it isn't needed for validation, it might contain entity definitions. The only ways to prevent this happening are (a) to remove the DOCTYPE reference before parsing, or (b) to redirect the parser to use a different DTD (or a local copy of the DTD) typically by a mechanism such as Oasis catalogs. See for example http://www.dpawson.co.uk/docbook/catalogs.html#d1538e138 Michael Kay

RE: performance issue with document() and DOC - Added by Anonymous over 18 years ago

Legacy ID: #3334403 Legacy Poster: Jean-Claude Moissinac (moissinac)

Thank you. I will try this solution. I'm writing a tool usable by several users which all have to install it on their computer. I try to push open solution. But, I'm in front of two cases: - Windows and msxsl: with the -xw -xe options, executing my scripts and XSLT takes less than one minute, - other platform: * with a catalog: imply a long (and complex?) installation process, * without a catalog: imply an execution time of more than ten minutes I'm afraid my users chooses the Windows/msxsl solution.

RE: performance issue with document() and DOC - Added by Anonymous over 18 years ago

Legacy ID: #3334430 Legacy Poster: Michael Kay (mhkay)

There may be other ways you can configure the XML parser to cause it not to read the DTD; it might depend on which parser you are using. I'm afraid this is essentially an XML parser problem rather than a Saxon problem, so I can't really give you much more help. Michael Kay

    (1-3/3)

    Please register to reply