Validation and the doc function
Added by Anonymous about 17 years ago
Legacy ID: #4601877 Legacy Poster: Per Sennels (persennels)
In the "Xpath 2.0 Programmers reference", I find the following phrase in the section describing the doc function: "The XML document is parsed, and optionally validated using a DTD validator [....]" It seems to me that validation is default when using the doc function. But if validation is optional, it must be able to turn it off. How do I do that? Why do I want to avoid validation? Because, what I want to do is to extract information from a lot of (many hundred) SMIL files referenced from a master XML file. I have good reasons to believe that these files are valid (no need to check), and processing time would probably be reduced if validation could be avoided. And, more important: The server hosting the SMIL DTD (www.w3.org) does not like that I request the same resource over and over again, so I automatically get blocked from doing it. I appreciate any help I can get on this one.
Replies (4)
Please register to reply
RE: Validation and the doc function - Added by Anonymous about 17 years ago
Legacy ID: #4601963 Legacy Poster: Michael Kay (mhkay)
Some XML parsers will perform DTD validation automatically if there is a reference to a DTD. Even if they don't perform validation, they will usually fetch the DTD, because it may contain definitions of external entities that need to be expanded. So it sounds as if you don't just need to switch validation off, you need to prevent the DTD being fetched. That's best achieved using OASIS catalogs: you can redirect the doctype reference to a local "dummy " DTD. If your chosen XML parser allows validation to be switched off, then you can achieve this by configuring the parser yourself in a URIResolver. (The URIResolver can return a SAXSource which contains a preconfigured XMLReader.) There are also options on the Saxon command line (but it depends which interfaces you are using), and settings in the Saxon Configuration class (setValidation() method), which can also be set via the TransformerFactory. But as I say, switching off DTD validation doesn't stop the DTD being retrieved from the server. Michael Kay http://www.saxonica.com/
RE: Validation and the doc function - Added by Anonymous about 17 years ago
Legacy ID: #4608857 Legacy Poster: Per Sennels (persennels)
Thank you! I must admit that XML Catalogs are completely new creatures to me. But based on the documentation, I should create a catalogue which maps the public doctype to a local DTD. In order to make a parser be aware of the catalogue, I must use a processing instruction to point to the catalogue. But if this PI must appear in all of the SMIL files I'm processing, then I will have to edit a set of several thousand files in order to put it in. Could just as well edit the DOCTYPE declaration.... Is it possible instead to put the PI in my master XML, and then have that information available when the SMIL files are being processed? If so, how is that information handed over to the SMIL files? Or, could the PI simply be placed in the stylesheet? Or, point to it from the command line? If it's relevant, I'm doing this on a Windows XP machine, using the a command line similar to java -jar saxon8.jar -o result.html master.xml GetInformation.xsl Per Sennels
RE: Validation and the doc function - Added by Anonymous about 17 years ago
Legacy ID: #4608909 Legacy Poster: Michael Kay (mhkay)
I'm no expert on OASIS catalogs, and can't really help you with the detail, but no: you shouldn't need to edit the instances. Just create a catalog that maps the system identifiers / public identifiers for DTDs or other entities to local files, then nominate the catalog-resolving-parser when you run Saxon. Something like this, except that's for Saxon 6.5: http://www.dpawson.co.uk/docbook/catalogs.html#d1654e141 Michael Kay
RE: Validation and the doc function - Added by Anonymous about 17 years ago
Legacy ID: #4608940 Legacy Poster: Per Sennels (persennels)
Thanks, I'll try it out. Per Sennels
Please register to reply