Project

Profile

Help

How to configure a TransformerFactory to not attempt loading the URL indicated in an XML namespace?

Added by Kean Erickson almost 5 years ago

Hi there,

I'm parsing an XML document with a root element that has a namespace pointing to a URL with a domain that doesn't exist, like so:


<messages xmlns="http://n.validator.nu/messages/">

What's curious is that nothing in my XSL template will find a match so long as this namespace is here. If I manually strip out this namespace declaration from the XML, everything matches just fine. But I'm not sure what all saxon is doing to reach out to this URL or why things are getting confused. I tried setting up an EntityResolver/URIResolver like so, and passing it as a URIResolver to the TransformerFactory:


	private class W3CEntityResolver implements EntityResolver, URIResolver {

		@Override
		public InputSource resolveEntity(String publicId, String systemId) throws SAXException, IOException {
			return new InputSource(new StringReader(""));
		}

		@Override
		public Source resolve(String href, String base) throws TransformerException {
			return null;
		}
	}

But these inner methods never get hit, so it isn't being used. What am I missing? I'd prefer to not have saxon dial out to anything at all.

I'm using Saxon EE 9.7.0 on a windows 10 machine.

Thanks!


Replies (3)

Please register to reply

RE: How to configure a TransformerFactory to not attempt loading the URL indicated in an XML namespace? - Added by Michael Kay almost 5 years ago

Saxon will never attempt a fetch from a URL merely because the URL is used as a namespace in a namespace declaration. Neither will the XML parser. The fact that your EntityResolver isn't called seems to confirm this.

So something else is the matter here, and it looks to me as if you're simply failing to take the namespace into consideration in a match pattern or path expression.

Remember that a bare name in a path expression or match pattern (like match="title") will only match an element that is in no namespace. If the element is in a namespace, you either need to qualify the name in the stylesheet with a namespace prefix bound to the correct URI, or you need to add a default-xpath-namespace declaration to the stylesheet.

RE: How to configure a TransformerFactory to not attempt loading the URL indicated in an XML namespace? - Added by Kean Erickson almost 5 years ago

So I would need to add a qualifying prefix to every select/match in order to use the namespace? I think I am going to just continue to strip it out from the XML to keep my template simpler, the namespace URL 404s anyway so I don't see a lot of value in having it there. Thanks for your reply

RE: How to configure a TransformerFactory to not attempt loading the URL indicated in an XML namespace? - Added by Michael Kay almost 5 years ago

The namespace isn't a URL, there is no resource at the end of it! It's just a unique identifier.

Yes, if your source document uses namespaces then the stylesheet needs to use them too, either by qualifying all the names or by using xpath-default-namespace.

You're quite right though, if you don't need to use namespaces then don't use them, it makes life much simpler.

    (1-3/3)

    Please register to reply