Project

Profile

Help

Fragments of source XML in output

Added by Anonymous over 12 years ago

Legacy ID: #10826146 Legacy Poster: Tomaz Erjavec (tomaz-erjavec)

I'm writing a fairly simple XSLT to convert a TEI dictionary into HTML, but a strange thing happens: here and there, fragments of the source XML appear escaped in the HTML output. It seems only to happen when the file get large. I'm using saxonHE, and the same behaviour appears both with 9.2 and 9.3 I put the source, xslt and output on http://nl.ijs.si/imp/bug/ The first error is at http://nl.ijs.si/imp/bug/Lexicon_bug.html#lex.261b810f56f0d28229c6e2b019848ba9 but more can be found searching for '<'. Thanks for any help! Tomaž


Replies (5)

Please register to reply

RE: Fragments of source XML in output - Added by Anonymous over 12 years ago

Legacy ID: #10826311 Legacy Poster: Michael Kay (mhkay)

Are you using the default XML parser included in the JDK? If so, could you see if the problem still occurs when you use the Xerces parser from Apache instead? This looks similar to effects I've seen caused by bugs in the JDK parser.

RE: Fragments of source XML in output - Added by Anonymous over 12 years ago

Legacy ID: #10826419 Legacy Poster: Tomaz Erjavec (tomaz-erjavec)

This could well be it. I'm a bit helpless when it comes to java, but will figure out how to install Xerces and try that. I will get back just in case the problem persists. Thanks! Tomaž

RE: Fragments of source XML in output - Added by Anonymous over 12 years ago

Legacy ID: #11029108 Legacy Poster: Tomaz Erjavec (tomaz-erjavec)

I've tried using Xerces now, but the problem still persists. As before, a sample of the bad output (search for '<') is at http://nl.ijs.si/imp/bug and the way saxon is now called is as below. java -Djavax.xml.parsers.DocumentBuilderFactory=org.apache.xerces.jaxp.DocumentBuilderFactoryImpl -Djavax.xml.parsers.SAXParserFactory=org.apache.xerces.jaxp.SAXParserFactoryImpl net.sf.saxon.Transform In case it would be any help, I can put the source and xslt on the web as well, but am afraid that it is indeed some horrible xml parser or java problem that will be impossible to solve. Which is pretty awful, actually. Anyway, thanks for any help! Tomaž

RE: Fragments of source XML in output - Added by Anonymous over 12 years ago

Legacy ID: #11030110 Legacy Poster: Michael Kay (mhkay)

If you can supply the information needed to reproduce this I will be happy to investigate. Without that information, I can't really help.

RE: Fragments of source XML in output - Added by Anonymous over 12 years ago

Legacy ID: #11033314 Legacy Poster: Tomaz Erjavec (tomaz-erjavec)

Thank you, you''ve already helped! Nothing like preparing files to show to others, to find that the bug was actually on my side - I process the dictionary in several steps, and one of them still used the default XML parser. So, the conversion to HTML was completely ok, it just just got already garbled input. Sorry for crying wolf!

    (1-5/5)

    Please register to reply