Project

Profile

Help

EventToStaxBridge loses SystemId

Added by Anonymous about 15 years ago

Legacy ID: #6944025 Legacy Poster: David Lee (daldei)

I am using EventToStaxBridge to turn a NodeInfo into a Stax XMLStreamReader. Works great .... until I try to get the base URI or the SystemId ... (Stax seems not to know about the "base URI" so my best 2nd bet is the SystemID stored in the Location). When I call getLocation() it always returns null for the System ID. Debugging into EventToStaxBridge I find that the code is not quite complete, but there is evidence of an attempt. getLocation() returns a dynamically created Location object which always sets the SystemId to null. There's an inner class SourceStreamLocation which seems like it was an attempt to provide a Location object with real values, but as far as I can tell its not actually used. I hacked together a "fixed' EventToStaxBridge to prove the concept and changed getLocation() to populate the SystemId in the return Location. I did this by changing line 460 to public String getSystemId() { if( currentItem != null && currentItem instanceof NodeInfo ) return ((NodeInfo)currentItem).getSystemId(); else return null; } This sorta works in my simple test case although it appears to be turning a simple relative URL ("books.xml") into a full path ... not sure why. I also suspect the code is nieve and not quite right. Seems close though. Suggestions welcome ! -David


Replies (4)

Please register to reply

RE: EventToStaxBridge loses SystemId - Added by Anonymous about 15 years ago

Legacy ID: #6950422 Legacy Poster: Michael Kay (mhkay)

You're certainly right that the code here looks unfinished. It was derived from PullToStax.java, which differs in that a PullProvider offers location information, whereas an EventIterator does not. I suspect I left some code around in anticipation that EventIterator would change some day to offer such information. Have you considered using PullToStax instead? Your code fixes the problem for the simple case where the current PullEvent is a node, but in general I would have thought that's not a very important case. In particular, it will only ever be an attribute, text, comment, or PI node, since elements will have been decomposed into start and end events. (The documentation doesn't clearly say so, but the class is designed to handle decomposed event streams only.) A StartElementEvent contains a locationId, which can be resolved to a real location by using the LocationProvider in the PipelineConfiguration, and I would think this is the right way to provide the information. I'll look into it further. I think that the systemId() on a node should always be an absolute URI - it's designed to reflect the base URI property in the XDM, modulo the impact of the xml:base attribute.

RE: EventToStaxBridge loses SystemId - Added by Anonymous about 15 years ago

Legacy ID: #6953750 Legacy Poster: Michael Kay (mhkay)

I've committed a patch for this particular case (bug 2711894). In general however, the EventPull pipeline doesn't handle location information particularly well, and I'm finding it difficult to fix this.

RE: EventToStaxBridge loses SystemId - Added by Anonymous about 15 years ago

Legacy ID: #6955312 Legacy Poster: David Lee (daldei)

Thanks ! I also tried PullToSax and it worked out-of-the box. As to why I didn't try it before ? Well ... its hunt & peck :) (you did ask ... ) Even after working with saxon for > 2 years I still have a difficult time figuring out what classes to use. I know I'm a fairly obscure use case, so I'm not complaining. But frequently I find myself with "Class A" and really need "Class B" so I start hunting for the fragil threads that bind them, hoping such a thing exists. If I'm lucky its 1 or 2 steps removed, but often its 4 or 6 steps removed and I'm just guessing most of the time if its the "right" path to get there. For example this last case you suggested PullToSax which was very useful ... but it still took me quite a while to come up with a working set of transformations. I have a "XdmValue" and need a "XMLStreamReader" ... for the PullToSax case I have found I need to convert XdmValue -> ValueRepresentation -> { Value | NodeInfo } -> SequenceIterator -> PullFromIterator -> PullToStax -> XMLStreamReader And of course I didnt know that chain of transformatoins until I worked backwards and forward to find something that matched ... (and many false starts and dead ends) its definately a deductive reasoning challenge :) But it worked !!!! Although I still dont have a clue if this is the best path between the 2 types (or even a reasonably good path). This sort of problem makes me think there would be a great use for a tool where you could plug in the endpoint types and it would query the codebase/library/type dictionaries and come up with the paths to translate from one type to another. Although I suspect nothing short of AI could really make sense out of anything but the most trivial of translations. Its similar to a proof-generation AI ... Please dont take this as any critisizm of the code, Saxon is an awsome (if complex) codeset and I feel priviledged to work with it.

RE: EventToStaxBridge loses SystemId - Added by Anonymous about 15 years ago

Legacy ID: #6955646 Legacy Poster: Michael Kay (mhkay)

I can only say that sometimes I forget a class exists too! The PullEvent mechanism was intended as a replacement for PullProvider, but it never quite got far enough to replace all the functionality. Another way of converting from NodeInfo to XMLStreamReader, using rather more "public" interfaces, would be using XQJ: SaxonXQDataSource ds= new SaxonXQDataSource(node.getConfiguration()); XMLStreamReader reader = new SaxonXQItem(node, ds.getConnection()).getItemAsStream()); That does essentially the same as you are doing behind the covers (using PullEvent iterators, and therefore losing the location information). Michael Kay

    (1-4/4)

    Please register to reply