Project

Profile

Help

Bug #3920

Problems using xsl:result-document with no href attribute

Added by Gunther Rademacher almost 2 years ago. Updated almost 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
XSLT conformance
Sprint/Milestone:
-
Start date:
2018-09-27
Due date:
% Done:

100%

Estimated time:
Legacy ID:
Applies to branch:
9.9
Fix Committed on Branch:
9.9
Fixed in Maintenance Release:

Description

Thanks for Saxon 9.9!

The attached program is a GLR parser that was generated by REx for a simple ambiguous grammar. When run with this command line

java net.sf.saxon.Transform -it:main -xsl:x.xslt input={x} !indent=yes

it should report the ambiguity. But rather than writing results to System.out, it fails with a FileNotFoundException.

Tested with Saxon-HE 9.9.0.1. Works OK wit 9.8.0.11.

x.xslt (42.8 KB) x.xslt Gunther Rademacher, 2018-09-27 22:32

History

#1 Updated by Michael Kay almost 2 years ago

Congratulations on the first bug report. It never ceases to amaze me how we spend 3 months running a million tests, and within hours users find something very basic that doesn't work.

#2 Updated by Michael Kay almost 2 years ago

There's been a change in the handling of an xsl:result-document instruction with no @href attribute, though we didn't quite predict this consequence. Saxon 9.9 is doing more closely what the spec says it should do, which is to default the @href attribute to the value of the base output URI. When you run from the command line, with no -o option, the base output URI is the current working directory, and writing to that is obviously going to fail.

In previous releases xsl:result-document with no @href attempted to reuse the output stream established for the primary output of the transformation, regardless of whether or not it had a known URI. This strategy doesn't extend well to the wider range of output delivery mechanisms now available for a transformation, in particular the possibility of "raw output".

I'll have another think about how this can be made to work. Meanwhile using -o on the command line should work around the problem.

#3 Updated by Michael Kay almost 2 years ago

  • Category set to XSLT conformance
  • Assignee set to Michael Kay
  • Priority changed from Low to Normal

This needs careful thinking through. We did encounter problems in this area during testing, but perhaps bent too far in the direction of "doing what the spec says" rather than "doing what we've done in the past so applications don't break".

When xsl:result-document finds that the href is empty (or the absolutized URI is the same as the base output URI) then it consults the PrincipalOutputGatekeeper to ensure that we're not writing both primary output and secondary output to the same URI. I think the answer might lie in extending the role of the PrincipalOutputGatekeeper so it also determines where to send the output.

The difficulty here is that the PrincipalOutputGatekeeper only knows about the output destination to the extent that it is a Receiver. But this Receiver encapsulates the whole serialization pipeline, which we don't want, because the secondary output can be serialized in a completely different way. What we really want is the StreamResult (or perhaps even the OutputStream) at the end of this serialization pipeline. Of course, there might not be one; the primary output might be being sent to a DOM, or it might be a RawDestination that delivers the items in the result sequence directly.

So perhaps the answer is: if a StreamResult has been registered with the PrincipalOutputGatekeeper as the primary destination, then use it; otherwise call the result document resolver to get a destination as 9.9 is currently doing.

The only snag is, at the point we create the PrincipalOutputGatekeeper, we're in the XsltController, which only knows about the destination Receiver. The construction of this Receiver is done in the s9api layer, and it's only the s9api layer therefore that has access to its internals.

We could solve this in a couple of ways.

(a) The XsltController could peek inside the pipeline represented by its supplied Receiver to see if it ends in a StreamResult, and if so, pass that StreamResult to the PrincipalOutputGatekeeper

(b) The s9api layer (specifically, `AbstractXsltTransformer.getDestinationReceiver) could pass the information to the XsltController, in the same way that it already passes the base output URI. This is probably cleaner - except that at this level, we still don't have direct access to the StreamResult, all we have is the Destination object. And we can't reuse the complete Destination for exactly the same reasons - it encapsulates serialization parameters which might be completely different for the secondary output.

(c) Analogously to the old 9.8 design, we could initially allocate an UncommittedDestination for the principal output. This would have the capability of generating both the primary and secondary serialization pipelines on request. The caller at the s9api level could elect to supply an UncommittedDestination (and the command line would generally do so); if the destination is an UncommittedDestination then it would be registered with the PrincipalOutputGatekeeper and thus be available for use by the no-href xsl;result-document instruction. If the principal destination is not an UncommittedDestination then the xsl:result-document call would be handled as it is in 9.9 now, i.e. the base output URI would be passed to the output URI resolver.

#4 Updated by Michael Kay almost 2 years ago

I have a solution which works, though it needs further thought and further testing.

(1) The s9api layer (AbstractXsltTransformer.getDestinationReceiver()) registers the principal Destination with the XsltController.

(2) ResultDocument, in the case where href="", calls a new method PrincipalOutputGateway.tryToMakeReceiver(). If this returns non-null, we use this receiver rather than calling the result document resolver to get one.

(3) The implementation of PrincipalOutputGateway.tryToMakeReceiver() is: if the principalDestination registered with the XsltController (at (1) above) is a Serializer, then call its repurpose() method to get a new Serializer. Set the new serialization properties on this new Serializer, and call Serializer.getReceiver() to return the Receiver. Otherwise return null.

(4) Serializer.repurpose() is a new method that constructs a new Serializer with the same destination (StreamResult) as the original, with null serialization properties.

The effect of this in user terms is:

(a) at the command line, where output is always to a Serializer, secondary output produced with xsl:result-document href="" goes to the destination that principal output would have gone to, i.e. the destination specified by -o, or in its absence, System.out.

(b) at the s9api level, if the Destination supplied to an XsltTransformer or Xslt30Transformer is a Serializer, then secondary output produced with xsl:result-document href="" goes to the destination that principal output would have gone to.

At the moment the second Serializer is not taking any output properties from the first. We could easily change this but it's probably clearer if we don't.

#5 Updated by Michael Kay almost 2 years ago

  • Status changed from New to In Progress

The change causes two unit tests to break (in the same way): testApplyTemplatesWithResultValidation, and testCallTemplateWithResultValidation.

These tests are expecting the serialization properties on the primary serializer to be carried across. A simple change to the new Serializer.repurpose() method achieves this.

#6 Updated by Michael Kay almost 2 years ago

I'm not comfortable that this solution treats Serializers differently from other kinds of Destination. This violates substitutability, and prevents someone, for example, writing a new Destination class that delegates to a Serializer.

Also, what is the correct behaviour of <xsl:result-document href=""/> when the primary Destination is, say, a DOMDestination or a SAXDestination? I think the effect should be uniform: reuse the primary destination, but with different parameters for serialization and validation.

This suggests that the repurpose() method (is that the best name we can find for it?) should apply to all kinds of Destination, and that the default should be to return the Destination unchanged, except for new serialization and validation parameters being supplied. (This raises obvious question marks about whether the software should ever modify a Destination object supplied by the user: but we already do, so the precedent has been set.)

#7 Updated by Michael Kay almost 2 years ago

I think the design can be simplified. Instead of the repurpose() method, we can simply call the Destination.getReceiver() method to get a second Receiver for the same Destination, and we can do this regardless of the kind of Destination

#8 Updated by Michael Kay almost 2 years ago

Note also, the code in XsltController contains remnants of a previous attempt to solve this problem using a "ReceiverFactory" class; this design was abandoned and its remnants should be deleted.

#9 Updated by Michael Kay almost 2 years ago

Current state - largely working with the exception of a couple of unit tests, for example in the area of saxon:next-in-chain. Some of the failures are apparently a consequence of fixing #3922.

#10 Updated by Michael Kay almost 2 years ago

  • Status changed from In Progress to Resolved
  • Applies to branch 9.9 added
  • Fix Committed on Branch 9.9 added

The remaining unit tests are now passing. They were assuming that the base output URI is initialized to the current working directory. This is now true only for the Transform command line, not for applications invoked using the JAXP or s9api APIs.

#11 Updated by Michael Kay almost 2 years ago

  • Subject changed from FileNotFoundException in net.sf.saxon.serialize.XMLEmitter to Problems using xsl:result-document with no href attribute

#12 Updated by O'Neil Delpratt almost 2 years ago

  • Status changed from Resolved to Closed
  • % Done changed from 0 to 100
  • Fixed in Maintenance Release 9.9.0.2 added

Bug fix applied to the Saxon 9.9.0.2 maintenance release.

Please register to edit this issue

Also available in: Atom PDF