Project

Profile

Help

Duplicate values from generate-id() function

Added by Anonymous almost 18 years ago

Legacy ID: #3861751 Legacy Poster: Sheila Morrissey (smorrissey)

Duplicate values from generate-id() function in an XSL1.0 stylesheet using Saxon B 8.7.3, Java 1.5.0_06 Hello, Mr. Kay. The generate-id() method in Saxon B 8.7.3 is creating identical values for 2 distinct elements when we execute one of our XSL 1.0 stylesheets. The generate-id() method is being invoked on elements contained in a (Saxon B 8 7 3) node which has been returned from an extension function. Each node returned from the extension function contains a single <body> element containing 2 or more <img> elements. We process the information in the <img> elements, creating 2 elements in our output for each <img> element in the node returned from the extension function. One of these output elements contains an IDREF attribute, and the other contains an ID attribute. The values for the IDREF and ID attributes in the paired output elements are computed by applying generate-id() to the same input <img> element. The first time the template in question is invoked (from a template matching element name ‘tihtml’), it is passed 2 <img> elements in succession from a template which is processing the <body> element in the node returned from the extension function in the ‘tihtml’ template. It correctly produces the 2 pairs of output elements, and there is no problem with duplicate ID values being created with the generate-id() method. The problem template is then subsequently invoked from a template matching element name ‘abshtml’, which in turn invokes a template to process the <body> element returned from the extension function. This second <body> element (contains 10 <img> elements, all with attribute values distinct from the <img> elements returned previously by the extension function. When the generate-id() method is invoked on the first of these new <img> elements, it generates the same value as it did for the last of the 2 <img> elements in the first invocation of the template. Thereafter, however, as the template processes the remaining 9 <img> elements, it correctly produces unique values when generate-id() is invoked. Here is the template which is creating the duplicate id values: <xsl:template match="img" mode="html"> <xsl:param name="pAsBreak" select="false()"/> <xsl:variable name="commentValue"> <xsl:for-each select="preceding-sibling::comment()[1] | preceding-sibling::img[1]"> <xsl:sort select="position()" data-type="number" order="descending"/> <xsl:if test="position() = 1"> <xsl:if test="not(name() = 'img')"> <xsl:choose> <xsl:when test="contains(., 'MATH: ')"> <xsl:value-of select="substring-after(., 'MATH: ')"/> </xsl:when> <xsl:otherwise> <xsl:value-of select="."/> </xsl:otherwise> </xsl:choose> </xsl:if> </xsl:if> </xsl:for-each> </xsl:variable> <inline-formula> <xsl:if test="string-length(@src) > 0"> <inline-graphic mimetype="image"> <xsl:attribute name="alternate-form-of"><xsl:value-of select="generate-id()"/></xsl:attribute> <xsl:attribute name="xlink:href"><xsl:value-of select="concat('localfile:', @src)"/></xsl:attribute> </inline-graphic> </xsl:if> <tex-math> <xsl:attribute name="id"><xsl:value-of select="generate-id()"/></xsl:attribute> <xsl:choose> <xsl:when test="normalize-space($commentValue)"> <xsl:value-of select="$commentValue"/> </xsl:when> <xsl:otherwise> <xsl:value-of select="@alt"/> </xsl:otherwise> </xsl:choose> </tex-math> </inline-formula> </xsl:template> Here is the chain of invocation for the first element: <!-- ============================================================= --> <!-- "tihtml" --> <!-- ============================================================= --> <!-- --> <xsl:template match="tihtml"> <xsl:text>&#xA;</xsl:text> <article-title> <xsl:variable name="path"> <xsl:apply-templates select="." mode="get-full-xpath" /> </xsl:variable> <xsl:variable name="result" select="jtidy:convert($tidyInstance, ., false(), $path)"/> <xsl:choose> <xsl:when test="$result"> <xsl:apply-templates select="$result" mode="html"> <xsl:with-param name="pAsBreak" select="true()"/> </xsl:apply-templates> </xsl:when> <xsl:otherwise> <xsl:call-template name="ConPrepMessage"> <xsl:with-param name="msgNum" select="'AMS704'"/> <xsl:with-param name="scriptName" select="$SCRIPT_NAME"/> <xsl:with-param name="message"><xsl:value-of select="name()"/></xsl:with-param> </xsl:call-template> </xsl:otherwise> </xsl:choose> </article-title> </xsl:template> The $result variable is a Saxon 8 node containing the <body> element which contains the <img> elements. The apply-templates therefore matches first on <body> element, mode=’html’. That template is as follows: <xsl:template match="body" mode="html"> <xsl:param name="pAsBreak" select="false()"/> <xsl:param name="expectP" select="false()"/> <xsl:choose> <xsl:when test="$expectP"> <xsl:choose> <xsl:when test="p"> <xsl:apply-templates mode="html"> <xsl:with-param name="pAsBreak" select="$pAsBreak"/> </xsl:apply-templates> </xsl:when> <xsl:otherwise> <p> <xsl:apply-templates mode="html"> <xsl:with-param name="pAsBreak" select="$pAsBreak"/> </xsl:apply-templates> </p> </xsl:otherwise> </xsl:choose> </xsl:when> <xsl:otherwise> <xsl:choose> <xsl:when test="p"> <xsl:apply-templates select="p/*" mode="html"> <xsl:with-param name="pAsBreak" select="$pAsBreak"/> </xsl:apply-templates> </xsl:when> <xsl:otherwise> <xsl:apply-templates mode="html"> <xsl:with-param name="pAsBreak" select="$pAsBreak"/> </xsl:apply-templates> </xsl:otherwise> </xsl:choose> </xsl:otherwise> </xsl:choose> </xsl:template> The <img> elements contained in the body are then processed, via appy-templates, mode=”html” by the “problem” template listed above. For the second template (the one which ultimately results in production of duplicate IDs), the chain of invocation is as follows: <!-- ============================================================= --> <!-- "abshtml" --> <!-- ============================================================= --> <!-- --> <xsl:template match="abshtml"> <xsl:choose> <xsl:when test="string-length(normalize-space(.)) > 0"> <xsl:variable name="path"> <xsl:apply-templates select="." mode="get-full-xpath" /> </xsl:variable> <xsl:variable name="result" select="jtidy:convert($tidyInstance, normalize-space(.), true(), $path)"/> <xsl:if test="$result"> <xsl:text>&#xA;</xsl:text> <abstract> <xsl:text>&#xA;</xsl:text> <title><x x-type="archive"><xsl:value-of select="$ABSTRACT_TITLE"/></x></title> <xsl:apply-templates select="$result" mode="abshtml"/> </abstract> </xsl:if> </xsl:when> <xsl:otherwise> <xsl:call-template name="ConPrepMessage"> <xsl:with-param name="msgNum" select="'AMS702'"/> <xsl:with-param name="scriptName" select="$SCRIPT_NAME"/> <xsl:with-param name="message"><xsl:value-of select="name()"/></xsl:with-param> </xsl:call-template> </xsl:otherwise> </xsl:choose> </xsl:template> The $result variable is a Saxon 8 node containing the <body> element which contains the 10 <img> elements. The apply-templates therefore matches first on <body> element, mode= ‘abshtml’. That template is as follows: <xsl:template match="body" mode="abshtml"> <xsl:param name="pAsBreak" select="false()"/> <xsl:for-each select="node()"> <xsl:choose> <xsl:when test="self::text() and not(normalize-space(.))"/> <xsl:when test="name(.) = 'p' or name(.) = 'div' or name(.) = 'br' or name(.) = 'address'"> <xsl:apply-templates select="." mode="html"> <xsl:with-param name="pAsBreak" select="$pAsBreak"/> </xsl:apply-templates> </xsl:when> <xsl:otherwise> <xsl:if test="position() = 1 or name(preceding-sibling::node()[1]) = 'p' or name(preceding-sibling::node()[1]) = 'div' or name(preceding-sibling::node()[1]) = 'br' or name(preceding-sibling::node()[1]) = 'address'"> <xsl:text disable-output-escaping="yes">&lt;p&gt;</xsl:text> </xsl:if> <xsl:apply-templates select="." mode="html"> <xsl:with-param name="pAsBreak" select="$pAsBreak"/> </xsl:apply-templates> <xsl:if test="position() = last() or name(following-sibling::node()[1]) = 'p' or name(following-sibling::node()[1]) = 'div' or name(following-sibling::node()[1]) = 'br' or name(following-sibling::node()[1]) = 'address'"> <xsl:text disable-output-escaping="yes">&lt;/p&gt;</xsl:text> </xsl:if> </xsl:otherwise> </xsl:choose> </xsl:for-each> </xsl:template> This template then again invokes the problem template via the apply-templates mode=”html’’ This is when the duplicate ID value is generated. Any help you could give us on this matter would be greatly appreciated. Regards, Sheila Sheila M. Morrissey Portico 228 Alexander Street Princeton NJ 08540 USA 609 258 9173 http://portico.org/ http://www.ithaka.org/


Replies (3)

Please register to reply

RE: Duplicate values from generate-id() funct - Added by Anonymous almost 18 years ago

Legacy ID: #3861768 Legacy Poster: Michael Kay (mhkay)

This looks like a pretty complex issue, so I think it would be best if you can package up all the files needed for me to reproduce it. You can send these directly to me at . I won't get a chance to look at it until I'm back from vacation at the weekend, I'm afraid.

RE: Duplicate values from generate-id() funct - Added by Anonymous almost 18 years ago

Legacy ID: #3867156 Legacy Poster: Michael Kay (mhkay)

I can see what's going wrong here, but it's not immediately obvious what I should advise you to do about it, or how I should fix the problem. When you use the JAXP DocumentBuilder interface with Saxon, you don't supply a Saxon Configuration object (because the interface supplies no way of doing it), so the DocumentBuilder creates its own. Unique document numbers are allocated within a Configuration, so each document is getting the same document number (0). When you then feed this document as input to a Saxon transformation (this would be equally true if you supplied it as a parameter), Saxon isn't detecting that it was built under a different Configuration from the one in which the transformation is running, and that the document number might therefore not be unique. I think I should change Saxon to add this check and reject the document if it was built under the wrong Configuration. I should probably also add a non-JAXP method to the DocumentBuilderImpl that allows a user-defined Configuration to be supplied. In your case you could get this by declaring an extra (first) parameter on your extension function: net.sf.saxon.om.XPathContext context and picking up context.getConfiguration(). It's hard to see any other way to fix the problem. I can't modify the document number held in the document because it might be in use in other transformations running in other threads. Potentially I could hold the allocated document number externally to the document itself, so the same document can have different document numbers in different transformations, but that's a lot of added complexity. Your extension function is not actually using DOM interfaces: all it's doing as far as I can see is to parse lexical XML into a tree, and then pass Saxon the tree. It would fit Saxon's design much better if you simply returned the lexical XML, bundled up as a StreamSource. Saxon would then build the tree in its own format. That is, instead of doing doc = db.parse(baisSE2); you would do return new StreamSource(baisSE2); Of course this has the drawback that it makes your code more Saxon-specific (I don't know if other processors allow an extension function to return a StreamSource). Also, the result of the extension function would be the document node, rather than the <body> element: you would have to navigate down to the body element with a path expression at the XSLT level. Let me know what you think.

RE: Duplicate values from generate-id() funct - Added by Anonymous almost 18 years ago

Legacy ID: #3867367 Legacy Poster: Sheila Morrissey (smorrissey)

Mr. Kay, Thank you for your speedy and helpful response. The second alternative (returning a StreamSource from the extension function) seemed the most straightforward to implement -- we made the changes to both the Java extension function and the XSL -- works like a charm. Thanks again, Regards Sheila Sheila M. Morrissey Portico 228 Alexander Street Princeton NJ 08540 USA 609 258 9173 http://www.portico.org/ http://www.ithaka.org/

    (1-3/3)

    Please register to reply