Processing mulitple documents w/o duplicates

Added by Anonymous over 17 years ago

Legacy ID: #4548693 Legacy Poster: d1_xslt (d1_xslt)

I want to process lists from multiple XML documents and eliminate duplicate entries. I tried the following script with the latest version of saxon ( saxonb8-9-0-4 ). The xsl:sequence statement merges the two lists however it does not duplicate entries with the same name. What I'm doing wrong ? XSL Stylesheet: ==================================================== <?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" > <xsl:template match='/'> <xsl:variable name='mergedList' as="item()"> <xsl:sequence select='document(Documents/Document/@name)/Log/Clients/Client[ not( @name = preceding::/@name ) ]'/> </xsl:variable> <xsl:for-each select='$mergedList'> <xsl:message>Client "<xsl:value-of select='@name'/>"</xsl:message> </xsl:for-each> </xsl:template> </xsl:stylesheet> ==================================================== Document 1: ==================================================== <?xml version="1.0" encoding="utf-8"?> <Documents> <Document name='t11.xml'/> <Document name='t12.xml'/> </Documents> Document t11.xml: ==================================================== <Log name="Log t1"> <Clients> <Client name="A" /> <Client name="B" /> </Clients> </Log> Document t12.xml: ==================================================== <Log name="Log t1"> <Clients> <Client name="A" /> <Client name="B" /> </Clients> </Log> Result : ==================================================== Client "A" Client "B" Client "A" Client "B"

Replies (3)

Please register to reply

RE: Processing mulitple documents w/o duplica - Added by Anonymous over 17 years ago

Legacy ID: #4548794 Legacy Poster: Michael Kay (mhkay)

There is nothing in this question that's specific to Saxon - general XSLT coding questions should be asked on the xsl-list at mulberrytech.com The preceding axis (like every other axis) doesn't cross document boundaries. It's also very inefficient. To eliminate duplicates in XSLT 2.0, use xsl:for-each-group <xsl:variable name='mergedList' as="item()*"> <xsl:for-each-group select="document(Documents/Document/@name)/Log/Clients/Client" group-by="@name"> <xsl:copy-of select="current-group()[1]"/> </xsl:for-each-group> </xsl:variable>

RE: Processing mulitple documents w/o duplica - Added by Anonymous over 17 years ago

Legacy ID: #4548869 Legacy Poster: d1_xslt (d1_xslt)

Thank you. That was exactly what I'm looking for. I have one more question: The first approach did not work, even when I create (a) a copy of the mergedList and (b) define a second list, which filters duplicate elements. Due to my understanding the copied list should eliminate the cross document boundary problem. =========================================== <xsl:variable name='mergedList' as="item()"> <xsl:copy-of select='document(Documents/Document/@name)/Log/Clients/Client'/> </xsl:variable> <xsl:variable name='mergedList2' as="item()"> <xsl:sequence select='$mergedList[ not( @name = preceding::*/@name ) ]'/> </xsl:variable>

RE: Processing mulitple documents w/o duplica - Added by Anonymous over 17 years ago

Legacy ID: #4549139 Legacy Poster: Michael Kay (mhkay)

No, your code doesn't create a new document. It just creates a sequence of elements. The elements are parentless copies of the originals, they aren't part of a new document. To add the elements to a document you need to use xsl:document, or to leave out the as="item()*" from xsl:variable.

(1-3/3)

Please register to reply

Project

Profile

Help

Saxon