Project

Profile

Help

Merge xml files

Added by Mohd Shadab almost 5 years ago

I would like to merge 2 XML files,

<?xml version="1.0" encoding="UTF-8"?>
<root>
	<list>
		<cast subject="C#" title="cast 1"/>
		<cast subject="XQuery" title="cast 3"/>
		<cast subject="XSLT" title="cast 4"/>
	</list>
</root>
<?xml version="1.0" encoding="UTF-8"?>
<books>
	<book>
		<name>book 1</name>
		<subject>C#</subject>
	</book>
	<book>
		<name>book 3</name>
		<subject>XSLT</subject>
	</book>
</books>

and get following output,

<?xml version="1.0" encoding="UTF-8"?>
<titles>
	<book>
		<name>book 1</name>
		<subject>C#</subject>
                <title>cast 1</title>
	</book>
	<book>
		<name>book 3</name>
		<subject>XSLT</subject>
                <title>cast 4</title>
	</book>
</titles>

Please suggest XSL for doing this merge.


Replies (10)

Please register to reply

RE: Merge xml files - Added by Martin Honnen almost 5 years ago

Declare a key <xsl:key name="ref" match="cast" use="@subject"/> and then process the second document as the primary input and use

<xsl:template match="book">
  <xsl:copy>
    <xsl:copy-of select="@*, node()"/>
    <title>{key('ref', subject, doc('file1.xml'))/@title}</title>
  </xsl:copy>
</xsl:template>

Additionally set up the identity transformation with <xsl:mode on-no-match="shallow-copy"/> and use expand-text="yes" on the xsl:stylesheet/xsl:transform root element.

RE: Merge xml files - Added by Mohd Shadab almost 5 years ago

Thanks Martin. Is it possible to use xsl:merge-source and xsl:merge-action?

RE: Merge xml files - Added by Martin Honnen almost 5 years ago

It should be possible but in my view has the disadvantage that it requires sorted merge keys in the input XMLs or requires you to indicate you want the item to be merged to be sorted first by the merge keys. Also with the xsl:merge I think you would need some extra effort to ensure the <cast subject="XQuery" title="cast 3"/> from the first file doesn't show up in the merge result.

RE: Merge xml files - Added by Mohd Shadab almost 5 years ago

Could you please share the xsl using xsl:merge? I agree that XMLs need to be sorted, but I am trying to understand what are the limitations of xsl:merge. In this example, it tries to create an output with non-key fields from both the input XMLs. So inputs files have 2 fields each and output has 3 fields.

For now we can ignore the <cast subject="XQuery" title="cast 3"/> part coming in output.

RE: Merge xml files - Added by Michael Kay almost 5 years ago

I agree with Martin, this looks to me more of a join query than a merge query. However, if both the files are sorted on subject (or if you're prepared to pre-sort them), then it can usefully be done using xsl:merge (the main advantage is if the files are large and you want to use streaming). I won't do you a worked example, but you can define two xsl:merge-source elements, one for each file:

<xsl:merge-source name="A" select="doc('1.xml')/root/list/cast">
  <xsl:merge-key select="@subject"/>
<xsl:merge-source>

<xsl:merge-source name="B" select="doc('2.xml')/books/book">
  <xsl:merge-key select="subject"/>
<xsl:merge-source>

and then in the xsl:merge-action you can access the two records as current-merge-group('A') and current-merge-group('B'); test if both exist, and if they do, extract the relevant information.

RE: Merge xml files - Added by Michael Kay almost 5 years ago

P.S. If you want help with coding, then it's best to show us your best attempt and explain how it fails. That way we can see where you've gone wrong and explain the relevant points of the spec. Otherwise, we're just writing the code for you, and that doesn't really help you understand what you're doing.

RE: Merge xml files - Added by Mohd Shadab almost 5 years ago

Thanks, Mike. Yes, this is for streaming mode only. I am attempting this merge-action, but the issue is that both book and cast templates are defined separately. How to merge the elements from 2 XML into a single output record - for the same key.

 <xsl:merge-source name="A" select="doc('1.xml')/root/list/cast">
   <xsl:merge-key select="@subject"/>
 <xsl:merge-source>
 
 <xsl:merge-source name="B" select="doc('2.xml')/books/book">
   <xsl:merge-key select="subject"/>
 <xsl:merge-source>

	<xsl:merge-action>
		<subject name="{current-merge-key()}">
			<xsl:apply-templates select="current-merge-group('A'), current-merge-group('B')"/>
		</subject>
	</xsl:merge-action>

	<xsl:template match="book">
		<media1 type="{local-name()}">
			<xsl:apply-templates select="* except subject"/>
		</media1>
	</xsl:template>
	
	<xsl:template match="cast">
		<media type="{local-name()}">
			<title>{@title}</title>
		</media>
	</xsl:template>

RE: Merge xml files - Added by Martin Honnen almost 5 years ago

The subject name="{current-merge-key()}" should give you a single record for each merge group so that part should do. You haven't spelled out which content you want exactly and the templates you have shown seem to try to create some structure with media type="{local-name()}" you haven't shown in your wanted result.

As for using streaming, that is a different issue, as far as I remember of the top of my head you need to use for-each-source, e.g. <xsl:merge-source name="A" for-each-source="'doc1.xml'" select="root/list/cast" streamable="yes">.

RE: Merge xml files - Added by Mohd Shadab almost 5 years ago

I apologize, this is the template I am trying to build. But how will the template for book(which is in doc2) get access to title(which is in doc1)? I am following this link for the merge, https://www.saxonica.com/html/documentation/xsl-elements/merge.html. Please advise for non-streamable, then we can work on streamable.

	<xsl:template match="book">
		<titles>
			<book><xsl:value-of select="{local-name()}" /></book>
			<subject><xsl:value-of select="{local-name()}" /></subject>
			<!-- how to get title? -->
		</titles>
	</xsl:template>

RE: Merge xml files - Added by Michael Kay almost 5 years ago

I don't think I would use template rules here. I would do something like

<xsl:merge-action>
  <xsl:variable name="cast" select="current-merge-group('A')"/>
  <xsl:variable name="book" select="current-merge-group('B')"/>
  <xsl:if test="exists($cast) and exists($book)">
    <book>
        <name>{$book/name}</name>
        <subject>{$book/subject}</subject>
        <title>{$cast/@title}</title>
    </book>
  </xsl:if>
</xsl:merge-action>
 
    (1-10/10)

    Please register to reply