Project

Profile

Help

Optimize an XSLT script

Added by Anonymous about 16 years ago

Legacy ID: #5547060 Legacy Poster: XmlStudent (xmlstudent80)

Hi everyone! I'm trying to learn XSLT and I have a doubt about a particular transformation. It is the first time I write in this forum , please let me know if the problem or the questions are not clear. Let's say that I have a source instance with two sets of soccer players: ---------------------------------------------------source------------ <Players> <Player role="forward"> <name>Messi</name> </Player> <Player role="forward"> <name>C.Ronaldo</name> </Player> <Player role="striker"> <name>Messi</name> </Player> <Player role="striker"> <name>Raul</name> </Player> <Players> --------------------------------------------------- the result that I want to get is a list of forward/strikers players (a copy) without duplicates, that is: ---------------------------------------------------target------------ <Strikers> <Player> <name>Messi</name> </Player> <Player> <name>C.Ronaldo</name> </Player> <Player> <name>Raul</name> </Player> <Strikers> --------------------------------------------------- I've been able to write the following xsl script, but I suppose the complexity (thus the running time) can be reduced. In the script the idea is:copy all the players marked as forward, then copy all the players marked as stikers that were not marked as forward. ---------------------------------------------------XSL------------ <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:pp="Paolo's functions" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> <xsl:function name="pp:is-node-in" as="xs:boolean" > <xsl:param name="node" as="node()?"/> <xsl:param name="seq" as="node()"/> <xsl:sequence select="some $nodeInSeq in $seq satisfies ($nodeInSeq/text()=$node/text())"/> </xsl:function> [...] <xsl:for-each select=".[@role='forward']"> <xsl:apply-templates select="."/> </xsl:for-each> <xsl:for-each select=".[@sourceView='striker']"> <xsl:variable name="innerCandidate" select="."/> <xsl:if test="not(pp:is-node-in($innerCandidate/homeName,/result/[@sourceView='Home2']/homeName))"> <xsl:apply-templates select="."/> </xsl:if> </xsl:for-each> [...] </xsl:template> <xsl:template name="normal" match=""> <xsl:copy> <xsl:copy-of select="./@[not((name()='sourceView'))]"/> <xsl:value-of select="./text()"/> <xsl:apply-templates select="*"/> </xsl:copy> </xsl:template> </xsl:stylesheet> ------------------------------------------------------------------------------------------------ notice the 'is-node-in' function: is there anything better available in Saxon? notice also the containmente test I do for each stiker element: this is like a very expensive self join on the source Players set....any idea to improve it, leaving me the chance to compare also players on more than 1 value? (say name and age) thanks in advance for you help! -- paolo


Replies (2)

RE: Optimize an XSLT script - Added by Anonymous about 16 years ago

Legacy ID: #5547330 Legacy Poster: Michael Kay (mhkay)

I would suggest that you ask XSLT coding questions on the xsl-list at mulberrytech.com in future. This list is designed for questions that are specifically Saxon support issues. Performance is product-specific, of course, but in this case I think your questions are largely product-independent. I suspect that your is-node-in() function should be testing for node identity rather than equality (but I'm not sure, I haven't really tried to follow your logic in detail, and you haven't included it all, and what you have included refers to things that aren't in your source, like an @sourceView attribute). If I'm right, then <xsl:sequence select="some $nodeInSeq in $seq satisfies ($nodeInSeq/text()=$node/text())"/> should be <xsl:sequence select="some $nodeInSeq in $seq satisfies ($nodeInSeq is $node)"/> which can be more succinctly expressed as <xsl:sequence select="exists($nodeInSeq intersect $seq)"/> On the other hand, if it really is an equality test you want, then it can be written <xsl:sequence select="$nodeInSeq = $seq"/> However, all this is probably irrelevant. What you are doing looks like a simple grouping problem, and if you want to do it efficiently you should use <xsl:for-each-group>.

RE: Optimize an XSLT script - Added by Anonymous about 16 years ago

Legacy ID: #5547541 Legacy Poster: XmlStudent (xmlstudent80)

Michael, thanks for your reply (and for your work on Saxon in general). You are right, there were a couple of typos in my code: 'sourceView' value stand for 'role' and the correct line test is: <xsl:if test="not(pp:is-node-in($innerCandidate/name,/result/*[@role='forward']/name))"> -------------------------------------------------------------------------------- I've been thinking about for-each-group, but I'm not sure it may help. If I understand correctly, with for-each-group you can specify transformations as follows (for my example): <xsl:for-each-group select="file" group-by="@role"> <xsl:for-each select="current-group()"> <xsl:value-of select="@name"/> </xsl:for-each> but this solution doesn't help in my case: 1. I need to take different actions depending on the value of role, 2. the performance issue is due to the test line above, when I check for each striker element if I had copied it in a previous step I'll try to post also on the other web site, thanks for the pointer. -- Paolo

    (1-2/2)

    Please register to reply