Project

Profile

Help

Bug #3159

closed

Possible concurrency bug when using strip-space

Added by Cory O over 7 years ago. Updated over 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Multithreading
Sprint/Milestone:
-
Start date:
2017-03-11
Due date:
% Done:

100%

Estimated time:
Legacy ID:
Applies to branch:
9.7
Fix Committed on Branch:
9.7, trunk
Fixed in Maintenance Release:
Platforms:

Description

I am randomly receiving the following exception when processing the same input using a Templates object in multiple threads.

Unknown element: <html/>
Error in xsl:element/@name on line 7 column 35 
  XTMM9000: Processing terminated by xsl:message at line 7 in 
  in built-in template rule
; SystemID: ; Line#: 7; Column#: 35
net.sf.saxon.expr.instruct.TerminationException: Processing terminated by xsl:message at line 7 in 
	at net.sf.saxon.expr.instruct.Message.processLeavingTail(Message.java:223)
	at net.sf.saxon.expr.instruct.TemplateRule.applyLeavingTail(TemplateRule.java:353)
	at net.sf.saxon.trans.Mode.applyTemplates(Mode.java:456)
	at net.sf.saxon.trans.TextOnlyCopyRuleSet.process(TextOnlyCopyRuleSet.java:65)
	at net.sf.saxon.trans.Mode.applyTemplates(Mode.java:433)
	at net.sf.saxon.Controller.transformDocument(Controller.java:2321)
	at net.sf.saxon.Controller.transform(Controller.java:1892)
	at net.sf.saxon.s9api.XsltTransformer.transform(XsltTransformer.java:579)
	at net.sf.saxon.jaxp.TransformerImpl.transform(TransformerImpl.java:185)
	at z.test.UnsafeXslt.run(UnsafeXslt.java:89)
	at java.lang.Thread.run(Thread.java:745)

As I understand it, Templates.newTransformer() should be thread safe.

I even wrapped it within a synchronized block to be sure.

The problem only seems to occur when I include the following in my XSLT:

<xsl:strip-space elements="*"/>

You may need to execute the following multiple times before seeing the exception.

unsafe.main.xslt

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:html="http://www.w3.org/1999/xhtml">
	<xsl:strip-space elements="*"/>
	<xsl:template match="*">
		<xsl:message terminate="yes">
			<xsl:text>Unknown element: </xsl:text>
			<xsl:element name="{name()}" />
		</xsl:message>
	</xsl:template>
	<xsl:template match="html:html">
		<root>
			<xsl:apply-templates />
		</root>
	</xsl:template>
	<xsl:template match="html:body">
		<record>
			<xsl:apply-templates />
		</record>
	</xsl:template>
</xsl:stylesheet>

unsafe.input.xml

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml"><body>content</body></html>

UnsafeXslt.java

package z.test;

import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.StringReader;
import java.net.URISyntaxException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.Executors;
import java.util.concurrent.ThreadFactory;

import javax.xml.transform.Source;
import javax.xml.transform.Templates;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerConfigurationException;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.URIResolver;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;

public class UnsafeXslt implements Runnable {
	private static final Templates templates;
	static {
		final TransformerFactory factory = TransformerFactory.newInstance();
		factory.setURIResolver(new URIResolver() {
			@Override
			public Source resolve(String href, String base) throws TransformerException {
				final InputStream input = UnsafeXslt.class.getResourceAsStream(href);
				if (input == null) {
					throw new TransformerException("Unable to find '" + href + "' in '" + base + "'.");
				}
				return new StreamSource(input, "ccm:"+href);
			}
		});
		try {
			templates = factory.newTemplates(new StreamSource(UnsafeXslt.class.getResourceAsStream("/unsafe.main.xslt")));
		} catch (TransformerConfigurationException e) {
			throw new RuntimeException(e);
		}
	}
	private static final String input;
	static {
		try {
			input = String.join("\n", Files.readAllLines(Paths.get(UnsafeXslt.class.getResource("/unsafe.input.xml").toURI())));
		} catch (IOException | URISyntaxException e) {
			throw new RuntimeException(e);
		}
	}
	
	public static void main(String[] args) throws Throwable {
		final int threadSize = 30;
		final CountDownLatch latch = new CountDownLatch(threadSize);
		final ThreadFactory threadFactory = Executors.defaultThreadFactory();
		final Thread[] threads = new Thread[threadSize];
		for (int i = 0; i < threadSize; ++i) {
			threads[i] = threadFactory.newThread(new UnsafeXslt(latch));
		}
		for (int i = 0; i < threadSize; ++i) {
			threads[i].start();
		}
		for (int i = 0; i < threadSize; ++i) {
			threads[i].join();
		}
	}
	
	private CountDownLatch latch;
	
	public UnsafeXslt(final CountDownLatch latch) {
		this.latch = latch;
	}
	
	@Override
	public void run() {
		try {
			latch.countDown();
			latch.await();
			final int size = 1000;
			final ByteArrayOutputStream output = new ByteArrayOutputStream();
			final StreamResult result = new StreamResult(output);
			for (int i = 0; i < size; ++i) {
				output.reset();
				final Transformer transformer;
				synchronized (templates) {
					transformer = templates.newTransformer();
				}
				transformer.transform(new StreamSource(new StringReader(input), "ccm:input"), result);
			}
		} catch (Throwable t) {
			t.printStackTrace(System.out);
		}
	}
}
Actions #1

Updated by Cory O over 7 years ago

Using the latest Saxon HE 9.7.0-15.

Java 1.8 through Eclipse Mars.1 Release (4.5.1).

Actions #2

Updated by Michael Kay over 7 years ago

Thanks for reporting it.

My initial attempts to reproduce it have been unsuccessful, but I will keep trying, in particular by trying to reproduce your operating conditions more precisely.

Actions #3

Updated by Michael Kay over 7 years ago

Could you check whether you still get the error if you use the Apache Xerces parser rather than the built-in JDK parser?

Actions #4

Updated by Cory O over 7 years ago

Yes, I still get the error when using Xerces (2.4.0).

public static Source xercesSource() throws SAXException {
	final InputSource source = new InputSource(new StringReader(input));
	final XMLReader reader = XMLReaderFactory.createXMLReader("org.apache.xerces.parsers.SAXParser");
	return new SAXSource(reader, source);
}
Actions #5

Updated by Michael Kay over 7 years ago

Can you reproduce it outside Eclipse?

(I'm now running your exact Java code, accessing the source document and stylesheet the way you do from static initialization blocks (which I would never normally do and am therefore suspicious of) - and still not seeing any error.

We don't routinely use Eclipse and when people report problems, I'm always suspicious it might have something to do with the unusual class loading mechanisms, so it would be nice to eliminate Eclipse from our enquiries, as the police say.

Actions #6

Updated by Michael Kay over 7 years ago

The first stage in diagnosing a threading error is to reproduce it, and unless we can reproduce it, it's idle to speculate about the causes. So this is idle speculation.

The message output suggests that the html element of the input document doesn't match the template rule for html.

The most likely reason is that the allocation of NamePool fingerprints has gone wrong for some reason. Specifically, the fingerprint in the compiled stylesheet doesn't match the fingerprint in the source document. The NamePool is shared by all threads (it's owned by the Configuration which is owned by the TransformerFactory) so it's definitely a focus for investigation. Given that the source document is built by the transformer, it seems highly implausible that there should be more than one NamePool around. If you feel like doing some debugging at the Saxon level, you could breakpoint at the construction of the NamePool to check this.

I recall now that one aspect of the (re-)design of the NamePool that caused me a little unease is that it contains two concurrent hashmaps, one to map QNames to fingerprints, and one to do the reverse; and there is a potential risk that the two hashmaps could contain inconsistent mappings; the whole aim of the redesign was to avoid single-threading all updates to the NamePool.

Now, I can't really see what could go wrong here, because in this particular case, all the QNames that appear in the source document also appear in the stylesheet, so the NamePool must be fully populated before any source documents are built. But if I were able to reproduce the problem, the first thing I would do to try and identify a cause would be (a) to check the fingerprints in the source document against those in the stylesheet, and (b) in the event of inconsistency, to monitor the allocation of fingerprints in the NamePool. It helps that there are only two names in this case.

The other line of attack is to explore why space stripping might be significant. Again, it's not easy to see what could go wrong here. Because the input document is supplied as a stream, space stripping is a filter operation applied to events emanating from the XML parser en route to the document builder. (The filter class is called "Stripper", and its javadoc gets a large number of Google hits...) The thing is, the Stripper class always passes element nodes through unchanged; the only thing it affects are text nodes. So it's hard to see how its presence could contribute to any kind of confusion over element name fingerprints. But again, this is an avenue to investigate.

Actions #7

Updated by Cory O over 7 years ago

Should I not be creating a single Templates object statically and reusing it?

This is what I am doing to reuse it between Servlet calls in my web application.

I updated the source to pull a new InputStream with each call. I only did the previous to recreate the error.

public static Source xercesSource() throws SAXException {
	final InputStream input = UnsafeXslt.class.getResourceAsStream("/unsafe.input.xml");
	final InputSource source = new InputSource(input);
	final XMLReader reader = XMLReaderFactory.createXMLReader("org.apache.xerces.parsers.SAXParser");
	return new SAXSource(reader, source);
}

I still receive the error when run from the command line.

java -jar target/z-test.jar
java -version
java version "1.8.0_91"
Java(TM) SE Runtime Environment (build 1.8.0_91-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.91-b14, mixed mode)
ver
Microsoft Windows [Version 10.0.14393]

My POM:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>z.test</groupId>
  <artifactId>z-test</artifactId>
  <version>0.0.1-SNAPSHOT</version>
  <packaging>jar</packaging>
  <properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    <maven.compiler.source>1.8</maven.compiler.source>
    <maven.compiler.target>1.8</maven.compiler.target>
  </properties>
  <dependencies>
    <dependency>
      <groupId>net.sf.saxon</groupId>
      <artifactId>Saxon-HE</artifactId>
      <version>9.7.0-15</version>
    </dependency>
    <dependency>
      <groupId>xerces</groupId>
      <artifactId>xerces</artifactId>
      <version>2.4.0</version>
    </dependency>
  </dependencies>
  <build>
    <plugins>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-assembly-plugin</artifactId>
        <version>2.4.1</version>
        <configuration>
          <descriptorRefs>
            <descriptorRef>jar-with-dependencies</descriptorRef>
          </descriptorRefs>
          <finalName>z-test</finalName>
          <appendAssemblyId>false</appendAssemblyId>
          <archive>
            <manifest>
              <mainClass>z.test.UnsafeXslt</mainClass>
            </manifest>
          </archive>
        </configuration>
        <executions>
          <execution>
            <id>make-assembly</id>
            <phase>package</phase>
            <goals>
              <goal>single</goal>
            </goals>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
</project>
Actions #8

Updated by Michael Kay over 7 years ago

I'm personally very averse to putting things in static, my instinct is always to go for singleton instances. One reason is that things can easily go wrong when you have multiple classloaders around. But I don't think it's actually wrong to hold the Templates object in static.

I shall try and reproduce the problem on a Windows machine when back in the office.

Actions #9

Updated by Cory O over 7 years ago

I created the following enum singleton to instantiate my Templates object.

Note that not only am I running the newTransformer() method, but I also run an initial transform().

These occur before the enum fully instantiates and before any other thread can request a new transformer.

I am no longer seeing the error with this setup. Without that initial transform(), the error occurs.

I have to assume that somewhere in your code you are waiting until the transform() occurs before populating something within the parent Templates class.

package z.test;

import java.io.ByteArrayOutputStream;
import java.io.StringReader;
import java.util.Properties;

import javax.xml.transform.Result;
import javax.xml.transform.Source;
import javax.xml.transform.Templates;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerConfigurationException;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;

public enum StylesheetEnum implements Templates {
	INSTANCE;
	
	private final Templates templates;
	
	private StylesheetEnum() {
		try {
			final TransformerFactory factory = TransformerFactory.newInstance();
			final Source source =  new StreamSource(new StringReader("<html xmlns=\"http://www.w3.org/1999/xhtml\"/>"));
			final Result result = new StreamResult(new ByteArrayOutputStream());
			final Templates templates = factory.newTemplates(new StreamSource(StylesheetEnum.class.getResourceAsStream("/unsafe.main.xslt")));
			templates.newTransformer().transform(source, result);
			this.templates = templates;
		} catch (TransformerException e) {
			throw new RuntimeException(e);
		}
	}
	
	@Override
	public Properties getOutputProperties() {
		return templates.getOutputProperties();
	}
	
	@Override
	public Transformer newTransformer() throws TransformerConfigurationException {
		return templates.newTransformer();
	}
}
Actions #10

Updated by Michael Kay over 7 years ago

In studying this further, I discovered an inefficiency, but I don't think it's directly related. It seems that during initialization of the stylesheet, we register a system-defined key definition to implement the idref() function, and this key definition calls tokenize() to split a string on whitespace boundaries, and the existence of this expression initializes the regex engine, causing the file categories.xml to be read and parsed, even though no regular expression is ever evaluated.

I noticed this because parsing the categories.xml file initializes entries in a separate NamePool.

Actions #11

Updated by Michael Kay over 7 years ago

Since I still can't reproduce this, could I ask you please for some diagnostics?

(a) please change the stylesheet so it looks like this:

    <xsl:template match="*">
        <xsl:message terminate="yes">
            <xsl:copy/>
        </xsl:message>
    </xsl:template>
    <xsl:template match="html:html">
        <xsl:message>
            <xsl:copy/>
        </xsl:message>
        <root>
            <xsl:apply-templates />
        </root>
    </xsl:template>
    <xsl:template match="html:body">
        <record>
            <xsl:apply-templates />
        </record>
    </xsl:template>

(b) Please add a custom message listener like this:

                final Transformer transformer;
                synchronized(templates) {
                    transformer = templates.newTransformer();
                }
                ((TransformerImpl)transformer).getUnderlyingController().setMessageEmitter(new XMLEmitter() {
                    @Override
                    public void open() throws XPathException {
                        setOutputStream(System.err);
                        super.open();
                    }

                    @Override
                    public void startElement(NodeName elemName, SchemaType typeCode, Location location, int properties) throws XPathException {
                        System.err.println("Message elem: " + elemName.getURI() + "#" + elemName.getLocalPart() + " fp=" + elemName.getFingerprint());
                        super.startElement(elemName, typeCode, location, properties);
                    }
                });

The idea here is to output the namepool fingerprints of the <html> node in both the success case and the failure case, and thus to confirm or repudiate my conjecture that the reason for the template rule not matching is that the namepool fingerprint is wrong. It would also reveal if the element uri/local-name is not one that we expect to find in the source document.

Actions #12

Updated by Cory O over 7 years ago

All output was identical. Just a lot of the following two messages.

Message elem: http://www.w3.org/1999/xhtml#html fp=1026
<html xmlns="http://www.w3.org/1999/xhtml"/>
Actions #13

Updated by Michael Kay over 7 years ago

So it's not hitting the terminate="yes" branch in this case?

Actions #14

Updated by Cory O over 7 years ago

It was, yes. However, that xsl:copy/ produced the same output and same "message elem".

Actions #15

Updated by Michael Kay over 7 years ago

When it fails, does it always fail on the first transformation with a given Templates object, or do you sometimes get N successful transformations followed by an unsuccessful one?

Could you try adding this additional method to the MessageEmitter:

                 public void startDocument(int properties) throws XPathException {
                        boolean terminate = (properties & ReceiverOptions.TERMINATE) != 0;
                        if (terminate) {
                            System.err.println("** Terminating. Template rules:");
                            ((TransformerImpl) transformer).getUnderlyingController()
                                    .getRuleManager().getUnnamedMode().processRules(new Mode.RuleAction() {
                                @Override
                                public void processRule(Rule r) throws XPathException {
                                    System.err.println(
                                            "Pattern:" + r.getPattern()
                                                    + " fp:" + r.getPattern().getFingerprint()
                                                    + " prio: " + r.getPriority());
                                }
                            });
                        }
                        super.startDocument(properties);
                    }

The aim here is, when the termination occurs, to print some details of the rules in the stylesheet. I'm expecting output like this:

** Terminating. Template rules:
Pattern:element() fp:-1 prio: -0.5
Pattern:element(Q{http://www.w3.org/1999/xhtml}html) fp:1026 prio: 0.0
Pattern:element(Q{http://www.w3.org/1999/xhtml}body) fp:1027 prio: 0.0

showing the three template rules for match="", match="html", and match="body". Your evidence so far suggests that the fingerprints in the source document are correct; so I now want to explore whether the fingerprints in the stylesheet are correct. If both are correct, then it's very odd that the match="" rule should be firing...

Actions #16

Updated by Cory O over 7 years ago

Yes, it only appears to occur on the first transform, if it appears at all.

Is there any way to simply print a message every time a fingerprint is generated? When a thread first notices the fingerprint is missing, does it generate a new one while waiting for its turn to update the main pool? Does it by chance forward that new version instead of first checking if the fingerprint was already added while it was waiting?

It's odd that you aren't able to reproduce it. Is anyone else monitoring this bug that can try? It may take several runs before it occurs. I have reproduced it on Windows 7, Windows 10, and deployed to a Tomcat server on RHEL.

Note I added an asterisk after the xsl:copy/ for the terminate message.

** Terminating. Template rules:
** Terminating. Template rules:
Pattern:element() fp:-1 prio: -0.5
Pattern:element(Q{http://www.w3.org/1999/xhtml}html) fp:1026 prio: 0.0
Pattern:element(Q{http://www.w3.org/1999/xhtml}body) fp:1027 prio: 0.0
Pattern:element() fp:-1 prio: -0.5
Pattern:element(Q{http://www.w3.org/1999/xhtml}html) fp:1026 prio: 0.0
Pattern:element(Q{http://www.w3.org/1999/xhtml}body) fp:1027 prio: 0.0
Message elem: http://www.w3.org/1999/xhtml#html fp=1026
Message elem: http://www.w3.org/1999/xhtml#html fp=1026
Message elem: http://www.w3.org/1999/xhtml#html fp=1026
Message elem: http://www.w3.org/1999/xhtml#html fp=1026
Message elem: http://www.w3.org/1999/xhtml#html fp=1026
Message elem: http://www.w3.org/1999/xhtml#html fp=1026
Message elem: http://www.w3.org/1999/xhtml#html fp=1026
Message elem: http://www.w3.org/1999/xhtml#html fp=1026
Message elem: http://www.w3.org/1999/xhtml#html fp=1026
Message elem: http://www.w3.org/1999/xhtml#html fp=1026
<html xmlns="http://www.w3.org/1999/xhtml"/>*
<html xmlns="http://www.w3.org/1999/xhtml"/>
<html xmlns="http://www.w3.org/1999/xhtml"/>
<html xmlns="http://www.w3.org/1999/xhtml"/>
<html xmlns="http://www.w3.org/1999/xhtml"/>*
<html xmlns="http://www.w3.org/1999/xhtml"/>
<html xmlns="http://www.w3.org/1999/xhtml"/>
<html xmlns="http://www.w3.org/1999/xhtml"/>
<html xmlns="http://www.w3.org/1999/xhtml"/>
<html xmlns="http://www.w3.org/1999/xhtml"/>
Count: 0
Count: 0
; SystemID: ccm:xslt; Line#: 5; Column#: 32
net.sf.saxon.expr.instruct.TerminationException: Processing terminated by xsl:message at line 5 in ccm:xslt
	at net.sf.saxon.expr.instruct.Message.processLeavingTail(Message.java:223)
	at net.sf.saxon.expr.instruct.TemplateRule.applyLeavingTail(TemplateRule.java:353)
	at net.sf.saxon.trans.Mode.applyTemplates(Mode.java:456)
	at net.sf.saxon.trans.TextOnlyCopyRuleSet.process(TextOnlyCopyRuleSet.java:65)
	at net.sf.saxon.trans.Mode.applyTemplates(Mode.java:433)
	at net.sf.saxon.Controller.transformDocument(Controller.java:2321)
	at net.sf.saxon.Controller.transform(Controller.java:1892)
	at net.sf.saxon.s9api.XsltTransformer.transform(XsltTransformer.java:579)
	at net.sf.saxon.jaxp.TransformerImpl.transform(TransformerImpl.java:185)
	at z.test.UnsafeXslt.run(UnsafeXslt.java:110)
	at java.lang.Thread.run(Thread.java:745)
; SystemID: ccm:xslt; Line#: 5; Column#: 32
net.sf.saxon.expr.instruct.TerminationException: Processing terminated by xsl:message at line 5 in ccm:xslt
	at net.sf.saxon.expr.instruct.Message.processLeavingTail(Message.java:223)
	at net.sf.saxon.expr.instruct.TemplateRule.applyLeavingTail(TemplateRule.java:353)
	at net.sf.saxon.trans.Mode.applyTemplates(Mode.java:456)
	at net.sf.saxon.trans.TextOnlyCopyRuleSet.process(TextOnlyCopyRuleSet.java:65)
	at net.sf.saxon.trans.Mode.applyTemplates(Mode.java:433)
	at net.sf.saxon.Controller.transformDocument(Controller.java:2321)
	at net.sf.saxon.Controller.transform(Controller.java:1892)
	at net.sf.saxon.s9api.XsltTransformer.transform(XsltTransformer.java:579)
	at net.sf.saxon.jaxp.TransformerImpl.transform(TransformerImpl.java:185)
	at z.test.UnsafeXslt.run(UnsafeXslt.java:110)
	at java.lang.Thread.run(Thread.java:745)
Actions #17

Updated by Michael Kay over 7 years ago

OK, so it looks from that as if the fingerprints are correct in both the stylesheet and the source document, and yet it's matching the wrong rule. So that appears to demolish my first hypothesis. Have to get the thinking cap on to come up with another.

Yes, fingerprints are allocated lazily, but of course the logic is carefully designed to be thread-safe. Which doesn't necessarily mean that it is: concurrency bugs can lurk for years. And of course we can't rule out a bug in the JVM implementation of ConcurrentHashMap (at some stage we may have to ensure that we are using exactly the same JVM version) - but that's something to keep in reserve: and the evidence so far is that the NamePool is behaving correctly.

Actions #18

Updated by Michael Kay over 7 years ago

I've just remembered something that might be relevant. There is logic in the pattern matching that's conditional on whether the node is a FingerprintedNode or not. For example SimpleMode#459

                    if (node instanceof FingerprintedNode) {
                        namedNodeChain = namedElementRuleChains.get(((FingerprintedNode) node).getFingerprint());
                    } else {
                        namedNodeChain = getNamedRuleChain(context, Type.ELEMENT, node.getURI(), node.getLocalPart());
                    }

and then I remembered that in 9.7, under JAXP, if space stripping is in force, then the node in question is actually a SpaceStrippedNode, which does not implement the FingerprintedNode interface (even though its fingerprint is actually known).

So we're going to search for the rule chain for named elements using the getNamedRuleChain() path, matching on URI and local name rather than on fingerprint. And here's the crunch: the index to access rule chains by URI/local-name is built lazily, the first time it is needed. Which means there is an operation that needs synchronization.

Is the code correct? Probably not. It's doing the classic

if (qNamedElementRuleChains == null) {
            synchronized (this) {
                qNamedElementRuleChains = new HashMap<StructuredQName, Rule>(namedElementRuleChains.size());
                ... populate the hashmap

which means that transform A can set the index to an empty hashmap, and while it's populating it, transform B comes along and finds the value non-null but empty, decides that there is no rule chain for elements with this name, and therefore uses the generic rule chain, leading to a match on the match="*" template.

This is a sufficiently close fit to the observations that I think we can be confident we've found the problem. Shame I haven't been able to reproduce it (I haven't tried on Windows yet - it's going to be an accident of timing, and the thread scheduling policy on Windows is different, so it's not surprising when multithreading problems manifest themselves on one platform and not another.)

Now to work on a solution. It's worth pointing out that there are various line of attack:

(a) we should be stripping by filtering parsing events rather than by building wrapped nodes. I was under the impression we had already fixed that one, but perhaps we only fixed it on the 9.8 branch, or only for s9api applications.

(b) the test for "x instanceof FingerprintedNode" leads to inefficiency. We've fixed that on the 9.8 branch, but we could consider back-porting the fix to 9.7

(c) although either of those changes will workaround the multithreading bug, we need to fix the multithreading bug anyway, because it will hit applications using external tree models such as DOM and JDOM.

Actions #19

Updated by Cory O over 7 years ago

Just needs an extra if-statement inside the block to see if the variable is still null, or did another thread populate it while waiting.

if (qNamedElementRuleChains == null) {
	synchronized (this) {
		if (qNamedElementRuleChains == null) {
			qNamedElementRuleChains = new HashMap<StructuredQName, Rule>(namedElementRuleChains.size());
			... populate the hashmap
Actions #20

Updated by Cory O over 7 years ago

Oh, and as you said. Populate a local variable until you're done. Then set qNamedElementRuleChains equal to it.

if (qNamedElementRuleChains == null) {
    synchronized (this) {
        if (qNamedElementRuleChains == null) {
            final HashMap<StructuredQName, Rule> map = new HashMap<StructuredQName, Rule>(namedElementRuleChains.size());
            ... populate the hashmap
            qNamedElementRuleChains = map;
        }
    }
}
Actions #21

Updated by Michael Kay over 7 years ago

  • Status changed from New to Resolved
  • Assignee set to Michael Kay
  • Priority changed from Low to Normal
  • Applies to branch 9.7, 9.8 added
  • Fix Committed on Branch 9.7, 9.8 added

I have fixed the synchronization problem simply by putting the "if (x=null)" test inside the synchronized block. The general advice nowadays seems to be that the overhead of non-contending synchronization is very low, so I think this is the simplest solution.

The other problems identified in the course of the investigation will be managed as separate issues.

Actions #22

Updated by O'Neil Delpratt over 7 years ago

  • Status changed from Resolved to Closed
  • % Done changed from 0 to 100
  • Fixed in Maintenance Release 9.7.0.18 added

Bug fix applied in the Saxon 9.7.0.18 maintenance release.

Actions #23

Updated by O'Neil Delpratt over 7 years ago

  • Fix Committed on Branch trunk added
  • Fix Committed on Branch deleted (9.8)
Actions #24

Updated by O'Neil Delpratt over 7 years ago

  • Applies to branch deleted (9.8)

Please register to edit this issue

Also available in: Atom PDF