Project

Profile

Help

Bug #5449

closed

XSLT:garbage appended to output

Added by Arno Rosenboom about 2 years ago. Updated almost 2 years ago.

Status:
Closed
Priority:
Low
Category:
PHP API
Start date:
2022-04-05
Due date:
% Done:

100%

Estimated time:
Found in version:
11.3
Fixed in version:
11.4
Platforms:

Description

hi! i'm using SaxonHE 11.3 (libsaxon-HEC-setup64-v11.3.zip) under Ubuntu 20.04.4 LTS and PHP 7.4.3 installation went smooth - no problems at all. but: when i do a relative simple transformation in php, there is some strange garbage appended to the html output. the $result variable contains the correct output string and some random characters in variable length changing on every transformation/call:

</html>7:36:01+0200
or
</html>�U
or
</html>esizeDuratio�

what can cause this behaviour? is there any common mistake i make? thanks in advance arni

these are the source files:

<?php
$xmlfile = "awxnotify.xml";
$xslfile = "awxnotify.xsl";
$saxon = new Saxon\SaxonProcessor();
$proc = $saxon->newXslt30Processor();
$exe = $proc->compileFromFile($xslfile);
$result = $exe->transformFileToString($xmlfile);
echo $result;
$exe->clearParameters();
$exe->clearProperties();
?>
<?xml version="1.0"?>
<events>
  <event>
    <id>{99996EEE-4E85-003F-B644-7E7D3757A91A}</id>
    <time>2022-04-03T00:24:12+0100</time>
    <subject>K&#xFC;chenfenster ge&#xF6;ffnet</subject>
    <src/>
    <desc/>
    <url>http://vm01.worxnet.localnet/cameras/kueche/030422-002411.jpg</url>
  </event>
  <event>
    <id>{EAF9112D-7E8B-909B-0579-E297C744490B}</id>
    <time>2022-04-03T00:27:25+0100</time>
    <subject>K&#xFC;chenfenster ge&#xF6;ffnet</subject>
    <src/>
    <desc/>
    <url>http://vm01.worxnet.localnet/cameras/kueche/030422-002724.jpg</url>
  </event>
  <event>
    <id>{49C54A20-50B3-C5F0-206E-3C96150132C8}</id>
    <time>2022-04-03T00:29:33+0100</time>
    <subject>Briefkasten ge&#xF6;ffnet</subject>
    <src/>
    <desc/>
    <url>http://vm01.worxnet.localnet/cameras/haustuer/030422-002932.jpg</url>
  </event>
</events>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
<xsl:output method="xhtml" encoding="utf-8" doctype-public="-//W3C//DTD HTML 4.01 Transitional//EN" indent="yes" />
<xsl:template match="/">
<html>
<head>
<title>awxNotify</title>
</head>
<body>

<table id="main">
<thead>
	<tr class="head">
		<th>Time</th>
		<th>Image</th>
		<th>Subject</th>
		<th>Description</th>
	</tr>
</thead>
<xsl:element name="tbody">
	<xsl:for-each select="events/event">
		<xsl:sort select="time" order="descending" />
		<xsl:element name="tr">
			<xsl:attribute name="class">
				<xsl:text>entry</xsl:text>
			</xsl:attribute>
			<xsl:apply-templates select="time" />
			<xsl:apply-templates select="url" />
			<xsl:apply-templates select="subject" />
			<xsl:apply-templates select="desc" />
		</xsl:element>
	</xsl:for-each>
</xsl:element>
</table>

</body>
</html>
</xsl:template>

<xsl:template match="time">
	<xsl:element name="td">
		<xsl:attribute name="class">
			<xsl:text>time</xsl:text>
		</xsl:attribute>
		<xsl:value-of select="." />
	</xsl:element>
</xsl:template>

<xsl:template match="subject">
	<xsl:element name="td">
		<xsl:attribute name="class">
			<xsl:text>subject</xsl:text>
		</xsl:attribute>
		<xsl:value-of select="." />
	</xsl:element>
</xsl:template>

<xsl:template match="desc">
	<xsl:element name="td">
		<xsl:attribute name="class">
			<xsl:text>desc</xsl:text>
		</xsl:attribute>
		<xsl:value-of select="." />
	</xsl:element>
</xsl:template>	

<xsl:template match="url">
	<xsl:element name="td">
		<xsl:attribute name="class">
			<xsl:text>url</xsl:text>
		</xsl:attribute>
		<xsl:if test=". != ''">
			<xsl:element name="a">
				<xsl:attribute name="href">
					<xsl:value-of select="." />
				</xsl:attribute>
				<xsl:text>Image</xsl:text>
			</xsl:element>
		</xsl:if>
	</xsl:element>
</xsl:template>	
</xsl:stylesheet>
Actions #1

Updated by O'Neil Delpratt about 2 years ago

Thanks for reporting this issue. It looks like some encoding problem. I will investigate it further and report back.

Actions #2

Updated by Martin Honnen about 2 years ago

Consider to use <xsl:output method="html" with the HTML 4.01 DTD and given that your result elements are in no namespace. If you do so, the XSLT processor on serialization will add e.g.

<!DOCTYPE html
  PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

at the start of the result document and insert e.g. <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> in the head element, that should ease the task on the browser to decode and render the content properly.

Actions #3

Updated by Martin Honnen about 2 years ago

If you really want to produce XHTML and serve it as text/html then use e.g. <xsl:output method="xhtml" encoding="utf-8" doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN" doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" indent="yes" /> and make sure the XSLT declares the XHTML namespace as the default namespace <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="http://www.w3.org/1999/xhtml">.

Actions #4

Updated by Arno Rosenboom about 2 years ago

ok, thanks. tried that, but doesn't seem to make any difference.

<!DOCTYPE html
  PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <title>awxNotify</title>
   </head>
   <body>
      <table id="main">
         <thead>
            <tr class="head">
               <th>Time</th>
               <th>Image</th>
               <th>Subject</th>
               <th>Description</th>
            </tr>
         </thead>
         <tbody>
            <tr class="entry">
               <td class="time">2022-04-03T00:29:33+0100</td>
               <td class="url"><a href="http://vm01.worxnet.localnet/cameras/haustuer/030422-002932.jpg">Image</a></td>
               <td class="subject">Briefkasten geöffnet</td>
               <td class="desc"></td>
            </tr>
            <tr class="entry">
               <td class="time">2022-04-03T00:27:25+0100</td>
               <td class="url"><a href="http://vm01.worxnet.localnet/cameras/kueche/030422-002724.jpg">Image</a></td>
               <td class="subject">Küchenfenster geöffnet</td>
               <td class="desc"></td>
            </tr>
            <tr class="entry">
               <td class="time">2022-04-03T00:24:12+0100</td>
               <td class="url"><a href="http://vm01.worxnet.localnet/cameras/kueche/030422-002411.jpg">Image</a></td>
               <td class="subject">Küchenfenster geöffnet</td>
               <td class="desc"></td>
            </tr>
         </tbody>
      </table>
   </body>
</html>�ʞU
Actions #5

Updated by Arno Rosenboom about 2 years ago

Martin Honnen wrote in #note-3:

If you really want to produce XHTML and serve it as text/html then use e.g. <xsl:output method="xhtml" encoding="utf-8" doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN" doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" indent="yes" /> and make sure the XSLT declares the XHTML namespace as the default namespace <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="http://www.w3.org/1999/xhtml">.

thanks, martin, but it must be something else. still garbage at the end of the resulting output

<?xml version="1.0" encoding="utf-8"?><!DOCTYPE html
  PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
      <title>awxNotify</title>
   </head>
   <body>
      <table id="main">
         <thead>
            <tr class="head">
               <th>Time</th>
               <th>Image</th>
               <th>Subject</th>
               <th>Description</th>
            </tr>
         </thead>
         <tbody>
            <tr class="entry">
               <td class="time">2022-04-03T00:29:33+0100</td>
               <td class="url"><a href="http://vm01.worxnet.localnet/cameras/haustuer/030422-002932.jpg">Image</a></td>
               <td class="subject">Briefkasten geöffnet</td>
               <td class="desc"></td>
            </tr>
            <tr class="entry">
               <td class="time">2022-04-03T00:27:25+0100</td>
               <td class="url"><a href="http://vm01.worxnet.localnet/cameras/kueche/030422-002724.jpg">Image</a></td>
               <td class="subject">Küchenfenster geöffnet</td>
               <td class="desc"></td>
            </tr>
            <tr class="entry">
               <td class="time">2022-04-03T00:24:12+0100</td>
               <td class="url"><a href="http://vm01.worxnet.localnet/cameras/kueche/030422-002411.jpg">Image</a></td>
               <td class="subject">Küchenfenster geöffnet</td>
               <td class="desc"></td>
            </tr>
         </tbody>
      </table>
   </body>
</html>�PK
Actions #6

Updated by O'Neil Delpratt about 2 years ago

  • Category set to PHP API
  • Status changed from New to In Progress
  • Assignee set to O'Neil Delpratt
  • Found in version set to 11.3

I am still investigating this issue. As a quick test I tried your repo with SaxonC-EE 11.3, but not able to reproduce the error.

See output below:


<?xml version="1.0" encoding="utf-8"?><html>
   <head>
      <title>awxNotify</title>
   </head>
   <body>
      <table id="main">
         <thead>
            <tr class="head">
               <th>Time</th>
               <th>Image</th>
               <th>Subject</th>
               <th>Description</th>
            </tr>
         </thead>
         <tbody>
            <tr class="entry">
               <td class="time">2022-04-03T00:29:33+0100</td>
               <td class="url"><a href="http://vm01.worxnet.localnet/cameras/haustuer/030422-002932.jpg">Image</a></td>
               <td class="subject">Briefkasten geöffnet</td>
               <td class="desc"></td>
            </tr>
            <tr class="entry">
               <td class="time">2022-04-03T00:27:25+0100</td>
               <td class="url"><a href="http://vm01.worxnet.localnet/cameras/kueche/030422-002724.jpg">Image</a></td>
               <td class="subject">Küchenfenster geöffnet</td>
               <td class="desc"></td>
            </tr>
            <tr class="entry">
               <td class="time">2022-04-03T00:24:12+0100</td>
               <td class="url"><a href="http://vm01.worxnet.localnet/cameras/kueche/030422-002411.jpg">Image</a></td>
               <td class="subject">Küchenfenster geöffnet</td>
               <td class="desc"></td>
            </tr>
         </tbody>
      </table>
   </body>
</html>

I am also using PHP 7.4.3. I will try SaxonC-HE, but I wonder if there is some locale issue going on here.

Actions #7

Updated by Arno Rosenboom about 2 years ago

did you reload 3-4 times? sometimes the garbage is not there

Actions #8

Updated by O'Neil Delpratt about 2 years ago

Yes I have now reproduced the strange garbage text at the end of the closing html tag. Thanks

I will investigate it further

Actions #9

Updated by O'Neil Delpratt about 2 years ago

The problem seems to be in the C++ code Xslt30Processor::transformFileToString. We are failing to add the string null terminating character to the end of the string that is returned. Hence why we are getting a memory corruption.

A workaround is to use the applyTemplatesReturningString method which has the correct code.

Actions #10

Updated by O'Neil Delpratt about 2 years ago

Please see workaround example below which uses applyTemplateReturningString:

<?php
$xmlfile = "awxnotify.xml";
$xslfile = "awxnotify.xsl";
$saxon = new Saxon\SaxonProcessor();
$proc = $saxon->newXslt30Processor();
$exe = $proc->compileFromFile($xslfile);
$exe->setInitialMatchSelectionAsFile($xmlfile);
$exe->setGlobalContextFromFile($xmlfile);
$result = $exe->applyTemplatesReturningString();
//$result = $exe->transformFileToString($xmlfile);
echo $result;
$exe->clearParameters();
$exe->clearProperties();
?>
Actions #11

Updated by O'Neil Delpratt about 2 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100

Bug fixed in the transformFileToString method of the Xslt30Executable class. Available in the next maintenance release.

Actions #12

Updated by Arno Rosenboom about 2 years ago

thanks everyone! fantastic cooperation. like it should be

Actions #13

Updated by O'Neil Delpratt almost 2 years ago

  • Status changed from Resolved to Closed
  • Fixed in version set to 11.4

Bug fix applied in the SaxonC 11.4 maintenance release.

Please register to edit this issue

Also available in: Atom PDF