Support #6055
closedPHP SaxonC - How to work with a schematron file?
0%
Description
Hello,
I have a problem implementing the SaxonC PHP extension. I managed to install the extension and it is loaded. As the validatorExamples.php is working I assume the class works as expected.
I am new to xslt and schematron and after doing some research I believe to understand that SaxonC cannot work with schematron files and it needs to be converted to an xsd or xsl file. Could you please advise me how to proceed with my schematron file? To what format shall I convert my schematron file and how do I do that to make it work? Or maybe there is a different approach?
Here is the definition of my file:
I would be very happy about an guiding hand.
Thank you and best regards Steve
Environement:¶
PHP: 7.3 Debian 4.19.208-1 (2021-09-29) x86_64 Saxon 11.5
Results from : validatorExamples.php¶
PHP Schema Validation in SaxonC examples Saxon Processor version: SaxonC-PE 11.5 from Saxonica exampleSimple1: Doc is valid
exampleSimple2: Doc family.xml is valid!
exampleSimple3: Error: Doc is not valid!
Files
Updated by Matt Patterson over 1 year ago
The open-source Schematron implementation SchXSLT
(https://github.com/schxslt/schxslt) can be used to generate XSLT files
from your Schematron files for doing validation using Saxon.
Using the XSLT-only distribution of SchXSLT, you can use the
2.0/pipeline.xsl
XSLT with Saxon to compile your .sch
file to an
XSLT that you can then use Saxon to run against your XML files. If you
haven’t already seen them, there are a couple of very simple sample
PHP scripts at
https://www.saxonica.com/saxon-c/documentation12/index.html#!samples/samples_php
that show how to load an XSLT file and transform an XML file with it.
You should only need to compile the .sch
file to an XSLT once, so you
could do that using the command line (see
https://www.saxonica.com/saxon-c/documentation12/index.html#!starting/running)
to create the final XSLT file you’ll use to validate your documents.
This is a very quick overview, but that should hopefully be enough to
get you started.
All the best,
Matt
On 31 May 2023, at 10:07, Saxonica Developer Community wrote:
Updated by Matt Patterson over 1 year ago
You can find the latest (1.9.5) SchXSLT release downloads here:
https://github.com/schxslt/schxslt/releases/tag/v1.9.5
All the best,
Matt
Updated by Steve Goffinet over 1 year ago
Hello,
thank you for you reply!
I wanted to try the "compile once" method. I got a compiler error:
"C:\Program Files\Saxonica\SaxonPEC 11.5\command>buildpec-command.bat"
C:\Program Files\Saxonica\SaxonPEC 11.5\command>cl /EHsc "-I..\Saxon.C.API\jni" "-I..\Saxon.C.API\jni\win32" /DPEC Transform.c
Microsoft (R) C/C++ Optimizing Compiler Version 18.00.21005.1 for x86
Copyright (C) Microsoft Corporation. All rights reserved.
Transform.c
Microsoft (R) Incremental Linker Version 12.00.21005.1
Copyright (C) Microsoft Corporation. All rights reserved.
/out:Transform.exe
Transform.obj
Transform.obj : error LNK2019: unresolved external symbol _snprintf referenced in function _setDllname
Transform.exe : fatal error LNK1120: 1 unresolved externals
Similar to this thread: https://saxonica.plan.io/boards/4/topics/8451
I tried some includes, but I don't really know how to modify the batch the correct way.
Thank you
Updated by Martin Honnen over 1 year ago
You are definitely using the wrong compiler (x86) one, try the x64 one, e.g. for instance with the various options VS (Visual Studio) install as command line tools you need "x64 Native Tools command prompt".
Updated by Martin Honnen over 1 year ago
Additionally, "Microsoft (R) C/C++ Optimizing Compiler Version 18.00.21005.1" sounds rather old (VS 2013?), if you can try to install VS 2019 or 2022 C/C++ compiler tools.
Updated by Steve Goffinet over 1 year ago
- File jet_err_22775.txt jet_err_22775.txt added
Yes, I realized that after your message. Used the VS 2022 64 Bit and it worked.
We are almost there - Got my xlst file and uploaded it to the server.
First test with an correct xml: Doc is valid. Cool 2nd test with some incorrect values: Doc is valid. 3th test with randomly destroyed xml structure: Server instance crashes.
After the crash I have an jet_err file with a memory dump and the apache error log contains some errors. The apache error is correct. that xml element is not closed. Shouldn't it give back a this as some feedback to the program instead of crashing? I assume so?
- Attached is the jet_err file.
- Apache error log:
root@server:/home/me# cat /var/log/apache2/error.log
Error: Failed to create the internal object for the SchemaValidator
Error on line 211 column 9 of file:///var/www/***/www/:
SXXP0003 Error reported by XML parser: The element type "cac:ClassifiedTaxCategory" must
be terminated by the matching end-tag "</cac:ClassifiedTaxCategory>".: The element type
"cac:ClassifiedTaxCategory" must be terminated by the matching end-tag
"</cac:ClassifiedTaxCategory>".
JET RUNTIME HAS DETECTED UNRECOVERABLE ERROR: system exception at 0x00007f95ab34aaab
Please, contact the vendor of the application.
Core dump will be written to "/var/www/***/www/core"
Extra information about error is saved in the "jet_err_22775.txt" file.
[Thu Jun 01 16:31:58.318518 2023] [core:notice] [pid 21532] AH00052: child pid 22775 exit signal Aborted (6)
- my PHP Code
$xsltFile = './includes/EN16931-UBL.xslt';
try {
$proc = new Saxon\SaxonProcessor(true);
$validator = $proc->newSchemaValidator();
$version = $proc->version();
echo '<b>PHP Schema Validation in SaxonC examples</b><br/>';
echo 'Saxon Processor version: '.$version;
echo '<br/>';
$validator->registerSchemaFromFile($xsltFile);
$xml = $proc->parseXmlFromString($Message);
$validator->setSourceNode($xml);
$validator->setProperty('report-node', 'true');
$validator->validate();
$node = $validator->getValidationReport();
if($validator->exceptionOccurred()) {
echo "Doc is not valid!";
$errCount = $validator->getExceptionCount();
if($errCount > 0 ){
for($i = 0; $i < $errCount; $i++) {
$errCode = $validator->getErrorCode(intval($i));
$errMessage = $validator->getErrorMessage(intval($i));
echo 'Expected error: Code='.$errCode.' Message='.$errMessage;
}
$validator->exceptionClear();
}
} else {
echo "Doc is valid";
}
echo 'Validation Report:'.$node->getStringValue().'<br/>';
$validator->clearParameters();
$validator->clearProperties();
}
catch (Exception $e) {
debug($e);
}
unset($validator);
unset($proc);
What am I doing wrong?
Thank you
Updated by Martin Honnen over 1 year ago
Well, first of all, if you have an XSLT stylesheet (even if it is the representation of a Schematron schema), to use it with Saxon in a meaningful way, you don't use SchemaValidator or Saxon's Validation API at all, that is only useful for XSD 1.x based validation.
So the way to perform Schematron based validation with an XSLT processor, where you have already precompiled the Schematron schema to XSLT, is simply to run an XSLT transformation with XSLT version of the Schematron schema and your XML sample as the input to the XSLT transformation.
In the context of SaxonC and 11 and PHP I am better not going to try to spell that out without the help of an IDE here in a code sample but use e.g. https://www.saxonica.com/saxon-c/documentation11/index.html#!api/saxon_c_php_api/saxon_c_php_xslt30processor and then compileFromFile to e.g. an https://www.saxonica.com/saxon-c/documentation11/index.html#!api/saxon_c_php_api/saxon_c_php_xsltexecutable and then use one of the transform..
methods to perform the transformation.
As far as I am aware off the top of my head, that doesn't give your valid or not valid result but rather some validation report you would need to look into (and perhaps later process with XPath) to see any or check for failed Schematron asserts.
Before writing PHP code it might be easier (now that you have the command line working) to simply run Transform.exe -s:instanceXml.xml -xsl:EN16931-UBL.xslt
to check the results and whether they represent what you expect.
Updated by Martin Honnen over 1 year ago
Depending on your aims the right file to compile the Schematron .sch
to an XSLT (that can then be used to generate an SVRL validation report) might be pipeline-for-svrl.xsl
and not pipeline.xsl
.
Updated by O'Neil Delpratt over 1 year ago
Martin is correct with his comments.
On a side note, the crash reported in comment #6 is a bug. The XML parser error message should have been handled internally and accessed using the callback method (i.e. getErrorMessage()
). In SaxonC 12 we redesigned exception handling by re-throwing errors as a PHP exception. Are you restricted to SaxonC 11?
Updated by Steve Goffinet over 1 year ago
Yes, because its Debian and there is the GLIBC_2.32 restriction. I don't want to mess arround with libc components.
In the worst case I could set up an ubuntu server. But I am more common with debian and it would require a rearrangement of the whole infrastructure to outsource the validation check. That would be firing with big canons.
I am trying Martin Honnen's tipps. Starting with the pipeline-fo-svrl.xsl. Looks easier for me on first sight.
Updated by Steve Goffinet over 1 year ago
Here is what happens:
C:\Program Files\Saxonica\SaxonPEC 11.5\command>transform -s:PEPPOL-EN16931-UBL.sch -xsl:pipeline-for-svrl.xsl -o:EN16931-UBL.xslt
Saxon license expires in 21 days
C:\Program Files\Saxonica\SaxonPEC 11.5\command>Transform.exe -s:text.xml -xsl:EN16931-UBL.xslt
Saxon license expires in 21 days
Error near {... and (} at char 93 in xsl:if/@test on line 2378 column 500 of EN16931-UBL.xslt:
XPST0017 Cannot find a 1-argument function named Q{utils}TinVerification()
Error near {...D) = 'VAT']/cbc:CompanyID,3...} at char 211 in xsl:if/@test on line 2523 column 355 of EN16931-UBL.xslt:
XPST0017 Cannot find a 1-argument function named Q{utils}TinVerification()
Error near {...nVerification(substring(.,3...} at char 32 in xsl:if/@test on line 2564 column 98 of EN16931-UBL.xslt:
XPST0017 Cannot find a 1-argument function named Q{utils}TinVerification()
Error near {...933' and u:TinVerification(...} at char 27 in xsl:if/@test on line 2787 column 80 of EN16931-UBL.xslt:
XPST0017 Cannot find a 1-argument function named Q{utils}TinVerification()
Error near {...D) = 'VAT']/cbc:CompanyID,3...} at char 259 in xsl:if/@test on line 2828 column 403 of EN16931-UBL.xslt:
XPST0017 Cannot find a 1-argument function named Q{utils}TinVerification()
Error near {...933' and u:TinVerification(...} at char 27 in xsl:if/@test on line 2869 column 80 of EN16931-UBL.xslt:
XPST0017 Cannot find a 1-argument function named Q{utils}TinVerification()
Errors were reported during stylesheet compilation
Updated by Martin Honnen over 1 year ago
Can you share the PEPPOL-EN16931-UBL.sch
and the text.xml
(you can attach files here but they will be visible to the public)? Also did you use Schxslt 1.9.5? Basically any details to easily allow others to reproduce the problem helps.
Updated by Steve Goffinet over 1 year ago
- File text.xml text.xml added
- File PEPPOL-EN16931-UBL.sch PEPPOL-EN16931-UBL.sch added
- File EN16931-UBL.xslt EN16931-UBL.xslt added
Attached:
- PEPPOL-EN16931-UBL.sch
- text.xml
- EN16931-UBL.xslt
Thank you
Updated by Martin Honnen over 1 year ago
It indeed seems as if functions like TinVerification
don't end up in the produced XSLT file. I am currently not sure what is the reason for that, I would first need to check whether the Schematron spec gives any restrictions as to where/how functions can be included and whether Schxslt perhaps has a quirk or bug swallowing certain functions.
It is probably not a SaxonC or Saxon problem at all but hard to tell given a rather complex Schematron schema and with all the tools like Schxslt involved.
I will try to find out where things go wrong but it might take a while; not sure if any of the Saxonica guys will show up given that today is Markup UK in London, I think.
Updated by Martin Honnen over 1 year ago
For a test outside of SaxonC I have tried to use oXygen XML editor 25 to try to validate the provided test.xml against the provided PEPPOL-EN16931-UBL.sch Schematron, it says the test.xml is valid; I think oXygen continues to use the ISO Schematron XSLT 2 reference implementation of Schematron instead of Schxslt.
So this might indeed be a problem related to Schxslt.
I will continue to check what goes wrong, but getting a handle of whether there is a bug or quirk in Schxslt might take some time and need some interaction with the Schxslt implementor.
Updated by Steve Goffinet over 1 year ago
Again, thank you for your help. I would be completely lost without.
And I have overseen you question: Yes. It was the Schxslt version 1.9.5.
Updated by Martin Honnen over 1 year ago
If you want to proceed without waiting for some resolution of the potential Schxslt issue you could try the (no longer maintained/updated) Skeleton ISO Schematron reference implementation instead, the archived last release is at https://github.com/Schematron/schematron/releases/tag/2020-10-01, in the zip iso-schematron-xslt2.zip you would use the iso_svrl_for_xslt2.xsl
to compile your Schematron .sch
to XSLT, then run that XSLT to get an SVRL report of the validation.
I have tested that that compiles your provide Schematron schema and runs the created XSLT fine with Saxon HEC 11.5 to give an SVRL report that shows no validation failure for your input sample.
Updated by Martin Honnen over 1 year ago
I have raised a potential issue on Schxslt: https://github.com/schxslt/schxslt/issues/315.
Updated by Martin Honnen over 1 year ago
I have looked through the Schematron spec for the XSLT 2 query language binding and it contains a decisive requirement:
The XSLT2 function element may be used, in the XSLT2 namespace, before the pattern elements.
So what I think happens here, is that Schxslt implements that by only copying XSLT functions defined in the Schematron .sch
schema before any Schematron pattern element; for some reasons, however, after some pattern
, your used Schema has a section starting with
<!-- GREECE -->
<!-- General functions and variable for Greek Rules -->
<function xmlns="http://www.w3.org/1999/XSL/Transform" name="u:TinVerification" as="xs:boolean">
<param name="val" as="xs:string"/>
<variable name="digits" select="
for $ch in string-to-codepoints($val)
return codepoints-to-string($ch)"/>
<variable name="checksum" select="
(number($digits[8])*2) +
(number($digits[7])*4) +
(number($digits[6])*8) +
(number($digits[5])*16) +
(number($digits[4])*32) +
(number($digits[3])*64) +
(number($digits[2])*128) +
(number($digits[1])*256) "/>
<value-of select="($checksum mod 11) mod 10 = number($digits[9])"/>
</function>
and that way, as far as I understand it, Schxslt (rightly) ignores that function and doesn't include it in the generated XSLT; that way, later you get an error.
So basically the Schematron schema would need to have all its function
declarations to be used in XSLT before any pattern
elements to work properly with Schxslt.
Updated by O'Neil Delpratt over 1 year ago
Steve Goffinet wrote in #note-10:
Yes, because its Debian and there is the GLIBC_2.32 restriction. I don't want to mess arround with libc components.
In the worst case I could set up an ubuntu server. But I am more common with debian and it would require a rearrangement of the whole infrastructure to outsource the validation check. That would be firing with big canons.
The GLIBC version limitation has been fixed so that SaxonC will work on older versions of GLIBC. This fix will be available in the next release. See the bug issue: #6078
Updated by O'Neil Delpratt over 1 year ago
- Status changed from New to AwaitingInfo
Hi,
I am closing this bug issue since I don't think there is a bug against SaxonC.
Updated by O'Neil Delpratt about 1 year ago
- Status changed from AwaitingInfo to Closed
Closing this bug as there has not been a pushback given the fix with the GLIBC issue.
Please register to edit this issue