Project

Profile

Help

Saxon/C and Windows: several issues with C++ classes

Added by Thomas Elsässer 4 months ago

Hello, I am using Saxon/C API 1.2.1 on Windows; first of all: big compliment to all Saxon developers, it is really a nice tool and working stable!

But I noticed three minor issues here (which I could handle by myself) and one maybe general issue where I don't know the expected behavior:

  1. For source file "SaconCGlue.c", I had to #undef UNICODE and _UNICODE, because the Unicode-Version of Win32 function "LoadLibrary" would not be called right (--> char* instead of wchar_t*)

  2. The function "SaxonProcessor::version()" produced dumps because of "delete tempVersionStr" --> with JNI, the function "SaxonProcessor::sxn_environ->env->ReleaseStringUTFChars(jstr, tempVersionStr);" should be called.

  3. Source file "SaconCProcessor.c" leads to a syntax error with preprocessor option "DEBUG" (which is very helpful!), because constant "__true" is not defined --> replaced with "true" works fine.

  4. This issue is more disturbing than 1) - 3), because I only found a workaround with negative performance: When I use SchemaValidator to validate an XML file with one XSD schema (transferred with "registerSchemaFromString"), it is working fine for the first time; when I pass a differend schema on a second call, the validator is still using the first XSD schema. Shouldn't it be possible to call another XSD validation with the same Saxon processor instance? The only way I found is to create always a new instance of "SaxonProcessor" (even a new "SchemaValidator" instance did not work). This is not nice, since it takes significant longer than only call an existing instance.

Please let me know what you think about these issues, and if you can fix them. If you need more information, please let me know.

Best regards, Thomas


Replies (4)

Please register to reply

RE: Saxon/C and Windows: several issues with C++ classes - Added by O'Neil Delpratt 4 months ago

Dear Thomas, Thank you very much for your feedback. We will investigate and address all your points you have raised.

With 4. Are the schemas you are loading conflicting? The configuration of loading schemas is accumulative, therefore if there are any conflicts it will ignore the new one. You would have to create a new SchemaValidator each time or have them in different namespaces.

RE: Saxon/C and Windows: several issues with C++ classes - Added by Michael Kay 4 months ago

To expand on O'Neil's reply, a SaxonProcessor holds one "schema" in the sense of a coherent set of schema definitions (element definitions, type declarations etc). It's possible to load multiple schema documents, in which case you're validating against the union of these; but this is only possible if the various schema documents are non-conflicting (you can't have two different definitions of the same type). Sometimes it's best to create an artificial "grand schema" that imports all the individual schema documents you want to use for different kinds of document. But if you need to work with multiple schemas that conflict with each other, e.g. by having different definitions of the same type, then the only way is to use multiple SaxonProcessor instances.

RE: Saxon/C and Windows: several issues with C++ classes - Added by Thomas Elsässer 4 months ago

Hello, you're right, I changed an existing schema; am I right that this (currently) is not possible within the same Saxon processor instance, right? Would like to have an "overwrite"-flag or something similar :-) But for the moment, I sure can live with it.

Best regards, Thomas

RE: Saxon/C and Windows: several issues with C++ classes - Added by Michael Kay 4 months ago

Although there are various APIs and language bindings, the internal architecture of Saxon has pretty deeply embedded the concept that a Configuration, which owns all the shared resources for an application, contains a single set of schema components that is internally consistent and contains no duplicates.

I've had an aspiration to address this for years, but it's challenging. It's hard to do it in a way that doesn't complicate a lot of internal and external APIs, that doesn't damage backwards compatibility, or conformance with W3C specifications, or external Java API specifications (JAXP / XQJ), or have unacceptable performance implications. There are some rather fragile rules about things like importing of multiple schema modules with the same target namespace, or checking for schema component identity, or the "global" semantics of xs:redefines, which could easily break (these tend to be things that are very weakly specified in the W3C specs, where we have had to find our own solutions).

Some of the things we would like to achieve by making this change, for example allowing a schema-aware XSLT transformation to use one version of a schema for its input and a different version of the schema for its output, aren't even compatible with the W3C specifications as currently written.

So I'm afraid you can't expect this limitation to be lifted any time soon. There are possibly minor things we could do, like allowing the schema contained in a configuration to be cleared out and replaced with a different one; but even that is difficult, because of the way type annotations in schema-validated nodes are maintained.

    (1-4/4)

    Please register to reply