Project

Profile

Help

Support #4792

Python program using SaxonC 1.2.1 HE module with xquery_processor.run_query_to_string() crashes

Added by Anton Shchetikhin 14 days ago. Updated 13 days ago.

Status:
New
Priority:
Normal
Category:
Python
Start date:
2020-10-08
Due date:
% Done:

0%

Estimated time:
Found in version:
1.2.1

Description

I managed to compile the SaxonC 1.2.1 HE module for Python 3.7 on Ubuntu 16.04 64bit, but I am having trouble getting some XQuery expressions to run (see attachments).

  1. without_func_call.xquery works fine
Creating new xquery processor -> OK
Prepare xml -> OK
Prepare xquery -> OK
Setup context -> OK
Setup `$document` variable -> OK
Running XQuery -> 
<БлокПроверок xmlns="http://пф.рф/ВС/СЗВ-М/2017-01-01"
              ID="ВСЗЛ.Б-АНКЕТА.1"
              name="Блок проверок по БД анкетных данных">
   <Проверка ID="1">
      <Описание>Указывается СНИЛС, содержащийся в страховом свидетельстве</Описание>
      <РезультатЗапроса>0</РезультатЗапроса>
      <КодРезультата>30</КодРезультата>
   </Проверка>
   <Проверка ID="2">
      <Описание>Указывается ФИО, содержащееся в страховом свидетельстве</Описание>
      <РезультатЗапроса>0</РезультатЗапроса>
      <КодРезультата>30</КодРезультата>
   </Проверка>
   <Проверка ID="3">
      <Описание>Статус ИЛС в реестре 'Застрахованные лица' на дату проверяемого документа не должен быть равен значению 'УПРЗ'</Описание>
      <РезультатЗапроса>0</РезультатЗапроса>
      <КодРезультата>30</КодРезультата>
   </Проверка>
</БлокПроверок>
  1. func_call_with_empty_param.xquery works fine too
Creating new xquery processor -> OK
Prepare xml -> OK
Prepare xquery -> OK
Setup context -> OK
Setup `$document` variable -> OK
Running XQuery -> 
<БлокПроверок xmlns="http://пф.рф/ВС/СЗВ-М/2017-01-01"
              ID="АФ.КСФ.1"
              name="Проверка структуры файла">
   <Проверка ID="1">
      <Описание>Проверяемый файл должен быть корректно заполненным XML-документом</Описание>
      <РезультатЗапроса>
         <Результат xmlns="http://пф.рф/ВС/СЗВ-М/2017-01-01">0</Результат>
      </РезультатЗапроса>
      <КодРезультата>50</КодРезультата>
   </Проверка>
</БлокПроверок>
  1. func_call_with_non_empty_param.xquery doesn't work as expected
Creating new xquery processor -> OK
Prepare xml -> OK
Prepare xquery -> OK
Setup context -> OK
Setup `$document` variable -> OK
Running XQuery -> 
Syntax error on line 124 at column 7 near {...у[x443]л[x43b]ь[x44c]т[x442]а[x430]т[x442]а[x430]> </П[x41f]р[x440]о[x43e]в[x432]е[x435]р[x440]к[x43a]а[x430]> </Б[x411]л[x43b]о[x43e]к[x43a]...} 
  XPST0003: End of input encountered while parsing direct constructor
None

JET RUNTIME HAS DETECTED UNRECOVERABLE ERROR: system exception at 0x00007f3456e437c6
Please, contact the vendor of the application.
Core dump will be piped to "/usr/share/apport/apport %p %s %c %d %P %E"
Extra information about error is saved in the "jet_err_25848.txt" file.

Aborted (core dumped)

Text file with the error and python script (test_saxon.py) are attached.

What am I doing wrong? Maybe there's a problem with Cyrillic letters in function call? I wish also to note that BaseX works as expected in all cases.

func_call_with_empty_param.xquery (2.25 KB) func_call_with_empty_param.xquery Anton Shchetikhin, 2020-10-08 19:44
func_call_with_non_empty_param.xquery (4.73 KB) func_call_with_non_empty_param.xquery Anton Shchetikhin, 2020-10-08 19:44
test_saxon.py (1.19 KB) test_saxon.py Anton Shchetikhin, 2020-10-08 19:44
jet_err_25848.txt (74.4 KB) jet_err_25848.txt Anton Shchetikhin, 2020-10-08 19:44
without_func_call.xquery (1.55 KB) without_func_call.xquery Anton Shchetikhin, 2020-10-08 19:44
test.xml (2.4 KB) test.xml Anton Shchetikhin, 2020-10-09 14:29
gdb_out.txt (3.78 KB) gdb_out.txt Anton Shchetikhin, 2020-10-09 14:37

History

#1 Updated by Martin Honnen 14 days ago

Can you show the input XML as well?

Also, if you have XML and XQuery as files, why do you use the indirection over etree and file apis to feed strings to Saxon, can't you just let it handle the files with e.g. parse_xml(xml_file_name='/home/mrslow/dev/scripts_py3/test.xml') and set_query_file('/home/mrslow/dev/scripts_py3/func_call_with_non_empty_param.xquery')? Perhaps that way Saxon is better able to cope with the file content.

#2 Updated by O'Neil Delpratt 13 days ago

  • Priority changed from Low to Normal
  • Found in version set to 1.2.1

Thanks for reporting you issue.

Initial thoughts as mentioned in comment #1 why not pass the files directly to Saxon. However, it is good to see how Saxon/C can be used with ElementTree.

This is indeed a bug as we should not be getting the Jet runtime crash. It is probably intercepting a null pointer exception somewhere. I will investigate with the repo you have sent us. I usually run the gdb debugger, which prevents this interception by Jet. Therefore seeing the underlying problem.

#3 Updated by Anton Shchetikhin 13 days ago

Thanks for the responses.

  1. My mistake, I forgot to attach the input XML.
  2. I usually get XML and XQuery in binary form, so I read them in the script first.

#4 Updated by O'Neil Delpratt 13 days ago

Thanks for sharing the gdb output. Very useful indeed. I can see the error:

Syntax error on line 124 at column 7 near {...у[x443]л[x43b]ь[x44c]т[x442]а[x430]т[x442]а[x430]> </П[x41f]р[x440]о[x43e]в[x432]е[x435]р[x440]к[x43a]а[x430]> </Б[x411]л[x43b]о[x43e]к[x43a]...} 
50	
  XPST0003: End of input encountered while parsing direct constructor

It looks like there is a problem with the XQuery string. Not sure if this is a legitimate error or maybe encoding/decoding issue with what is produced from the binary. The encoding (i.e. encoding='utf-8') you have used to convert the binary to string should work well in Saxon.

What happens next is you rightly try to print the error message from the XQuery processor. Here is where the crash happens somewhere in the get_error_message. This is because of the index+1 in the argument should be index. See: xqp.get_error_message(index+1) should be xqp.get_error_message(index). The documentation is not clear if the argument is 0 or 1 based. We will raise a bug against the documentation too: getErrorMessage should should zero based indexing.

We will investigate further if the query should be accepted or not.

#5 Updated by Anton Shchetikhin 13 days ago

I have tried to pass the XQuery file path directly to the processor (set_query_file('/home/mrslow/dev/scripts_py3/func_call_with_non_empty_param.xquery')) as Martin Honnen suggested and it works as expected.

Thanks for you advice.

Though working with files isn't the solution of the original problem. For instance, if I try to reuse the same XQuery expression it will definitely be better to utilize memory representation rather than reading it from a file. But maybe I'm missing out on something, and there is a 'prepare' statement counterpart for the expressions.

Please register to edit this issue

Also available in: Atom PDF