Bug #4240: ssllooww matching of HUGE regexp - Saxon - Saxonica Developer Community

Actions

Send by e-mail Copy link

Bug #4240

closed

ssllooww matching of HUGE regexp

Added by Syd Bauman over 5 years ago. Updated about 5 years ago.

Status:

Won't fix

Priority:

Low

Assignee:

Michael Kay

Category:

Performance

Sprint/Milestone:

Start date:

2019-06-24

Due date:

% Done:

Estimated time:

Legacy ID:

Applies to branch:

Fix Committed on Branch:

Fixed in Maintenance Release:

Platforms:

Description

The attached XSLT 3.0 program (also valid XSLT 2.0) is designed to be run with itself as the input file. Its purpose is to state whether or not the value of any @selector attribute found is a valid CSS3 selector. (It differs mildly from the CSS3 spec, but that's not important here.) It does this by comparing each @selector against an enormous regular expression using the matches() function with the 'i' switch.

When I ran some tests (using Saxon-HE 9.8.0.11J, I think) Saxon took a very long time to validate even a small number of @selector attributes. (40 @selectors in ~03:48 according to time on an Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz). I did not use "-opt:e".

By contrast, jing processed a RELAX NG grammar that tested the same regular expression against the same 40 @selectors almost instantaneously (although it does not have an 'i' switch to set).

Sadly I did not record a lot of the details (like which 40 of the ~5900 test @selector attributes were processed). I am running some more tests now, but may not have the results available for awhile, so thought I should post this w/o waiting, as I told O'Neil I would submit this almost 2 weeks ago. (BTW, I have no idea if this should be assigned to him or not—I am just making sure to let him know it has been submitted. :-)

For further information see the paper at https://markupuk.org/webhelp/Syd_selector_regex.html?hl=bauman

Files

CSS3_selector_checker.xslt (558 KB) CSS3_selector_checker.xslt

XSLT pgm to run against itself while you go out for lunch

Syd Bauman, 2019-06-25 04:12

Please register to edit this issue

Actions

Send by e-mail Copy link

Also available in: Atom PDF

Project

Profile

Help

Saxon

Bug #4240

ssllooww matching of HUGE regexp

Updated by Michael Kay over 5 years ago

Updated by Syd Bauman over 5 years ago

Updated by Michael Kay over 5 years ago

Updated by Syd Bauman about 5 years ago