Project

Profile

Help

Bug #5884

closed

SaxonCS 12 doesn't find elements in HTML DOM based on [@class = 'foo'] predicate

Added by Martin Honnen about 1 year ago. Updated about 1 year ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Features new in 4.0
Sprint/Milestone:
-
Start date:
2023-02-16
Due date:
% Done:

100%

Estimated time:
Legacy ID:
Applies to branch:
12, trunk
Fix Committed on Branch:
12, trunk
Fixed in Maintenance Release:
Platforms:
.NET

Description

I get a strange failure by SaxonCS to select elements with predicates based on the class attribute in HTML DOMs (i.e. returned by saxon:parse-html or fn:parse-html); a query like 'C:\Program Files\Saxonica\SaxonCS-12.0\SaxonCS.exe' query -q:.\saxon-parse-html-test1.xq !indent=yes returns nothing but <?xml version="1.0" encoding="UTF-8"?>.

XQuery sample

saxon:parse-html(unparsed-text('test2.html'))//*:h2[contains(@class, 'foo')]

HTML document:

<!doctype html>
<html>
  <head>
    <title>Test</title>
  </head>
  <body>
    <h2>h2 1</h2>
    <h2 class=foo>h2 2</h2>
  </body>
</html>

Saxon EE 12.0 Java finds e.g.

<?xml version="1.0" encoding="UTF-8"?>
<h2 xmlns="http://www.w3.org/1999/xhtml" class="foo">h2 2</h2>

Please register to edit this issue

Also available in: Atom PDF