XQJ: html page screen scrap
Added by Anonymous almost 14 years ago
Legacy ID: #9388554 Legacy Poster: Chris (chr1sbau)
I have been trying to use the XQJ api to screen scrap a html page that has been
loaded into dom via HTML tidy. The Query I am try to run is: { for $x in
//div[@id='ctl09_RacePanel'] return {$x} } When I
run the query it results in just empty , however if I change for
$x in .//div[@id='ctl09_RacePanel'] to for $x in //[@id='ctl09_RacePanel'] the query
returns the html between the
Replies (1)
RE: XQJ: html page screen scrap - Added by Anonymous almost 14 years ago
Legacy ID: #9389790 Legacy Poster: Michael Kay (mhkay)
If you look more closely at your source XML you will almost certainly find that the elements are in a namespace, probably http://www.w3.org/1999/xhtml. So if you want to select elements from this namespace, you will need to start your query with [code]declare default element namespace = "http://www.w3.org/1999/xhtml";[/code] Unfortunately this will have the side-effect of putting your output elements (run and race) in this namespace as well, which is probably not what you want. The workaround is to bind a specific prefix [code]declare namespace h = "http://www.w3.org/1999/xhtml"; [/code] then write [code]for $x in //h:div[...] return ...[/code] This is a weakness in the design of the XQuery language. Please note that this forum isn't really intended for general XQuery coding help that's independent of the Saxon product. You should try the talk @ x-query.com mailing list, or stackoverflow.com.
Please register to reply