eXist XPath Extensions


One of the really cool things about eXist is the XPath extensions for fulltext searching. They mimic (using XPath) the stuff that is done in XStreamDB via XQuery.

I can do stuff like:

document(*)//text() &= "*image*"

and eXist will return me any xml document (from it's entire set of collections) that contains the string "image" somewhere in it (could be in /lom/general/title/langstring/Images Of Bangalore, or /lom/technical/format/image/jpeg, or whatever. It doesn't care. And, it's very fast.

What's more, I can do stuff like:

document(*)/*[ //format &= "*image*" and //text() &= "*earth*"]

which says "find me xml documents that have "image" somewhere in a "format" element (could be, say, /lom/technical/format), and contain the string "earth" somewhere (like, say, /lom/general/title/langstring/Earth At Night or /lom/general/title/langstring/Earthquakes )

I can also do something like:

document(*)//text() &="*image* *kyoto*"

Which will give me different results than

document(*)//text() &= "*image* *kyoto* *relig*"

because the second query will restrict the search to stuff to do with "relig" - religion, religious, whatever (in this case, a Buddhist temple in Kyoto is returned, as opposed to the Kyoto Accord presentations at the University of Calgary, which are returned by the query before it...)

The fulltext extension - based queries (using the &= qualifier to indicate "boolean and" - you can also use the := qualifier to indicate "boolean or") are amazingly fast. I'm getting results from rather complicated test queries on the entire 3600+ CAREO record set in a fraction of a second. Nice.


exist 
comments powered by Disqus