trimxslt allows you to rapidly extract details from large XML files on the command line. Run "trimxslt --help" for details of the command line parameters, but here are some pointers to get you started. Let's say you have a simple database dump format with the following form: <db> <record id="1"> <name>Alex</name> <address>123 Maple St.</address> </record> <record id="2"> <name>Bob</name> <address>456 Birch Rd.</address> </record> <record id="3"> <name>Chris</name> <address>789 Pine St.</address> </record> </db> You can: Get all the full contents of name elements $ trimxml file.xml name <name>Alex</name> <name>Bob</name> <name>Chris</name> Get the full contents of the record with ID 2 $ trimxml file.xml record "@id='2'" <record id="2"> <name>Bob</name> <address>456 Birch Rd.</address> </record> Get the full contents of the first two name elements $ trimxml -c 2 file.xml name <name>Alex</name> <name>Bob</name> Get the name of the record with ID 2 $ trimxml -d "name" file.xml record "@id='2'" <name>Bob</name> trimxml uses namespaces declared on the document element, so you can conveniently make queries without needing to separately declare prefixes. So to get the URLs of all a links in an XHTML document you could do: trimxml -d "@href" file.xhtml "html:a" As long as there is a namespace declaration xmlns:ht="http://www.w3.org/1999/xhtml" in the document. If not (many XHTML documents use the default namespace, which courtesy XPath 1.0 restrictions prevents trimxml from doing any guesswork for you) you have to declare the prefix. trimxml --ns=ht="http://www.w3.org/1999/xhtml" -d "@href" http://www.w3.org/2000/07/8378/xhtml/media-types/test4.xhtml "ht:a" Notice how this example loads the source XML (XHTML) from a Web URL rather than a local file.
Classes:
|
Functions:
|