XPath « XML « Java Articles

Home
Java Articles
1.Build Deploy
2.Class
3.Core Library
4.Data Types
5.Database JDBC
6.Design
7.Development
8.File Input Output
9.Graphics Desktop
10.J2EE Enterprise
11.J2ME Wireless
12.JVM
13.Language
14.Library Product
15.Network
16.Security
17.SOA Web Services
18.Test
19.Web Development
20.XML
Java Articles » XML » XPath 
Some of the exciting new features of the Java 2 Platform, Standard Edition (J2SE) 5.0 release, code-named Tiger, are the added XML validation package at javax.xml.validation and the XPath libraries at javax.xml.xpath. Before the Tiger release, the Java API for XML Processing (JAXP) SAXParser or DocumentBuilder classes were the primary instruments of Java technology XML validation. The new Validation API, however, decouples the validation of an XML document from the parsing of the document. Among other things, this allows Java technology to support multiple schema languages. Let's take a closer look at XML validation first.

Parsing an XML document with an XPath expression is more efficient than the getter methods, because with XPath expressions, an Element node may be selected without iterating over a node list. Node lists retrieved with the getter methods have to be iterated over to retrieve the value of element nodes. For example, the second article node in the journal node in the example XML document in this tutorial (listed in the Overview section below) may be retrieved with the XPath expression:

Until recently, the exact application program interface (API) by which Java programs made XPath queries varied with the XPath engine. Xalan had one API, Saxon had another, and other engines had other APIs. This meant your code tended to lock you into one product. Ideally, you'd like to able to experiment with different engines that have different performance characteristics without undue hassle or rewriting of code.

This tutorial introduces the W3C standard, XPath. It is aimed at people who do not know XPath or who want a refresher. If you plan to use XSLT, you should take this tutorial first. You will learn what XPath is, the syntax and semantics of the XPath language, how to use XPath location paths, how to use XPath expressions, how to use XPath functions, and how XPath relates to XSLT. This tutorial covers XPath Version 1.0.

DOM forms a very effective base on which easy-to-use systems can be built by following a few simple principles. Future versions of DOM are being designed with the combined wisdom and experience of a large group of users, and will likely present solutions to some of the problems discussed here. Projects such as JDOM are adapting the API for a more natural Java feel, and techniques such as those described in this article can help make XML manipulation easier, less verbose, and less prone to bugs. Leveraging these projects and following these usage patterns allows DOM to be an excellent platform for XML-based projects.

As new technologies emerge and become well established so do threats against those technologies. Blind SQL injection attacks are a well know and recognized form of code injection attack, but there are many other forms, some not so well documented or understood. An emerging code injection attack is the XPath injection attack, which takes advantage of the loose typing and forgiving nature of XPath parsers to allow malcontents to piggyback malicious XPath queries on URLs, forms, or other methods to gain access to privileged information and change it.

So far in this column, I've focused on the fairly traditional definition and use of data binding: An XML document is converted into a Java representation and used in normal Java methods (for example, getName() or setAddress()). Then, the Java object is converted back into an XML representation and usually serialized (saved) to disk. I've also looked at going the other way -- taking a Java object, converting it to XML, and then using that XML (perhaps sending it across a network connection or using it as input to another XML-consuming component in your application). These are all perfectly valid and useful data binding use-cases, but I still haven't touched on all the possibilities. In this article and the next, you'll learn about an alternative approach to data binding that uses XPath technology.

XSLT 1.0 and XPath 1.0 were originally intended to provide simple style language support for XML documents, primarily to convert those documents into HTML for rendering to a browser. Since XSLT and XPath became available, however, they have been pressed into all sorts of tasks for which they weren't originally designed -- from the sophisticated manipulation of data in XML documents (aggregation, distinct selection, relationship pivoting) to XSLT's transformation of one XML form into another. In version 2.0 of these specifications, the W3C attempts to make XSLT and XPath much more flexible and robust in order to handle the new way these technologies are being used.

The reason that XPath shows up in the Practical data binding series is that these selections all use logical names (see It's eminently logical). Instead of selecting, say, the first attribute of the second child element of the root element, you use an XPath expression like /cds/cd[@title='August and Everything After']. This is data binding, in a certain sense, because you're using XML's markup -- rather than its structure -- to access data.

Next up, you need an XPath object. This object is capable of evaluating XPaths, and is the cornerstone of your XPath-aware Java programs. Just as you get a DocumentBuilder from a DocumentBuilderFactory, you get an XPath from an XPathFactory. Listing 10 shows this minimal code.

Where possible, we are using standard Java API for XML (JAXP 2.0). However, DOM XPath capability is lacking in these interfaces. We'll use the XPath capabilities in Apache Xalan to fill in the gaps.

XML is simply markup for data. That's it. XML is not a magic wand; it does not specify how data is transmitted over the wire, it does not specify how data is stored. XML simply determines the format of the data: what you do with the data is up to you. That said, the real power behind XML is not solely its ability to represent data: XML's real power lies in ancillary technologies that, when combined with XML, provide robust solutions, and XPath is one of those ancillary technologies.

To conclude this section on XPaths, let's look at predicates. Predicates allow you to specify conditions that must apply to an element. The predicate appears between square brackets, [ and ], immediately after the element on which the condition applies.

There are some types built into XPath 2.0 that have already been derived by restriction from the XML schema xs:duration type. These types are in the namespace http://www.w3.org/2003/05/xpath-datatypes, which is represented by the prefix xdt:

And, as already mentioned, there are many new functions coming up in XPath 2.0. One of the specific tasks that W3C undertook in XPath 2.0 was to augment its string-processing capabilities. Accordingly, you'll find more string functions in XPath 2.0, including upper-case, lower-case, string-pad, matches, replace, and tokenize.

There is relatively little software out there that supports XPath 2.0?in fact, besides XQuery processors like Galax, the only real XPath 2.0-enabled processor in popular use today is the Saxon XSLT application, which you can get for free at http://saxon.sourceforge.net. Because Saxon supports some XPath 2.0?as well as some XSLT 2.0?we'll use it in examples in our XPath 2.0 discussions.

JXPath is an extremely useful tool for traversing, navigating, and querying complex object trees. Because it uses the XPath expression language for its queries, a large body of reference material is available to help you build efficient yet complex object-retrieval queries. Even more flexibility is added by using Pointers and relative contexts.

w_w__w__.j___a__v___a_2s_._c__o__m___ | Contact Us
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.