Retrieve XML data values with XQuery in SQL Server 2005
When Microsoft released SQL Server 2005, it introduced a new data type: XML. As with other SQL
Server data types, you can use the XML data type to define columns, variables and parameters for
stored procedures and user-defined functions, and you can access XML data in its entirety -- that
is, as a single value -- just like you would an XML document. However, as is the case with an XML
document, there might be times when you want to access only specific values within the XML data.
And that's where XQuery comes in.
XQuery is a powerful scripting language designed specifically to access XML data. SQL Server
2005 supports a subset of the XQuery language that lets you access values within an XML column,
variable or parameter. You can use XQuery within your Transact-SQL statements
Premium Access
Register now for unlimited access to our premium content across our network of over 70 information Technology web sites.
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States.
Privacy
This was first published in April 2008
by calling methods
supported by the XML data type. Two of these methods -- value() and query() -- are particularly
useful for retrieving specific elements from your XML data.
In this article, I introduce you to both methods and provide examples of each. Note that the
article assumes you are already familiar with T-SQL and XML
in SQL Server.
The XML value() method
When you call an XML method within your Transact-SQL statement, you specify the XML column,
variable or parameter name, followed by a period, the method name and an XQuery expression within
parentheses. For example, to call the value() method on an XML column named XmlInfo, you would use
the following syntax:
XmlInfo.value(<XQuery expression>)
The XML value() method returns a scalar (single) value of the type specified when you call the
method. The value() method takes two arguments. The first identifies the element and value to
retrieve, and the second specifies the data type of the returned value. Let's look at an example to
demonstrate how this works. The following statement retrieves data from the JobCandidateID and
Resume columns in the HumanResources.JobCandidate table (part of the AdventureWorks sample
database):
SELECT JobCandidateID, Resume.value('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
(/ns:Resume/ns:Name/ns:Name.Last)[1]',
'nvarchar(30)') AS LastName
FROM HumanResources.JobCandidate
The second column in the SELECT list (Resume) is defined with the XML data type. As a result,
you can call any of the XML methods. In this case, I call the value() method and pass in two
arguments. Each argument is enclosed in single quotes and they are separated with a comma. Now
let's take a closer look at the first argument.
The first argument is divided into two parts and separated by a semicolon. The first part
declares a namespace and assigns it to the "ns" alias. You must specify the namespace for any
typed XML column. A typed column is one that is associated with a specific schema. In SQL
Server Management Studio, you can determine the schema associated with an XML document by
retrieving the entire value from the XML column and then clicking on the returned value. The XML
document opens in a separate window, as shown in Figure 1. Notice that the schema is defined as an
attribute of the root element.
Figure 1: Determining the schema associated with an XML document.
(Click on image for enlarged view.)
The schema listed in the XML document is the one you use as your namespace when calling the
value() method. If the XML column is untyped (not associated with a schema), you do not need
to declare a namespace, and you can omit the first part of the argument.
The second part of the argument identifies the XML element containing the value you want to
retrieve.
 |
More on using XML data type in SQL Server: |
|
|
|
 |
 |
The element is essentially a path name separated by
forward slashes. When working with a typed XML column, you must precede each node in the path name
with the namespace alias, followed by a colon. In this case, I precede each node with the "ns:"
preface. Refer again to Figure 1 and where you can see this element near the top of the
document.
Notice that the path name is enclosed in parentheses and is followed by [1]. Because the value()
method can return scalar values only, you must specify [1] after the closing parenthesis to ensure
only one instance of the element can be returned, even if there is only one instance in the XML
document. After you specify the path, you must specify the data type. The SQL statement will now
retrieve the last name of each job candidate and return it as data type NVARCHAR.
Like any language, XQuery supports a wide variety of functions. For example, the following
statement uses the concat function (in the second part of the first argument) to concatenate the
first name with the last name:
SELECT JobCandidateID, Resume.value('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
concat((/ns:Resume/ns:Name/ns:Name.First)[1], " ",
(/ns:Resume/ns:Name/ns:Name.Last)[1])',
'nvarchar(60)') AS FullName
FROM HumanResources.JobCandidate
The concat function takes multiple arguments that are separated by commas. Each argument is the
value you want to concatenate. Notice that the first and third arguments use the same path
construction as the previous example. Whenever you refer to an element, you must use the complete
path. For information about the concat function and all of the XQuery functions, see Microsoft SQL
Server 2005 Books Online.
The XML query() method
Although the value() method is handy for retrieving a single value from an XML column, you'll
often want to retrieve multiple values. In these cases, you should use the XML query() method. The
query() method takes only one argument and returns the specified elements as XML. For example, the
following SQL statement returns the Education elements and their child nodes for each job
candidate:
SELECT JobCandidateID, Resume.value('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
concat((/ns:Resume/ns:Name/ns:Name.First)[1], " ",
(/ns:Resume/ns:Name/ns:Name.Last)[1])',
'nvarchar(60)') AS FullName,
Resume.query('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
/ns:Resume/ns:Education') AS Education
FROM HumanResources.JobCandidate
The query() method's argument is enclosed in single quotes and separated into two parts. Again,
you must identify a namespace when retrieving data from a typed column. You declare the namespace
in the same way you declared it for the value() method. After you declare the namespace, specify
which elements to retrieve. In this case, the Education elements and all their child nodes are
returned, as shown in Figure 2.
Figure 2: Once the namespace is declared, specify which XML elements to retrieve. (Click
on image for enlarged view.)
You can achieve the same results by casting the second part of the argument as a FLWOR
(pronounced flower) expression, which, by definition, is made up of "for, let, where, order
by, and return" clauses. Note, however, that SQL Server currently does not support the let
clause.
The XML query() method in the following example uses the "for" and "return" clauses to retrieve
the Education elements and their child nodes:
SELECT JobCandidateID, Resume.value('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
concat((/ns:Resume/ns:Name/ns:Name.First)[1], " ",
(/ns:Resume/ns:Name/ns:Name.Last)[1])',
'nvarchar(60)') AS FullName,
Resume.query('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
for $ed in /ns:Resume/ns:Education
return $ed')AS Education
FROM HumanResources.JobCandidate
As you can see, the FOR clause includes the $ed variable. You do not have to explicitly declare
this variable. Using it in the FOR clause is enough. The variable is used to iterate through the
Education elements. (You can use any name for the variable, as long as you follow SQL Server naming
conventions.) The RETURN clause then references the $ed variable. As a result, it returns each
Education element and its child nodes.
Of course, there's no reason to use a FLWER expression simply to retrieve an element and its
child nodes. However, you can include an expression in the path name of your FOR clause (in
brackets) that limits the results returned. For example, the following statement limits the results
to those Education elements that contain an Edu.Level value of "Bachelor":
SELECT JobCandidateID, Resume.value('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
concat((/ns:Resume/ns:Name/ns:Name.First)[1], " ",
(/ns:Resume/ns:Name/ns:Name.Last)[1])',
'nvarchar(60)') AS FullName,
Resume.query('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
for $ed in /ns:Resume/ns:Education[ns:Edu.Level="Bachelor"]
return $ed') AS Education
FROM HumanResources.JobCandidate
Notice I use an equal comparison operator (=) to compare the Edu.Level element value to the
string value. This forms a Boolean expression that must evaluate to true in order for the element
to be included in the results. XQuery supports a variety of operators for creating Boolean
expressions. You can view the supported operators in Microsoft SQL Server 2005 Books online.
Rather than specify a Boolean expression in the FOR clause, you can define the same logic in a
WHERE clause, as in the following example:
SELECT JobCandidateID, Resume.value('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
concat((/ns:Resume/ns:Name/ns:Name.First)[1], " ",
(/ns:Resume/ns:Name/ns:Name.Last)[1])',
'nvarchar(60)') AS FullName,
Resume.query('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
for $ed in /ns:Resume/ns:Education
where $ed/ns:Edu.Level="Bachelor"
return $ed') AS Education
FROM HumanResources.JobCandidate
In this case, the FOR clause includes only a simple path, and the WHERE clause includes the
Boolean logic. Notice that the path name uses the $ed variable to point to the correct Education
element that the FOR clause is iterating through.
Putting this logic in the WHERE clause makes it easier to read and write your code, particularly
if you use complex Boolean expressions. For example, the next statement uses the "and" logical
operator in the WHERE clause to limit the results to business majors with bachelor's degrees:
SELECT JobCandidateID, Resume.value('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
concat((/ns:Resume/ns:Name/ns:Name.First)[1], " ",
(/ns:Resume/ns:Name/ns:Name.Last)[1])',
'nvarchar(60)') AS FullName,
Resume.query(('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
for $ed in /ns:Resume/ns:Education
where $ed/ns:Edu.Level="Bachelor" and $ed/ns:Edu.Major="Business"
return $ed') AS Education
FROM HumanResources.JobCandidate
The two path names are joined together with the "and" operator. As a result, both conditions
must evaluate to true in order for the Education element to be returned.
Now let's look at an example that includes an ORDER BY clause. The following statement retrieves
employment information related to technology and production:
SELECT JobCandidateID, Resume.value('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
concat((/ns:Resume/ns:Name/ns:Name.First)[1], " ",
(/ns:Resume/ns:Name/ns:Name.Last)[1])',
'nvarchar(60)') AS FullName,
Resume.query('declare namespace ns=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";
for $emp in /ns:Resume/ns:Employment
where $emp/ns:Emp.FunctionCategory = "Production" or
$emp/ns:Emp.FunctionCategory = "Technology"
order by $emp/ns:Emp.EndDate descending
return $emp') AS Employment
FROM HumanResources.JobCandidate
As in the preceding examples, the statement uses a FOR clause to identify the desired elements
(in this case, Employment) and the WHERE clause to limit the results. Notice, however, that the
statement also includes an ORDER BY clause. You can use this clause to sort your results based on a
specified node. In this case, I'm sorting the results based on the Emp.EndDate element, and I'm
sorting the information in descending order. Accordingly, the most recent employment information
will appear at the top of the list.
As you can see, the value() and query() methods can be quite useful when retrieving XML data.
However, what I've shown you here only scratches the surface. XQuery is a powerful language that
allows you to write complex queries that return XML data in exactly the way you need that data to
return. Because of the rich variety of functions, operators and expressions supported by XQuery,
you can retrieve just about any element and attribute stored in your XML columns, variables and
parameters – in just about any format necessary. Again, be sure to refer to SQL Server 2005 Books
Online for additional information about XQuery and for more examples that demonstrate how it
works.
ABOUT THE AUTHOR
Robert Sheldon is a technical consultant and the author of numerous books,
articles, and training material related to Microsoft Windows, various relational database
management systems, and business intelligence design and implementation. You can find more
information at http://www.rhsheldon.com.
Disclaimer:
Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.
Join the conversationComment
Share
Comments
Results
Contribute to the conversation