Using T-SQL string functions to convert and report on data
In part one, you learned some basic T-SQL string functions and how to use them.
This article takes things one step further with examples of how to get the most out
Premium Access
Register now for unlimited access to our premium content across our network of over 70 information Technology web sites.
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States.
Privacy
This was first published in April 2010
of string
functions by converting numerical and character data, gathering additional information and
performing a SOUNDEX analysis.
Converting string values
T-SQL provides functions used to convert string values, as shown in the following SELECT
statement:
SELECT
Name + CHAR(9)
+ STR(MakeFlag) AS
Product,
ASCII(MakeFlag)
AsciiMakeFlag,
RTRIM(Style) AS Style,
UNICODE(RTRIM(Style)) AS UnicodeStyle
FROM
Production.Product
WHERE
ProductID = 771
The statement returns the following results:
Product
AsciiMakeFlag Style UnicodeStyle
--------------------------- ------------- ----- ------------
Mountain-100 Silver, 38 1
49 U
85
(1 row(s) affected)
The first function in the statement, CHAR, converts an integer ASCII code (0 through 255) to its
character symbol, and only takes one argument -- the ASCII integer. The function is particularly
handy for inserting control characters into your strings, such as the tab (9), line feed (10), or
carriage return (13).
In the example above, CHAR(9) is used to insert a tab into the Product value. Note that T-SQL
also supports the NCHAR function, which converts a Unicode integer to Unicode characters. The next
function, STR, converts numerical data to character data. The argument for the STR function is the
numeric expression to be converted. In the example above, the MakeFlag bit column is converted to a
string so it can be concatenated with the Name column. Remember, if the two columns are
concatenated without converting the bit column, an error message will display.
T-SQL also supports the ASCII function, which converts a single character to an ASCII integer.
The example uses the ASCII function to convert the MakeFlag value to an integer. Since the column
returns a 1, the value is converted to 49.
The UNICODE function is also shown in the example, and is used to convert a character to a
Unicode integer. The function takes only one argument, a Unicode character expression. In the above
example, the Style value is converted to a Unicode integer. It is important, however, to first use
RTRIM to remove the trailing space because the UNICODE function, like the ASCII function, can
convert only one character at a time.
Gathering information from string values
There are also functions that provide information about a string value, such as its length or
the placement of a substring within the value. The following SELECT statement includes several
functions that provide information about a character value:
SELECT
LEN(Description) AS LengthDescrip,
CHARINDEX('tech', Description) AS
CharIndexDescrip,
PATINDEX('%bike%', Description) AS
PatIndexDescrip
FROM
Production.ProductDescription
WHERE
ProductDescriptionID = 321
The functions in this statement are based on the Description column. For this product, the
Description column contains this value: Same technology as all of our Road series bikes. Perfect
all-around bike for road or racing. The SELECT statement returns the following results:
LengthDescrip |
CharIndexDescrip |
PatIndexDescrip |
------------- |
---------------- |
--------------- |
93 |
6 |
43 |
(1 row(s) affected)
The LEN function returns the number of characters in the specified character expression. The
expression is the function’s only argument. For the Description column, there are 93
characters.
The CHARINDEX function searches a character expression for the starting position of a set of
characters and takes three arguments: the characters to search for, the expression to be searched,
and the starting position for the search. This last argument is optional. In the example above, the
Description column is searched for the first occurrence of tech, which starts at the sixth
position of the string value.
The final function in the above statement is PATINDEX, which searches for the first position of
a character pattern in a character expression. The function takes two arguments: the pattern that
is being searched for and the character expression to be searched. Wildcards can be used to define
the pattern and provide flexibility in the types of patterns searched. If you use the % wildcard,
however, it must precede and follow the characters to be searched, unless you search on the first
or last character. For example, in the statement above, %bike% is specified as the character
pattern to search for in the Description value. The results show that the first instance of the
pattern appears at character 43.
Performing SOUNDEX analysis on string values
The final set of string functions discussed is related to SOUNDEX, a system for converting
alphanumeric values to a four-character code in order to identify similar sounding words. The first
character returned by a SOUNDEX function is the same as the target character expression, and the
next three characters are numeric codes that evaluate the consonants. (Vowels are ignored unless
they are the first letter.)
In the following T-SQL statements, SOUNDEX-related functions are used to analyze two last
names:
DECLARE @Name1 varchar(30)
SET @Name1 =
(SELECT LastName FROM Person.Person
WHERE BusinessEntityID = 341)
DECLARE @Name2 varchar(30)
SET @Name2 =
(SELECT LastName FROM Person.Person
WHERE BusinessEntityID = 441)
SELECT
@Name1 AS Name1,
@Name2 AS Name2,
SOUNDEX(@name1)AS SoundexName1,
SOUNDEX(@Name2) AS SoundexName2,
DIFFERENCE(@Name1, @Name2) AS SoundexDiff
The last names returned by the SET subquery statements are Allen and Alan, which are inserted
into variables, and used in the SELECT statement, returning the following results:
Name1 |
Name2 |
SoundexName1 |
SoundexName2 |
SoundexDiff |
----- |
----- |
------------ |
------------ |
----------- |
Allen |
Alan |
A450 |
A450 |
4 |
(1 row(s) affected)
The first time the SOUNDEX function is used, the last name Allen is analyzed; the second
time analyzes Alan. Notice that the function takes only one argument -- the string to be
analyzed. In this case, the SOUNDEX analyses of the two spellings return the same results, meaning
the names are very similar. This can be handy for identifying someone whose name is spelled
differently in two places. The DIFFERENCE function is also demonstrated and takes two arguments,
the character expressions to be compared, and returns an integer that rates the difference between
the two strings. The integer can range from 0 through 4. A value of 0 indicates little to no
similarity between two strings, while a 4 indicates a strong similarity. In the example above, the
function returns a 4 because the names are similar.
If you want to see the results for two names that are less alike, run the following
statements:
DECLARE @Name3 varchar(30)
SET @Name3 =
(SELECT LastName FROM Person.Person
WHERE BusinessEntityID = 1829)
DECLARE @Name4 varchar(30)
SET @Name4 =
(SELECT LastName FROM Person.Person
WHERE BusinessEntityID = 2580)
SELECT
@Name3 AS Name3,
@Name4 AS Name4,
SOUNDEX(@Name3)AS SoundexName3,
SOUNDEX(@Name4) AS SoundexName4,
DIFFERENCE(@Name3, @Name4) AS SoundexDiff
In this case, the SOUNDEX function returns a value of S000 for the last name Su and a value of
S620 for the last name Suarez. In addition, the DIFFERENCE function returns a value of 3, as shown
in the following results:
Name3 |
Name4 |
SoundexName3 |
SoundexName4 |
SoundexDiff |
----- |
------ |
------------ |
------------ |
----------- |
Su |
Suarez |
S000 |
S620 |
3 |
(1 row(s) affected)
The SOUNDEX function and string functions provide a great deal of flexibility when working with
character data. Although these examples are relatively basic, more complex statements using these
functions can be created. Be sure to check out SQL Server Books
Online for more information.
Back to Part one: T-SQL functions - The ins and outs for SQL Server
ABOUT THE AUTHOR
Robert Sheldon is a technical consultant and the author of numerous books, articles,
and training material related to Microsoft Windows, various relational database management systems,
and business intelligence design and implementation. You can find more information at www.rhsheldon.com.
Disclaimer:
Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.
Join the conversationComment
Share
Comments
Results
Contribute to the conversation