1. XML glossary In just a few years, XML's importance to Java developers has grown by leaps and bounds, especially with Web services on the scene. XML, however, is an evolving technology with numerous subtechnologies emerging to solve common problems. With that in mind, how can a Java developer hope to keep up? This glossary of important XML acronyms will help you dip your feet into the XML pond. I assume some XML knowledge, such as that you understand attributes and elements and know how an XML document looks. To further aid your knowledge, for each technology introduced here, I've included a list of related Websites in Resources.XML's flexibility comes with a price, however. Consider HTML, XML's predecessor. HTML sports a fixed tag set (although in the early days of the Web, Netscape and others defined new tags at such a frantic pace that at times the tag set seemed anything but fixed), which allows developers to completely specify an HTML tool's functionality -- up front. No such guarantee exists for many XML tools. Instead, these tools must be as flexible as the XML they work with. They must "do the right thing" when they encounter a novel tag in their XML input. They must expect the unexpected. Learn how the Data Transfer Object design pattern is implemented in Java ME architectures and why you might want to use XML-based DTOs rather than Java beans for faster data interchange between tiers. Author Mario La Menza presents an example architecture that supports the use of XML-based DTOs and also introduces MockMe, a free tool for generating XML-based mock objects so you can test the presentation layer of your Java mobile applications without waiting for server-side data. My last JavaWorld article "Simplify XML Processing with VTD-XML" looked at three important benefits of VTD-XML: performance, memory usage, and ease of use. VTD-XML makes XML applications not only easier to write, but also leaner and faster. XML applications written in VTD-XML are 10 times more responsive when compared to the same applications written with the Document Object Model (DOM) and are capable of serving 10 times the workload, while maintaining the same quality of services for proportionally bigger XML messages. Prior to this opportunity, I had always thought of XML as a "new and improved" HTML. I learned, however, that what began life as a "new and improved" HTML had application in domains far removed from Web publishing. In real data-processing situations, however, the structure of the input data often differs greatly from the eventual output structure. Since SAX passes SAX events to a programmer-defined handler in the order in which they appear in the input XML, as the programmer you are responsible for any data restructuring or reordering. Also, if the same data is to be used in more than one place in the output, you must either perform multiple passes over the XML or arrange for the handler to "remember" that data while producing output. One example of this was the recipe title in Part 2, which the handler maintained in an internal variable for use both in the browser title bar and in the Webpage. Recently, we've explored the interface between Java and XML. More specifically, we've been looking at ways to use Java's ability to dynamically load code to improve our ability to process XML -- in particular, to handle XML tags we didn't specifically design our application to handle. For processing XML documents, most XML tools work with the SAX or DOM API. In this article, we'll look at a way to implement the same APIs directly over a database, enabling XML tools to treat databases as if they were XML documents. That way, we can obviate the need of converting a database. My last two JavaWorld articles focused on two key benefits of VTD-XML: high-performance parsing and incremental update. Both are quite essential for a high-performance WSS implementation. Why? First, VTD-XML parses XML messages five to 10 times faster than DOM parsers, consumes just one-third to one-fifth of the memory, and, more importantly, exports a hierarchical view of XML Information Set (Infoset) that one can navigate back, forth, and sideways. Second, VTD-XML internally keeps an XML message intact and undecoded, meaning reserializing the parts of the SOAP message irrelevant to the security token computation is no longer necessary. When the security tokens are generated, just stick them anyplace you want in the message. In The J2EE Tutorial, Sun Microsystems briefly mentions that one client option is to embed applets in Webpages. However, the tutorial is unclear on how to integrate an applet into a J2EE system; instead, it concentrates on Web components (servlets and JSPs). This is due to the simplicity of those technologies; no plug-ins are required. Sun also mentions that you can use JSPs for outputting XML documents, which is generally useful for producing data in a standardized computer-readable format for Web service consumption. XML's emergence did not initially make our lives easier—at best, they did not change much. We quickly started writing our many data formats using angle brackets, which looked neat, but did not make much of a difference. But, over the years, many new XML-based specifications with disparate and often unique grammars have surfaced. How these grammars, and XML in general, fit into the world of object-oriented technology, at first, did not seem completely clear. As a Java developer you use XML every day in your build scripts, deployment descriptors, configuration files, object-relational mapping files and more. Creating all these XML files can be tedious, but it's not especially challenging. Manipulating or merging the data contained in such disparate files, however, can be difficult and time-consuming. You might prefer to use several files split into different modules, but find yourself limited to one large file because that is the only format the XML's intended consumer can understand. You might want to override particular elements in a large file, but find yourself replicating the file's entire contents instead. Maybe you just lack the time to create the XSL transformations (XSLT) that would make it easier to manipulate XML elements in your documents. Whatever the case, it seems nothing is ever as easy as it should be when it comes to merging the elements in your XML files. In this column, I briefly explore the ebXML Registry standard, its architecture, maturity, industry adoption, and how an ebXML registry can use a service-oriented architecture (SOA) repository. The Jakarta POI HSSF API provides classes to create an Excel workbook and add spreadsheets to the workbook. With the POI API, the HSSFWorkbook class represents a workbook, and you set the spreadsheet fonts, sheet order, and cell styles in the HSSFWorkbook class. You can represent the spreadsheet using the HSSFSheet class. Specifically, you set the sheet layout, including the column widths, margins, header, footer, and print setup using the HSSFSheet class. You can represent a spreadsheet row using the HSSFRow class, and you set the row height using the HSSFRow class. The HSSFCell class represents a cell in a spreadsheet row, and you set the cell style using the HSSFCell class. The indexing of spreadsheets in a workbook, of rows in a spreadsheet, and of cells in a row is zero based. In this article, we'll show how to convert an example XML document to an Excel spreadsheet and then convert the spreadsheet to an XML document. Listing 1 shows the example document, incomestatements.xml. This article is a follow-up to my introductory article, "XML for the absolute beginner", in the April 1999 issue of JavaWorld (see the Resources section below for the URL). That article described XML; I will now build on that description and show in detail how to create an application that uses the Simple API for Java (SAX), a lightweight and powerful standard Java API for processing XML. This month, I wish to carry the thread further -- parsing and validating are fine as far as they go, but they don't go very far. The problem at hand typically involves doing something with the parsed information. But what if you don't understand the tags used to generate the information? Come walk with me a bit farther along the border between Java and XML, and I'll show you how to use Java to solve that problem, too. For the purpose of this article I'm going to assume that you know what JavaServer Pages (JSP) and Extensible Markup Language (XML) are, but you may be a little unclear on how you can use them. JSP use is pretty easy to defend. It allows you to design a Website built from files that look and act a lot like HTML. The only difference is that JSPs also act dynamically -- for example, they can process forms or read databases -- using Java as a server-side scripting language. XML use is more difficult to justify. While it seems as if every new product supports it, each one seems to be using XML for a different purpose. This article explores such distributed applications written in Java. I'll focus on the communication of XML between Java code running in different virtual machines. The good news is that many of the limitations of HTML have been overcome in XML, the Extensible Markup Language. XML is easily comprehensible to anyone who understands HTML, but it is much more powerful. More than just a markup language, XML is a metalanguage -- a language used to define new markup languages. With XML, you can create a language crafted specifically for your application or domain. The Apache Jakarta site is home to many well-known Java open source projects, including Tomcat, Ant, and log4j. A lesser-known subproject of Jakarta is Jakarta Commons, a repository of reusable Java components. These components, such as Commons BeanUtils, Commons DBCP, and Commons Logging, alleviate the pain of some standard programming tasks. This article will focus on the Jakarta Commons Digester, a utility that maps XML files to Java objects. We believe XML databases provide a schema-agnostic and ubiquitous representation of information and can enable enterprise information integration (EII). By the end of this article, you will have enough information to deduce that XML databases combined with Web services enable the flow of information across loosely-coupled applications, resulting in a more responsive architecture and compelling return on investment. You'll also remember that an XML parser checks that the document is well formed (meaning that roughly all of the open and close tags match and don't overlap in nonsensical ways). But even well-formed documents can contain meaningless data or have a senseless structure. How can such conditions be detected and reported? Either way, performance suffers, as exemplified by Apache Axis. On its FAQ page, Axis claims to internally use SAX to create a higher-performing implementation, yet it still builds its own object model that is quite DOM-like, resulting in negligible performance improvements when compared with its predecessor (Apache SOAP). In addition, SAX doesn't work well with XPath, and in general can't drive XSLT (Extensible Stylesheet Language Transformation) processing. So SAX parsing skirts the real problems of XML processing. XML Basics for Java Developers, Part 4 In part four in a series of XML basics for Java developers book excerpts from Learning Java, 2nd Edition, learn about validating documents. Currently, we have standard APIs for representing an XML document as a tree of objects through the W3C's DOM specification, and as a series of events through the SAX API. JAXP 1.0 gave us a standard Java API for XML parsers, and JAXP 1.1 expands on this to include a standard API for XSLT engines. This standard API is the Transformation API for XML, or TRaX for short. I will cover TRaX basic usage and explain the top-level interfaces to show how powerful this API is. The specific TRaX implementation that I am working from is the Xalan-Java 2 XSLT processor from the Apache project. One alternative approach is to use XML for the email templates, and that's the approach I'm going to discuss in this article. XML provides great flexibility in how you structure your templates and does not have the same strict formatting rules as property files do, so it's easier to maintain large strings. The main downside is that XML files can be trickier to deal with than property files. With a property file, it's easy to load the files and it's easy to access the properties once they've been loaded. On the other hand, it takes more work to load the XML file and process it using one of the many XML processing libraries provided for use with Java. According to conventional wisdom, the only way for parties to exchange financial transactions is asynchronous messaging, primarily because each party uses its own, different, largely incompatible core application software. In this heterogeneous application environment, transmission and reception of messages remains the only way for disparate applications to communicate. This article has presented a method of parsing XML documents with concurrent programming. It has also explained the ideas behind the producer-consumer model, as well as thread coordination. XML Basics for Java Developers, Part 5 In this final in a series of XML basics for Java developers book excerpts from Learning Java, 2nd Edition, get an introduction to XSL/XSLT and Web services. Apache Xindice is a native XML database in which XML documents may be stored, queried, and modified. The advantage of a native database over a relational database is that mapping of XML to SQL is not required. Instead, XPath is used to query the Xindice database and XML:DB XUpdate is used to update the database. Xindice implements the Java XML:DB API to add, query, update XML documents to the Xindice database. XML documents in the Xindice database are stored in collections; a collection may consist of one or more XML documents. Xindice also provides a command-line tool which has the same functionality as the XML:DB API. This chapter begins our look at specific Java and XML topics. So far, we have covered the basics of using XML from Java, looking at the SAX and DOM APIs to manipulate XML and the fundamentals of using and creating XML itself. We've also looked at how JDOM can provide a more Java-centric means of using our XML data and documents within Java programs. Now that you have a grasp on using XML from your code, we will spend time on specific applications. The next six chapters represent the most significant applications of XML, and, in particular, how those applications are implemented in the Java space. While there are literally hundreds and soon to be thousands of important applications of XML, the topics in these chapters are those that continually seem to be in the spotlight, and that have a significant potential to change the way traditional development processes occur. XML Basics for Java Developers, Part 5 In this final in a series of XML basics for Java developers book excerpts from Learning Java, 2nd Edition, get an introduction to XSL/XSLT and Web services. XML Basics for Java Developers, Part 5 In this final in a series of XML basics for Java developers book excerpts from Learning Java, 2nd Edition, get an introduction to XSL/XSLT and Web services. XML Basics for Java Developers, Part 5 In this final in a series of XML basics for Java developers book excerpts from Learning Java, 2nd Edition, get an introduction to XSL/XSLT and Web services. This paper is mostly about how code generation aids speedier application software development. However, this paper will also highlight the fact that source code generation processing is a particular application of the broad technology of XML based document transformation. The advent of container managed relations and local references has opened exciting new avenues in enterprise application development using EJBs. In this article I will take you through a powerful way of using EJB 2.0 in conjunction with bean introspection and JAXP to create dynamic XML based data structures that can be transferred between your enterprise tier and presentation tier. The use of XML for transferring data from the enterprise tier to the client tier helps you implement loose coupling between the various tiers in your applications; however, as and when you add new domain objects to your entity model, you may need to add classes responsible for creating new DOM structures for the added entities. In this article we will develop a framework that can dynamically traverse the container managed persistent and related fields of a given local EJB and create an XML document that can be transferred between the various tiers in the application. This method will provide the following advantages: This book excerpt is from Chapter 15 of "JavaServer Pages, 3rd Edition" by Hans Bergsten, ISBN 0596005636, copyright 2004, 2002, 2001. All rights reserved. This chapter, titled "Working with XML Data," is posted with permission from O'Reilly & Associates. XMLTask is an external task for the popular build tool Ant that permits complex manipulations of XML in a simple and consistent fashion, without having to deal with XML style sheets (XSL). XMLTask can be used for many common tasks that developers face, including manipulating J2EE and Spring descriptors, creating XHTML websites, and driving workflows via XML configuration files. Consumers find themselves with a wide array of communications options: snail mail, phone, email, instant messenger, video phone, web publishing, fax, and, increasingly, Really Simple Syndication (RSS) feeds, driven by everything from newspapers to weblogs. Everywhere, individuals and businesses are trying to make the most effective use of communications technology, while IT companies decide which technologies to bet on. In this context, XML pioneer Tim Bray joined Sun Microsystems in March 2004, where, as Director of Web Technologies in the Software CTO Office, he plans to incorporate blogging software and content syndication based on the RSS format into Sun's software line, and help set the company's direction with respect to web services and search technology. XML digital signatures will enable a sender to cryptographically sign data, and the signatures can then be used as authentication credentials or a way to check data integrity. Neither Java nor XML Technology need an introduction, nor the synergy between the two: "Portable Code and Portable Data". With the growing interest in web services and e-business platforms, XML is joining Java in the developer's toolbox. As of today, no less than six extensions to the Java Platform empower the Java developer when building XML-based applications: I. Executive Summary Web services using XML standards is a new paradigm in the way B2B collaborations are modeled. It provides a conceptual and architectural foundation which can be implemented using a variety of platforms and products. Today, developers can use the Java 2 Enterprise Edition (J2EE) to build XML-based web services. They can leverage existing J2EE technologies to build a complete and fully interoperable web service that complies with XML standards. Without radical reengineering, and without rebuilding a proven J2EE system, developers can construct complex and powerful web services applications. Before a run, org.suiterunner.Runner (Runner) instantiates each Reporter via the Reporter's no-arg constructor. Therefore, when you create a custom Reporter, you must give it a public no-arg constructor. Bill Venners: That subclass hierarchy reminds me of JAXB, where classes map to the concepts represented in the data, not just to XML concepts. For example, classes that represent description or link elements in RSS, not just Element and Attribute. You write in your book, "You can parse a plain text file with only partial knowledge of its format." How often do we lose the format specification, or is this more about not needing to "read the manual"—the specification—because the data is more user-friendly. Have an opinion about XML processing APIs? Discuss this article in the News & Ideas Forum topic, What's Wrong with XML APIs. Because Digester requires an XML parser that conforms to JAXP version 1.1 or later, the Digester component uses the SAX parser for the actual parsing. It is easier to use than SAX alone, however, because Digester hides all the complex parsing maintenance. The other main API for XML parsing, DOM, uses too much memory to be a practical solution for large documents—and don't you deal with large documents most of the time in the real world? Since Digester is just a layer over SAX, the difference in memory usage between DOM and Digester is the same as that between DOM and SAX. (Click here for a good comparison of the two.) This new feature definitely should appeal to existing Hibernate developers because it follows the same persistence methods as POJOs (plain old Java objects), requiring a minimal learning curve. The convenience of XML persistence should appeal to new users as well. This article describes the Hibernate 3 persistence method. s concern has grown over the security and efficiency of Web-based applications, validation of user input has increased its importance in turn. Relying on scripting for client-side validation is unmanageable, inefficient, and non-portable. Hard-coding data validation rules into server application code ties critical business logic to the presentation tier and makes maintenance extremely troublesome. Web developers need an efficient, secure, and flexible server-side data validation mechanism. A Java/XML-based data validation approach separates the implementation of common data validation reasoning code from the business rules and criteria data used to validate user input. The validation reasoning code is implemented in Java, while the business-specific rules and data are specified in XML. This approach provides a powerful and flexible way for application developers to specify data validation in a manner that is secure and easy to manage, while decoupling validation rules from the server-side business logic implementation. Common Approaches to Data Validation Before delving into the mechanics of Java/XML-based data validation, let's examine some common approaches to handling this problem, along with their relative strengths and weaknesses. Client-side Scripting: Inflexible and Non-portable Client-side data validation was the first generation of user input data validation technology for web applications. It is still widely used today. The data validation logic is typically implemented in HTML and Javascript and embedded in the Web pages transmitted to the browser. While it provides faster response time by reducing the number of round trips to the server, this approach has the following shortcomings: Whitespace Indentation and the JSON Option The YAML file format is centered on the concept of whitespace indentation, which is used to indicate the hierarchical structure of data?instead of nested XML tags or JSON braces ({}) and brackets ([]). It is, however, a superset of JSON. So when it is useful, you may break out of the whitespace flow and adopt a typical JSON-style syntax. Its creators describe it as a "human-friendly data serialization standard for all programming languages." In my experience, its focus on "human-friendliness" is what sets it apart. However, even with the advent of XML, mapping from objects such as Java class instances to XML is not always trivial. In particular, using a "contract first" approach that defines an XML schema, namespaces, and XML data types can be arduous. Object-to-XML mapping (OXM) libraries such as JAXB, XMLBeans, and XStream have made OXM easier, and they've helped to define APIs that serve as the foundation of serialization for tools such as Spring Web Services (Spring-WS). These libraries work by generating classes or using Java annotations to map the objects to a defined XML schema automatically. So the application developer simply instantiates an object, populates the data, and tells the library to marshal the object to well-formed XML. The process then works in reverse when the application unmarshals the XML back into objects by feeding the XML into the library. When dealing with XML, you need a convenient representation of the XML data in memory. Such a representation will make finding nodes and their attributes by their "type" and "id" values and scanning or filtering the whole tree easy. This article offers Java programmers a solution to achieve this goal: an easy-to-use package for handling XML data in Java. First, it gives some practical advice on what is and isn't important in this data. But first, let's look at which problems you can address with pipestreaming libraries? It's all well and good to want this abstract way to manipulate XML, but what can you do with it? The IBM® DB2® Developer Workbench (DWB) provides out-of-the-box integrated development for DB2 9 pureXML. DWB is based on the Eclipse open source Integrated Development Environment (IDE). Learn how the DWB resources, perspectives, views, editors, and wizards assist you to work with the XML functionality in DWB. This article explains how you can take advantage of Java programming to explore online XML data. Perhaps the biggest step is to find sites that are publishing the content in which you are interested. Once you find a site that interests you, the process of extracting data is the same for all XML documents: first, you request the document, then you parse it, and finally, you filter out the element and attribute data that are interesting. By using the standard XML parsers, you get a more robust tool than writing one yourself. In addition, by using an XML document, your code is more capable of handling any data rearrangement that an HTML parser might miss. In answer to this question, this article demonstrates a few important SOA principles with straightforward XML and some Java code. It doesn't attempt to cover everything in the SOA universe; instead, the coverage is restricted to a few key areas. For example, you can conceivably use RSS to distribute XML service definitions. However, for this article's example, the transport mechanism uses Java facilities. In the first article of this series I looked at the performance of some of the leading XML document models in Java. Performance is only part of the story when it comes to choosing a technology of this type, though. Ease of use is at least as important, and that's been one of the main arguments presented in favor of using Java-specific models rather than the language-independent DOM. With that exception, though, what do you really gain by using XML as the format for your configuration data? I think there are a couple of other good reasons, but I'm not telling -- I'd like to hear from you. Check out the Resources section, click on XML and Java technology forum, and let me know. I'm curious to hear what advantages you think XML provides for configuration data. Imagine that you're launching a Web magazine of formal poetry called Stanza Web. You can use Atom for updating the site, to list new verse, articles, and other features, and to gather information from other sites of interest. The first step is to post an article welcoming readers. In order to do so, the Atom API specifies that you wrap the content in an XML document comprising an Atom entry element, and send this document as an HTTP POST message to the Web server at a special URI called the PostURI. Listing 1 is an example of this XML document. The test directory of JELDoclet includes a set of test Java files. This command parses all the Java files in the test directory and creates a file called out.xml, which contains all the information in the Javadoc tree. Listing 1 shows a portion of this output XML file. While the underlying RXP GPL library is almost certainly the fastest validating XML parser you can find, the actual parser code is quite under-documented, and comes with just one simple example of a command-line tool. This tool, rxp, is similar to the utility xmlcat.py (which I presented in my tip Command-line XML processing) as well as a variety of similar utilities -- it reads XML documents, validates them, and outputs a canonical form. You can look through the source code for the file rxp.c to see the way that RXP parsing generates a compact document tree as a data structure. This article is the first of three XML Matters installments that discuss RELAX NG. This installment will look at the general semantics of RELAX NG, and touch on datatyping. The next installment will look at tools and libraries for working with RELAX NG. The final installment will discuss the RELAX NG compact syntax in more detail. Many of the changes to the JAXP API have centered around parsing, which makes sense, given that the "P" in JAXP stands for "parsing." But the most significant changes in JAXP 1.1 center around XML transformations, which I will cover later in this article. In terms of the existing JAXP functionality, the changes are fairly minor. The biggest addition is support for SAX 2.0, which went final in May of 2000, and DOM Level 2, which is still being finalized. The previous version of JAXP only supported SAX 1.0 and DOM Level 1. This lack of updated standards has been one of the biggest criticisms of JAXP 1.0. XML 1.0 (Second Edition) [W3C Recommendation] is, of course, the trunk of the sprawling XML technology tree. It builds on Unicode [Unicode Consortium technical report and ISO standard] to define strict rules for text format as well as the Document Type Definition (DTD) validation language. The current (second) edition of the specification contains accumulated corrections to the specification. It has been widely translated, although the English version is the only normative one, meaning the only one that is intended to carry the force of standardization. Uche Ogbuji is a consultant and co-founder of Fourthought Inc., a software vendor and consultancy specializing in XML solutions for enterprise knowledge management. Fourthought develops 4Suite, an open source platform for XML, RDF, and knowledge management applications. Mr. Ogbuji is also a lead developer of the Versa RDF query language. He is a computer engineer and writer born in Nigeria, living and working in Boulder, Colorado, USA. You can contact Mr. Ogbuji at [email protected]. All of this often leads developers to either put JAXB down, or learn a lot more about XML, SAX, and DOM. At that point, many developers then move on to using SAX and DOM for persistence, and keep JAXB simply for its simplest function: converting between XML and Java objects. Java developers working with XML documents in memory can chose to use either a standard DOM representation or any of several Java-specific models. This flexibility has helped establish Java as a great platform for XML work. However, as the number of different models has grown it has become difficult to determine how the models compare in terms of features, performance, and ease of use. XML parses the newer Java language APIs?JAXP, JAXB, JAX-WS, and more?so easily that XML parsing is now a fundamental aspect of Java programming; potential problems arise when abstractions in the higher-level APIs cause a loss of fine-grained control between parser and data interactions. In this article, I'll show you how the Simple API for XML (SAX) delivers an easy-to-use vehicle to deal with those errors, one you can use even when you're not using SAX directly. Before you create an XML data model, you must first have a form. The form can be an existing XFDL form or, if you just want to learn the concept, it could be a new form you create using IBM Workplace Forms Designer. For your convenience, we created a simple leave application form to use in this article. You can download this form from the Download section of this article. Detailed instructions on how to create this form are not discussed here. XML-RPC is a remote function invocation protocol with a great virtue: It is worse than all of its competitors. Compared to Java RMI or CORBA or COM, XML-RPC is impoverished in the type of data it can transmit and obese in its message size. XML-RPC abuses the HTTP protocol to circumvent firewalls that exist for good reasons, and as a consequence transmits messages lacking statefulness and incurs channel bottlenecks. Compared to SOAP, XML-RPC lacks both important security mechanisms and a robust object model. As a data representation, XML-RPC is slow, cumbersome, and incomplete compared to native programming language mechanisms like Java's serialize, Python's pickle, Perl's Data::Dumper, or similar modules for Ruby, Lisp, PHP, and many other languages. XML is a great tool for putting together this issue tracker. Descriptions of issues and action items, and the related discussions, require flexible representation, but structure is important to maintaining the semantics of the data. In our example, the application has already been developed, and basic techniques are used for tasks such as sending action item reminders to users, supporting search and browsing, and so on. However, the developers have decided to start using RDF in the application in order to take advantage of the many existing tools and techniques that are available for RDF processing. SQL/XML is an extension of the SQL language standard (ANSI/ISO) that includes XML publishing functions for converting relational data into XML. IBM DB2 Universal Database for Linux®, UNIX®, and Windows® (DB2 UDB) includes built-in SQL/XML publishing functions that make it easy to publish DB2 UDB data in an XML document. These functions let you create tagged XML documents in character large objects (of type CLOB, one of the DB2 UDB built-in data types). You can use a SELECT statement to assemble the required XML nodes, and capture the marked-up text by directing the output to a file. You can also use an INSERT statement to write the generated text to a table. The COLUMNS clause is used to transform XML data into relational data. Each of the entries in this clause defines a column with a column name and a SQL data type. In the example above, the returned rows have 3 columns named empID, firstname and lastname of data type Integer, Varchar(20) and Varchar(25), respectively. The values for each column are extracted from the employee elements, which are produced by the row-generating XQuery expression, and cast to the SQL data types. For example, the path name/first is applied to each employee element to obtain the value for the column firstname. The row-generating expression provides the context for the column-generating expressions. In other words, you can typically append the column-generating expressions to the row-generating expression to get an intuitive idea of what a given XMLTABLE function returns in its columns. XML is an extremely versatile data transport format, but despite high hopes for it, XML is mediocre to poor as a data storage and access format. It is not nearly time to throw away your (SQL) relational databases that are tuned to quickly and reliably query complex data. So just what is the relationship between XML and the relational data model? And more specifically, what's a good design approach for projects that utilize both XML and relational databases -- with data transitions between the two? This column discusses how abstract theories of data models, as conceptualized by computer scientists, help us develop specific multirepresentational data flows. Future columns will look at specific code and tools to aid the transitions; this column addresses the design considerations. DB2 doesn't require any special configuration to enable you to develop or run Java applications that work with XML data. Indeed, you can write, test, and debug your Java programs using the integrated development environment (IDE) of your choice or by working directly with a supported Java Developer Kit (JDK) from the command line. However, because DB2 Viper ships with a Developer Workbench, the examples in this article use its development environment. This section discusses how to configure the Developer Workbench, reviews some sample data, and explores database configuration parameters that may be of interest to you. Python: Special Interest Group for XML Processing in Python; "Python & XML" column on XML.com; "The State of the Python-XML Art, 2003"; "Uche Ogbuji's Akara site on XML processing in Python" Among Lisp/Scheme enthusiasts, the starting point for the SSAX library for Scheme is the observation that XML is semantically almost identical to the nested list-oriented data structures native to Lisp-like languages. Anything you can represent in XML can be straightforwardly represented as SXML -- Scheme lists nesting the same data as the original XML. Moreover, Scheme comes with a rich library of list and tree manipulation functions, and a history of contemplating manipulation of those very structures. A natural fit, perhaps. The XML-RPC Web site includes links for client and server implementations of the spec in multiple languages, including Java programming language, Ruby, Python, C/C++, and Perl. The XML features in DB2 Viper (now in beta) include new storage management, indexing, and query language support. In this article, learn how to query data in DB2 XML columns using SQL or SQL with XML extensions (SQL/XML). A future article will discuss DB2's new support for XQuery, an emerging industry standard, and explore when it can be most useful. Jump start the development of your Java applications using the new pureXML features of DB2 9, and integrate pureXML into your application development environment. There is a difference in the way a CGI script runs and the way this XML-RPC server runs. The XML-RPC server is its own process (and uses its own port). CGI scripts, on the other hand, are automatically generated by a general HTTP server. But both still travel over HTTP (or HTTPS) layers, so any issues with firewalls, statefulness, and the like remain identical. Moreover, some general-purpose HTTP servers support XML-RPC internally. But if, like me, you do not control the configuration of your Web host, it is easier to write a stand-alone XML-RPC server like the eight-line version in Listing 5. XHTML 1.0 [W3C Recommendation] is mostly HTML 4 recast as well-formed XML. HTML is an SGML application, and when XML was developed as a simplification and specialization of SGML for the Web, HTML (itself the lingua franca of the Web) became the chief candidate for adoption to XML. The result is a variation named XHTML. The goal of the XHTML work is an HTML language for which parsing is simpler (because of XML's stricter syntax). XHTML is easily processed using off-the-shelf XML tools, and strives to better separate content from presentation. XHTML is one of the oldest XML applications and has a huge number of contributing interests, resulting in many parts and versions. I'll do my best to summarize the lot. This article explains how to work with XML data in DB2 9.5 using JDBC. Learn how to perform both administrative tasks and application development tasks. In the early days of Java and XML programming, there were a lot of XML parsers (Xerces, XML4J in its prime, Sun's Crimson, Oracle's XML parser, and several others that few people today have ever heard of). When you wrote an application that worked and interacted with XML, you had to connect your SAX and DOM APIs to these parser implementations, usually by telling SAX or the DOM about the parser class name, sort of like this: Susan Malaika has been an IBM Academy of Technology member since 1995. She co-edited a book on the Web in 1996. She has worked in DB2 since 1998 and she specializes in XML and Web technologies, including Grid computing. Her personal interests include opera, film, plays, and, lately, science fiction. The new update functionality in DB2 9.5 reduces this process to a single step. The application simply sends an SQL UPDATE statement with an embedded XQuery transform expression to the DB2 server (Figure 2). This expression is part of the emerging XQuery standard for updating XML data. The "trick" xml_indexer uses is the same one that XPath uses. Rather than treat XML documents simply as "things" in the file system, I can pretend that the hierarchical nodes of an XML document look much like a hierarchical file system. For purposes of indexing, other than a need for a little syntax to distinguish an XPath from a file system path, I can simply treat an XML node as if it were itself a text file. Fortunately, I designed indexer with enough flexibility to use arbitrary identifiers in indexing texts. Let's look at some search results. Data binding allows you to directly map between the Java objects and XML, without having to deal with XML attributes and elements. Additionally, it allows Java developers to work with XML without having to spend hours boning up on XML specifications. Quick is one such data binding API -- a project that's geared toward business use in Java applications. In the XML Enhancements for Java (XJ) project (the initial release is available on IBM alphaWorks), we take a different approach. XJ integrates XML as a first-class construct in Java technology. Programmers can import declarations in W3C XML Schemas as if they were Java classes, they can write inline XPath expressions to navigate XML data, and can construct XML data by writing inline XML (if you are familiar with ECMAScript for XML, the construction aspect of XJ is similar to that technology; see Resources for more on E4X). Since knowledge of XML is embedded in the language, the compiler can check XML processing programs for correctness with respect to XML Schema declarations and perform optimizations so that applications run more efficiently. Compared to XML-based languages such as XSLT, the advantage here is that in XJ, XML processing applications have all of Java technology -- including the use of all existing libraries -- at their disposal. Many transformations are easier to write in XJ than in XSLT. As much as I hate to say it, XML tools simply have not reached the level of convenience of text utilities that are available at a Unix-like command-line. For line-oriented, whitespace- or comma-delimited-text files, it is quite amazing what you can accomplish with clever combinations of sed, grep, xargs, wc, cut, pipes, and short shell scripts. The pureXML™ technology in DB2® 9 is designed to provide the highest level of performance for XML data management. This article compares its performance with that of character large object (CLOB) and shredded XML storage. Many database systems allow you to store XML data as CLOBs or "shred" the data into relational tables. These two options are also supported in DB2® V8 through the XML Extender, which is still available unchanged in DB2 9 for backwards compatibility. However, they are superseded by the pureXML features. However, the authors do have some sense of proportion about their work. The candidate contains this passage: "The XML Signature ... does not normatively specify how keys are associated with persons or institutions, nor the meaning of the data being referenced and signed. Consequently, while this specification is an important component of secure XML applications, it is, by itself, not sufficient to address all application security/trust concerns, particularly with respect to using signed XML (or other data formats) as a basis of human-to-human communication and agreement. Such an application must specify additional key, algorithm, processing and rendering requirements." In short, the authors are cautioning against considering this work as a technical panacea; that it must be used within other security measures. This is wise, but begs the question of what's behind the XML curtain. To set up and develop an application with XStream is a simple matter of a few easy steps. Now that you know how to use XStream to serialize and deserialize Java objects and read configuration files, you can learn more about aliases, annotations, and converters at the XStream site (see Resources for tutorial links). Aliases and converters enable you to have complete control over the XML that is generated. Another early convention apart from SAX and DOM was developing tools that turn XML into generic data structures native to the language -- a process called unmarshalling -- and vice versa (marshalling). The idea is to make developers in a specific language feel at home and not have to really think about the XML behind the data. Unfortunately, many developers are hostile to XML and this is often the only way they can find it palatable. But even for those who are comfortable with XML, marshalling tools are useful for quick and dirty processing: JDOM is a DOM-like API that sticks strictly to Java-language idioms; Python users have ElementTree, which creates a specialized data structure from XML, focusing on elements; Perl users have the now rather dated XML::Grove, which interchanges parsed XML, HTML, or SGML with a tree of Perl hashes; Ruby users have XMLification for very simple translation of Ruby objects to XML; an option for PHP is class_path_parser.php, which allows you to register XPath-like expressions for an XML source and dispatches PHP handler functions accordingly; an option for Haskell is Haskell2Xml, which allows you to read and write ordinary Haskell data as XML documents. Again, how do you pass the parameters? The W3C has not proposed a mechanism, but it seems logical to define new processing instructions. XM recognizes xm-xsl-param, as well as xml-stylesheet. The syntax for xm-xsl-param is similar to the other processing instruction and uses two pseudo-attributes, name and value: You have just had a quick tour of some Java's built-in XML processing capabilities. Although the programming style of working with XML does in some respects differ from that of the rest of the API, the upshot of learning this approach is that it is at least "mentally portable" to many other languages. (Even JavaScript is functionally similar, should you ever dabble in AJAX-style response processing.) Of course, those with a real need to do intensive XML munging should probably at least look in on the many third-party libraries, but you should not be afraid to take things on yourself if requirements so demand. Now you're ready to generate some XML. As before, you read the mapping file in, but this time you create a Marshaller, handing it a java.io.Writer (in this case a StringWriter, that will store the XML for us to print.) Then you have only to set the mapping on the marshaller, and marshall the top-level object, in this case our book collection. XML stands for the eXtensible Markup Language. XML is used to identify key elements within a document, store information for later retrieval, exchange data between programs and much more. XML is a standard that was developed by the World Wide Web Consortium in 1996 as an open standard so everyone could use it. 106. XML meets Java Today, if a language does not support ODBC or JDBC (for database connectivity), it loses its effectiveness for enterprise applications. Similarly, in an XML world, it becomes imperative for programming languages to manipulate and interact with XML documents. This article provides an overview of how such interaction would happen in the case of the Java programming language.Briefly, you convert input.xml into a VTD indexed XML file in the makeIndexedFile() function up top. Later, in Line 29, you can just open it with a plain old filehandle and begin using it with just a single call to loadIndex() in Line 36. The interesting parts are where you set up an autopilot to traverse the XML input via an XPath created by selectXPath() in Line 34. Finally, you are able to call insertAttribute() each time evalXPath() gives you a hit and then, when you are done, you just call output2() to dump out the newly transformed XML file. 108. Minimal XML and Java * Most applications that use Java and XML fall into the e-commerce or eAI range, so they will benefit from the tight focus on a practical subset. * MinXML is easier to implement. As we will see, MinXML parsers are dramatically smaller and sometimes faster than regular XML parsers.For a long time, XML was not built into the Java API. Support for XML was primarily through third-party libraries (such as Apache Xerces or JDOM). Fortunately, that has changed, and now you can get the Java XML Pack, a toolset for dealing with everything XML in Java. The XML Pack brings together several of the key industry standards for XML, such as SAX, DOM, XSLT, SOAP, Universal Description, Discovery & Integration (UDDI), Electronic Business using Extensible Markup Language (ebXML), and Web Services Description Language (WSDL). The two common programmatic XML APIs (SAX and DOM) are now built into the core Java API (as of J2SE 1.4.0). I have not yet posted the code for HC at the ananas.org repository because it is not ready for prime time. I need to finalize the DFA compilation before I have anything new worth posting. However, I'm sure that the time invested in building automated tests (and learning about JUnit) was well spent; it will pay off over the life of this project. By next month, I expect to have a running version (albeit not optimized) version of the compiler and I expect to be working on the proxy. |
w__w___w___.__ja_v___a___2_s___.c___o___m_ | Contact Us |
Copyright 2009 - 12 Demo Source and Support. All rights reserved. |
All other trademarks are property of their respective owners. |