Xml Family of Specifications

  • Authors:
  • Kenneth B. Sall

  • Affiliations:
  • -

  • Venue:
  • Xml Family of Specifications
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

From the Book:XML: It's a cheese spread. No, it's a floor wax. No, it's two—two—two products in one! Or maybe it's everything but the kitchen sink? Say, did you hear the one about the XML Kitchen Sink Language? (see http://blogspace.com/xkitchensink/)XML: What It's All AboutIt has been said that XML, the Extensible Markup Language, will become the ASCII of the twenty-first century because it is rapidly becoming ubiquitous. XML is expected to have an impact on both the Web and application development comparable to that of Java and JavaScript because it has opened up a wide variety of new capabilities and has been embraced by so many sectors of human endeavor.XML is a metalanguage—a syntax for describing other languages. These languages span diverse vertical industries including accounting, advertising, aerospace, agriculture, astronomy, automotive products, biology, chemistry, database management, e-commerce/EDI, education, financial institutions, health care, human resources, mathematics, publishing, real estate, software programs, supply chain management, and many more (for the many more, see http://www.xml.org/ml/industry_industrysectors.jsp). In one sense, XML is really a very trivial thing—just a markup syntax for describing structured text using angle brackets. But in another sense, XML is a basic building block—an enabling technology that makes it possible to develop more complex, more interesting, and more powerful tools. In the Web arena, XML is facilitating exciting improvements such as user-controllable views and filtering of information, creation oftruly device-independent content that can be re-purposed for vastly different devices, highly focused searching based on element hierarchies, and more sophisticated and flexible linking mechanisms. In the business and application arena, XML makes it easier to deliver filtered content from databases, to more readily share data between applications and between companies, and to exchange EDI messages that describe complex transactions. In the scientific arena, XML is a natural fit for describing complex datasets, models, control of instruments, images, chemical compounds, and much more.Just as Java made data processing platform-independent, XML has done the same for data, making the exchange of information much easier than ever before. But, no, XML is not the kitchen sink; it is not the solution to all of the world's problems in one tidy package; nor is it the solution to all your computer needs either, at least not alone. Rather, XML is a tool, or more accurately, a set of tools from the same toolbox. That toolbox is the XML family of specifications. This book will help you see what XML can and cannot do by describing how to use each tool.Although XML shares a number of concepts with its ancestor, SGML (Standard Generalized Markup Language), XML is said to yield 80 percent of the benefits of SGML, but with only 20 percent of the complexity. It is precisely this 80/20 rule that has excited countless companies and developers, encouraging them to support the efforts of the World Wide Web Consortium (W3C) in the development of XML. A few of the more than 500 companies and organizations that actively support XML development as members of the W3C include IBM, Sun, Microsoft, Oracle, Commerce One, and NASA.Audience: Who Should Read This Book?The book is intended for Web developers, which includes programmers, content writers, and designers. Depending on your background and interests, some chapters may be more relevant to you than others. It's intended for those who may be familiar with particular aspects of XML but who have not been formally exposed to all of the major W3C specifications, as well as those who have never dealt with XML before. Later in this preface, I provide a roadmap to help orient you. I've assumed that most readers are familiar with HTML elements and syntax, although the XML and DTD syntax discussions in Chapters 3 and 4 pretty much cover the concepts of elements, attributes, types, entities, and content that carry over from HTML to XML. In other words, you can get by without knowing HTML, except the XHTML chapter, which will make much more sense to you if you do. For those who would like to brush up on HTML, see "For Further Exploration: HTML and Java" at the end of this preface.Some examples require programming knowledge, but for most examples, anyone with general Web development skills will find them beneficial. Generally, scope and breadth of treatment is favored over depth. On the other hand, some readers will find that the depth is more than they expected, but they should still be able to "tread the water." My intent in writing this book was to cover a number of XML-related technologies in varying degrees of detail. I'd like to make it clear that although there are three chapters containing Java examples, this is not a book about Java and XML. You don't need a Java background for the vast majority of what's in this book.Although I do assume the Windows operating system, this is not a statement of preference. My formative years were spent on UNIX (I still use UNIX utilities to maintain a ski club site) at the office and a Mac at home. Rather, since Windows tends to be somewhat ubiquitous, it seems appropriate to show Windows command lines and mention some Windows-only tools. UNIX and Mac users are encouraged to share their experiences with fellow readers via the book's Web site. Personally, I have found cygwin—a UNIX environment for Windows developed by Red Hat—to be very handy (see http://cygwin.com/).What's Special About This Book?There are several features that contribute to making this book an invaluable resource for anyone beginning to plunge into the somewhat turbulent "seas" of XML.XML Family of Specifications Big Picture—Since early 1998, I've periodically updated a diagram I call "The Big Picture of the XML Family of Specifications." This unique diagram (front inside cover) depicts virtually all of the key W3C efforts related to XML, with colors to indicate each specification's status (maturity); it includes related non-W3C efforts as well. Physical positioning denotes a relationship among neighboring specifications, as explained in Chapter 2. Best of all, the Big Picture diagram appears as an imagemap on the CD-ROM and on this book's Web site, possibly as a more up-to-date version. The Big Picture imagemap on the Web site expands acronyms as your mouse hovers over a term. Clicking on the acronym or name connects you instantly to the actual specification or, in some cases, a collection of documents relating to that specification.History Timeline—A detailed "History of the Web and XML" in timeline form—the product of a considerable amount of research—is broken down into three time periods in Chapter 1, which should be interesting to many readers. Historical perspectives are also presented for particular specifications in their own chapters. A rather unique pullout at the back of the book shows, in bar chart format, the gestation periods of all of the XML specifications in this book, giving you a visual picture of what developments occurred in sequence and/or in tandem.Coverage—I've selected what are generally considered to be the most significant XML-related specifications from the W3C: XML/DTDs, XML Namespaces, XML Schema, the DOM, CSS, XSLT, XPath, XSLFO, XLink, XPointer, XHTML, and RDF. Several of the less frequently discussed specifications, such as XML Infoset, Canonical XML, XML Base, and XML Inclusions, are also covered. In addition, I've included four topics that are not under the purview of the W3C: RDDL, SAX, JDOM, and JAXP. The focus is on breadth rather than depth of coverage because if you have a general understanding of a lot of XML topics, you can better appreciate which are most relevant to your needs and you can "drill down" to the details by following the links I provide. The hope is that as you become more familiar with each of the topics I present, you'll know which areas you'll want to explore by buying more specialized Addison-Wesley or Prentice Hall books (e.g., about XSLT, XML with Java, or XHTML). I've tried hard to make the information current and have spent a good bit of time in the final months polishing and updating details here and there. All topics are as up-to-date as possible, except where noted otherwise.For Further Exploration—Each chapter ends with a section called "For Further Exploration," which presents quite a few links that serve not only as my bibliography, but also points to resources that contain more details than what can be provided here without killing way more than my fair share of trees. Links are provided to the specifications themselves, to articles that explain the specs in more everyday language than the precision required for formal specifications, and to articles describing subtleties or nuances of the specs. Links to tutorials, books, software, special references, and so on are also supplied. My intention is that readers will use the links, so they all appear in HTML form on the book's CD. Professors may wish to consider some of these links for students' research assignments. Tables—I'm a big fan of the use of tables. When I read a technical book, I seldom read it word for word, cover to cover. Often I want to locate some particular detail pretty quickly, so I look it up in the table of contents or index—I don't want to have to skim through paragraph after paragraph to find the little tidbit I need. Therefore, I feel that tables will help you do the same thing, maximizing the use of your time. The List of Tables is something with which you might want to familiarize yourself—let a table be your friend.CD-ROM—The CD that accompanies the book contains all the sample code presented in the text, as well as most of the software I used while writing this book, including the following:- Code Examples—every example that appears as a code listing plus a number of variations- XML Environment—batch files to simplify using XML with Java on Windows operation systems- For Further Exploration—all links from the end of each chapter- Big Picture of XML Family of Specifications Imagemap—links to more than 60 specifications, including many not covered in this book (see Chapter 2)- W3C XML Specifications in PDF Form—every W3C specification discussed in this book is available (unedited) for offline reading(hours and hours of fun for the whole family)- Glossary of terms- Chapter 12, "Practical Formatting Using XSLFO" by G. Ken Holman, in HTML format with two useful appendices which aren't included in the printed book- Freeware and evaluation copies of commercial software (XML/DTD/XML Schema editors, validators, parsers, XSLT processors, and more)Web Site—The book's main Web site is hosted by Web Developer's Virtual Library, an Internet.com site. I maintain the extensive XML section of WDVL.com. The book's URL there is http://WDVL.Internet.com/Authoring/Languages/XML/XML-Family. There you'll find all the links from the "For Further Exploration" sections organized by chapter, as well as the online version of the Big Picture imagemap, and of course the inevitable corrections to the text. While this material appears on the CD-ROM, the Web site versions may be more up-to-date. The Web site will be updated periodically; you can register to receive e-mail when the site is updated, if you wish.Organization and Roadmap: How You Should Read This BookThis book is divided into five conceptual parts. With the exception of a few chapters in Part I, it is not absolutely necessary to read this book chapter by chapter (and I'll tell you right up front: "the butler did it"). Chapter 1, "History of the Web and XML," provides an interesting historical perspective of the development of XML, but some readers may prefer to skip it entirely, or at least defer reading it until they've completed other chapters or find themselves on a long, boring plane flight with neither good movies nor readable magazines. Readers without a Java background may wish to gloss over the three chapters that contain Java examples, instead focusing on the concepts that are discussed in these chapters. The following describes the book's organization and suggested reading emphasis. Introduction: History of the Web and XML—As mentioned, Chapter 1 provides an historical perspective. It's divided into three eras: Ancient History (1945 to 1984), Medieval History (1986 to 1994), and Modern History: From HTML to XML (1994 to 2001).Part I: Fundamental XML Concepts and Syntax—This part introduces XML Syntax, DTD Syntax, the XML Infoset abstraction, Canonical XML, Namespaces, RDDL (Resource Directory Description Language), and XML Schema, corresponding to Chapters 2 through 6, intended to be read in sequence. All readers should read these chapters, although if you won't be developing your own vocabularies, you might be able to skim the DTD and XML Schema chapters (4 and 6, respectively). Although XML Schema is expected to replace the use of DTDs in many applications, your own project needs may dictate sticking with DTDs, in which case you could skip the XML Schema chapter, although I still recommend that you read the sections in Chapters 4 and 6 that highlight DTD limitations and XML Schema advantages. If you are tempted to skip the chapter on Infoset, Canonical XML, Namespaces and RDDL (Chapter 5), be sure to at least read the Namespaces section because this concept is central to many XML specifications. All chapters following 5 assume you are familiar with XML Namespaces. Although RDDL is a recent grassroots effort as I write this, it's bound to have gathered a lot of momentum by the time you read this.Part II: Parsing and Programming APIs—This part presents SAX (Simple API for XML), DOM (Document Object Model), JAXP (Java API for XML Processing) and JDOM—Chapters 7 through 9. All of these are application programming interfaces (APIs) to parsing and manipulating XML documents. This is the part of the book with the most Java examples. While all readers are encouraged to read the initial sections of the SAX and DOM chapters, non-Java developers can completely skip Chapter 9, which covers JAXP and JDOM, as well as the code examples in the SAX and DOM chapters. However, be sure to read the explanation of parsing at the beginning of Chapter 7 and study the comparison, "SAX vs. DOM vs. JDOM vs. JAXP—Who Wins?" at the end of Chapter 9.Part III: Displaying and Transforming XML—This part covers CSS (Cascading Style Sheets), XSLT (Extensible Stylesheet Language Transformations), XPath (XML Path Language), XSLFO (Extensible Stylesheet Language Formatting Objects), presented in Chapters 10 to 12. Of these, the lengthy Chapter 11 on XSLT and XPath is essential reading for anyone who wishes to display or transform XML into other formats (including HTML, XHTML, text, or other kinds of XML, particularly in e-commerce applications). Chapter 10 on CSS is more important if your XML display needs are more modest and your transformation needs are nil. The chapter can be skimmed for XML hooks if you are already familiar with CSS. Chapter 12 concerns XSL Formatting Objects, sort of the next generation CSS for desktop publishing quality layout, PDF, and targeting your output for different devices. The XSLFO chapter was contributed by noted XSL expert and instructor, G. Ken Holman, chair of the OASIS XSLT/XPath Conformance Technical Committee (see his home page at http://www.cranesoftwrights.com/). Part IV: Related Core XML Specifications—This part focuses on XLink (XML Link Language) and XPointer (XML Pointer Language)—Chapters 13 and 14. Most developers will benefit from reading about XLink and XPointer because they greatly extend the notion of linking and fragment access beyond what is possible in HTML 4.01, including one-to-many links, multidirectional links, links stored external to the documents, and linking to specific elements without hooks being provided by the original author.Part V: Specialized XML Vocabularies—This part presents two unrelated XML-based languages: XHTML (Extensible HyperText Markup Language) in Chapter 15 and RDF (Resource Description Framework) in Chapter 16. Please consider Chapter 15 on XHTML as essential reading for all developers. As you'll see, XHTML is its own nuclear family of specifications that is currently replacing HTML, especially in the increasingly popular world of handheld devices, voice browsers, and other alternative Web interfaces. RDF should be of particular interest to developers and scientists with an interest in metadata (data about data), site descriptions, catalogs, intelligent software agents, and so on. RDF attempts to add semantics to the Web; related concepts are the recent XML Topic Maps (XTM) effort and the older Dublin Core work. The RDF chapter was contributed by Ora Lassila, co-author of the Resource Description Framework Model and Syntax Specification for the W3C and contributor to the RDF Core Working Group and Web-Ontology (WebOnt) Working Group (see his home page at http://www.lassila.org/).This book does not cover XQuery, an XML Query language, nor Scalable Vector Graphics (SVG), except in passing. XQuery was still very much in flux at the time of this writing. As for SVG, with a more than 500-page specification, I felt I could not do the topic justice in the time I had left after writing the rest of this book. Well, there's always the Second Edition, I guess.What You Need to Get the Most Out of This BookAll code examples have been developed on a Dell Dimension XPS R450 PC (a paltry 450 MHz) running Windows 98. DOS .bat files are provided to help you configure your environment so that you can run the examples on your own. UNIX developers should be able to study the .bat files and set environment variables accordingly, such as CLASSPATH for Java and variables that point to the location of XML parsers and XSLT processors. I'm afraid I can't say much to Mac developers at this point (sadly, my own ancient PowerMac 7100/80 hasn't been used for the better part of three years), but if you contact me via the Web site and want me to share your experiences with others, I will gladly do so. I'll give you credit and a free copy of this book—it makes a great gift and keeps its flavor longer than fruitcake.XML and DTD examples are plain text, so they are viewable in their raw form on all platforms using any text editor. To process XML in a browser, however, you'll need the most current generation of browsers, such as Netscape 6.x, Internet Explorer 5.5 or 6.x, Amaya 5.x, or Opera 5.x or higher. If you're not the type of reader who has to try out every example in his or her own browser, then perhaps the many screenshots in this book will be sufficient. Evaluation copies of commercial XML, DTD and XML Schema editors appear on the CD that accompanies this book; XML parsers and XSLT processors also appear there. The CD also contains a page of links to the current versions of all provided software, as well as links to software that couldn't be included on the CD for a variety of reasons.The Java code examples should compile and run fine with either JDK 1.2.x or 1.3.x, also known by other confusing names and numbers such as Java 2 SDK, J2EE, and J2SE—or their equivalent as provided with your favorite Java IDE (Integrated Development Environment). This book does not attempt to teach Java; on the other hand, you really don't need to know Java to follow most of the discussions. Interested readers who desire a better Java background should refer to the key Java resources listed in "For Further Exploration: HTML and Java" that follows.I truly hope you enjoy this book and find the XML family of specifications as fascinating as I do.Conventions Used in This BookThe typographic conventions used in this book are as follows:Glosssary terms look like this where they are defined: node-setCode excerpts, code listings, command lines, filenames, element names, and attribute names look like this: or collection8.xml.Quotations (material excerpted from another source) is indented both left and right and is set in a smaller type size.Notes, important information or things to watch out for, are set off by an arrow in the margin and rules above and below their text.For Further Exploration: HTML and JavaDave Raggett's Getting Started with HTMLhttp://www.w3.org/MarkUp/Guide/Web Design Group's HTML 4.0 Referencehttp://www.htmlhelp.com/reference/html40/Google's HTML Tutorials categoryhttp://directory.google.com/Top/Computers/Data_Formats/Markup_Languages/HTML/Tutorials/Java Technology Products and APIshttp://java.sun.com/products/The Java Tutorialhttp://java.sun.com/docs/books/tutorial/Google Web Directory: Java includes a Books categoryhttp://directory.google.com/Top/Computers/Programming/Languages/Java/Google Web Directory: Java IDEshttp://directory.google.com/Top/Computers/Programming/Languages/Java/Development_Tools/Integrated_Development_Environments/Cafe au Lait Java FAQs, News, and Resourceshttp://www.ibiblio.org/javafaq/