Extensible Markup Language
eXtensible Markup Language (XML) is a W3C markup language derived from SGML (ISO8879-1986) used in a wide variety of applications for the storage and representation of textual data.
Features and Syntax
XML consists of a hierarchical tree of elements that contain attributes to express the structure of data.
Usage Examples
Data Storage
Address Book Example
Suppose Thomas wants to write a simple address book program that stores his addresses and phone numbers in a simple structured manner. He decides his best option is to use XML to store his information because he can define what information he wishes to store. XML works best when the information has a hierarchical arrangement, so Thomas designs the schema of how his information will be held.
- Person or Company Name
- Addresses
- Mailing Address
- E-mail Address
- Phone Numbers
- Fax Number
- Cell/Mobile Number
- Note
- Addresses
This arrangement of data is then transformed into XML, one possible arrangement is below:
<?xml version="1.0"?> <address-book> <person name="Thomas Paine"> <addresses> <mailing>812 Juniper Road</mailing> <email>tpaine@foo.net</email> <telephone> <primary>987-654-4321</primary> <fax>555-555-5555</fax> <cell>123-456-7890</cell> <note>Me.</note> </person> </address-book>
Let's dissect this example and see what the different components of this structure do.
<?xml version="1.0"?>
The first line of the example is known as the prolog, it tells the parser that the file it's about to parse is indeed XML. The version attribute is the version of the XML specification the document represents. In order for an XML document to be well-formed it must contain this prolog. "Well-formed"ness and validity will be discussed later.
The root element in this structure is <address-book>
The root element is the top of the hierarchical tree that XML documents typically represent. In this case it represents the address book as a whole. There are ways of specifying whether or not an element can contain multiple sub-elements with the same tag however that will not be covered in this example just yet.
The first sub-element in the tree is <person name="Thomas Paine">
. Obviously this element represents a person within the address book. The name
attribute is the name of the person or entity that the <person>
element will represent. Each of the other sub-elements are embedded inside the first a little like Russian stacking dolls, each one defines a more specific entity within the larger entity.