XML and Tools - Informationssysteme

Transcrição

XML and Tools - Informationssysteme
XML and Tools
Muhammad Khalid Sohail Khan
Mat #: 745783
University of Duisburg Essen Germany
1
Tables of Contents
1
Main Topics ................................................................................................................ 2
1.1
What is XML?..................................................................................................... 3
1.2
XML Syntax........................................................................................................ 3
1.3
Namespace .......................................................................................................... 5
2 XML Schema .............................................................................................................. 6
2.1
Schema and Schema languages .......................................................................... 6
2.2
DTD .................................................................................................................... 6
2.3
Problems with DTD ............................................................................................ 8
2.4
XML Schema ...................................................................................................... 9
2.5
Simple Example .................................................................................................. 9
Build in data type .............................................................................................. 12
2.6
2.7
Creating own data type ..................................................................................... 12
2.8
Better schema.................................................................................................... 13
3 What is XSLT? ......................................................................................................... 15
3.1
XPath................................................................................................................. 15
3.2
Core Function Library....................................................................................... 18
3.3
XSLT................................................................................................................. 19
3.3.1
Template ................................................................................................... 20
3.3.2
Example for our project ............................................................................ 22
4 Different Parsers, API’s and Tools ........................................................................... 25
4.1
Parsers ............................................................................................................... 25
4.2
Different API’s.................................................................................................. 25
4.2.1
The Document Object Model (DOM) Parsing.......................................... 25
4.2.2
SAX........................................................................................................... 26
4.2.3
JDOM........................................................................................................ 26
4.3
Editors ............................................................................................................... 27
5 Acknowledgments..................................................................................................... 28
Table of Figures
Figure 1 ............................................................................................................................... 6
Figure 2 ............................................................................................................................. 11
Figure 3 ............................................................................................................................. 15
Figure 4
Forwarding- sibling ................................................................................. 16
Figure 5 Preceding sibling ............................................................................................. 16
Figure 6 Parent................................................................................................................. 16
Figure 7
Child ......................................................................................................... 17
Figure 8
Forwarding ................................................................................................ 17
Figure 9 Preceding .......................................................................................................... 17
Figure 10 Ancestor........................................................................................................ 18
Figure 11
Decedent .................................................................................................... 18
Figure 12 XSLT ............................................................................................................... 19
Figure 13 Output in browser ............................................................................................. 24
Figure 14 DOM................................................................................................................ 25
Figure 15 SAX .................................................................................................................. 26
2
1 Main Topics
Here is the list of topics which will be covered.
What is XML
What is DTD and XML Schema
What is XSLT
Different Parsers, API’s and Tools
1.1 What is XML?
XML stands for eXtensible Markup Language
XML is a framework for defining markup languages. There is no fixed collection of
markup tags - we may define our own tags, tailored for our kind of information. Each
XML language is targeted at its own application domain, but the languages will share
many features. There is a common set of generic tools for processing documents.
XML is designed to separate syntax from semantics to provide a common framework for
structuring information.
XML can be used to Create new Languages. XML is the mother of WML (Wireless
Markup Language), CML (Chemical Markup Language), ThML (Theological Markup
Language) and so on …
XML can be used to exchange data With XML data can be exchanged between
incompatible systems (Portable data). XML is a cross-platform, software and hardware
independent tool for transmitting information.
1.2 XML Syntax
XML documents use a self-describing and simple syntax. The following example
describes the XML syntax. In this example the root element is project. Every other
element must be includes inside root element. Every xml element must have a closing
element. Element can have attributes and values of attributes must be quoted either
single or double.
<?xml version="1.0" encoding="ISO-8859-1"?>
<project xmlns="http://www.uni-duisburg.de"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.uni-duisburg.de
project.xsd">
<description>
Yahoo for das Invisible Web Scatter/Gather-Clustering for semistrukturierte
Daten
</description>
<participant>
3
<professor>
<lastName>Fuhr</lastName>
<firstName>Norbert</firstName>
<fachgebiet>Informationssysteme</fachgebiet>
<phone>0203-379-2524</phone>
</professor>
<mitarbeiter>
<lastName>Fischer</lastName>
<firstName>Gudrun</firstName>
<fachgebiet>Informationssysteme</fachgebiet>
<phone>0203-379-2206</phone>
</mitarbeiter>
<students number="1">
<lastName>Khan</lastName>
<firstName>Khalid</firstName>
<matNumber>745784</matNumber>
<sex>Male</sex>
<age>28</age>
<studiengang>AOS</studiengang>
<thema>XML und Werkzeuge</thema>
</students>
<students number="2">
<lastName>Li</lastName>
<firstName>Ting</firstName>
<matNumber>741733</matNumber>
<sex>Female</sex>
<age>28</age>
<studiengang>Angewandte Informatik</studiengang>
<thema>Open Archives und Metadaten</thema>
</students>
<students number="3">
<lastName>Nurzenski</lastName>
<firstName>Andre</firstName>
<matNumber>740564</matNumber>
<sex>Male</sex>
<age>26</age>
<studiengang>Angewandte Informatik</studiengang>
<thema>Der Scatter/Gather-Algorithmus</thema>
</students>
<students number="4">
<lastName>Chojnacki</lastName>
<firstName>Michael</firstName>
<matNumber>740706</matNumber>
<sex>Male</sex>
<age>26</age>
<studiengang>Angewandte Informatik</studiengang>
4
<thema>Text-Clustering</thema>
</students>
<students number="5">
<lastName>Tang</lastName>
<firstName>Zhihong</firstName>
<matNumber>745505</matNumber>
<sex>Female</sex>
<age>28</age>
<studiengang>Angewandte Informatik</studiengang>
<thema>Eine vorprozessierte Variante von Scatter/Gather</thema>
</students>
</participant>
</project>
1.3 Namespace
XML Namespaces provide a method to avoid element name conflicts. Since element
names in XML are not fixed, very often a name conflict will occur when two different
documents use the same names describing two different types of elements.
project xmlns="http://www.uni-duisburg.de"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.uni-duisburg.de
project.xsd">
First, using a default namespace declaration, tell the schema-validator that all of the
elements used in this instance document come from the http://www.uni-duisburg.de
namespace. Second, with schemaLocation tell the schema-validator that the
http://www.uni-duisburg.de namespace is defined by project.xsd (i.e., schemaLocation
contains a pair of values). Third, tell the schema-validator that the schemaLocation
attribute we are using is the one in the XMLSchema-instance namespace.
5
2 XML Schema
The basic idea of the schema can be shown with the help of the following figure.
Figure 1
2.1 Schema and Schema languages
A schema is a definition of the syntax of an XML-based language. A schema language is
a formal language for expressing schemas. The document being validated is called an
instance document or application document.
Main schema language are
DTD
XML Schema
2.2 DTD
DTD stands for Document Type Definition. Here is DTD for our example project.xml.
<?xml version="1.0" encoding="UTF-8"?>
<!ELEMENT project (description, particepant+)>
<!ELEMENT description ANY>
<!ELEMENT particepant (professor, mitarbeiter, students+)>
<!ELEMENT professor (lastName, firstName, fachgebiet, phone)>
<!ELEMENT mitarbeiter (lastName, firstName, fachgebiet, phone)>
<!ELEMENT students (lastName, firstName, matNumber, sex, age, studiengang,
thema)>
6
<!ATTLIST students
number CDATA #IMPLIED>
<!ELEMENT lastName (#PCDATA)>
<!ELEMENT firstName (#PCDATA)>
<!ELEMENT matNumber (#PCDATA)>
<!ELEMENT sex (#PCDATA)>
<!ELEMENT age (#PCDATA)>
<!ELEMENT studiengang (#PCDATA)>
<!ELEMENT thema (#PCDATA)>
<!ELEMENT fachgebiet (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
Here is the general syntax with expiation.
<!DOCTYPE root-element [ doctype-declaration... ]>
Determines the name of the root element and contains the document type
declarations
<!ELEMENT element-name content-model>
associates a content model to all elements of the given name
content models:
EMPTY: no content is allowed m
ANY: any content is allowed m
(#PCDATA|element-name|...): "mixed content", arbitrary sequence of character
data and listed elements
deterministic regular expression over element names: sequence of elements
matching the expression
choice: (...|...|...)
sequence: (...,...,...)
optional: ...?
zero or more: ...*
one or more: ...+
<!ATTLIST element-name attr-name attr-type attr-default...>
declares which attributes are allowed or required in which elements
attribute types:
CDATA: any value is allowed (the default)
(value|...): enumeration of allowed values
attribute defaults:
#REQUIRED: the attribute must be explicitly provided
#IMPLIED: attribute is optional, no default provided
"value": if not explicitly provided, this value inserted by default
#FIXED "value": as above, but only this value is allowed
7
2.3 Problems with DTD
Some of the top reasons for avoiding DTD:
not itself using XML syntax (the SGML heritage can be very unintuitive + if using
XML, DTDs could potentially themselves be syntax checked with a "meta DTD")
mixed into the XML 1.0 spec (would be much less confusing if specified separately
+ even non-validating processors must look at the DTD)
no constraints on character data (if character data is allowed, any character data is
allowed)
too simple attribute value models (enumerations are clearly insufficient)
cannot mix character data and regexp content models (and the content models are
generally hard to use for complex requirements).
no support for Namespaces (of course, XML 1.0 was defined before Namespaces)
very limited support for modularity and reuse (the entity mechanism is too lowlevel)
no support for schema evolution, extension, or inheritance of declarations
(difficult to write, maintain, and read large DTDs, and to define families of related
schemas)
limited white-space control (xml:space is rarely used)
no embedded, structured self-documentation (<!-- comments --> are not enough)
content and attribute declarations cannot depend on attributes or element
context (many XML languages use that, but their DTDs have to "allow too much")
too simple ID attribute mechanism (no points-to requirements, uniqueness scope,
etc.)
only defaults for attributes, not for elements (but that would often be convenient) .
cannot specify "any element" or "any attribute" (useful for partial specifications
and during schema development)
defaults cannot be specified separate from the declarations (would be convenient
to have defaults in separate modules)
8
2.4 XML Schema
XML Schemas are a tremendous advancement over DTDs.
Enhanced data types:
o 44+ versus 10
o Can create your own data types
XML syntax (there is a Schema for Schemas) uses and supports Namespaces
object-oriented-like type system for declarations (with inheritance, subsumption,
abstract types, and finals)
global (=top-level) and local (=inlined) type definitions
modularization (schema inclusion and redefinitions)
structured self-documentation
cardinality constraints for sub-elements nil values (missing content)
attribute and element defaults
any-element, any-attribute
uniqueness constraints and ID/IDREF attribute scope
regular expressions for specifying valid chardata and attribute values
2.5 Simple Example
The flowing schema used to validate the project.xml.
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema targetNamespace="http://www.uni-duisburg.de" xmlns="http://www.uniduisburg.de" xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified">
<xs:element name="project"/>
<xs:element name="description" type="xs:string"/>
<xs:element name="participant">
<xs:complexType>
<xs:sequence>
<xs:element ref="professor"/>
<xs:element ref="mitarbeiter"/>
<xs:element ref="students" maxOccurs="12"/>
</xs:sequence>
9
</xs:complexType>
</xs:element>
<xs:element name="professor">
<xs:complexType>
<xs:sequence>
<xs:element name="lastName"/>
<xs:element name="firstName"/>
<xs:element name="fachgebiet"/>
<xs:element name="phone"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="mitarbeiter">
<xs:complexType>
<xs:sequence>
<xs:element name="lastName"/>
<xs:element name="firstName"/>
<xs:element name="fachgebiet"/>
<xs:element name="phone"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="students">
<xs:complexType>
<xs:sequence>
<xs:element name="lastName"/>
<xs:element name="firstName"/>
<xs:element name="matNumber"/>
<xs:element name="sex"/>
<xs:element name="age"/>
<xs:element name="studiengang"/>
<xs:element name="thema"/>
</xs:sequence>
<xs:attribute name="number"/>
</xs:complexType>
</xs:element>
<xs:element name="lastName" type="xs:string"/>
<xs:element name="firstName" type="xs:string"/>
<xs:element name="fachgebiet" type="xs:string"/>
<xs:element name="matNumber" type="xs:string"/>
<xs:element name="sex" type="xs:string"/>
<xs:element name="age" type="xs:string"/>
<xs:element name="studiengang" type="xs:string"/>
<xs:element name="thema" type="xs:string"/>
</xs:schema>
10
All XML schema have “schema” as root element
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" 1
targetNamespace="http://www.uni-duisburg.de“ 2
xmlns="http://www.uni-duisburg.de" elementFormDefault="qualified"> 4
3
1. The schemas and data type that are used to contract schemas i.e. schema, element,
complexType, sequence, string… are come form the
http://www.w3.org/2001/XMLSchema .
2. Indicates that the elements defined by this schema i.e project, description,
participant,….. Are to go in the http://www.uni-duisburg.de.
3. The default namespace is http://www.uni-duisburg.de which is the targetNamesapce
4. This is a directive to any instance documents which conform to this schema; any
element used by the instance document which were declared in this schema must be
namespace qualified
A schema defines a new vocabulary. Instance document use that new vocabulary
schemasLocation=“http://www.uni-duisburg.de
project.xsd”
project.xml
Project.xml use schema form
namesapce http://www.uni-duisburg.de
targetNamespace=“http://www.uni-duisburg.de”
project.xsd
Defines element in namespace
http://www.uni-duisburg.de
Figure 2
11
2.6 Build in data type
2.7 Creating own data type
Primitive Datatypes
string
boolean
decimal
float
double
duration
dateTime
time
date
gYearMonth
gYear
gMonthDay
gDay
gMonth
hexBinary
base64Binary
anyURI
QName
NOTATION
Derived types
normalizedString
integer
nonPositiveInteger
negativeInteger
long
int
short
byte
nonNegativeInteger
unsignedLong
unsignedInt
unsignedShort
unsignedByte
positiveInteger
We can create our own data types. A new data type can be defined from the existing data
type by specifying value for one or more of the optional facets.
Lets create two data types for our example
PhoneType
<xs:simpleType name="phoneType">
<xs:restriction base="xs:string">
<xs:pattern value="\d{4}-\d{3}-\d{4}"/>
</xs:restriction>
</xs:simpleType>
MatriculationtType
<xs:simpleType name="matType">
<xs:restriction base="xs:integer">
12
<xs:totalDigits value="6"/>
<!--<xs:pattern value="\d{6}"/> -->
</xs:restriction>
</xs:simpleType>
in the first type we created the phone data type, this make sure that phone should be in
standards format for example 0203-379-2524. If the phone is not in this format then
document will not be validated.
In the second type we created a matType for matriculation. Here we make sure that
matriculation type must have six digits.
2.8 Better schema
Here is the a better schema for our project example. This uses the new defined data types.
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema targetNamespace="http://www.uni-duisburg.de"
xmlns="http://www.uni-duisburg.de"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified">
<xs:element name="project">
<xs:annotation>
<xs:documentation>This is the root of my schema</xs:documentation>
</xs:annotation>
</xs:element>
<xs:element name="description" type="xs:string"/>
<xs:element name="participant">
<xs:complexType>
<xs:sequence>
<xs:element name="professor" type="empType"/>
<xs:element name="mitarbeiter" type="empType"/>
<xs:element name="students" type="stuType" maxOccurs="12"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:complexType name="empType">
<xs:sequence>
<xs:element name="lastName" type="xs:string"/>
<xs:element name="firstName" type="xs:string"/>
<xs:element name="fachgebiet" type="xs:string"/>
<xs:element name="phone" type="phoneType"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="stuType">
<xs:sequence>
13
<xs:element name="lastName" type="xs:string"/>
<xs:element name="firstName" type="xs:string"/>
<xs:element name="matNumber" type="matType"/>
<xs:element name="sex" type="xs:string"/>
<xs:element name="age" type="xs:integer"/>
<xs:element name="studiengang" type="xs:string"/>
<xs:element name="thema" type="xs:string"/>
</xs:sequence>
<xs:attribute name="number" type="xs:integer" use="required"/>
</xs:complexType>
<xs:simpleType name="phoneType">
<xs:restriction base="xs:string">
<xs:pattern value="\d{4}-\d{3}-\d{4}"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="matType">
<xs:restriction base="xs:integer">
<xs:totalDigits value="6"/>
<xs:pattern value="\d{6}"/>
</xs:restriction>
</xs:simpleType>
</xs:schema>
14
3 What is XSLT?
XSLT stands for eXtensible Stylesheet Language Transformation
It is a part of XSL, which consists of three parts:
XSLT
XPath
XSL Formatting Objects.
To understand XSLT we need to understand the Xpath
3.1 XPath
XPath is a set of syntax rules for defining parts of an XML document. XPath uses paths
to define XML elements. XPath defines a library of standard functions. XPath is a major
element in XSLT.
In XPath axes play very important rule, the following figures play very important rule.
Figure 3
15
Figure 4
Figure 5
Forwarding- sibling
Preceding sibling
Figure 6 Parent
16
Figure 7
Child
Figure 8
Forwarding
Figure 9 Preceding
17
Figure 10
Figure 11
Ancestor
Decedent
3.2 Core Function Library
Here is list of some the functions
Node-set functions:
o
o
o
o
….
last() returns position number of last node
position() returns the context position
count(node-set) number of nodes in node-set
name(node-set) string representation of first node in node-set
……
String functions:
….
o
o
string(value) type cast to string
concat(string, string, ...) string concatenation
……
18
Boolean functions:
o
o
boolean(value) type cast to boolean
not(boolean) boolean negation
….
……
Number functions:
o
o
….
number(value) type cast to number
sum(node-set) sum of number value of each node in node-set
……
3.3 XSLT
The basic idea of XSLT can be shown with the help of the following figure.
Figure 12 XSLT
An XSLT style sheet is declarative and uses pattern matching and templates to specify
the transformation. On the Web, XSLT transformation can be done either on the client
(e.g. Explorer or Mozilla), or on the server (e.g. Apache Xalan).
An XSLT style sheet is itself an XML document, for example in general form
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform“
version="1.0” xmlns="...">
.
.
<xsl:template match="pattern">
template
</xsl:template>
.
.
<- other top-level elements
.
</xsl:stylesheet>
19
3.3.1 Template
There are different kinds of the templates constructs.
literal result fragments
recursive processing
computed result fragments
conditional processing
sorting
numbering
variables and parameters
keys
3.3.1.1 Literal result fragments
A literal result fragment is a text constant (character data)
an element not belonging to the XSL namespace
<xsl:text ...> ... </...> (as raw text, but with white-space and
character escaping control)
<xsl:comment> ... </...> (inserts a comment <!--...-->)
Since literal fragments are part of the stylesheet XML document, only well-formed XML
will be generated.
Example:
<xsl:template match="...">
this text is written directly to output
when this template is instantiated
</xsl:template>
recursive processing
3.3.1.2 Recursive processing
Recursive processing instructions:
<xsl:apply-templates select="node-set expression" .../>
apply pattern matching and template instantiation on selected nodes (default: all
children)
<xsl:call-template name="..."/>
invoke template by name (where xsl:template has name="..." attribute)
<xsl:for-each select="node-set expression"> template
</...>
instantiate inlined template for each node in node-set (document order by default)
<xsl:copy> template </...>
copy current node to output and apply template (shallow copy)
<xsl:copy-of select="..."/>
copy selected nodes to output (deep copy, includes descendants)
The select attribute contains an XPath expression, which is evaluated in the current
context.
Example:
<xsl:template match="students">
20
<h1><xsl:apply-templates select="age"/></h1>
</xsl:template>
3.3.1.3 Computed result fragments
Result fragments can be computed using XPath expressions:
<xsl:element name="..." namespace="..."> ... </...>
construct an element with the given name, attributes, and contents
<xsl:attribute name="..." namespace="..."> ... </...>
construct an attribute (inside xsl:element)
<xsl:value-of select="..."/>
construct character data or attribute value (expression converted to string)
<xsl:processing-instruction name="..."> ... </...>
construct a processing instruction
3.3.1.4 Conditional processing
Processing can be conditional:
<xsl:if test="expression"> ... </...>
apply template if expression (coerced to boolean) evaluates to true
<xsl:choose>
<xsl:when test="expression"> ... </...>
...
<xsl:otherwise> ... </...>
</...>
test conditions in turn, apply template for the first that is true
3.3.1.5 Sorting
Sorting chooses an order for xsl:apply-templates and xsl:for-each
<xsl:sort select="expression" .../>;
a sequence of xsl:sort elements placed in xsl:apply-templates or
xsl:for-each defines a lexicographic order
Some extra attributes:
order="ascending/descending"
lang="..."
data-type="text/number"
case-order="upper-first/lower-first"
3.3.1.6 Numbering
21
for automatic numbering of sections, item lists, footnotes, etc.
<xsl:number value="expression" converted to number
format="..."
(as ol in HTML, default: "1. ")
level="..."
any/single/multiple
count="..."
select what to count
from="..."
select where to start counting
lang="..."
letter-value="..."
grouping-separator="..."
grouping-size="..."/>
If value is specified, that value is used. Otherwise, the action is determined by level:
level="any": number of preceding count nodes occurring after from
(example use: numbering footnotes)
level="single" (the default): as any but only considers ancestors and their
siblings
level="multiple": generates whole list of numbers
3.3.1.7 Variables and Parameters
For reusing results of computations and parameterizing templates and whole stylesheets
Static scope rules can hold any XPath value (string, number, boolean, node-set) + resulttree fragment purely declarative:
variables cannot be updated can be global or local to a template rule
Declaration:
<xsl:variable name="..." select="expression"/>
variable declaration, value given by XPath expression
<xsl:variable name="..."> template </..>
variable declaration, template is instantiated as result tree fragment to give value
- similarly for xsl:param parameter declarations (where the specified values act as
defaults).
3.3.1.8 Keys
Advanced node IDs for automatic construction of links. A key is a triple (node, name,
value) associating a name-value pair to a tree node.
<xsl:key match="pattern" name="..." use="node set
expression"/>
Declares set of keys - one for each node matching the pattern and for each node in the
node set. Extra XPath key function:
key(name expression, value expression)
returns nodes with given key name and value
This is often used together with:
generate-id(singleton node-set expression)
returns unique string identifying the given node
3.3.2 Example for our project
22
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body bgcolor="cornflower">
<h2>
<font color="blue">
<u> Student Project</u>
</font>
</h2>
<h4>Description:</h4>
<font color="blue">
<xsl:value-of select="project/description"/>
</font>
<hr/>
<h3>Dozent(en):</h3>
<xsl:for-each select="project/participant/*">
<font color="blue">
<h5>
<xsl:if test="descendant-or-self::professor">
Prof. Dr.
</xsl:if>
<xsl:if test="descendant-or-self::mitarbeiter">
Dip. Inform.
</xsl:if>
<xsl:if test="descendant-or-self::professor | descendant-orself::mitarbeiter">
<xsl:value-of select="firstName"/>
<xsl:text> </xsl:text>
<xsl:value-of select="lastName"/>
</xsl:if>
</h5>
</font>
</xsl:for-each>
<h3>
Students:
</h3>
<xsl:for-each select="project/participant/*">
<xsl:if test="self::students">
(<xsl:value-of select="@number"/>)
<br/>
Name:
<font color="blue">
<xsl:value-of select="firstName"/>
<xsl:text> </xsl:text>
23
<xsl:value-of select="lastName"/>
</font>
<br/>
Matrikulation Nr:
<font color="blue">
<xsl:value-of select="matNumber"/>
</font>
<br/>
Studiengng:
<font color="blue">
<xsl:value-of select="studiengang"/>
</font>
<br/>
Thema:
<font color="blue">
<xsl:value-of select="thema"/>
</font>
<br/>
<br/>
</xsl:if>
</xsl:for-each>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
The output in browser will be like
Figure 13 Output in browser
24
4 Different Parsers, API’s and Tools
4.1 Parsers
Here is list of some parser, the most famous the Apache Xerces-J.
Apache Xerces-J
IBM XML4J
MS XML Parser
James Clark’s parser, XP
4.2 Different API’s
The Document Object Model (DOM) API
The W3C official Proposal
The Simple API for XML (SAX) API
The first widely adopted API for XML in java and de facto standard
The JDOM API
An API that is tailored to java
JAXP
the official API for XML processing from Sun.
The Streaming API for XML (StAX) API
Promising new model introduced
4.2.1 The Document Object Model (DOM) Parsing
DOM is a tree based parsing technique that builds up entire tree in the memory. It allow
complete, dynamic access to the whole XML document.
Figure 14 DOM
25
4.2.2 SAX
SAX is a event driven push model for processing XML. The SAX started as a grassroots
movement, but has gained an official standing. An XML tree is not viewed as a data
structure, but as a stream of events generated by the parser.
The kinds of events are:
the start of the document is encountered
the end of the document is encountered
the start tag of an element is encountered
the end tag of an element is encountered
character data is encountered
a processing instruction is encountered
Scanning the XML file from start to end, each event invokes a corresponding callback
ethod that the programmer writes.
Figure 15 SAX
4.2.3 JDOM
JDOM is designed to be simple and Java-specific. JDOM is a small library, since it is
used on top of either DOM or SAX.
JDOM contains five Java packages:
org.jdom - defines the basic model of an XML tree
org.jdom.adapters - defines wrappers for various DOM implementations
org.jdom.input - defines means for reading XML documents
org.jdom.output - defines means for writing XML documents
org.jdom.transform - defines an interface to JAXP XSLT
26
4.3 Editors
XMLSpy ( I just love it )
www.altova.com/download.html
Eclipse with xml plug ins
www.eclipse.org
X/HTML Kit
www.chami.com/html-kit
Many more
27
5 Acknowledgments and lecture
During the preparation of this presentation, I studied and used material available online
and in the form of printing. I say thanks to all of these organization and authors for their
wonderful work.
www.w3.org/xml
www.ibm.com/developerworks
www.w3school.com/default.asp
www.java.sun.com/products/xml DB2 Magazine
Oracle Magazine
Roger L. Costello (XML Schema)
Anders Moller (The XML Revolution)
Michael I. Schwartzbach (The XML Revolution)
Brett McLaughlin (java & XML)
http://xml.Apache.org
many many more ……….
28