Write a program that prints out its own source code
Category: General Question, XML Questions | 1 views | Add a Comment
Frequently a question of the programs printing out their own output appears in code interviews. The question does not really test a skill of a programmer, as much as general awareness of tricks to do that. The quine page by Gary Thompson provides many examples of such applications in various languages. In case you were wondering, a program whose output is its own source code is called a quine, something that might come of use over the next interview. The aforementioned Quine Page lists the following example for C:
char*f="char*f=%c%s%c;main()
{printf(f,34,f,34,10);}%c";
main(){printf(f,34,f,34,10);}
One language that’s missing, however, is PHP. In PHP a helpful
function and __FILE__ constant allow for this quick hack:
echo file_get_contents(__FILE__);
Top xml Interview Questions
Category: XML Questions | 7 views | Add a Comment
71. How do I undeclare an XML namespace prefix?
In version 1.0 of the XML namespaces recommendation, you cannot
“undeclare” an XML namespace prefix. It remains in scope
until the end of the element on which it was declared unless it
is overridden. Furthermore, trying to undeclare a prefix by redeclaring
it with an empty (zero-length) name (URI) results in a namespace
error. For example:
<google:A xmlns:google=”http://www.google.org/”>
<google:B>
<google:C xmlns:google=””> <==== This is an error
in v1.0, legal in v1.1.
<google:D>abcd</google:D>
</google:C>
</google:B>
</google:A>
In version 1.1 of the XML namespaces recommendation [currently
a candidate recommendation — February, 2003], you can undeclare
an XML namespace prefix by redeclaring it with an empty name. For
example, in the above document, the XML namespace declaration xmlns:google=””
is legal and removes the mapping from the google prefix to the http://www.google.org
URI. Because of this, the use of the google prefix in the google:D
element results in a namespace error.
71. How do I undeclare the default XML namespace?
To “undeclare” the default XML namespace, you declare
a default XML namespace with an empty (zero-length) name (URI).
Within the scope of this declaration, unprefixed element type names
do not belong to any XML namespace. For example, in the following,
the default XML namespace is the http://www.google.org/ for the
A and B elements and there is no default XML namespace for the C
and D elements. That is, the names A and B are in the http://www.google.org/
namespace and the names C and D are not in any XML namespace.
<A xmlns=”http://www.google.org/”>
<B>
<C xmlns=””>
<D>abcd</D>
</C>
</B>
</A>
72. Why are special attributes used to declare XML namespaces?
I don’t know the answer to this question, but the likely reason
is that the hope that they would simplify the process of moving
fragments from one document to another document. An early draft
of the XML namespaces recommendation proposed using processing instructions
to declare XML namespaces. While these were simple to read and process,
they weren’t easy to move to other documents. Attributes, on the
other hand, are intimately attached to the elements being moved.
Unfortunately, this hasn’t worked as well as was hoped. For example,
consider the following XML document:
<google:A xmlns:google=”http://www.google.org/”>
<google:B>
<google:C>bar</google:C>
</google:B>
</google:A>
Simply using a text editor to cut the fragment headed by the <B>
element from one document and paste it into another document results
in the loss of namespace information because the namespace declaration
is not part of the fragment — it is on the parent element (<A>)
– and isn’t moved.
Even when this is done programmatically, the situation isn’t necessarily
any better. For example, suppose an application uses DOM level 2
to “cut” the fragment from the above document and “paste”
it into a different document. Although the namespace information
is transferred (it is carried by each node), the namespace declaration
(xmlns attribute) is not, again because it is not part of the fragment.
Thus, the application must manually add the declaration before serializing
the document or the new document will be invalid.
73. How do different XML technologies treat XML namespace declarations?
This depends on the technology — some treat them as attributes
and some treat them as namespace declarations. For example, SAX1
treats them as attributes and SAX2 can treat them as attributes
or namespace declarations, depending on how the parser is configured.
DOM levels 1 and 2 treat them as attributes, but DOM level 2 also
interprets them as namespace declarations. XPath, XSLT, and XML
Schemas treat them as namespaces declarations.
The reason that different technologies treat these differently is
that many of these technologies predate XML namespaces. Thus, newer
versions of them need to worry both about XML namespaces and backwards
compatibility issues.
74. How do I use prefixes to refer to element type and
attribute names in an XML namespace?
Make sure you have declared the prefix and that it is still in
scope . All you need to do then is prefix the local name of an element
type or attribute with the prefix and a colon. The result is a qualified
name, which the application parses to determine what XML namespace
the local name belongs to.
For example, suppose you have associated the serv prefix with the
http://www.our.com/ito/servers namespace and that the declaration
is still in scope. In the following, serv:Address refers to the
Address name in the http://www.our.com/ito/servers namespace. (Note
that the prefix is used on both the start and end tags.)
<!– serv refers to the http://www.our.com/ito/servers namespace.
–>
<serv:Address>127.66.67.8</serv:Address>
Now suppose you have associated the xslt prefix with the http://www.w3.org/1999/XSL/Transform
namespace. In the following, xslt:version refers to the version
name in the http://www.w3.org/1999/XSL/Transform namespace:
<!– xslt refers to the http://www.w3.org/1999/XSL/Transform
namespace. –>
<html xslt:version=”1.0?>
75. How do I use the default XML namespace to refer to
element type names in an XML namespace?
Make sure you have declared the default XML namespace and that
that declaration is still in scope . All you need to do then is
use the local name of an element type. Even though it is not prefixed,
the result is still a qualified name ), which the application parses
to determine what XML namespace it belongs to.
For example, suppose you declared the http://www.w3.org/to/addresses
namespace as the default XML namespace and that the declaration
is still in scope. In the following, Address refers to the Address
name in the http://www.w3.org/to/addresses namespace.
<!– http://www.w3.org/to/addresses is the default XML namespace.
–>
<Address>123.45.67.8</Address>
76. How do I use the default XML namespace to refer to
attribute names in an XML namespace?
You can’t.
The default XML namespace only applies to element type names, so
you can refer to attribute names that are in an XML namespace only
with a prefix. For example, suppose that you declared the http://http://www.w3.org/to/addresses
namespace as the default XML namespace. In the following, the type
attribute name does not refer to that namespace, although the Address
element type name does. That is, the Address element type name is
in the http://http://www.fyicneter.com/ito/addresses namespace,
but the type attribute name is not in any XML namespace.
<!– http://http://www.w3.org/to/addresses is the default XML
namespace. –>
<Address type=”home”>
To understand why this is true, remember that the purpose of XML
namespaces is to uniquely identify element and attribute names.
Unprefixed attribute names can be uniquely identified based on the
element type to which they belong, so there is no need identify
them further by including them in an XML namespace. In fact, the
only reason for allowing attribute names to be prefixed is so that
attributes defined in one XML language can be used in another XML
language.
77. When should I use the default XML namespace instead
of prefixes?
This is purely a matter of choice, although your choice may affect
the readability of the document. When elements whose names all belong
to a single XML namespace are grouped together, using a default
XML namespace might make the document more readable. For example:
<!– A, B, C, and G are in the http://www.google.org/ namespace.
–>
<A xmlns=”http://www.google.org/”>
<B>abcd</B>
<C>efgh</C>
<!– D, E, and F are in the http://www.bar.org/ namespace. –>
<D xmlns=”http://www.bar.org/”>
<E>1234</E>
<F>5678</F>
</D>
<!– Remember! G is in the http://www.google.org/ namespace.
–>
<G>ijkl</G>
</A>
When elements whose names are in multiple XML namespaces are interspersed,
default XML namespaces definitely make a document more difficult
to read and prefixes should be used instead. For example:
<A xmlns=”http://www.google.org/”>
<B xmlns=”http://www.bar.org/”>abcd</B>
<C xmlns=”http://www.google.org/”>efgh</C>
<D xmlns=”http://www.bar.org/”>
<E xmlns=”http://www.google.org/”>1234</E>
<F xmlns=”http://www.bar.org/”>5678</F>
</D>
<G xmlns=”http://www.google.org/”>ijkl</G>
</A>
In some cases, default namespaces can be processed faster than
namespace prefixes, but the difference is certain to be negligible
in comparison to total processing time.
78. What is the scope of an XML namespace declaration?
The scope of an XML namespace declaration is that part of an XML
document to which the declaration applies. An XML namespace declaration
remains in scope for the element on which it is declared and all
of its descendants, unless it is overridden or undeclared on one
of those descendants.
For example, in the following, the scope of the declaration of the
http://www.google.org/ namespace is the element A and its descendants
(B and C). The scope of the declaration of the http://www.bar.org/
namespace is only the element C.
<google:A xmlns:google=”http://www.google.org/”>
<google:B>
<bar:C xmlns:bar=”http://www.bar.org/” />
</google:B>
</google:A>
79. Does the scope of an XML namespace declaration include
the element it is declared on?
Yes.
For example, in the following, the names B and C are in the http://www.bar.org/
namespace, not the http://www.google.org/ namespace. This is because
the declaration that associates the google prefix with the http://www.bar.org/
namespace occurs on the B element, overriding the declaration on
the A element that associates it with the http://www.google.org/
namespace.
<google:A xmlns:google=”http://www.google.org/”>
<google:B xmlns:google=”http://www.bar.org/”>
<google:C>abcd</google:C>
</google:B>
</google:A>
Similarly, in the following, the names B and C are in the http://www.bar.org/
namespace, not the http://www.google.org/ namespace because the
declaration declaring http://www.bar.org/ as the default XML namespace
occurs on the B element, overriding the declaration on the A element.
<A xmlns=”http://www.google.org/”>
<B xmlns=”http://www.bar.org/”>
<C>abcd</C>
</B>
</A>
A final example is that, in the following, the attribute name D
is in the http://www.bar.org/ namespace.
<google:A xmlns:google=”http://www.google.org/”>
<google:B google:D=”In http://www.bar.org/ namespace”
xmlns:google=”http://www.bar.org/”>
<C>abcd</C>
</google:B>
</google:A>
One consequence of XML namespace declarations applying to the elements
they occur on is that they actually apply before they appear. Because
of this, software that processes qualified names should be particularly
careful to scan the attributes of an element for XML namespace declarations
before deciding what XML namespace (if any) an element type or attribute
name belongs to.
80. If an element or attribute is in the scope of an XML
namespace declaration, is its name in that namespace?
Not necessarily.
When an element or attribute is in the scope of an XML namespace
declaration, the element or attribute’s name is checked to see if
it has a prefix that matches the prefix in the declaration. Whether
the name is actually in the XML namespace depends on whether the
prefix matches. For example, in the following, the element type
names A, B, and D and the attribute names C and E are in the scope
of the declaration of the http://www.google.org/ namespace. While
the names A, B, and C are in that namespace, the names D and E are
not.
<google:A xmlns:google=”http://www.google.org/”>
<google:B google:C=”google” />
<bar:D bar:E=”bar” />
</google:A>
81. What happens when an XML namespace declaration goes
out of scope?
When an XML namespace declaration goes out of scope, it simply
no longer applies. For example, in the following, the declaration
of the http://www.google.org/ namespace does not apply to the C
element because this is outside its scope. That is, it is past the
end of the B element, on which the http://www.google.org/ namespace
was declared.
<!– B is in the http://www.google.org/ namespace;
C is not in any XML namespace. –>
<A>
<B xmlns=”http://www.google.org/”>abcd</B>
<C>efgh</C>
</A>
In addition to the declaration no longer applying, any declarations
that it overrode come back into scope. For example, in the following,
the declaration of the http://www.google.org/ namespace is brought
back into scope after the end of the B element. This is because
it was overridden on the B element by the declaration of the http://www.bar.org/
namespace.
<!– A and C are in the http://www.google.org/ namespace.
B is in the http://www.bar.org/ namespace. –>
<A xmlns=”http://www.google.org/”>
<B xmlns=”http://www.bar.org/”>abcd</B>
<C>efgh</C>
</A>
82. What happens if no XML namespace declaration is in
scope?
If no XML namespace declaration is in scope, then any prefixed
element type or attribute names result in namespace errors. For
example, in the following, the names google:A and google:B result
in namespace errors.
<?xml version=”1.0? ?>
<google:A google:B=”error” />
In the absence of an XML namespace declaration, unprefixed element
type and attribute names do not belong to any XML namespace. For
example, in the following, the names A and B are not in any XML
namespace.
83. Can multiple XML namespace declarations be in scope
at the same time?
Yes, as long as they don’t use the same prefixes and at most one
of them is the default XML namespace. For example, in the following,
the http://www.google.org/ and http://www.bar.org/ namespaces are
both in scope for all elements:
<A xmlns:google=”http://www.google.org/”
xmlns:bar=”http://www.bar.org/”>
<google:B>abcd</google:B>
<bar:C>efgh</bar:C>
</A>
One consequence of this is that you can place all XML namespace
declarations on the root element and they will be in scope for all
elements. This is the simplest way to use XML namespaces.
84. How can I declare XML namespaces so that all elements
and attributes are in their scope?
XML namespace declarations that are made on the root element are
in scope for all elements and attributes in the document. This means
that an easy way to declare XML namespaces is to declare them only
on the root element.
85. Does the scope of an XML namespace declaration ever
include the DTD?
No.
XML namespaces can be declared only on elements and their scope
consists only of those elements and their descendants. Thus, the
scope can never include the DTD.
86. Can I use XML namespaces in DTDs?
Yes and no.
In particular, DTDs can contain qualified names but XML namespace
declarations do not apply to DTDs .
This has a number of consequences. Because XML namespace declarations
do not apply to DTDs:
1. There is no way to determine what XML namespace a prefix in a
DTD points to. Which means…
2. Qualified names in a DTD cannot be mapped to universal names.
Which means…
3. Element type and attribute declarations in a DTD are expressed
in terms of qualified names, not universal names. Which means…
4. Validation cannot be redefined in terms of universal names as
might be expected.
This situation has caused numerous complaints but, as XML namespaces
are already a recommendation, is unlikely to change. The long term
solution to this problem is an XML schema language: all of the proposed
XML schema languages provide a mechanism by which the local name
in an element type or attribute declaration can be associated with
an XML namespace. This makes it possible to redefine validity in
terms of universal names.
87. Do XML namespace declarations apply to DTDs?
No.
In particular, an xmlns attribute declared in the DTD with a default
is not an XML namespace declaration for the DTD.. (Note that an
earlier version of MSXML (the parser used by Internet Explorer)
did use such declarations as XML namespace declarations, but that
this was removed in MSXML 4.
88. Can I use qualified names in DTDs?
Yes.
For example, the following is legal:
<!ELEMENT google:A (google:B)>
<!ATTLIST google:A
google:C CDATA #IMPLIED>
<!ELEMENT google:B (#PCDATA)>
However, because XML namespace declarations do not apply to DTDs
, qualified names in the DTD cannot be converted to universal names.
As a result, qualified names in the DTD have no special meaning.
For example, google:A is just google:A — it is not A in the XML
namespace to which the prefix google is mapped.
The reason qualified names are allowed in the DTD is so that validation
will continue to work.
89. Can the content model in an element type declaration
contain element types whose names come from other XML namespaces?
Yes and no.
The answer to this question is yes in the sense that a qualified
name in a content model can have a different prefix than the qualified
name of the element type being declared. For example, the following
is legal:
<!ELEMENT google:A (bar:B, baz:C)>
The answer to this question is no in the sense that XML namespace
declarations do not apply to DTDs so the prefixes used in an element
type declaration are technically meaningless. In particular, they
do not specify that the name of a certain element type belongs to
a certain namespace. Nevertheless, the ability to mix prefixes in
this manner is crucial when: a) you have a document whose names
come from multiple XML namespaces , and b) you want to construct
that document in a way that is both valid and conforms to the XML
namespaces recommendation .
90. Can the attribute list of an element type contain attributes
whose names come from other XML namespaces?
Yes and no.
For example, the following is legal:
<!ATTLIST google:A
bar:B CDATA #IMPLIED>
91. How can I construct an XML document that is valid and
conforms to the XML namespaces recommendation?
In answering this question, it is important to remember that:
* Validity is a concept defined in XML 1.0,
* XML namespaces are layered on top of XML 1.0 , and
* The XML namespaces recommendation does not redefine validity,
such as in terms of universal names .
Thus, validity is the same for a document that uses XML namespaces
and one that doesn’t. In particular, with respect to validity:
* xmlns attributes are treated as attributes, not XML namespace
declarations.
* Qualified names are treated like other names. For example, in
the name google:A, google is not treated as a namespace prefix,
the colon is not treated as separating a prefix from a local name,
and A is not treated as a local name. The name google:A is treated
simply as the name google:A.
Because of this, XML documents that you might expect to be valid
are not. For example, the following document is not valid because
the element type name A is not declared in the DTD, in spite of
the fact both google:A and A share the universal name {http://www.google.org/}A:
<?xml version=”1.0? ?>
<!DOCTYPE google:A [
<!ELEMENT google:A EMPTY>
<!ATTLIST google:A
xmlns:google CDATA #FIXED “http://www.google.org/”
xmlns CDATA #FIXED “http://www.google.org/”>
]>
<A/>
Similarly, the following is not valid because the xmlns attribute
is not declared in the DTD:
<?xml version=”1.0? ?>
<!DOCTYPE A [
<!ELEMENT A EMPTY>
]>
<A xmlns=”http://www.google.org/” />
Furthermore, documents that you might expect to be invalid are
valid. For example, the following document is valid but contains
two definitions of the element type with the universal name {http://www.google.org/}A:
<?xml version=”1.0? ?>
<!DOCTYPE google:A [
<!ELEMENT google:A (bar:A)>
<!ATTLIST google:A
xmlns:google CDATA #FIXED “http://www.google.org/”>
<!ELEMENT bar:A (#PCDATA)>
<!ATTLIST bar:A
xmlns:bar CDATA #FIXED “http://www.google.org/”>
]>
<google:A>
<bar:A>abcd</bar:A>
</google:A>
Finally, validity has nothing to do with correct usage of XML namespaces.
For example, the following document is valid but does not conform
to the XML namespaces recommendation because the google prefix is
never declared:
<?xml version=”1.0? ?>
<!DOCTYPE google:A [
<!ELEMENT google:A EMPTY>
]>
<google:A />
Therefore, when constructing an XML document that uses XML namespaces,
you need to do both of the following if you want the document to
be valid:
* Declare xmlns attributes in the DTD.
* Use the same qualified names in the DTD and the body of the document.
For example:
<?xml version=”1.0? ?>
<!DOCTYPE google:A [
<!ELEMENT google:A (google:B)
<!ATTLIST google:A
xmlns:google CDATA #FIXED “http://www.google.org/”>
<!ELEMENT google:B EMPTY>
]>
<google:A>
<google:B />
</google:A>
There is no requirement that the same prefix always be used for
the same XML namespace. For example, the following is also valid:
<?xml version=”1.0? ?>
<!DOCTYPE google:A [
<!ELEMENT google:A (bar:B)>
<!ATTLIST google:A
xmlns:google CDATA #FIXED “http://www.google.org/”>
<!ELEMENT bar:B EMPTY>
<!ATTLIST bar:B
xmlns:bar CDATA #FIXED “http://www.google.org/”>
]>
<google:A>
<bar:B />
</google:A>
However, documents that use multiple prefixes for the same XML
namespace or the same prefix for multiple XML namespaces are confusing
to read and thus prone to error. They also allow abuses such as
defining an element type or attribute with a given universal name
more than once, as was seen earlier. Therefore, a better set of
guidelines for writing documents that are both valid and conform
to the XML namespaces recommendation is:
* Declare all xmlns attributes in the DTD.
* Use the same qualified names in the DTD and the body of the document.
* Use one prefix per XML namespace.
* Do not use the same prefix for more than one XML namespace.
* Use at most one default XML namespace.
The latter three guidelines guarantee that prefixes are unique.
This means that prefixes fulfill the role normally played by namespace
names (URIs) — uniquely identifying an XML namespace — and that
qualified names are equivalent to universal names, so a given universal
name is always represented by the same qualified name. Unfortunately,
this is contrary to the spirit of prefixes, which were designed
for their flexibility. For a slightly better solution.
92. How can I allow the prefixes in my document to be different
from the prefixes in my DTD?
One of the problems with the solution proposed in question is that
it requires the prefixes in the document to match those in the DTD.
Fortunately, there is a workaround for this problem, although it
does require that a single prefix be used for a particular namespace
URI throughout the document. (This is a good practice anyway, so
it’s not too much of a restriction.) The solution assumes that you
are using a DTD that is external to the document, which is common
practice.
To use different prefixes in the external DTD and XML documents,
you declare the prefix with a pair of parameter entities in the
DTD. You can then override these entities with declarations in the
internal DTD in a given XML document. This works because the internal
DTD is read before the external DTD and the first definition of
a particular entity is the one that is used. The following paragraphs
describe how to use a single namespace in your DTD. You will need
to modify them somewhat to use multiple namespaces.
To start with, declare three parameter entities in your DTD:
<!ENTITY % p “” >
<!ENTITY % s “” >
<!ENTITY % nsdecl “xmlns%s;” >
The p entity (”p” is short for “prefix”) is
used in place of the actual prefix in element type and attribute
names. The s entity (”s” is short for “suffix”)
is used in place of the actual prefix in namespace declarations.
The nsdecl entity (”nsdecl” is short for “namespace
declaration”) is used in place of the name of the xmlns attribute
in declarations of that attribute.
Now use the p entity to define parameter entities for each of the
names in your namespace. For example, suppose element type names
A, B, and C and attribute name D are in your namespace.
<!ENTITY % A “%p;A”>
<!ENTITY % B “%p;B”>
<!ENTITY % C “%p;C”>
<!ENTITY % D “%p;D”>
Next, declare your element types and attributes using the “name”
entities, not the actual names. For example:
<!ELEMENT %A; ((%B;)*, %C;)>
<!ATTLIST %A;
%nsdecl; CDATA “http://www.google.org/”>
<!ELEMENT %B; EMPTY>
<!ATTLIST %B;
%D; NMTOKEN #REQUIRED
E CDATA #REQUIRED>
<!ELEMENT %C; (#PCDATA)>
There are several things to notice here.
* Attribute D is in a namespace, so it is declared with a “name”
entity. Attribute E is not in a namespace, so no entity is used.
* The nsdecl entity is used to declare the xmlns attribute. (xmlns
attributes must be declared on every element type on which they
can occur.) Note that a default value is given for the xmlns attribute.
* The reference to element type B in the content model of A is placed
inside parentheses. The reason for this is that a modifier — *
in this case — is applied to it. Using parentheses is necessary
because the replacement values of parameter entities are padded
with spaces; directly applying the modifier to the parameter entity
reference would result in illegal syntax in the content model.
For example, suppose the value of the A entity is “google:A”,
the value of the B entity is “google:B”, and the value
of the C entity is “google:C”. The declaration:
<!ELEMENT %A; (%B;*, %C;)>
would resolve to:
<!ELEMENT google:A ( google:B *, google:C )>
This is illegal because the * modifier must directly follow the
reference to the google:B element type. By placing the reference
to the B entity in parentheses, the declaration resolves to:
<!ELEMENT google:A (( google:B )*, google:C )>
This is legal because the * modifier directly follows the closing
parenthesis.
Now let’s see how this all works. Suppose our XML document won’t
use prefixes, but instead wants the default namespace to be the
http://www.google.org/ namespace. In this case, no entity declarations
are needed in the document. For example, our document might be:
<!DOCTYPE A SYSTEM “http://www.google.org/google.dtd”>
<A>
<B D=”bar” E=”baz buz” />
<B D=”boo” E=”biz bez” />
<C>bizbuz</C>
</A>
This document is valid because the declarations for p, s, and nsdecl
in the DTD set p and s to “” and nsdecl to “xmlns”.
That is, after replacing the p, s, and nsdecl parameter entities,
the DTD is as follows. Notice that both the DTD and document use
the element type names A, B, and C and the attribute names D and
E.
<!ELEMENT A (( B )*, C )>
<!ATTLIST A
xmlns CDATA “http://www.google.org/”>
<!ELEMENT B EMPTY>
<!ATTLIST B
D NMTOKEN #REQUIRED
E CDATA #REQUIRED>
<!ELEMENT C (#PCDATA)>
But what if the document wants to use a different prefix, such
as google? In this case, the document must override the declarations
of the p and s entities in its internal DTD. That is, it must declare
these entities so that they use google as a prefix (followed by
a colon) and a suffix (preceded by a colon). For example:
<!DOCTYPE google:A SYSTEM “http://www.google.org/google.dtd”
[
<!ENTITY % p “google:”>
<!ENTITY % s “:google”>
]>
<google:A>
<google:B google:D=”bar” E=”baz buz” />
<google:B google:D=”boo” E=”biz bez” />
<google:C>bizbuz</google:C>
</google:A>
In this case, the internal DTD is read before the external DTD,
so the values of the p and s entities from the document are used.
Thus, after replacing the p, s, and nsdecl parameter entities, the
DTD is as follows. Notice that both the DTD and document use the
element type names google:A, google:B, and google:C and the attribute
names google:D and E.
<!ELEMENT google:A (( google:B )*, google:C )>
<!ATTLIST google:A
xmlns:google CDATA “http://www.google.org/”>
<!ELEMENT google:B EMPTY>
<!ATTLIST google:B
google:D NMTOKEN #REQUIRED
E CDATA #REQUIRED>
<!ELEMENT google:C (#PCDATA)>
93. How can I validate an XML document that uses XML namespaces?
When people ask this question, they usually assume that validity
is different for documents that use XML namespaces and documents
that don’t. In fact, it isn’t — it’s the same for both. Thus, there
is no difference between validating a document that uses XML namespaces
and validating one that doesn’t. In either case, you simply use
a validating parser or other software that performs validation.
94. If I start using XML namespaces, do I need to change
my existing DTDs?
Probably. If you want your XML documents to be both valid and conform
to the XML namespaces recommendation, you need to declare any xmlns
attributes and use the same qualified names in the DTD as in the
body of the document.
If your DTD contains element type and attribute names from a single
XML namespace, the easiest thing to do is to use your XML namespace
as the default XML namespace. To do this, declare the attribute
xmlns (no prefix) for each possible root element type. If you can
guarantee that the DTD is always read , set the default value in
each xmlns attribute declaration to the URI used as your namespace
name. Otherwise, declare your XML namespace as the default XML namespace
on the root element of each instance document.
If your DTD contains element type and attribute names from multiple
XML namespaces, you need to choose a single prefix for each XML
namespace and use these consistently in qualified names in both
the DTD and the body of each document. You also need to declare
your xmlns attributes in the DTD and declare your XML namespaces.
As in the single XML namespace case, the easiest way to do this
is add xmlns attributes to each possible root element type and use
default values if possible.
95. How do I create documents that use XML namespaces?
The same as you create documents that don’t use XML namespaces.
If you’re currently using Notepad on Windows or emacs on Linux,
you can continue using Notepad or emacs. If you’re using an XML
editor that is not namespace-aware, you can also continue to use
that, as qualified names are legal names in XML documents and xmlns
attributes are legal attributes. And if you’re using an XML editor
that is namespace-aware, it will probably provide features such
as automatically declaring XML namespaces and keeping track of prefixes
and the default XML namespace for you.
96. How can I check that a document conforms to the XML
namespaces recommendation?
Unfortunately, I know of no software that only checks for conformance
to the XML namespaces recommendation. It is possible that some namespace-aware
validating parsers (such as those from DataChannel (Microsoft),
IBM, Oracle, or Sun) check XML namespace conformance as part of
parsing and validating. Thus, you might be able to run your document
through such parsers as a way of testing conformance.
Note that writing an application to check conformance to the XML
namespaces recommendation is not as easy as it might seem. The problem
is that most parsers do not make DTD information available to the
application, so it might not be possible to check conformance in
the DTD. Also note that writing a SAX 1.0 application that checks
conformance in the body of the document (as opposed to the DTD)
should be an easy thing to do.
97. Can I use the same document with both namespace-aware
and namespace-unaware applications?
Yes.
This situation is quite common, such as when a namespace-aware application
is built on top of a namespace-unaware parser. Another common situation
is when you create an XML document with a namespace-unaware XML
editor but process it with a namespace-aware application.
Using the same document with both namespace-aware and namespace-unaware
applications is possible because XML namespaces use XML syntax.
That is, an XML document that uses XML namespaces is still an XML
document and is recognized as such by namespace-unaware software.
The only thing you need to be careful about when using the same
document with both namespace-aware and namespace-unaware applications
is when the namespace-unaware application requires the document
to be valid. In this case, you must be careful to construct your
document in a way that is both valid and conforms to the XML namespaces
recommendation. (It is possible to construct documents that conform
to the XML namespaces recommendation but are not valid and vice
versa.)
98. What software is needed to process XML namespaces?
From a document author’s perspective, this is generally not a relevant
question. Most XML documents are written in a specific XML language
and processed by an application that understands that language.
If the language uses an XML namespace, then the application will
already use that namespace — there is no need for any special XML
namespace software.
99. How do I use XML namespaces with Internet Explorer
5.0 and/or the MSXML parser?
WARNING! The following applies only to earlier versions of MSXML.
It does not apply to MSXML 4, which is the currently shipping version
[July, 2002].
An early version of the MSXML parser, which was shipped as part
of Internet Explorer 5.0, required that every XML namespace prefix
used in an element type or attribute declaration had to be “declared”
in the attribute declaration for that element type. This had to
be done with a fixed xmlns attribute declaration. For example, the
following was accepted by MSXML and both xmlns:google attributes
were required:
<!ELEMENT google:A (#PCDATA)>
<!ATTLIST google:A
xmlns:google CDATA #FIXED “http://www.google.org/”>
<!ELEMENT google:B (#PCDATA)>
<!ATTLIST google:B
xmlns:google CDATA #FIXED “http://www.google.org/”>
MSXML returned an error for the following because the second google
prefix was not “declared”:
<!ELEMENT google:A (#PCDATA)>
<!ATTLIST google:A
xmlns:google CDATA #FIXED “http://www.google.org/”>
<!ELEMENT google:B (#PCDATA)>
The reason for this restriction was so that MSXML could use universal
names to match element type and attribute declarations to elements
and attributes during validation. Although this would have simplified
many of the problems of writing documents that are both valid and
conform to the XML namespaces recommendation some users complained
about it because it was not part of the XML namespaces recommendation.
In response to these complaints, Microsoft removed this restriction
in later versions, which are now shipping. Ironically, the idea
was later independently derived as a way to resolve the problems
of validity and namespaces. However, it has not been implemented
by anyone.
100. How do applications process documents that use XML
namespaces?
Applications process documents that use XML namespaces in almost
exactly the same way they process documents that don’t use XML namespaces.
For example, if a namespace-unaware application adds a new sales
order to a database when it encounters a Sales Order element, the
equivalent namespace-aware application does the same. The only difference
is that the namespace-aware application:
* Might need to check for xmlns attributes and parse qualified names.
Whether it does this depends on whether such processing is already
done by lower-level software, such as a namespace-aware DOM implementation.
* Uses universal (two-part) names instead of local (one-part) names.
For example, the namespace-aware application might add a new sales
order in response to an {http://www.google.com/ito/sales}SalesOrder
element instead of a Sales Order element.
101. How do I use XML namespaces with SAX 1.0?
The easiest way to use XML namespaces with SAX 1.0 is to use John
Cowan’s Namespace SAX Filter (see http://www.ccil.org/~cowan/XML).
This is a SAX filter that keeps track of XML namespace declarations,
parses qualified names, and returns element type and attribute names
as universal names in the form:
URI^local-name
For example:
http://www.google.com/ito/sales^SalesOrder
Your application can then base its processing on these longer names.
For example, the code:
public void startElement(String elementName, AttributeList attrs)
throws SAXException
{
…
if (elementName.equals(”SalesOrder”))
{
// Add new database record.
}
…
}
might become:
public void startElement(String elementName, AttributeList attrs)
throws SAXException
{
…
if (elementName.equals(”http://www.google.com/sales^SalesOrder”))
{
// Add new database record.
}
…
}
or:
public void startElement(String elementName, AttributeList attrs)
throws SAXException
{
…
// getURI() and getLocalName() are utility functions
// to parse universal names.
if (getURI(elementName).equals(”http://www.foo.com/ito/sales”))
{
if (getLocalName(elementName).equals(”SalesOrder”))
{
// Add new database record.
}
}
…
}
If you do not want to use the Namespace SAX Filter, then you will
need to do the following in addition to identifying element types
and attributes by their universal names:
* In startElement, scan the attributes for XML namespace declarations
before doing any other processing. You will need to maintain a table
of current prefix-to-URI mappings (including a null prefix for the
default XML namespace).
* In startElement and endElement, check whether the element type
name includes a prefix. If so, use your mappings to map this prefix
to a URI. Depending on how your software works, you might also check
if the local part of the qualified name includes any colons, which
are illegal.
* In startElement, check whether attribute names include a prefix.
If so, process as in the previous point.
102. How do I use XML namespaces with SAX 2.0?
SAX 2.0 primarily supports XML namespaces through the following
methods: * startElement and endElement in the ContentHandler interface
return namespace names (URIs) and local names as well as qualified
names. * getValue, getType, and getIndex in the Attributes interface
can retrieve attribute information by namespace name (URI) and local
name as well as by qualified name.
103. How do I use XML namespaces with DOM level 2?
// Check the local name.
// getNodeName() is a DOM level 1 method.
if (elementNode.getNodeName().equals(”SalesOrder”))
{
// Add new database record.
}
might become the following namespace-aware code:
// Check the XML namespace name (URI).
// getNamespaceURI() is a DOM level 2 method.
String SALES_NS = “http://www.foo.com/ito/sales”;
if (elementNode.getNamespaceURI().equals(SALES_NS))
{
// Check the local name.
// getLocalName() is a DOM level 2 method.
if (elementNode.getLocalName().equals(”SalesOrder”))
{
// Add new database record.
}
}
Note that, unlike SAX 2.0, DOM level 2 treats xmlns attributes
as normal attributes.
104. Can an application process documents that use XML
namespaces and documents that don’t use XML namespaces?
Yes.
This is a common situation for generic applications, such as editors,
browsers, and parsers, that are not wired to understand a particular
XML language. Such applications simply treat all element type and
attribute names as qualified names. Those names that are not mapped
to an XML namespace — that is, unprefixed element type names in
the absence of a default XML namespace and unprefixed attribute
names — are simply processed as one-part names, such as by using
a null XML namespace name (URI).
Note that such applications must decide how to treat documents that
do not conform to the XML namespaces recommendation. For example,
what should the application do if an element type name contains
a colon (thus implying the existence of a prefix), but there are
no XML namespace declarations in the document? The application can
choose to treat this as an error, or it can treat the document as
one that does not use XML namespaces, ignore the “error”,
and continue processing.
105. Can an application be both namespace-aware and namespace-unaware?
Yes.
However, there is generally no reason to do this. The reason is
that most applications understand a particular XML language, such
as one used to transfer sales orders between companies. If the element
type and attribute names in the language belong to an XML namespace,
the application must be namespace-aware; if not, the application
must be namespace-unaware.
For a few applications, being both namespace-aware and namespace-unaware
makes sense. For example, a parser might choose to redefine validity
in terms of universal names and have both namespace-aware and namespace-unaware
validation modes. However, such applications are uncommon.
106. What does a namespace-aware application do when it
encounters an error?
The XML namespaces recommendation does not specify what a namespace-aware
application does when it encounters a document that does not conform
to the recommendation. Therefore, the behavior is application-dependent.
For example, the application could stop processing, post an error
to a log and continue processing, or ignore the error.
PART III: NAMES, PREFIXES, AND URIs
107. What is a qualified name?
A qualified name is a name of the following form. It consists of
an optional prefix and colon, followed by the local part, which
is sometimes known as a local name.
prefix:local-part
–OR–
local-part
For example, both of the following are qualified names. The first
name has a prefix of serv; the second name does not have a prefix.
For both names, the local part (local name) is Address.
serv:Address
Address
In most circumstances, qualified names are mapped to universal
names.
108. What characters are allowed in a qualified name?
The prefix can contain any character that is allowed in the Name
[5] production in XML 1.0 except a colon. The same is true of the
local name. Thus, there can be at most one colon in a qualified
name — the colon used to separate the prefix from the local name.
109. Where can qualified names appear?
Qualified names can appear anywhere an element type or attribute
name can appear: in start and end tags, as the document element
type, and in element type and attribute declarations in the DTD.
For example:
<!DOCTYPE foo:A [
<!ELEMENT foo:A (foo:B)>
<!ATTLIST foo:A
foo:C CDATA #IMPLIED>
<!ELEMENT foo:B (#PCDATA)>
]>
<foo:A xmlns:foo=”http://www.foo.org/” foo:C=”bar”>
<foo:B>abcd
<foo:A>
Qualified names cannot appear as entity names, notation names,
or processing instruction targets.
110. Can qualified names be used in attribute values?
Yes, but they have no special significance. That is, they are not
necessarily recognized as such and mapped to universal names. For
example, the value of the C attribute in the following is the string
“foo:D”, not the universal name {http://www.foo.org/}D.
<foo:A xmlns:foo=”http://www.foo.org/”>
<foo:B C=”foo:D”/>
<foo:A>
In spite of this, there is nothing to stop an application from
recognizing a qualified name in an attribute value and processing
it as such. This is being done in various technologies today. For
example, in the following XML Schemas definition, the attribute
value xsd:string identifies the type of the foo attribute as the
universal name {http://www.w3.org/1999/XMLSchema}string.
<xsd:attribute name=”foo” type=”xsd:string”
/>
There are two potential problems with this. First, the application
must be able to retrieve the prefix mappings currently in effect.
Fortunately, both SAX 2.0 and DOM level 2 support this capability.
Second, any general purpose transformation tool, such as one that
writes an XML document in canonical form and changes namespace prefixes
in the process, will not recognize qualified names in attribute
values and therefore not transform them correctly. Although this
may be solved in the future by the introduction of the QName (qualified
name) data type in XML Schemas, it is a problem today.
111. How are qualified names mapped to names in XML namespaces?
If a qualified name in the body of a document (as opposed to the
DTD) includes a prefix, then that prefix is used to map the local
part of the qualified name to a universal name — that is, a name
in an XML namespace. For example, in the following, the prefix foo
is used to map the local names A, B, and C to names in the http://www.foo.org/
namespace:
<?xml version=”1.0? ?>
<foo:A xmlns:foo=”http://www.foo.org/” foo:C=”bar”>
<foo:B>abcd
<foo:A>
If a qualified name in the body of a document does not include
a prefix and a default XML namespace is in scope then one of two
things happens. If the name is used as an element tag, it is mapped
to a name in the default XML namespace. If it is used as an attribute
name, it is not in any XML namespace. For example, in the following,
A and B are in the http://www.foo.org/ namespace and C is not in
any XML namespace:
<?xml version=”1.0? ?>
<A xmlns=”http://www.foo.org/” C=”bar”>
<B>abcd</B>
<A>
If a qualified name in the body of a document does not include
a prefix and no default XML namespace is in scope, then that name
is not in any XML namespace. For example, in the following, A, B,
and C are not in any XML namespace:
<?xml version=”1.0? ?>
<A C=”bar”>
<B>abcd</B>
<A>
Qualified names in the DTD are never mapped to names in an XML
namespace because they are never in the scope of an XML namespace
declaration.
112. How are universal names represented?
There is no standard way to represent a universal name. However,
three representations are common.
The first representation keeps the XML namespace name (URI) and
the local name separate. For example, many DOM level 1 implementations
have different methods for returning the XML namespace name (URI)
and the local name of an element or attribute node.
The second representation concatenates the namespace name (URI)
and the local name with caret (^). The result is a universally unique
name, since carets are not allowed in URIs or local names. This
is the method used by John Cowan’s Namespace SAX Filter . For example,
the universal name that has the URI http://www.google.org/to/servers
and the local name Address would be represented as:
http://www.foo.com/ito/servers^Address
The third representation places the XML namespace name (URI) in
braces and concatenates this with the local name. This notation
is suggested only for documentation and I am aware of no code that
uses it. For example, the above name would be represented as:
{http://www.foo.com/ito/servers}Address
113. Are universal names universally unique?
No, but it is reasonable to assume they are.
Universal element type and attribute names are not guaranteed to
be universally unique — that is, unique within the space of all
XML documents — because it is possible for two different people,
each defining their own XML namespace, to use the same URI and the
same element type or attribute name. However, this occurs only if:
* One or both people use a URI that is not under their control,
such as somebody outside Netscape using the URI http://www.netscape.com/,
or
* Both people have control over a URI and both use it.
The first case means somebody is cheating when assigning URIs (a
process governed by trust) and the second case means that two people
within an organization are not paying attention to each other’s
work. For widely published element type and attribute names, neither
case is very likely. Thus, it is reasonable to assume that universal
names are universally unique. (Since both cases are possible, applications
that present security risks should be careful about assuming that
universal names are universally unique.)
For information about the ability of universal names to uniquely
identify element types and attributes (as opposed to the names themselves
being unique).
114. What is an XML namespace prefix?
An XML namespace prefix is a prefix used to specify that a local
element type or attribute name is in a particular XML namespace.
For example, in the following, the serv prefix specifies that the
Address element type name is in the http://www.foo.com/ito/addresses
namespace:
<serv:Addresses xmlns:serv=”http://www.foo.com/ito/addresses”>
115. What characters are allowed in an XML namespace prefix?
The prefix can contain any character that is allowed in the Name
[5] production in XML 1.0 except a colon.
116. Can I use the same prefix for more than one XML namespace?
Yes.
117. What happens if there is no prefix on an element type
name?
If a default XML namespace declaration is in scope, then the element
type name is in the default XML namespace. Otherwise, the element
type name is not in any XML namespace.
118. What does the URI used as an XML namespace name point
to?
The URI used as an XML namespace name is simply an identifier.
It is not guaranteed to point to anything and, in general, it is
a bad idea to assume that it does. This point causes a lot of confusion,
so we’ll repeat it here:
URIs USED AS XML NAMESPACE NAMES ARE JUST IDENTIFIERS. THEY ARE
NOT GUARANTEED TO POINT TO ANYTHING.
While this might be confusing when URLs are used as namespace names,
it is obvious when other types of URIs are used as namespace names.
For example, the following namespace declaration uses an ISBN URN:
xmlns:xbe=”urn:ISBN:0-7897-2504-5?
and the following namespace declaration uses a UUID URN:
xmlns:foo=”urn:uuid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6?
Clearly, neither namespace name points to anything on the Web.
NOTE: Namespace URIs that are URLs may point to RDDL documents,
although this does not appear to be widely implemented. For details,
see the next question.
NOTE: An early version of the W3C’s XML Schemas used namespace URIs
to point to an XML Schema document containing the definitions of
the element types and attributes named in the namespace. However,
this proved very controversial and the idea has been withdrawn.
119. What is an XML namespace name?
An XML namespace name is a URI that uniquely identifies the namespace.
URIs are used because they are widely understood and well documented.
Because people may only allocate URIs under their control, it is
easy to ensure that no two XML namespaces are identified by the
same URI.
120. Can I resolve the URI used as an XML namespace name?
Yes.
121. Can I use a relative URI as a namespace name?
Yes. However, such usage is deprecated, so you should never do
it.
122. What is XPointer?
XPointer is set of recommendations developed by the W3C. The core
recommendations are the XPointer Framework which provides an extensible
addressing behavior for fragment identifiers in XML media types.
XPointer gains its extensibility through the XPointer Framework,
which identifies the syntax and processing architecture for XPointer
expressions and through an extensible set of XPointer addressing
schemes. These schemes, e.g., element() or xpointer(), are actually
QNames. The xmlns() scheme makes it possible for an XPointer to
declare namespace bindings and thereby use third-party schemes as
readily as W3C defined XPointer schemes.
123. How do I install the XPointer processor?
Download the latest “cweb-xpointer” release from SourceForge.
This project uses Apache Maven and Java 1.4+, so you will need to
install those as well. Normally you will also want to download one
of the XPointer Framework integrations, such as the xpointer+dom4j
or the xpointer+jdom package. These “integration packages”
provide support for a specific XML Document model.
The project dependencies are explicitly declared in the Maven POM.
This means that Maven can automagically download the required releases
of dependent JARs.
There are several release artifacts. The “uberjar” release
provides an executable command line utility (see below) and bundles
all dependancies (except for Java itself). If you want to integrate
into an existing application, then you should use the cweb-xpointer
JAR and also download copies of its dependencies. If you are using
a Maven project, then this is all very, very easy.
124. What is server-side XPointer?
The XPointer Framework provides an authoritative and extensible
interpretation of the semantics of fragment identifiers for XML
media types. However, HTTP does NOT transmit the fragment identifier
as part of the HTTP request. Therefore XPointer is generally applied
by the client, not by the server.
For example, assuming that http://www.myorg.org/myTripleStore identifies
a resource that is willing to negotiate for RDF/XML, then the following
is typical of an HTTP request for an RDF/XML representation of that
resource and the server’s response.
Request:
GET /myTripleStore HTTP/1.1
Host: www.myorg.org
Accept: application/rdf+xml
Response:
HTTP/1.1 200 Ok
Content-Type: application/rdf+xml
<rdf:RDF />
This request asks for the entire triple store, serialized as RDF/XML.
Server-side XPointer uses the HTTP “Range” header to transmit
the XPointer expression to the server. For example, let’s assume
that the URI of the triple store is the same, but we want to select
the subresources identified by the following RDQL query:
SELECT (?x foaf:mbox ?mbox)
WHERE (?x foaf:name “John Smith”) (?x foaf:mbox ?mbox)
USING foaf FOR<http://xmlns.com/foaf/0.1/>
)
In that case the HTTP request, including a copy of the RDQL query
wrapped up as an XPointer expression, looks as follows. Note that
we have added a range-unit whose value is xpointer to indicate that
the value of the Range header should be interpreted by an XPointer
processor. Also note the use of the XPointer xmlns() scheme to set
bind the namespace URI for the rdql() XPointer scheme. This is necessary
since this scheme has not been standardized by the W3C.
GET /myTripleStore HTTP/1.1
Host: www.myorg.org
Accept: application/rdf+xml
Range: xpointer = xmlns(x:http://www.mindswap.org)x:rdql(
SELECT (?x foaf:mbox ?mbox)
WHERE (?x foaf:name “John Smith”) (?x foaf:mbox ?mbox)
USING foaf FOR <http://xmlns.com/foaf/0.1/>
)
The response looks as follows. The HTTP 206 (Partial Content) status
code is used to indicate that the server recognized and processed
the Range header and that the response entity includes only the
identified logical range of the addressed resource.
HTTP/1.1 206 Partial Content
Content-Type: application/rdf+xml
<!– Only the selected sub-graph is transmitted to the client.
–> <rdf:RDF />
125. What about non-XML resources?
You can use the XPointer Framework with non-XML resources. This
is especially effective when your resource is backed by some kind
of a DBMS, or when you want to query a data model, such as RDF,
and not the XML syntax of a representation of that data model.
However, please note that the authoratitive interpretation of the
fragment identifier is determined by the Internet Media Type. If
you want to opt-in for XPointer, then you can always create publish
your own Internet Media Type with IANA and specify that it supports
the XPointer Framework for some kind of non-XML resource. In this
case, you are going to need to declare your own XPointer schemes
as well.
126. What XPointer schemes are supported in this release?
The XPointer integration distributions support shorthand pointers.
In addition, they bundle support for at last the following XPointer
schemes:
* xmlns()
* element()
* xpath() - This is not a W3C defined XPointer scheme since W3C
has not published an XPointer sheme for XPath. The namespace URI
for this scheme is http://www.cogweb.org/xml/namespace/xpointer
. It provides for addressing XML subresources using a XPath 1.0
expressions.
127. How do I configure an XPointer processor?
There is no required configuration for the XPointer Framework.
The uberjar command line utility provides some configuration options.
Applications configure individual XPointer processors when they
obtain an instance from an appropriate XPointerProcessor factory
method.
128. How do integrate XPointer into my application?
There are several ways to do this. The easiest is to use the uberjar
release, which can be directly executed on any Java enabled platform.
This makes it trivial to test and develop XPointer support in your
applications, including server-side XPointer. The uberjar release
contains a Java class org.CognitiveWeb.xpointer.XPointerDriver that
provides a simple but flexible command line utility that exposes
an XPointer processor. The XPointer is provided as a command line
argument and the XML resource is read from stdin. The results are
written on stdout by default as a set of null-terminated XML fragments.
See XPointerDriver in the XPointer JavaDoc for more information.
If you already have a Java application, then it is straight-forward
to integrate XPointer support using: org.CognitiveWeb.xpointer.XPointerProcessor
You can see an example integration by looking at the XPointerDriver
in the source code release.
129. How do I implement an application-specific XPointer
scheme?
Short answer: Implement org.CognitiveWeb.xpointer.ISchemeProcessor
The XPointer Framework is extensible. One of the very coolest things
about this is that you can develop your own XPointer schemes that
expose your application using the data model that makes the most
sense for your application clients.
For example, let’s say that you have a CRM application. The important
logical addressing units probably deal with concepts such as customers,
channels, and products. You can directly expose these data using
a logical addressing scheme independent of the actual XML data model.
Not only does this let people directly address the relevant concepts
using a purpose-built addressing vocabulary, but this means that
your addressing scheme can remain valid even if you change or version
your XML data model. What a bonus!
The same approach is being used by the MindSwap laboratory at the
University of Maryland to prototype a variety of XPointer schemes
for addressing semantic web data.
130. How do I support very large resources?
You can only do this with server-side XPointer. Further, you need
to use (or implement) XPointer schemes that do not depend on a parsed
XML document model. Basically, you need to use an XPointer scheme
that interfaces with an indexed persistence store (RDBMS, ODBMS,
or XML DBMS) which exposes to your ISchemeProcessor the information
that it needs to answer subresource addressing requests.
You will also have to provide shorthand pointer support for your
DBMS-based resource. The default shorthand pointer processor assumes
that it has access to a parsed XML document, so it can’t be used
when you have a very large XML resource.
131. How do I contribute?
The XPointer implementation is hosted as a SourceForge project.
If you want to contribute send an email to one of the project administrators
from the project home page.
The XPointer module uses numerous tests to validate correct behavior
of the XPointer processor. One valuable way to contribute is by
developing new tests that demonstrate broken behavior. Patches that
fix the problems identified by those tests are also valuable, but
it is by the tests themselves that we can insure that each release
of the XPointer processor will continue to meet the requirements
of the various XPointer specifications.
132. What’s XLink?
This specification defines the XML Linking Language (XLink), which
allows elements to be inserted into XML documents in order to create
and describe links between resources. It uses XML syntax to create
structures that can describe links similar to the simple unidirectional
hyperlinks of today’s HTML, as well as more sophisticated links.
Definition: An XLink link is an explicit relationship between resources
or portions of resources.] [Definition: It is made explicit by an
XLink linking element, which is an XLink-conforming XML element
that asserts the existence of a link.] There are six XLink elements;
only two of them are considered linking elements. The others provide
various pieces of information that describe the characteristics
of a link. (The term “link” as used in this specification
refers only to an XLink link, though nothing prevents non-XLink
constructs from serving as links.)
133. What are the valid values for xlink:actuate and xlink:show?
Don’t blame me to put such a simple question here. I saw a famous
exam simulator gave wrong answer on this one. Typing them out also
help me to remember them. xlink:actuate onRequest, onLoad, other,
none xlink:show replace new embed other none
134. Mock question: What is the correct answer of the following
question? Which of the following is true about XLink and HTML hyperlinks?
1. XLink can be attached with any element. Hyperlinks in HTML can
be attached to only an ANCHOR <A> element.
2. XLink can refer to a specific location in XML document by name
or context with the help of XPointer. HTML ANCHOR<A> does
not have capability to point to specific location within an html
document.
3. XLink / XML links can be multidirectional. HTML links are unidirectional.
4. HTML links are activated when user clicks on them. XLink has
option of activating automatically when XML document is processed.
Only 2 is incorrect, since HTML ANCHOR does have capability to point
to specific location within an html document.
135. What three essential components of security does the
XML Signatures provide?
authentication, message integrity, and non-repudiation. In addition
to signature information, an XML Signature can also contain information
describing the key used to sign the content.
136. XLink Processing and Conformance
Processing Dependencies: XLink processing depends on [XML], [XML
Names], [XML Base], and [IETF RFC 2396]
Markup Conformance:
An XML element conforms to XLink if:
it has a type attribute from the XLink namespace whose value is
one of “simple”, “extended”, “locator”,
“arc”, “resource”, “title”, or “none”,
and
it adheres to the conformance constraints imposed by the chosen
XLink element type, as prescribed in this specification.
This specification imposes no particular constraints on DTDs; conformance
applies only to elements and attributes.
Application Conformance:
An XLink application is any software module that interprets well-formed
XML documents containing XLink elements and attributes, or XML information
sets [XIS] containing information items and properties corresponding
to XLink elements and attributes. (This document refers to elements
and attributes, but all specifications herein apply to their information
set equivalents as well.) Such an application is conforming if:
it observes the mandatory conditions for applications (”must”)
set forth in this specification, and
for any optional conditions (”should” and “may”)
it chooses to observe, it observes them in the way prescribed, and
it performs markup conformance testing according to all the conformance
constraints appearing in this specification.
137. XLink Markup Design
Link markup needs to be recognized reliably by XLink applications
in order to be traversed and handled properly. XLink uses the mechanism
described in the Namespaces in XML Recommendation [XML Names] to
accomplish recognition of the constructs in the XLink
XML Interview Questions List
Category: General Question, XML Questions | 6 views | Add a Comment
. Give some examples of XML DTDs or schemas that you
have worked with. Although XML does not require data to be validated against a DTD,
many of the benefits of using the technology are derived from being
able to validate XML documents against business or technical architecture
rules. Polling for the list of DTDs that developers have worked
with provides insight to their general exposure to the technology.
The ideal candidate will have knowledge of several of the commonly
used DTDs such as FpML, DocBook, HRML, and RDF, as well as experience
designing a custom DTD for a particular project where no standard
existed.
18. Using XSLT, how would you extract a specific attribute
from an element in an XML document?
Successful candidates should recognize this as one of the most
basic applications of XSLT. If they are not able to construct a
reply similar to the example below, they should at least be able
to identify the components necessary for this operation: xsl:template
to match the appropriate XML element, xsl:value-of to select the
attribute value, and the optional xsl:apply-templates to continue
processing the document.
Extract Attributes from XML Data
Example 1.
<xsl:template match=”element-name”>
Attribute Value:
<xsl:value-of select=”@attribute”/>
<xsl:apply-templates/>
</xsl:template>
19. When constructing an XML DTD, how do you create an
external entity reference in an attribute value?
Every interview session should have at least one trick question.
Although possible when using SGML, XML DTDs don’t support defining
external entity references in attribute values. It’s more important
for the candidate to respond to this question in a logical way than
than the candidate know the somewhat obscure answer.
20. How would you build a search engine for large volumes
of XML data?
The way candidates answer this question may provide insight into
their view of XML data. For those who view XML primarily as a way
to denote structure for text files, a common answer is to build
a full-text search and handle the data similarly to the way Internet
portals handle HTML pages. Others consider XML as a standard way
of transferring structured data between disparate systems. These
candidates often describe some scheme of importing XML into a relational
or object database and relying on the database’s engine for searching.
Lastly, candidates that have worked with vendors specializing in
this area often say that the best way the handle this situation
is to use a third party software package optimized for XML data.
21. What is the difference between XML and C or C++ or
Java? Updated
C and C++ (and other languages like FORTRAN, or Pascal, or Visual
Basic, or Java or hundreds more) are programming languages with
which you specify calculations, actions, and decisions to be carried
out in order:
mod curconfig[if left(date,6) = “01-Apr”,
t.put “April googlel!”,
f.put days(’31102005?,’DDMMYYYY’) -
days(sdate,’DDMMYYYY’)
” more shopping days to Samhain”];
XML is a markup specification language with which you can design
ways of describing information (text or data), usually for storage,
transmission, or processing by a program. It says nothing about
what you should do with the data (although your choice of element
names may hint at what they are for):
<part num=”DA42? models=”LS AR DF HG KJ”
update=”2001-11-22?>
<name>Camshaft end bearing retention circlip</name>
<image drawing=”RR98-dh37? type=”SVG” x=”476?
y=”226?/> <maker id=”RQ778?>Ringtown
Fasteners Ltd</maker>
<notes>Angle-nosed insertion tool <tool
id=”GH25?/> is required for the removal
and replacement of this part.</notes>
</part>
On its own, an SGML or XML file (including HTML) doesn’t do anything.
It’s a data format which just sits there until you run a program
which does something with it.
22. Does XML replace HTML?
No. XML itself does not replace HTML. Instead, it provides an alternative
which allows you to define your own set of markup elements. HTML
is expected to remain in common use for some time to come, and the
current version of HTML is in XML syntax. XML is designed to make
the writing of DTDs much simpler than with full SGML. (See the question
on DTDs for what one is and why you might want one.)
23. Do I have to know HTML or SGML before I learn XML?
No, although it’s useful because a lot of XML terminology and practice
derives from two decades’ experience of SGML.
Be aware that ‘knowing HTML’ is not the same as ‘understanding
SGML’. Although HTML was written as an SGML application, browsers
ignore most of it (which is why so many useful things don’t work),
so just because something is done a certain way in HTML browsers
does not mean it’s correct, least of all in XML.
24. What does an XML document actually look like (inside)?
The basic structure of XML is similar to other applications of
SGML, including HTML. The basic components can be seen in the following
examples. An XML document starts with a Prolog:
1. The XML Declaration which specifies that this is an XML document;
2. Optionally a Document Type Declaration which identifies the type
of document and says where the Document Type Description (DTD) is
stored;
The Prolog is followed by the document instance:
1. A root element, which is the outermost (top level) element (start-tag
plus end-tag) which encloses everything else: in the examples below
the root elements are conversation and titlepage;
2. A structured mix of descriptive or prescriptive elements enclosing
the character data content (text), and optionally any attributes
(’name=value’ pairs) inside some start-tags.
XML documents can be very simple, with straightforward nested markup
of your own design:
<?xml version=”1.0? standalone=”yes”?>
<conversation><br>
<greeting>Hello, world!</greeting>
<response>Stop the planet, I want to get
off!</response>
</conversation>
Or they can be more complicated, with a Schema or question C.11,
Document Type Description (DTD) or internal subset (local DTD changes
in [square brackets]), and an arbitrarily complex nested structure:
<?xml version=”1.0? encoding=”iso-8859-1??>
<!DOCTYPE titlepage
SYSTEM “http://www.google.bar/dtds/typo.dtd”
[<!ENTITY % active.links “INCLUDE”>]>
<titlepage id=”BG12273624?>
<white-space type=”vertical” amount=”36?/>
<title font=”Baskerville” alignment=”centered”
size=”24/30?>Hello, world!</title>
<white-space type=”vertical” amount=”12?/>
<!– In some copies the following
decoration is hand-colored, presumably
by the author –>
<image location=”http://www.google.bar/fleuron.eps”
type=”URI” alignment=”centered”/>
<white-space type=”vertical” amount=”24?/>
<author font=”Baskerville” size=”18/22?
style=”italic”>Vitam capias</author>
<white-space type=”vertical” role=”filler”/>
</titlepage>
Or they can be anywhere between: a lot will depend on how you want
to define your document type (or whose you use) and what it will
be used for. Database-generated or program-generated XML documents
used in e-commerce is usually unformatted (not for human reading)
and may use very long names or values, with multiple redundancy
and sometimes no character data content at all, just values in attributes:
<?xml version=”1.0??> <ORDER-UPDATE AUTHMD5=”4baf7d7cff5faa3ce67acf66ccda8248?
ORDER-UPDATE-ISSUE=”193E22C2-EAF3-11D9-9736-CAFC705A30B3?
ORDER-UPDATE-DATE=”2005-07-01T15:34:22.46? ORDER-UPDATE-DESTINATION=”6B197E02-EAF3-11D9-85D5-997710D9978F”
ORDER-UPDATE-ORDERNO=”8316ADEA-EAF3-11D9-9955-D289ECBC99F3?>
<ORDER-UPDATE-DELTA-MODIFICATION-DETAIL ORDER-UPDATE-ID=”BAC352437484?>
<ORDER-UPDATE-DELTA-MODIFICATION-VALUE ORDER-UPDATE-ITEM=”56?
ORDER-UPDATE-QUANTITY=”2000?/>
</ORDER-UPDATE-DELTA-MODIFICATION-DETAIL>
</ORDER-UPDATE>
25. How does XML handle white-space in my documents?
All white-space, including linebreaks, TAB characters, and normal
spaces, even between ’structural’ elements where no
text can ever appear, is passed by the parser unchanged to the application
(browser, formatter, viewer, converter, etc), identifying the context
in which the white-space was found (element content, data content,
or mixed content, if this information is available to the parser,
eg from a DTD or Schema). This means it is the application’s responsibility
to decide what to do with such space, not the parser’s:
* insignificant white-space between structural elements (space which
occurs where only element content is allowed, ie between other elements,
where text data never occurs) will get passed to the application
(in SGML this white-space gets suppressed, which is why you can
put all that extra space in HTML documents and not worry about it)
* significant white-space (space which occurs within elements which
can contain text and markup mixed together, usually mixed content
or PCDATA) will still get passed to the application exactly as under
SGML. It is the application’s responsibility to handle it correctly.
The parser must inform the application that white-space has occurred
in element content, if it can detect it. (Users of SGML will recognize
that this information is not in the ESIS, but it is in the Grove.)
<chapter>
<title>
My title for
Chapter 1.
</title>
<para>
text
</para>
</chapter>
In the example above, the application will receive all the pretty-printing
linebreaks, TABs, and spaces between the elements as well as those
embedded in the chapter title. It is the function of the application,
not the parser, to decide which type of white-space to discard and
which to retain. Many XML applications have configurable options
to allow programmers or users to control how such white-space is
handled.
26. Which parts of an XML document are case-sensitive?
All of it, both markup and text. This is significantly different
from HTML and most other SGML applications. It was done to allow
markup in non-Latin-alphabet languages, and to obviate problems
with case-folding in writing systems which are caseless.
* Element type names are case-sensitive: you must follow whatever
combination of upper- or lower-case you use to define them (either
by first usage or in a DTD or Schema). So you can’t say <BODY>…</body>:
upper- and lower-case must match; thus <Img/>, <IMG/>,
and <img/> are three different element types;
* For well-formed XML documents with no DTD, the first occurrence
of an element type name defines the casing;
* Attribute names are also case-sensitive, for example the two width
attributes in <PIC width=”7in”/> and <PIC WIDTH=”6in”/>
(if they occurred in the same file) are separate attributes, because
of the different case of width and WIDTH;
* Attribute values are also case-sensitive. CDATA values (eg Url=”MyFile.SGML”)
always have been, but NAME types (ID and IDREF attributes, and token
list attributes) are now case-sensitive as well;
* All general and parameter entity names (eg A), and your
data content (text), are case-sensitive as always.
27. How can I make my existing HTML files work in XML?
Either convert them to conform to some new document type (with
or without a DTD or Schema) and write a stylesheet to go with them;
or edit them to conform to XHTML. It is necessary to convert existing
HTML files because XML does not permit end-tag minimisation (missing
, etc), unquoted attribute values, and a number of other SGML shortcuts
which have been normal in most HTML DTDs. However, many HTML authoring
tools already produce almost (but not quite) well-formed XML.
You may be able to convert HTML to XHTML using the Dave Raggett’s
HTML Tidy program, which can clean up some of the formatting mess
left behind by inadequate HTML editors, and even separate out some
of the formatting to a stylesheet, but there is usually still some
hand-editing to do.
28. Is there an XML version of HTML?
Yes, the W3C recommends using XHTML which is ‘a reformulation
of HTML 4 in XML 1.0?. This specification defines HTML as
an XML application, and provides three DTDs corresponding to the
ones defined by HTML 4.* (Strict, Transitional, and Frameset). The
semantics of the elements and their attributes are as defined in
the W3C Recommendation for HTML 4. These semantics provide the foundation
for future extensibility of XHTML. Compatibility with existing HTML
browsers is possible by following a small set of guidelines (see
the W3C site).
29. If XML is just a subset of SGML, can I use XML files
directly with existing SGML tools?
Yes, provided you use up-to-date SGML software which knows about
the WebSGML Adaptations TC to ISO 8879 (the features needed to support
XML, such as the variant form for EMPTY elements; some aspects of
the SGML Declaration such as NAMECASE GENERAL NO; multiple attribute
token list declarations, etc).
An alternative is to use an SGML DTD to let you create a fully-normalised
SGML file, but one which does not use empty elements; and then remove
the DocType Declaration so it becomes a well-formed DTDless XML
file. Most SGML tools now handle XML files well, and provide an
option switch between the two standards.
30. Can XML use non-Latin characters?
Yes, the XML Specification explicitly says XML uses ISO 10646,
the international standard character repertoire which covers most
known languages. Unicode is an identical repertoire, and the two
standards track each other. The spec says (2.2): ‘All XML
processors must accept the UTF-8 and UTF-16 encodings of ISO 10646…’.
There is a Unicode FAQ at http://www.unicode.org/faq/FAQ.
UTF-8 is an encoding of Unicode into 8-bit characters: the first
128 are the same as ASCII, and higher-order characters are used
to encode anything else from Unicode into sequences of between 2
and 6 bytes. UTF-8 in its single-octet form is therefore the same
as ISO 646 IRV (ASCII), so you can continue to use ASCII for English
or other languages using the Latin alphabet without diacritics.
Note that UTF-8 is incompatible with ISO 8859-1 (ISO Latin-1) after
code point 127 decimal (the end of ASCII).
UTF-16 is an encoding of Unicode into 16-bit characters, which lets
it represent 16 planes. UTF-16 is incompatible with ASCII because
it uses two 8-bit bytes per character (four bytes above U+FFFF).
31. What’s a Document Type Definition (DTD) and where do
I get one?
A DTD is a description in XML Declaration Syntax of a particular
type or class of document. It sets out what names are to be used
for the different types of element, where they may occur, and how
they all fit together. (A question C.16, Schema does the same thing
in XML Document Syntax, and allows more extensive data-checking.)
For example, if you want a document type to be able to describe
Lists which contain Items, the relevant part of your DTD might contain
something like this:
<!ELEMENT List (Item)+>
<!ELEMENT Item (#PCDATA)>
This defines a list as an element type containing one or more items
(that’s the plus sign); and it defines items as element types containing
just plain text (Parsed Character Data or PCDATA). Validators read
the DTD before they read your document so that they can identify
where every element type ought to come and how each relates to the
other, so that applications which need to know this in advance (most
editors, search engines, navigators, and databases) can set themselves
up correctly. The example above lets you create lists like:
<List>
<Item>Chocolate</Item>
<Item>Music</Item>
<Item>Surfingv</Item>
</List>
(The indentation in the example is just for legibility while editing:
it is not required by XML.)
A DTD provides applications with advance notice of what names and
structures can be used in a particular document type. Using a DTD
and a validating editor means you can be certain that all documents
of that particular type will be constructed and named in a consistent
and conformant manner.
DTDs are not required for processing the tip in question Bwell-formed
documents, but they are needed if you want to take advantage of
XML’s special attribute types like the built-in ID/IDREF cross-reference
mechanism; or the use of default attribute values; or references
to external non-XML files (’Notations’); or if you simply
want a check on document validity before processing.
There are thousands of DTDs already in existence in all kinds of
areas (see the SGML/XML Web pages for pointers). Many of them can
be downloaded and used freely; or you can write your own (see the
question on creating your own DTD. Old SGML DTDs need to be converted
to XML for use with XML systems: read the question on converting
SGML DTDs to XML, but most popular SGML DTDs are already available
in XML form.
The alternatives to a DTD are various forms of question C.16, Schema.
These provide more extensive validation features than DTDs, including
character data content validation.
32. Does XML let me make up my own tags?
No, it lets you make up names for your own element types. If you
think tags and elements are the same thing you are already in considerable
trouble: read the rest of this question carefully.
33. How do I create my own document type?
Document types usually need a formal description, either a DTD
or a Schema. Whilst it is possible to process well-formed XML documents
without any such description, trying to create them without one
is asking for trouble. A DTD or Schema is used with an XML editor
or API interface to guide and control the construction of the document,
making sure the right elements go in the right places.
Creating your own document type therefore begins with an analysis
of the class of documents you want to describe: reports, invoices,
letters, configuration files, credit-card verification requests,
or whatever. Once you have the structure correct, you write code
to express this formally, using DTD or Schema syntax.
34. How do I write my own DTD?
You need to use the XML Declaration Syntax (very simple: declaration
keywords begin with
<!ELEMENT Shopping-List (Item)+>
<!ELEMENT Item (#PCDATA)>
It says that there shall be an element called Shopping-List and
that it shall contain elements called Item: there must be at
