Collaborative FAQ for Akara

General

Q: What is the relationship between "Amara" and "Akara"?


Akara is the overall, umbrella project. Amara is a component of Akara, but can be used on its own, by developers who need core XML processing. Amara is the basic XML processing toolkit. Akara is Amara plus a lightweight Web server framework in order to make functions using Amara available on the Web.

Amara (core XML components)

XPath (e.g. xml_select method) and XSLT patterns do not work with default namespaces?

Q: I am having trouble using XPath to access elements in the default namespace.


Amara follows the XPath standard. In the XPath standard, if you do not specify a prefix in a QName, it only matches an element that is not in any namespace. Thus if you want to match any of the elements in your sample document, you *must* use a prefix in the XPath. It doesn't matter that your original XML does not use prefixes, XPath requires it.

This is an XPath FAQ. e.g. see: http://www.edankert.com/defaultnamespaces.html

For an example discussion see: "xml_select and namespaces"

And yes, other compliant XPath 1.0 tools have to deal with this restriction e.g.

I can't access Bindery objects with certain name patterns in the XML

I have XML such as:

<note prefix="spam">
  Hello World
</note>

or

<xsl:if xmlns:xsl="http://www.w3.org/1999/XSL/Transform" test="foo">
  Hello World
</xsl:if>

But I get unusual results or errors from e.g. doc.note.prefix or doc.if.


Amara uses a name mangling scheme to deal with domain rule name clashes between XML and the library. It could be a matter of a Python reserved word such as if or of a name reserved by Amara itself, such as prefix, which is reserved for DOM compatibility.

You can access these objects by using their mangled names:

print doc.note.prefix_
print doc.if_

or using the mapping protocol APIs:

>>> from amara import bindery
>>> XML = '<if test="foo">Hello World</if>'
>>> doc = bindery.parse(XML)
>>> doc.xml_child_pnames
>>> doc[None, u'if']
<if_ at 0x1017bff80: name u'if', 0 namespaces, 1 attributes, 1 children>

>>> XSLTNS = u"http://www.w3.org/1999/XSL/Transform"
>>> XML = '<if xmlns="http://www.w3.org/1999/XSL/Transform" test="foo">Hello World</if>'
>>> doc[XSLTNS, u'if']
<if_ at 0x1025b2170: name u'if', 1 namespaces, 1 attributes, 1 children>

or using XPath

>>> XSLTNS = u"http://www.w3.org/1999/XSL/Transform"
>>> XML = '<if xmlns="http://www.w3.org/1999/XSL/Transform" test="foo">Hello World</if>'
>>> doc = bindery.parse(XML)
>>> doc.xml_select(u'xsl:if', prefixes={u'xsl': XSLTNS}) #Declare prefixes used *in the XPath*

Watch out for cases where XML names contain characters illegal in Python, such as dashes. These are also mangled:

<note x-id="spam">
  Hello World
</note>

You would use:

print note.x_id #"spam"

watch out for name clashes (which are a very rare case in the real world):

<note xmlns:msg="urn:bogus:message">
  <msg:id>spam</msg:id>
  Hello <id>World</id>
</note>

Amara disambiguates favoring the first instance in document order:

note.id #"spam"
note.id_ #"World"

Or even:

<note id="spam">
  Hello <id>World</id>
</note>

Amara disambiguates favoring the attribute:

note.id #"spam"
note.id_ #"World"

As Uche sometimes tell users:

Amara can't handle file paths on Windows

Q:' After installing Amara on your Windows system, you give it a go and get an error such as:

>>> import amara
>>> doc = amara.parse("f:\\monty.xml")
Traceback (most recent call last):
[SNIP]
amara.lib.iri.UriException: The URI scheme f is not supported by resolver


Turn that fiddly OS path into a proper URL:

from amara.lib import iri
doc = amara.parse(iri.os_path_to_uri("f:\\monty.xml"))

How can I compare XML documents without getting hung up on attribute order and such?

Quoted from [http://mail.python.org/pipermail/python-list/2006-March/329664.html "Python version of XMLUnit?"]


One possible approach is to use c14n to in effect normalize the XML so that you can use regular text compare. This is not as sophisticated as a full XML diff, but it's definitely a viable approach for testing.

For those who might be interested in that approach, learn more about c14n in [http://www.ibm.com/developerworks/xml/library/x-c14n/ "Introducing XML canonical form"]. It includes a brief example using the c14n module in [http://pyxml.sourceforge.net/ PyXML].

Note: unfortunately c14n is not yet fully available for Amara 2.x

Amara also contains in its test suite routines (amara.lib.treecompare.xml_compare) for comparing XML and HTML while ignoring non-significant syntactic variations.

Can I control the order of attributes generated by 4Suite?

Most likely the best you can do is take advantage of [http://www.ibm.com/developerworks/xml/library/x-c14n/ Canonical XML] which restricts output in certain ways, including forcing attributes (and namespace declarations, separately) to be in alphabetical order.

Akara/FAQ (last edited 2010-06-18 01:30:57 by UcheOgbuji)