Collaborative FAQ for Akara
Please feel free to send recommendations for this FAQ to the mailing list. Please try to draft the questions and answers to help reduce burden on the maintainers. Perhaps you had a question and found it in the archives? Sounds like a good candidate for the FAQ.
Contents
- General
-
Amara (core XML components)
- XPath (e.g. xml_select method) and XSLT patterns do not work with default namespaces?
- I can't access Bindery objects with certain name patterns in the XML
- Amara can't handle file paths on Windows
- How can I compare XML documents without getting hung up on attribute order and such?
- Can I control the order of attributes generated by 4Suite?
General
Q: What is the relationship between "Amara" and "Akara"?
Akara is the overall, umbrella project. Amara is a component of Akara, but can be used on its own, by developers who need core XML processing. Amara is the basic XML processing toolkit. Akara is Amara plus a lightweight Web server framework in order to make functions using Amara available on the Web.
Amara (core XML components)
XPath (e.g. xml_select method) and XSLT patterns do not work with default namespaces?
Q: I am having trouble using XPath to access elements in the default namespace.
Amara follows the XPath standard. In the XPath standard, if you do not specify a prefix in a QName, it only matches an element that is not in any namespace. Thus if you want to match any of the elements in your sample document, you *must* use a prefix in the XPath. It doesn't matter that your original XML does not use prefixes, XPath requires it.
This is an XPath FAQ. e.g. see: http://www.edankert.com/defaultnamespaces.html
For an example discussion see: "xml_select and namespaces"
And yes, other compliant XPath 1.0 tools have to deal with this restriction e.g.
Jaxen (see "How do I write a query for namespace qualified elements/attributes in the default namespace?")
I can't access Bindery objects with certain name patterns in the XML
I have XML such as:
<note prefix="spam">
Hello World
</note>
or
<xsl:if xmlns:xsl="http://www.w3.org/1999/XSL/Transform" test="foo">
Hello World
</xsl:if>
But I get unusual results or errors from e.g. doc.note.prefix or doc.if.
Amara uses a name mangling scheme to deal with domain rule name clashes between XML and the library. It could be a matter of a Python reserved word such as if or of a name reserved by Amara itself, such as prefix, which is reserved for DOM compatibility.
You can access these objects by using their mangled names:
print doc.note.prefix_
print doc.if_
or using the mapping protocol APIs:
>>> from amara import bindery
>>> XML = '<if test="foo">Hello World</if>'
>>> doc = bindery.parse(XML)
>>> doc.xml_child_pnames
>>> doc[None, u'if']
<if_ at 0x1017bff80: name u'if', 0 namespaces, 1 attributes, 1 children>
>>> XSLTNS = u"http://www.w3.org/1999/XSL/Transform"
>>> XML = '<if xmlns="http://www.w3.org/1999/XSL/Transform" test="foo">Hello World</if>'
>>> doc[XSLTNS, u'if']
<if_ at 0x1025b2170: name u'if', 1 namespaces, 1 attributes, 1 children>
or using XPath
>>> XSLTNS = u"http://www.w3.org/1999/XSL/Transform"
>>> XML = '<if xmlns="http://www.w3.org/1999/XSL/Transform" test="foo">Hello World</if>'
>>> doc = bindery.parse(XML)
>>> doc.xml_select(u'xsl:if', prefixes={u'xsl': XSLTNS}) #Declare prefixes used *in the XPath*
Watch out for cases where XML names contain characters illegal in Python, such as dashes. These are also mangled:
<note x-id="spam">
Hello World
</note>
You would use:
print note.x_id #"spam"
watch out for name clashes (which are a very rare case in the real world):
<note xmlns:msg="urn:bogus:message">
<msg:id>spam</msg:id>
Hello <id>World</id>
</note>
Amara disambiguates favoring the first instance in document order:
note.id #"spam"
note.id_ #"World"
Or even:
<note id="spam">
Hello <id>World</id>
</note>
Amara disambiguates favoring the attribute:
note.id #"spam"
note.id_ #"World"
As Uche sometimes tell users:
- Python and XML are very different worlds. There is really no way to avoid surprises going between them.
Amara can't handle file paths on Windows
Q:' After installing Amara on your Windows system, you give it a go and get an error such as:
>>> import amara
>>> doc = amara.parse("f:\\monty.xml")
Traceback (most recent call last):
[SNIP]
amara.lib.iri.UriException: The URI scheme f is not supported by resolver
Turn that fiddly OS path into a proper URL:
from amara.lib import iri
doc = amara.parse(iri.os_path_to_uri("f:\\monty.xml"))
How can I compare XML documents without getting hung up on attribute order and such?
Quoted from [http://mail.python.org/pipermail/python-list/2006-March/329664.html "Python version of XMLUnit?"]
- I have found XMLUnit to be very helpful for testing Java and Jython code that generates XML. At its heart XMLUnit is an XML-aware diff - it parses expected and actual XML and pinpoints any differences. It is smart enough to ignore things like attribute order, different quoting and escaping styles, and insignificant whitespace. Now I am working on a CPython project and have a similar need. Is there any comparable tool for Python? Basically I'm looking for a tool to compare XML and show diffs in an intelligible fashion that is usable from Python unit tests (using py.test, if it matters).
One possible approach is to use c14n to in effect normalize the XML so that you can use regular text compare. This is not as sophisticated as a full XML diff, but it's definitely a viable approach for testing.
For those who might be interested in that approach, learn more about c14n in [http://www.ibm.com/developerworks/xml/library/x-c14n/ "Introducing XML canonical form"]. It includes a brief example using the c14n module in [http://pyxml.sourceforge.net/ PyXML].
Note: unfortunately c14n is not yet fully available for Amara 2.x
Amara also contains in its test suite routines (amara.lib.treecompare.xml_compare) for comparing XML and HTML while ignoring non-significant syntactic variations.
Can I control the order of attributes generated by 4Suite?
Most likely the best you can do is take advantage of [http://www.ibm.com/developerworks/xml/library/x-c14n/ Canonical XML] which restricts output in certain ways, including forcing attributes (and namespace declarations, separately) to be in alphabetical order.
