Writing Akara modules for use with Freemix
Akara modules for Freemix are Python files that define a function to gather data, which is then wrapped, through the Akara decorators to become a RESTful end-point. You need access to add a module to an Akara server. You can follow Akara/Quick_start. If you don't have access to an Akara server, but would like to use a module with Freemix, you can use the experimental one at labs.zepheira.com. E-mail uche@zepheira.com.
I'm going to use for an example screen-scraping the members list at http://xmlguild.org/members
The first bit you need is the module documentation
# -*- encoding: utf-8 -*- ''' Module for use with Akara ( http://wiki.xml3k.org/Akara ). Scrape the XML Guild member's list to generate JSON suitable for use with Freemix ( http://Freemix.it ). '''
The first line is actually the file encoding declaration, which you should always specify, even if you're not immediately using non-ASCII characters in comments, literal strings, etc..
Next the imports and any global scope variables:
import sys import simplejson from amara.bindery import html from amara.lib.iri import absolutize from akara.services import simple_service, response GUILD_BASE = u'http://xmlguild.org/' GUILD_MEMBERS_PAGE = u'http://xmlguild.org/members'
Next you define one or more functions, use a decorator to mark them as RESTful endpoints, and provide a doc string for each.
SERVICE_ID = 'http://purl.org/akara/services/builtin/xmlguild.freemix.js'
@simple_service('GET', SERVICE_ID, 'xmlguild.freemix.js', 'application/json')
def xmlguild_freemix():
'''
Scrape the XML Guild member's list to generate JSON suitable for use with
Freemix ( http://Freemix.it ).
Sample request:
curl "http://localhost:8880/xmlguild.freemix.js"
'''The service ID identifies the nature of the service. As more than one person follows this tutorial we end up with multiple instances of the Akara service to scrape the xmlguild page for data. Each one uses the same code and thus does the same thing. You might actually _invoke_ one at www.example.org/server1/xmlguild.freemix.js and another at You should use the same service ID www.example.com/server2/xmlguild.freemix.js. These are different service end-points, but since they provide the same service and behave the same way should use the same service ID. This enables applications such as services discovery, load-balancing, fallback, and such.
The @simple_service decorator marks the function to the Akara instance for wrapping for RESTful invocation. In this case, it responds to HTTP GET requests at http://$AKARAROOT/xmlguild.freemix.js. In this case the service response will always provide the same Internet media type (IMT) of application/json, but if you prefer you can omit this from the decorator and set the IMT in the function return value.
In Akara terminology, you have mounted the function xmlguild_freemix at the path /xmlguild.freemix.js.
This function accepts no arguments, but in general Akara's @simple_service decorator converts GET parameters to function arguments. So for example http://$AKARAROOT/abc?x=1&y=2 would work for a function mounted at abc that takes parameters x and y. Each one is a list because GET parameters can have multiple occurrences, (a fact far too many Web frameworks ignore).
The rest of the module is just commented code to parse the XML Guild member page and build a Python structure with the desired information.
#Parse HTML / "Tag soup" into XHTML-like bindery structure
#XHTML in all ways except namespace
doc = html.parse(GUILD_MEMBERS_PAGE)
members = []
#Use XPath to find member names
for mname in doc.xml_select(u'//*[@class="membername"]'):
#Warning: the HTML parser
#Watch out for the usual Python/Unicode gotchas
print >> sys.stderr, 'Processing', repr(unicode(mname))
member = {}
#Give each entry a unique ID and a friendly label
member[u'label'] = unicode(mname)
#Ideally id should work as an HTML anchor. Here use the existing anchor
member[u'id'] = mname.id
#Images are given as relative URLs. Must absolutize them
picture = absolutize(
mname.xml_select(u'string(../preceding-sibling::*//@src)'),
GUILD_BASE)
member[u'picture'] = picture
member[u'description'] = mname.xml_select(u'string(following-sibling::p[1])')
#Use brute-force XPath to grab links
member[u'links'] = [ l.xml_value for l in mname.xml_select(u'..//@href') ]
members.append(member)
#The JSON structure requird by Freemix is based on Exhibit, and very simple
return simplejson.dumps({'items': members}, indent=4)The code snippets on this page make up the full module, but it's also attached for convenience. See the Akara/Quick_start page for notes on how to deploy this to an Akara instance. This service is running at http://labs.zepheira.com:8880/xmlguild.freemix.js .
Using your own data source services in Freemix
From your profile go to "Upload data", and enter the URL into the http://labs.zepheira.com:8880/xmlguild.freemix.js . You should get the Guild member information, and you can rename fields, update data type information etc. at this point. If you, for example specify the "picture" field as an image type, Freemix will reflect this change right away, as illustrated:
The resulting data profile is, for example, at: http://freemix.it/dataprofile/Uche/xml-guild-members/
I've created a Freemix from this data here: http://freemix.it/freemix/Uche/xml-guild-members/
