Bright Content Performance

In moving Copia to BC, the speed became an issue. Here is a simple effort to gather profile information when querying for a large dataset in the store.

   1 import os
   2 import time
   3 
   4 import cProfile
   5 
   6 from datetime import datetime
   7 from dateutil.tz import *
   8 from pprint import pprint
   9 
  10 from brightcontent.amplee.store import AmpleeFileStore
  11 
  12 
  13 if __name__ == '__main__':
  14     conf = {
  15         'config_document': '../docs/store.entries.config.xml',
  16         'service_document': '../docs/store.entries.service.xml',
  17     }
  18     args = {
  19         'lower_date': datetime(2005, 1, 1, tzinfo=tzutc()),
  20         'upper_date': datetime(2006, 1, 1, tzinfo=tzutc())
  21     }
  22     print 'Creating store'
  23     start = time.time()
  24     store = AmpleeFileStore(**conf)
  25     end = time.time()
  26     print 'Done ', end - start
  27     cProfile.run('store.get_entries(**args)')
  28 

Results

The results according to my (Eric) system are below. The one area that seems to be available to optimize is the way entries are loaded into Amplee. Currently the entries are copied using the copy module for each entry loaded. This is time consuming in the grand scheme of things and seems to be one way to optimize. One option could be to build the feed from actual web requests via recursive calls. This model could potentially scale in the long term as well since they could be made asynchronously across a distributed set of store services. This could also be aided in using HTTP caching.

(bc_env)Macintosh-2:tests ionrock$ python profiler.py 
Creating store
Done  50.575772047
         6439898 function calls (5586596 primitive calls) in 10.315 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   10.315   10.315 <string>:1(<module>)
      844    0.004    0.000    0.010    0.000 Context.py:22(__init__)
     1688    0.001    0.000    0.001    0.000 Context.py:66(copy)
     1688    0.001    0.000    0.001    0.000 Context.py:69(set)
      422    0.003    0.000    0.012    0.000 InputSource.py:342(fromStream)
      422    0.005    0.000    0.009    0.000 InputSource.py:41(__init__)
      422    0.001    0.000    0.002    0.000 InputSource.py:81(_getStreamEncoding)
      422    0.000    0.000    0.000    0.000 ParsedAbsoluteLocationPath.py:16(__init__)
      422    0.002    0.000    0.108    0.000 ParsedAbsoluteLocationPath.py:19(evaluate)
     1266    0.018    0.000    0.098    0.000 ParsedAxisSpecifier.py:100(select)
     2110    0.005    0.000    0.006    0.000 ParsedAxisSpecifier.py:15(ParsedAxisSpecifier)
     2110    0.001    0.000    0.001    0.000 ParsedAxisSpecifier.py:26(__init__)
      422    0.002    0.000    0.002    0.000 ParsedExpr.py:29(__init__)
      422    0.001    0.000    0.001    0.000 ParsedExpr.py:704(__init__)
      422    0.000    0.000    0.000    0.000 ParsedNodeTest.py:136(__init__)
     2110    0.013    0.000    0.017    0.000 ParsedNodeTest.py:17(ParsedNameTest)
     1688    0.001    0.000    0.001    0.000 ParsedNodeTest.py:182(__init__)
    15756    0.016    0.000    0.020    0.000 ParsedNodeTest.py:195(match)
      422    0.001    0.000    0.002    0.000 ParsedPredicateList.py:18(__init__)
      422    0.000    0.000    0.000    0.000 ParsedPredicateList.py:58(__len__)
     1266    0.001    0.000    0.001    0.000 ParsedRelativeLocationPath.py:12(__init__)
 1266/844    0.009    0.000    0.115    0.000 ParsedRelativeLocationPath.py:17(evaluate)
     2110    0.002    0.000    0.002    0.000 ParsedStep.py:17(__init__)
     1266    0.004    0.000    0.103    0.000 ParsedStep.py:23(evaluate)
      422    0.000    0.000    0.000    0.000 ParsedStep.py:50(__init__)
      422    0.000    0.000    0.000    0.000 ParsedStep.py:53(evaluate)
      422    0.001    0.000    0.002    0.000 Uri.py:285(SplitFragment)
      844    0.000    0.000    0.000    0.000 UserDict.py:69(__contains__)
      844    0.013    0.000    0.220    0.000 Util.py:140(Evaluate)
      984    0.001    0.000    0.001    0.000 __init__.py:101(__getitem__)
      422    0.035    0.000    5.321    0.013 __init__.py:108(from_entry)
        1    0.000    0.000    0.000    0.000 __init__.py:116(iterkeys)
      422    0.010    0.000    4.285    0.010 __init__.py:21(parse)
      422    0.006    0.000    0.012    0.000 __init__.py:210(safe_path_join)
      422    0.002    0.000    0.002    0.000 __init__.py:28(__init__)
        1    0.004    0.004    0.235    0.235 __init__.py:429(iterindex)
      422    0.002    0.000    0.165    0.000 __init__.py:443(is_draft)
      422    0.001    0.000    0.001    0.000 __init__.py:7(__init__)
    17946    0.050    0.000    0.154    0.000 bindery.py:1000(__init__)
    49147    0.037    0.000    0.037    0.000 bindery.py:1008(__len__)
   133280    0.212    0.000    0.289    0.000 bindery.py:1028(__setattr__)
     1713    0.001    0.000    0.001    0.000 bindery.py:103(startPrefixMapping)
      422    0.001    0.000    0.008    0.000 bindery.py:111(startDocument)
      422    0.001    0.000    0.003    0.000 bindery.py:120(endDocument)
    17946    0.048    0.000    3.213    0.000 bindery.py:127(startElementNS)
    17946    0.041    0.000    0.319    0.000 bindery.py:135(endElementNS)
    31198    0.060    0.000    0.336    0.000 bindery.py:143(characters)
        3    0.000    0.000    0.000    0.000 bindery.py:150(processingInstruction)
      422    0.001    0.000    0.006    0.000 bindery.py:156(comment)
    17946    0.038    0.000    0.070    0.000 bindery.py:168(_add_rule)
    17946    0.042    0.000    0.120    0.000 bindery.py:180(add_rule)
    17946    0.039    0.000    0.055    0.000 bindery.py:198(remove_rule)
    68359    0.376    0.000    3.733    0.000 bindery.py:211(apply_rules)
      422    0.006    0.000    4.233    0.010 bindery.py:245(read_xml)
    17946    0.218    0.000    0.578    0.000 bindery.py:300(create_element)
    17946    0.141    0.000    1.665    0.000 bindery.py:321(bind_attributes)
    17946    0.016    0.000    0.052    0.000 bindery.py:339(is_element)
    17946    0.185    0.000    0.377    0.000 bindery.py:343(bind_instance)
      844    0.001    0.000    0.001    0.000 bindery.py:38(__init__)
    17946    0.208    0.000    3.064    0.000 bindery.py:401(apply)
    61886    0.067    0.000    0.130    0.000 bindery.py:417(handle_end)
    49172    0.526    0.000    0.592    0.000 bindery.py:42(xml_to_python)
      422    0.002    0.000    0.004    0.000 bindery.py:431(apply)
        3    0.000    0.000    0.000    0.000 bindery.py:439(apply)
      422    0.002    0.000    0.003    0.000 bindery.py:457(apply)
    31198    0.106    0.000    0.142    0.000 bindery.py:473(apply)
    17946    0.024    0.000    0.024    0.000 bindery.py:61(__init__)
      422    0.006    0.000    0.007    0.000 bindery.py:72(__init__)
      422    0.002    0.000    0.002    0.000 bindery.py:850(__init__)
        3    0.000    0.000    0.000    0.000 bindery.py:942(__init__)
      422    0.000    0.000    0.000    0.000 bindery.py:961(__init__)
      422    0.015    0.000    4.268    0.010 binderytools.py:108(bind_stream)
      422    0.005    0.000    4.274    0.010 binderytools.py:132(bind_string)
      422    0.002    0.000    0.009    0.000 binderytools.py:45(setup_binder_)
    25179    0.015    0.000    0.015    0.000 binderyxpath.py:105(_namespaceURI)
    43559    0.023    0.000    0.023    0.000 binderyxpath.py:113(_localName)
     2110    0.011    0.000    0.011    0.000 binderyxpath.py:131(_parentNode)
     2110    0.007    0.000    0.018    0.000 binderyxpath.py:139(_rootNode)
     1266    0.033    0.000    0.060    0.000 binderyxpath.py:147(_childNodes)
     7667    0.007    0.000    0.007    0.000 binderyxpath.py:55(__init__)
      844    0.007    0.000    0.243    0.000 binderyxpath.py:79(xml_xpath)
      844    0.001    0.000    0.001    0.000 collection.py:432(store_container)
      844    0.006    0.000    0.006    0.000 collection.py:562(convert_id)
      422    0.001    0.000    0.018    0.000 collection.py:610(get_meta_data_info)
      422    0.001    0.000    0.442    0.001 collection.py:640(get_meta_data)
      422    0.002    0.000   10.075    0.024 collection.py:855(get_member)
      422    0.006    0.000   10.073    0.024 collection.py:886(load_member)
772388/422    1.899    0.000    5.027    0.012 copy.py:144(deepcopy)
   228388    0.047    0.000    0.047    0.000 copy.py:197(_deepcopy_atomic)
18368/422    0.081    0.000    3.198    0.008 copy.py:223(_deepcopy_list)
    43466    0.653    0.000    0.897    0.000 copy.py:231(_deepcopy_tuple)
45019/422    0.529    0.000    4.988    0.012 copy.py:250(_deepcopy_dict)
   362896    0.419    0.000    0.544    0.000 copy.py:260(_keep_alive)
     8862    0.060    0.000    1.665    0.000 copy.py:276(_deepcopy_inst)
18793/422    0.192    0.000    5.014    0.012 copy.py:299(_reconstruct)
    18793    0.023    0.000    0.023    0.000 copy_reg.py:91(__newobj__)
        1    0.000    0.000    0.235    0.235 indexers.py:16(between)
      984    0.008    0.000    0.228    0.000 indexers.py:17(_between)
      422    0.001    0.000    0.002    0.000 members.py:26(generate_resource_id)
      984    0.000    0.000    0.000    0.000 parser.py:129(__iter__)
    11808    0.013    0.000    0.080    0.000 parser.py:132(next)
      984    0.022    0.000    0.106    0.000 parser.py:138(split)
      984    0.006    0.000    0.009    0.000 parser.py:144(__init__)
     1968    0.002    0.000    0.003    0.000 parser.py:220(jump)
      984    0.001    0.000    0.001    0.000 parser.py:223(weekday)
     1968    0.001    0.000    0.002    0.000 parser.py:231(month)
     1968    0.008    0.000    0.008    0.000 parser.py:239(hms)
      984    0.003    0.000    0.004    0.000 parser.py:245(ampm)
      984    0.000    0.000    0.000    0.000 parser.py:262(convertyear)
      984    0.003    0.000    0.003    0.000 parser.py:272(validate)
      984    0.029    0.000    0.216    0.000 parser.py:294(parse)
      984    0.043    0.000    0.181    0.000 parser.py:341(_parse)
      984    0.003    0.000    0.004    0.000 parser.py:37(__init__)
    11808    0.057    0.000    0.067    0.000 parser.py:51(get_token)
      984    0.004    0.000    0.219    0.000 parser.py:696(parse)
      422    0.001    0.000    0.011    0.000 posixpath.py:168(exists)
      422    0.002    0.000    0.003    0.000 posixpath.py:56(join)
    47293    0.034    0.000    0.034    0.000 sets.py:292(__contains__)
      422    0.001    0.000    0.016    0.000 store.py:116(get_meta_data_info)
        1    0.000    0.000   10.315   10.315 store.py:147(get_entries)
        1    0.001    0.001   10.315   10.315 store.py:155(_do_query)
        1    0.002    0.002   10.314   10.314 store.py:192(get_entries_by_dates)
      422    0.001    0.000    0.440    0.001 store.py:254(fetch_meta_data)
      422    0.002    0.000    0.015    0.000 storefs.py:61(info)
      422    0.012    0.000    0.439    0.001 storefs.py:72(get_content)
      422    0.001    0.000    0.440    0.001 storefs.py:89(get_meta_data)
     3936    0.001    0.000    0.001    0.000 tz.py:33(utcoffset)
      422    0.001    0.000    0.008    0.000 xmlparser.c:508(StartDocument)
      422    0.001    0.000    0.004    0.000 xmlparser.c:532(EndDocument)
     1713    0.002    0.000    0.003    0.000 xmlparser.c:561(StartNamespace)
    17946    0.021    0.000    3.234    0.000 xmlparser.c:678(StartElement)
    17946    0.018    0.000    0.337    0.000 xmlparser.c:715(EndElement)
    31198    0.030    0.000    0.366    0.000 xmlparser.c:740(Characters)
        3    0.000    0.000    0.000    0.000 xmlparser.c:797(ProcessingInstruction)
      422    0.000    0.000    0.006    0.000 xmlparser.c:950(Comment)
      422    0.000    0.000    0.000    0.000 {Ft.Xml.Lib.XmlString.IsXml}
    17946    0.010    0.000    0.010    0.000 {Ft.Xml.Lib.XmlString.SplitQName}
      422    0.006    0.000    0.006    0.000 {Ft.Xml.cDomlettec.CreateParser}
    17946    0.027    0.000    0.027    0.000 {_bisect.insort}
      984    0.003    0.000    0.003    0.000 {built-in method now}
    11401    0.025    0.000    0.025    0.000 {built-in method sub}
     1406    0.001    0.000    0.001    0.000 {cStringIO.StringIO}
    19914    0.005    0.000    0.005    0.000 {callable}
    13280    1.026    0.000    1.026    0.000 {dir}
    90139    0.083    0.000    0.083    0.000 {getattr}
   132342    0.156    0.000    0.167    0.000 {hasattr}
  1270214    0.204    0.000    0.204    0.000 {id}
    70977    0.032    0.000    0.032    0.000 {isinstance}
    18793    0.029    0.000    0.029    0.000 {issubclass}
    74491    0.018    0.000    0.018    0.000 {len}
    18793    0.224    0.000    0.224    0.000 {method '__reduce_ex__' of 'object' objects}
   590959    0.130    0.000    0.130    0.000 {method 'append' of 'list' objects}
      422    0.402    0.001    0.402    0.001 {method 'close' of 'file' objects}
      844    0.005    0.000    0.005    0.000 {method 'copy' of 'dict' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
      844    0.002    0.000    0.002    0.000 {method 'encode' of 'unicode' objects}
      844    0.000    0.000    0.000    0.000 {method 'endswith' of 'str' objects}
      844    0.000    0.000    0.000    0.000 {method 'extend' of 'list' objects}
     2110    0.002    0.000    0.002    0.000 {method 'find' of 'unicode' objects}
  1203256    0.336    0.000    0.336    0.000 {method 'get' of 'dict' objects}
    13272    0.010    0.000    0.010    0.000 {method 'getQNameByName' of 'Ft.Xml.cDomlette.Attributes' objects}
    17946    0.007    0.000    0.007    0.000 {method 'has_key' of 'dict' objects}
    17946    0.013    0.000    0.013    0.000 {method 'items' of 'Ft.Xml.cDomlette.Attributes' objects}
    45019    0.013    0.000    0.013    0.000 {method 'iteritems' of 'dict' objects}
        1    0.000    0.000    0.000    0.000 {method 'iterkeys' of 'dict' objects}
     4920    0.001    0.000    0.001    0.000 {method 'lower' of 'str' objects}
      844    0.054    0.000    0.084    0.000 {method 'parse' of 'Ft.Xml.XPath.XPathParser' objects}
      422    0.253    0.001    4.212    0.010 {method 'parse' of 'Ft.Xml.cDomlette.Parser' objects}
    22866    0.010    0.000    0.010    0.000 {method 'pop' of 'list' objects}
    19680    0.006    0.000    0.006    0.000 {method 'read' of 'cStringIO.StringI' objects}
      422    0.014    0.000    0.014    0.000 {method 'read' of 'file' objects}
    17946    0.013    0.000    0.013    0.000 {method 'remove' of 'list' objects}
      422    0.001    0.000    0.001    0.000 {method 'rfind' of 'str' objects}
    17946    0.013    0.000    0.013    0.000 {method 'rindex' of 'unicode' objects}
      422    0.005    0.000    0.005    0.000 {method 'setContentHandler' of 'Ft.Xml.cDomlette.Parser' objects}
      422    0.000    0.000    0.000    0.000 {method 'setFeature' of 'Ft.Xml.cDomlette.Parser' objects}
      422    0.002    0.000    0.002    0.000 {method 'setProperty' of 'Ft.Xml.cDomlette.Parser' objects}
    17946    0.014    0.000    0.014    0.000 {method 'setdefault' of 'dict' objects}
   134124    0.054    0.000    0.054    0.000 {method 'startswith' of 'str' objects}
    29343    0.046    0.000    0.046    0.000 {method 'update' of 'dict' objects}
      381    0.000    0.000    0.000    0.000 {method 'update' of 'set' objects}
      422    0.010    0.000    0.010    0.000 {posix.stat}
    43466    0.040    0.000    0.040    0.000 {range}
    11528    0.004    0.000    0.004    0.000 {setattr}

Bright_Content/Performance (last edited 2008-11-24 18:46:31 by localhost)