It's major feature: Speed, speed and speed. Compared to any other templating system out there, by an order of magnitude faster.
Some time has passed and careful observers have spotted NKOTB at times. During last years performance sprint organized and sponsored by Headnet a critical mass of people met and tried various approaches to put this ingenious piece of art to some use.
Some time passed again and NKOTB got a proper name and a public presence, now to be known under the name spitfire. But when you put too many intelligent people into one room, you won't always get them to push in one direction. So Project Messerschmidt was born. An attempt to mock the same underlying approach of spitfire but make it vastly simpler.
So where are we today?
Spitfire is under active development, but to the current day is not integrated into any Zope environment except for some initial attempts. Spitfire only has support for Cheetah-like syntax today.
What about Project Messerschmidt?
It is nowadays known as z3c.pt and used by projects like Vudo or repoze.bfg. It has a full implementation of the TAL standard including the i18n namespace. While it does make some conscious choices about different default behavior than zope.pagetemplate it is virtually compatible.
The most notable differences are lacking support for METAL, the provider expression and to the current day no Five or Zope2 integration. The later two should be a matter of one focused day of work to finish.
But what about the main objective, speed? There is one established shoot-out test for templating languages called 'bigtable'. The objective is to render an HTML table of ten columns and 1000 rows. It is somewhat like the 'top speed of your car' number. It certainly isn't the only number you are interested in, when buying a car. But it's an easy to compare number, that has some objective nature to it.
So here are the numbers, judge for yourself:

Update: Thanks to a comment by Alexander, I made the chart a bit more impressive :)
Update 2: Since I've seen Donna experimenting with Mako, I added it to the graph.

12 comments:
For those not in the know, Messerschmidt is a renowned German manufacturer of fighter planes (today they produce small bubble cars). While the Spitfire propelled its way through the skies, the Me 262 was a real jet fighter.
No METAL support is not a problem in the long run, since macros are evil. When including macros, use viewlets instead, and for the theming something like Deliverence should be used.
In the short run it is a problem to do theming in for example Plone, without macros. But then again it's only useful if you want to be able to actually call the ZPT main template macros, so then you would have to include complete ZPT macro compatibility....
Better focus on getting a deliverence style theming layer into the zc.themehook. ;) (Yeah, I know, I promised to do that a year ago...)
Nice summary!
I'm excited about z3c.pt even though my heart lies with Spitfire. The cleanup and refactoring needed to support z3c.pt puts us in an ideal position to move to Spitfire later.
One important thing to note is that we'll keep compatible syntax across the implementations — tal:content will still be tal:content, regardless of the engine involved.
PS: Just a useful tip for the future when it comes to graph visualization: try to make the tallest graph be the "best/fastest". In this particular case, I would have picked "pages/second" instead of "second/page", so the tallest graph would be fastest. That would also show just how much faster both z3c.pt and Spitfire really are at the bigtable test. :)
PPS: Spitfire is a template compiler, not a template language. That's why it can support multiple input syntaxes. It generates an Abstract Syntax Tree. :)
Thank you for the graphical tip! I updated the chart to show the reversed benchmark now and spitfire does look by an order of magnitude better than z3c.pt now ;)
Good work Hanno. The current trunk build of Spitfire does have an attribute language that did at one time work. :-) I'm busy trying to get Spitfire used in a real production project. Once I have a little time, I will revisit the attribute language and makes sure it's test cases pass again.
In other news, my recent work on improving the performance of general python has led me to some strategies that may generate even faster code. Once Spitfire is running in a production there should be a wealth of data for further investigation.
Can you please publish the HTML page used for rendering ?
@Baiju: The bigtable tests are available in SVN:
svn://svn.zope.org/repos/main/z3c.pt/trunk/benchmark/benchmark/bigtable.py
and
http://spitfire.googlecode.com/svn/trunk/tests/perf/bigtable.py
I haven't tried, since Mako is plenty fast for anyone using it (including reddit), but I wonder what would happen if you ran its generated Python modules through an optimizer like Psyco ? I'm not sure how spitfire/z3c.pt are getting these kinds of results, but if its just a question of post-optimization on generated Python, then this is something that could apply just as well to Mako (and maybe Cheetah). The fact that you're showing bigtable results (which is just two nested loops) suggests this is what's going on.
@mike bayer:
The results for all of the tests don't use psyco, since that is an optimization strategy that applies to all of the templating languages. If you enable psyco, the spitfire result goes up by about 50% for example (spitfire does have built-in support for psyco at the O4 level, I show the result for O2/3, which for this simple test is the same).
I haven't actually tried running z3c.pt with psyco enabled but would expect to get a result of one third to 50 percent better as well.
Huge benefits are gained by not using c/StringIO but a simple list for storing the internal buffer, optimizing the cgi.escape function and merging consecutive write statements into one...
> Huge benefits are gained by not using c/StringIO but a simple list for storing the internal buffer, optimizing the cgi.escape function and merging consecutive write statements into one...
Mako uses a list instead of StringIO and merges write statements into one as well. The 0.2 series also inlines the buffer reference in the template to reduce name lookups.
We haven't optimized cgi.escape but you can disable all usage of cgi.escape (as well as the default call to unicode()) which is useful for benching the raw interpreter speed (but I wouldn't think this would provide a 300% speed improvement....). The Mako compiled template for "bigtable" is literally just the two loops and a couple of buffer.write() statements...just look at the generated source. There's a little bit of execution overhead but not much (i.e. not 300-500%, certainly), and there's no runtime "interpreter" in use either - Python controls the flow of execution directly.
So without my downloading everything and combing through everyone's source (which I guess I'll have to find the time to do), and without any native code optimization present, I'm not seeing any factor that would produce such largely divergent numbers. Mako has a few constructs that slow it down, like '%namespace import="*"', but I doubt you're using that...it also wouldn't explain your huge gains over Cheetah. You should definitely turn off the default "unicode()" filter though, this is a fairly significant performance factor.
I wrote Mako testing against Cheetah's speed at every step (including Cheetah's native C extensions) and could hardly squeeze an extra half second beyond them even without the default unicode() filter.
Any other ideas ?
Further discuss of the Mako comparision has been taken to http://groups.google.com/group/pylons-discuss/browse_thread/thread/9d453cc5b9fc6364/6dc5608263c8cd13
Re: the bigtable benchmark... I also have used it to primarily compare the performances of evoque and mako, employing a number of variations to explore mostly how quoting affects the performance i.e. when done automatically, manually, or none at all.
In addition, I also explored another variation, namely a "manually tweaked" version of this simple double-loop bigtable template (basically turning it a single python expression, as a list comprehension). It is interesting that both evoque and mako seem to speed up to a max -- probably getting very close to a real python speed limit. The specific results I am referring to are here:
http://evoque.gizmojo.org/benchmark/bigtable/
Being eval-based, evoque pays a higher price on such a loop-intensive template (but performs faster on pure substitions, as indicated by the companion subs benchmark.
The template source string for the "manually tweaked" variation (having just a single expression the syntax this is identical for both evoque and mako) is:
"""<table>
${ "".join([ "<tr>%s</tr>" %("".join(["<td>%s</td>"%(col) for col in row]))
for row in table ]) }
</table>
"""
This variation makes mako go twice as fast, and the gains are even bigger for evoque. Thus, in response to "Any other ideas?" above, another idea is clever loop optimization ;-)
Code for this and all benchmarks is of course included in the evoque distribution...
Post a Comment