Sunday, March 01, 2009

Plone trunk continues to get faster

It's been more than two months, since I had a detailed look at the state of Plone trunk performance. This weekend I spent on evaluating what impact the changes we made during that time had on performance and worked a bit more on it.

I made another round of improvements to the way we use CMF actions and converted some of the last Python scripts used in every page to browser view based code. Using as little untrusted code as possible to be part of the page rendering proved once again to be a fruitful endeavor.

My current findings indicate that both the sheer overhead of security checks for untrusted code, but sometimes more importantly the hard to understand and spread out code itself cause the performance problems we have. An example I found today was the language selector viewlet. A nice mixture of code and nesting found in the template, some code in the viewlet class, both backed up by the language tool and querying a language utility in the backend. Each of the pieces looked good in itself, the combination meant that even for a site with a single language a large dictionary of all languages in the world was deep copied, and iterated over twice to determine that the number of languages used in the site was not more than 1. The relevant information here being easily accessible as simple attribute access away on the language tool. Concentrating the control flow from all the different places into the browser view made it far easier to understand what happened and avoid the insane overhead.

Long story short, compared to Plone 3.2 as a baseline, we got up from a factor of being three times faster to about four times faster now.


  1. awesome! but the given example of the language portlet should be easily be backportable to 3.x, no?

    of course, I doubt, that that alone would generate such impressive graphs such as yours here :)

  2. I think ms/request is a more useful measure than request/s

  3. Tom: The language selector change in itself probably has an impact of half a millisecond. It was more an example for what kind of problems our former "template-script" soup brought us. Count a dozen of those mistakes and you have a noticeable difference, though.

    Laurence: All these numbers are ms/request, just turned into request/s for the presentation. 1000 divided by number isn't too hard to calculate, should you be interested.

    P.S. I just made a couple of more intrusive experiments and I think I know, how to get the next 5ms or so down (yes, that means more than 50req/s). CMF Expressions apparently really suck performance-wise.

  4. Very impressive. I do hope you'll think about backporting what can be easily backported to 3.x. I know it will make your graphs less pretty, but it will also be really nice to be able to say "Plone 3.4: now 2x faster!" :-)