It's been more than two months, since I had a detailed look at the
state of Plone trunk performance. This weekend I spent on evaluating what impact the changes we made during that time had on performance and worked a bit more on it.
I made another round of improvements to the way we use CMF actions and converted some of the last Python scripts used in every page to browser view based code. Using as little untrusted code as possible to be part of the page rendering proved once again to be a fruitful endeavor.
My current findings indicate that both the sheer overhead of security checks for untrusted code, but sometimes more importantly the hard to understand and spread out code itself cause the performance problems we have. An example I found today was the language selector viewlet. A nice mixture of code and nesting found in the template, some code in the viewlet class, both backed up by the language tool and querying a language utility in the backend. Each of the pieces looked good in itself, the combination meant that even for a site with a single language a large dictionary of all languages in the world was deep copied, and iterated over twice to determine that the number of languages used in the site was not more than 1. The relevant information here being easily accessible as simple attribute access away on the language tool. Concentrating the control flow from all the different places into the browser view made it far easier to understand what happened and avoid the insane overhead.
Long story short, compared to Plone 3.2 as a baseline, we got up from a factor of being three times faster to about four times faster now.