Saturday, May 23, 2009

Visualizing the persistent structure of a page

Inspired by David's blog post I took his script and modified it slightly for a different purpose. Instead of showing an entire Plone site, I used it on a single page, with only a simple title and text. Here's what a page actually consists of on the persistent data level:



So what you think of commonly as one page object is indeed at the minimum ten persistent objects. As soon as you add references to it or store certain other information on it, it will grow even more.

If you ever heard complains about what is wrong with the way we store data in Plone, this might make it more obvious. Another thing you need to know about the ZODB is that it is slightly simplified just one big table with a unique identifier (oid) mapped to the pickle data. While conceptually the above object references are important, on the DB level loading any of these objects from the DB is the same operation. If you have a ZEO setup where loading an object incurs some network latency, you'll end up with ten times the latency for loading what seems to be a simple page object.

So if you ever wondered why you need to increase your ZODB cache to numbers in the thousands and Plone needs so much RAM, the above should give you some ideas. Fixing these problems is incredibly hard after they have been introduced, but people like Laurence Rowe keep reminding us of the importance of it. </rant> ;)

6 comments:

  1. Any chance to see the script that you used to produce the graph? Thank you.

    ReplyDelete
  2. Thank you for posting this reminder.

    It's definitely something we need to look into, most likely in Plone 5 or Plone 6, once the new type story is more clear to us.

    ReplyDelete
  3. @mgr: I used David's script, added the size via len(p) into the loop and did the rest manually via some knowledge and OmniGraffle magic. Finding out that the BTree is actually the object stored in the attribute __annotations__ is quite hard to do in the general case.

    ReplyDelete
  4. Maybe you should be using from ZODB.referencesf import referencesf. See this link for an example

    ReplyDelete
  5. that should be from ZODB.serialize import referencesf

    ReplyDelete
  6. Add another persistence object, __vc_info__. This attribute gets persisteed once version control is applied via ZopeVersionControl.

    ReplyDelete