Ted Leung on the air: Open Source, Java, Python, and ...
I spent a bunch of time over the last few days trying to see if I could coax better performance out of my blog. Some of you have probably noticed (or complained) that the blog is pretty slow. This is mostly due to the fact that I'm dynamically rendering the blog for almost 1200 entries. Here are the things that I've done so far:
- Move the comments directory out of the datadir -- this reduced the number of files that needed to be stat'ed by about 1400
- Turn on entry caching (my testing shows that entryshelve is faster than entrypickle. YMMV
- Implement a simple cache for tools.walk_internal that removes the redundant datadir scans caused by pycategories, pyarchives, and pycalendar
Also, somehow Planet Pyblosxom didn't make it into NetNewswire (fixed) so I wasn't seeing some of the pyblosxom related discussions that have been going on outside of the developer's mailing list. Will has been holding down the fort, but newcomers Bill, and Rob have some good ideas. I've been seriously contemplating switch my blog to WordPress due to the performance issues, but since things have improved a lot with only a small amount of effort, I think I'm going to spend some more energy trying to get pyblosxom to go faster. WordPress 1.3 doesn't look like it's showing up soon, and I'm not desperate enough to learn PHP so I can help/extend WordPress.
That doesn't stop me from having fun rolling my own blogging tool in my spare time though :)
Posted by John Lam at Wed Dec 8 03:21:03 2004
Posted by Darryl at Wed Dec 8 05:23:24 2004
My Persist class has two methods save/restore which look like
def save(self):
Persist.object_cache[self.name] = self.data
pickle.dump(self.data, open(...))
def restore(self):
try:
self.data = Persist.object_cache[self.name]
except KeyError:
self.data = Persist.object_cache[self.name] = pickle.load(open(...))
I'm running twisted so object_cache is just a regular dictionary. It would be easy to drop in a metakit wrapper, berklydb, or some other solution.
My blog frontpage isn't too dynamic, only the articles are rendered (the right side navi w/ blogroll and archive links is static) but it can do 100 frontpages a second on a desktop Athlon.
With fancier output caching - keep track of what objects define an URL and if we've done that before return the final html result. Adding or editing a post invalidates the whole cache (easier than being smart about it). with this turned on it can churn out 600+ frontpages a second, many more than the connection can handle (it caps out at about 80 frontpages/sec).
Micro-optimising tools.walk will only get you so far. Caching in the right place is a much bigger boost.
Posted by Jack Diederich at Wed Dec 8 14:31:05 2004
My blog is seeing about 9k hits/day. Moveable Type is different because it statically renders the pages. You are leveraging the fine Apache HTTPD server.
Posted by Ted Leung at Thu Dec 9 00:13:37 2004
I agree the fancier output caching is a big part of the answer. Another of the pyblosxom contributors is working on that, so I'm working on other stuff, like micro optimising tools.walk.
At some point I'm interested in exploring alternate storage layers, although at that point we'll really have departed from the *blosxom tradition.
Posted by Ted Leung at Thu Dec 9 00:15:18 2004
Posted by Elliot Lee at Fri Dec 10 21:58:16 2004
To insert a URI, just type it -- no need to write an anchor tag.
Allowable html tags are:
<a href>
, <em>
, <i>
, <b>
, <blockquote>
, <br/>
, <p>
, <code>
, <pre>
, <cite>
, <sub>
and <sup>
.You can also use some Wiki style:
URI => [uri title]
<em> => _emphasized text_
<b> => *bold text*
Ordered list => consecutive lines starting spaces and an asterisk