[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Batched Indexing in Zimbra
I just want to take a moment to share a little piece of recently gained knowledge with you. If you don't already know about Zimbra's Batched Indexing feature, *turn it on*! This feature isn't on by default yet, but it makes a *huge* difference to performance. It is documented in the Zimbra Large Site Performance Wiki page.
The feature is enabled on a per-COS basis, presumably to allow you to group your users based on their need for mailbox indexing, and tune the value appropriately. It can be set on all COS's with this shell command:
for cos in `zmprov gac`; do zmprov mc $cos zimbraBatchedIndexingSize 20; done
The "zimbraBatchedIndexingSize" value is the number of unindexed messages that must accumulate in a mailbox before they'll be indexed. If the owner of a mailbox does a text search while there are unindexed messages in the mailbox, all unindexed messages will be indexed on the spot, so there's no danger of messages being missed in search results.
Essentially what this feature does is decouple message delivery from indexing, making them asynchronous. Until last Friday, we were having serious interactive performance issues whenever a message was sent to a substantial number of users (which isn't hard in a university environment). The I/O load spike would saturate the disks enough that all transactions would be affected -- especially sending messages from the web client. We'd see periods of 10-20 second delays while trying to send a message or save a draft. In addition, LMTP throughput wasn't great -- the graphs produced by zmstat-chart showed peak delivery rates of maybe 6-7 messages per second, with latency climbing to over 15 seconds (i.e 15 seconds to deliver a message) during that time.
Once we made the change, it's like we're looking at completely different servers. We now see LMTP peak delivery rates of 20+ msgs/second per mailbox server (and I suspect it can go higher - we just haven't had any really large mailouts since Friday) with latency peaking at 1 second. And interactive latency has dropped to under a second for sending. The peak disk I/O load has dropped from 80-100% busy to 20% busy. IOPS on the Index volume have gone from 1000+ peak iops to 100.
--
Steve Hillman IT Architect
hillman@sfu.ca IT Infrastructure
778-782-3960 Simon Fraser University