Posted by Rick Dales, VP Product Management
In the world of email archiving, there is an
ongoing argument about the value of stubbing, a process designed to help manage
the storage in Exchange by replacing messages or attachments on an email server
with a link to a copy of the file in an archive. I thought I’d weigh in on this
topic, first by explaining the concept and looking at the pros and cons, and
then (in a second post), providing a list of four best practices that
businesses should follow if they’re relying on stubbing in their organization.
With the growth of
email volume outpacing the reduction in total cost of storage ownership, it
comes as no surprise that IT is struggling to manage Exchange storage. The real frustration for most Exchange
administrators is that the vast majority of their storage is occupied with
content that people almost never read. For performance and reliability reasons, Exchange is usually implemented
on the most expensive of storage platforms making this usage pattern extremely
expensive. Furthermore, as a transaction
system, every piece of data is open for modification. This means that every piece of data needs to
be backed up on a regular basis.
Introducing Stubbing – How it works
All of these factors
have led IT to investigate archiving as a means to address their storage challenges. The idea is simple – focus the Exchange
server on the delivery and management of current mail, and push the older mail
to another repository that can be managed on less expensive infrastructure.
That repository can then use archival storage management processes that allow
for incremental backup of only newly added information, rather than the entire
set.
Moving the data to
another location (the archive) benefits IT; however, training users to change
their behavior and look for this information in a new application (often with
unique user interfaces and workflows) is often too cumbersome for broad
adoption. To address these concerns,
archiving vendors introduced features known as stubbing
or shortcutting. This involves replacing
the messages or attachments in users’ mailboxes with a pointer to the copy in
the archive. From an end-user’s
perspective, the email data is still accessible from Outlook, and yet they don’t
run into their mailbox quota less often.
Stubbing Drawbacks
Stubbing isn’t without
its drawbacks, however. To understand the
impact on storage, you need a solid understanding of Exchange’s single instance
storage model. When a message is
delivered to multiple recipients within the same mailbox database (storage
group), the message body and attachments are only stored once, and the message
entry in each mailbox simply references the single copy of this data.
When a user modifies a
message in their mailbox, Exchange creates a unique copy of the content and
points the message in the user’s mailbox to that copy. As Exchange doesn’t provide any way to access
the single-instance store of content, stubbing processes behave like end-user
edits -- modifying messages on a mailbox by mailbox basis. If a message was sent to multiple recipients
on the same mailbox database, but you only stub content for some of them, you
actually increase not decrease
storage by implementing stubbing. Furthermore, even though stubs may be small (typically <2K), as the
stubbing process works through each mailbox, it is creating separate items in
the single-instance store.
Since many elements of
Exchange and data management processes are impacted by the number of entries in
the tables, not just their total size, the unwinding of single-instance storage
in Exchange can be problematic. As it
happens, however, Microsoft Office has a habit of updating attachment metadata
when a user views the item, which in most environments means that
single-instance storage is pretty much non-existent within Exchange. The more of these changes that are made in
Exchange between backups, the longer an incremental backup of the mail system
will take.
Microsoft’s answer to
the storage management problem is to change Exchange 2007 to support
dramatically larger mailboxes and to change the way backup processes work so
that managing these larger mailboxes databases becomes more practical. While most firms that I’ve talked to plan to
increase mailbox sizes with their conversion to Exchange 2007, few are creating
the 1GB mailboxes that Microsoft touts.
Conclusion
Clearly,
stubbing is not the straightforward Exchange storage management solution that
some vendors would have you believe. That having been said, when implemented
properly, it can be a valuable tool to manage the growth of Exchange storage with
minimal impact on end-user behavior. In my next post, I’ll talk about four best
practices to make the most of stubbing in your organization.