Proofpoint: Security, Compliance and the Cloud

15 posts categorized "Storage"

August 22, 2011

Cloud Computing and the Law: Gary Steele Discusses Cloud Privacy and Security on NBC's Press Here



There are two kinds of people: Those who get up early enough on Sunday to watch the news and policy wonk shows and, well, those of us who don't. If, like me, you find yourself in the second camp, you might have missed Proofpoint's CEO, Gary Steele, discussing "Cloud Computing and the Law" with reporters from NBC, Forbes and Bloomberg on yesterday's edition of NBC's "press:here" interview show.

In this segment, Gary discusses some of the legal issues around cloud computing, including whether an electronic document stored in the cloud is entitled to the same protection as that same file stored in a physical safe. While this conversation is focused on data privacy and legislative issues, a discussion of some of the security concerns around cloud computing and storage also comes up.

The conversation ranges from basics about "the cloud" to the concerns around data locality, search and seizure of data and the evolving state of privacy legislation. You can watch a video replay below:

 

 

May 20, 2008

Can one email archiving approach meet all your needs? (Part 4 of 4)

Posted by Rick Dales, VP Product Management

In my last  three posts, I introduced the idea that there are multiple approaches to archiving and took a deeper look at the two most widely-used methods – mailbox archiving and journaled archiving.  I conclude this series of posts by addressing the question that often comes up:  Can one email archiving approach equally solve both your mailbox storage management challenges as well as meet your legal discovery and compliance requirements?

As I mentioned in my first post, companies may have many goals when they decide to implement an email archive, but some goals may end up being in conflict with others.   For example, the IT group may implement an archive for mailbox storage management purposes and let users control which messages are archived and which ones are deleted.   However, by doing this, they defeat the organization’s retention policy and make the archive a meaningless place to manage preservation orders for a litigation hold. 

Most of the in-house archiving software products implement both mailbox archiving and journal archiving and allow customers to enable both approaches as a way to deal with the limitations of each.  Not only does this not provide an overly practical solution, it also results in duplicate storage of content (despite what they might tell you about single instance storage).

At Fortiva, we use journal archiving because we wanted to ensure that we could address the litigation readiness and compliance requirements.  However, as I mentioned in my previous posts, using journaling as a source of information that you plan to expose to end-users requires additional work (that most archives don’t attempt to do).  We do the extra work to understand routing of messages and assignment to end-user mailboxes so that one copy of the message can be used for both end-user access as well as discovery purposes. 

Fortiva offers capabilities such as stubbing, a process similar to mailbox archiving where a periodic scan of mailboxes is performed.  Unlike implementing mailbox archiving on top of journaling, we scan mailboxes and then use our powerful real-time search engine to find the item that already exists in the archive to determine what the stub (or shortcut) in the mailbox should point to.  Doing so allows us to leverage the single copy of the data that is already in the archive via journaling.

It must be noted that Fortiva’s solution is built around a retention policy engine that assigns retention when messages are archived.  This means that neither users nor IT can simply say “I don’t need this anymore” and delete items at will.  As such, while Fortiva provides the added value of addressing storage management challenges, our on-demand archive is most suited for those that have a need for consistent retention as a core business requirement. 

While most modern archiving solutions offer some capabilities to address legal discovery and storage management challenges, each will have limitations on one area or the other – partially because the “optimal” business rules for each problem are in conflict. Thus, knowing what your primary goal will help you decide which email archiving approach is best suited for your organization.

May 16, 2008

Approach 2: Journaled Archiving (Part 3 of 4)

Posted by Rick Dales, VP Product Management

In my last two posts, I talked about the fact that there are multiple approaches to archiving, each with its pros and cons. I also took a closer look at one of those approaches – mailbox archiving.  In this post, I will dive more deeply into another widely-used approach – journaled archiving – including how it works and what problems it is best suited to address.

Journaled archiving relies on a feature in the mail system that captures a copy of every message in transport (as it is sent/received) and puts a copy in another mailbox.  This copy of the message is stored as an attachment to a message known as a journal report, which contains additional information about the actual recipients of the original message.  The archiving system then uses this “journal mailbox” as a source of messages to be captured (and typically deletes the content once it has been captured).  Some outsourced solutions rely on the customer configuring journaling to deliver to a remote SMTP address.

Strengths

  • Complete capture of email messages
    The journaling process places a copy of every message that is sent/received into a separate mailbox at the same time that a user receives it in their mailbox.  A user choosing to delete the message in their own mailbox has no bearing on whether the message gets archived. 
  • A single, complete picture of each message
    As the journaling process includes BCC information and expansion of distribution lists, the archiving system can provide a full picture of the original message.  While multiple Exchange servers can increase the complexity on this front (because multiple journal reports may be created), the data exists to allow an archiving system to collapse the data into a single message containing all information about the actual recipients.

Weaknesses

  • Providing end-user access to their own mail is difficult
    To provide end-users with access to the messages that they sent or received, an archiving system has to determine which mailboxes a message was actually delivered to.  The address information on journal reports is insufficient to archive this, as forwarding and routing rules must be factored into the equation.   While it is possible to do this (and Fortiva does), most other journal mail systems do not, resulting in journaled messages being available only to IT or legal that have rights to see all mail.
  • No direct ability to modify/stub messages
    There is no connection between a journal report in the journaling mailbox and the messages that live in users’ mailboxes.  Replacing message content in users’ mailboxes with a pointer to the message captured using journaling, requires the archiving system to use complex lookup routines based upon content similarity.  Fortiva uses this approach, but most firms do not.

Appropriate Uses of Journaled Archiving

Best suited for: Legal and Regulatory Compliance
Journaled archiving is the Microsoft-recommended approach for capturing data for legal discovery and compliance requirements.  It allows for the complete capture of all messages in a single, unified view.

Not usually well-suited for: Email Storage Management*
Unless the archiving vendor specifically implements other processes to cleanup user mailboxes, journaled archiving approaches won’t address storage management challenges. Some journaled archiving solutions, including Fortiva, have implemented attachment stubbing (replacing attachments with a link to the file in the archive) to address this.

Not usually well-suited for: End-user Access*
Unless the archiving vendor specially implements techniques to determine which users actually received mail, users will either not be able to access their own mail, or will be granted access to a subset of the messages that they actually received. Some solutions, such as Fortiva, have developed a way to overcome this, allowing end-users to fully access all their archived mail.  Because journaled archiving isn’t working against the users’ mailbox, it can’t record which folder each user chooses to file the messages into.

* NOTE - As a point of reference (and self-disclosure), Fortiva uses journaled archiving. It overcomes some of the noted limitations with additional address resolution techniques and the use of a periodic scan of users’ mailboxes to allow for the stubbing of older attachments.

May 08, 2008

Wondering how you can afford to go green? Get your numbers straight.

Posted by Justin Wiebe, Fortiva Operations

Given my posting last month on green computing, I found the following statistic based on a new survey by McKinsey and Co. quite interesting.  Apparently, the world’s data centers are projected to surpass the airline industry as a greenhouse gas polluter by 2020.

In my previous post, I wrote about efforts taken by Fortiva to reduce our overall infrastructure power consumption. This has had the dual benefit of reducing both our impact on the environment and our cost of doing business. Since then, I have been thinking more about the challenges of justifying green computing from a dollars and cents perspective.

Until recently, I have rarely been able to create a positive Return On Investment (ROI) for new hardware purchases, especially those related to green computing. It turns out that all along I was missing something – that dollar amount that pushes the cost of existing systems over the top and reduces the payback to less than 3 years (sound familiar?). And what is that key? Power – more importantly, the cost of the power used over its lifetime by the piece of hardware you want to replace. As noted by Mark Monroe, Director of Sustainable Computing at Sun, rarely are power costs included in the IT budget.

Here at Fortiva, we try to roll all of our data center costs into one number. By adding up all co-location costs, power costs, cooling costs and miscellaneous data center costs (but not bandwidth costs) and then dividing this number by the useable power, we obtain the monthly $/VA  cost (or approximately $/W). By then determining the amount of power (VA or W) used by each server, we can calculate the cost per month to keep the server up and running. A sample of the calculation may look something like this:

Co-location Cost ($/VA) * Power Used by Server (VA ) * Server Life = Cost to Run Server

With the cost of power increasing almost everywhere, the Cost To Run Server is approaching the Cost To Buy Server. At Fortiva, the cost of hosting our dual-cpu servers for three years is approximately 75% of the total cost to purchase the server. If you stretch the life of the server out to five years, it actually costs more to host the server than to buy it.

So what should you do now? Try the following:

1 – Determine Co-Location Costs:

  • Check your contracts. If you outsource your co-location facilities, you may be able to calculate the $/VA cost from your contracts.
  • Talk to your finance department. If you manage your own co-location facilities, see if you can find out the costs of power, maintenance, security, on-call personnel, etc. Remember, your Co-Location Cost is the total of the costs to run the data center, divided by the useable amount of power.

2 – Determine Power Used by Server:

  • Get yourself a good power meter and find out how much power your servers actually consume when idle and when under load. The numbers provided by the manufacturers tend not to reflect how you use the servers on a daily basis.

3 – Get out the spreadsheet and start crunching the numbers.

Try calculating the ROI on consolidating some of your existing servers onto virtual machines, or replacing some of your older machines with more energy-efficient models. You’ll probably be surprised by the results.

Hopefully these numbers can provide you with a better understanding of the true cost of your IT infrastructure. Who knows, they may also help you reduce your overall power consumption, justify some new hardware, or even help you justify outsourcing to someone who already has.

May 06, 2008

Approach 1: Mailbox Archiving (Part 2 of 4)

Posted by Rick Dales, VP Product Management

In a previous post, I introduced the idea that there are multiple approaches to archiving.  In this post, I will dive more deeply into one of the two most common approaches, known as mailbox archiving, including how it works and what problems it is best suited to address.

Mailbox archiving is the process of periodically connecting to a user’s mailbox and looking for content that matches some criteria (an archiving policy) and adding it to the archive.  While a mailbox archiving process might run on a nightly basis, typically the archiving policies are set to only store messages that are older than a certain age (typically 30-90 days).

Strengths

  • Visibility to all content and state information in the mailbox
    By connecting directly to the user’s mailbox, the archiving system can see (and choose to capture) any type of content, including calendar events, that wouldn’t be sent to another user.  Similarly, they can capture which folder the user has put the item into.
  • Ability to modify messages in the mailbox
    With direct access to the user’s mailbox, the original message can be modified (flagged), deleted or replaced with a pointer to the copy in the archive.
  • Easy to provide end-user access
    As the archive knows which mailbox it found a message in, it can easily provide the appropriate security controls to provide users with access to the messages in their mailbox without granting access to other messages.

Weaknesses

  • Incomplete set of messages are captured
    Similar to backups, any periodic snapshot activity cannot record things that arrived and were subsequently deleted between capture cycles.  Given that users read and then deleted over 50% of messages on the day they receive them, periodic capture will miss the majority of mail – even if the archiving policy is set to capture messages immediately. 
  • Incomplete picture of each message’s recipients
    When a user receives a message they have no visibility to the set of recipients that were BCC’d.  In addition, if the message was sent to a distribution list, the actual set of recipients isn’t stored with the message.  In the period between message receipt and capture, the membership of the distribution list can change materially (or the distribution list can be deleted from the mail system entirely).
  • Duplicate message removal is very difficult
    While digital signatures can be used to find and remove duplication of message bodies and attachments to optimize the storage within the archive, removing duplication of the messages themselves is difficult because the set of recipients may be different and the meta data about when a message was received will vary from mailbox to mailbox.  When performing legal discovery across a set of users, duplicate copies of messages from different user’s mailboxes dramatically increases the costs of reviewing messages to be produced for opposing counsel.

Appropriate Uses of Mailbox Archiving

Bested suited for:
Mailbox Storage Management
Mailbox archiving is appropriate for active mailbox storage management. A significant advantage -  mailbox archiving systems can “stub” or “shortcut” messages so that users don’t need to change their behavior to access historical mail. It is important to note, however, that without an active process that removes content from user’s mailbox, an archive only aids in storage management if combined with tight mailbox quotas – requiring users to spend hours each month on manual cleanup tasks.

Not appropriate for: Legal Discovery or Regulatory Compliance
Since mailbox archiving does not ensure the archiving all messages, nor does it provide a complete view of all message traffic, it is not suitable to address legal discovery or regulatory compliance requirements.

Click here to read Part 1 of Different Approaches to Archiving Email

April 15, 2008

Note to Vendors: Please Help Us Be Green

Posted by Justin Wiebe, Fortiva Operations

Green_computing I recently returned from a trip to Europe where I visited data centers in several countries. Almost everyone talked about the environment, carbon neutral computing and the possibility of governments starting to tax businesses based on their computing carbon footprint. This was a refreshing contrast to my experiences in North America where there is lots of press about ‘Green Computing’, but not too much action.

Here at Fortiva, we have been putting a lot of emphasis on minimizing our footprint. Not just because it is good for the environment, but also because the economics make sense. Over the past few years, we have managed to reduce our power consumption in a number of different ways:

  • Using AMD processors and lower-power SATA drives has allowed us to reduce our power usage per GB stored from 0.2 W to 0.05 W
  • Virtualization in our Development and QA environments has reduced the number of servers by approximately one-fifth. 
  • Investing in remote management solutions reduces the number of visits we make to our data centers.
  • And, as I opened with, we are looking into our vendor’s  ’greenness ‘.

Looking at this list of changes, it seems like we have made a lot of progress over the past few years. Unfortunately, it still feels like we have a long way to go. I have come up with a wish list of changes I would like to see our vendors make to help us achieve our goals:

  • Ship less junk with each server:  For every server we receive, we probably throw away a third of the total weight shipped. Packaging, cable management kits, mounting brackets for non-standard racks, and documentation that no one reads goes directly into the dumpster. Some of our more enlightened data centers encourage recycling, but no one seems to take the time to sort the mess. Add the extra fuel used to ship the servers as a result of this excess weight, and eliminating these extras would save us all money.
  • Act like a global company: One of our server suppliers recently announced that they will no longer ship a server purchased in Canada directly to a US address. What this means for us is that when we order a server for one of our US data centers, the manufacturer ships it from the US factory to our Canadian office, where we turn around and ship it back.
  • Offer us older hardware:  Newer isn’t always greener – or even necessary. In many cases, we can use older, lower power consuming CPUs to power our storage servers. We just can’t get them.

I am sure there a lots of things I have left off this list that may make more sense for your company. Think about them, and the next time your vendor’s rep asks if there is any way they can help you, you’ll have something to share with them.

March 13, 2008

Litigation Hold Loopholes – Preventing End-User Deletion

Deletekey Post by Rick Dales, VP Product Management

Last week, an interesting post appeared on StorageSoup, a SearchStorage.com blog that provides commentary on the storage industry. The post, titled FRCP looking like a PITW (Pain in the Wallet), identifies some of the potential loopholes a company can face trying to enforce a litigation hold. It also questions whether technology exists to address these loopholes without forcing an organization to literally keep every email indefinitely.

The quick answer to that question is yes (in fact that’s exactly what Fortiva’s on-demand email archive offers), but I thought it would be worthwhile to address some of the challenges mentioned in the blog entry in a bit more depth. Considering that the post was written by Tory Skyers, a Senior Systems Engineer who has hands-on experience dealing with multiple litigation holds and who regularly writes on storage issues, the confusion around how to best enforce a litigation hold is obviously hitting even the most seasoned IT professionals.

Here’s a quick rundown of Skyers’ main concerns, followed by my thoughts and recommendations:

  1. Some trials last a loooooooong time, and the costs of storing the data requested for litigation hold on WORM are very significant. Despite this, the potential risks and costs of not having the data available can be so high that businesses can’t afford not to store relevant data once a litigation hold comes into affect.

    1. As Skyers mentions, some cases can last five years or more and the cost of storing this data starts adding up quickly. The whole process can also be time-consuming for IT, and there are no guarantees that data won’t be corrupted. So not only is this approach expensive, it’s risky too. Having said that, the risks of not storing the data can be even higher. The key is to find a more cost-effective, reliable way to store the data (ie. an email archive).
  2. There’s a “Safe Harbor” clause in the FRCP that absolves companies of responsibility if the company has — and strictly follows — a deletion and retention policy. This protects the company from falling afoul of the regulation, but does my act (as an end user) of deleting an email fall under the “Safe Harbor” clause?

    1. The quick answer is no. The “Safe Harbor” clause protects organizations from being penalized for deleting relevant information before a litigation hold comes into affect, assuming the data was deleted according to a stated deletion and retention policy. If an end user is allowed to delete an email (accidentally or intentionally) that is covered by a litigation hold, or that has not yet reached the corporate retention period, it can be considered spoliation of data.

      Spoliation is the withholding, hiding, or destruction of evidence relevant to a legal proceeding and is a criminal act in the United States. It can result in fines and/or incarceration for the parties who engaged in the spoliation. It can also lead to a negative inference ruling that can ultimately lead to a guilt verdict.

      To avoid this, companies should have technology in place to ensure that email data cannot be deleted by an end-user until both of the following criteria are met: a) it has reached its retention period and b) it is not covered by a litigation hold.
  3. I’ve seen some precedent that leads me to believe that simply having and following a policy is not enough… So as it relates to e-discovery, if a company allows [me] to delete my own emails, are [they] implicitly approving of me disobeying retention and deletion policy?

    1. In a way, yes. The key to meeting the FRCP guidelines is having and enforcing a policy. If you believe your end-users can be relied on to accurately enforce your policy (and not make any errors), then it is sufficient to simply have a policy and rely on your employees. Otherwise, you better have some technology in place that enforces your policy (including litigation holds) and prevents human error.

      In fact, a case in point is the recent Intel vs AMD lawsuit. Intel executives were informed of the litigation hold retention requirement, but many of them deleted email anyway. Regardless of whether the email deletion was intentional (or whether it was simply human error), the company was guilty of spoliation.
  4. It seems like I would have to have CDP in place and store every email entering and leaving every mailbox forever to be really covered against every contingency.

    1. Fortunately, it’s not that bad. Once an email reaches the lifecycle outlined in the corporate retention policy, it can (and should) be deleted (assuming it’s not covered by a litigation hold). There is absolutely no need to keep everything forever (in fact that would raise a company’s risk profile significantly).

      The question is, how should you store your email? Skyers accurately points out that relying on a backup process may be insufficient, since any data that is sent or received, and deleted in between backup periods may not be retained. Beyond that, it is virtually impossible to apply a consistent retention policy against data on backups, since a single tape necessarily contains emails crossing a wide span of time. Backup tapes also have a high rate of corruption/failure, making them an unreliable.

      To keep all the data that enters your corporate email system for as long as necessary (and no longer), you really need an email archive like Fortiva, which captures every email that is sent or received, and keeps multiple copies in unalterable format on spinning disk until they meet the retention policy.

So all this leads to one conclusion –an email archive is really the most foolproof way to avoid the many possible loopholes when dealing with the FRCP requirements for email retention, litigation holds and e-discovery. At the risk of being self-promotional, here’s a run-down of how Fortiva meets all the requirements and addresses the concerns raised by Skyers:

  • Cost-effective storage: Fortiva’s SmartStore archive stores a redundant copy of every email sent and received according to the customer’s retention policy in a centralized location. It requires virtually no effort on the part of IT, and it starts at just $1.10 per user, per month for 1000-user company. It also offers storage management features that allow a company to significantly reduce the burden on the Exchange email server.
  • Litigation hold: Fortiva allows legal or IT to enforce a litigation hold against relevant email indefinitely with a click of a button in a web-browser interface.
  • Policy enforcement: Fortiva allows you to develop granular policies (including different retention policies for different departments, individuals, and types of data), and automatically enforces those policies.
  • Redundant storage: Fortiva stores multiple copies of every email in unalterable format on spinning disk, and keeps an additional copy in a secondary location. The system also provides continuous data validation across all archived data.

It’s important to note that not all email archives offer the same functionality. There is a whole class of email archives that were designed primarily to address email storage management issues, and those typically allow end-user deletion/deletion outside the retention policy (introducing many of the problems highlighted above). But that gets into topic in itself. In my next post, I’ll explain the different types of email archive, and the situations that each type is best suited for.

March 06, 2008

Is Tape Going the Way of the Dodo?

Dodo_bird Posted by Jeremy Hope, VP Operations

I recently got an email from a vendor that I felt I had to comment on, and since it refers to something I have recently been blogging on – storage and backups – I thought I’d dump my thoughts into the blog.  The email I’m referring to was from a vendor inviting me to read a White Paper titled The Risk of a Disk-Only Backup Strategy: the Case for Disk and Tape, extolling the benefits of Tape technology for backups rather than relying only on disk to disk backup solutions.

The synopsis of the report is that Disk drives have a high MTBF (Mean Time Between Failure) rate in their later years (jeez technology gets less healthy as it gets older – go figure) and if you drop disk drives they may break (huh?- what a breakthrough!).   This is their total justification of why you need tape in your environment rather than relying on disk to disk backup alone.

Ok, so I concede these two points might be true (i’m not going to try drop kicking any of my disk drives to prove them wrong) but let’s look at the big picture here. The White Paper fails to mention the MTBF rate of Tape Drives and physical tapes themselves (how many times have you tried to retrieve data from a tape to find out it is corrupt?), or the fact if you drop them they break too (both tape drive and tape).  Never mind the headaches you have to go through when you try to restore that 4 year old tape that was created with a drive you no longer have (it was dropped a while back) and the new latest technology drive won’t read it.

The White Paper also fails to mention readily available technologies and solutions (RAID 6, distributed/cluster file systems, grid computing, multiple redundant copies, etc.) that can be used to improve disk to disk backups.  When these technologies are utilized (if you don’t plan to keep up with technology – get out of the IT business) these simple issues can easily be overcome. In fact there are numerous ways that a disk to disk backup solution can be advantageous and even better than tape for data intensive uses such as archives.

At Fortiva we use current RAID technology, accompanied with a grid computing storage infrastructure to provide multiple redundant copies of data across both Primary and Secondary data centers.  At least 3 copies of data exist at any one time and replication is used to keep the copies current.  This includes copies of disk to disk backups for the various systems.

If data is needed to be moved it is done via gigabit network or portable disk drives (that now provide over a Terabyte in capacity) and the new instances of data (and its redundant copies) are verified before the original is deleted.  This accounts for any possible service or data outages within a primary data center caused by any one set of data as well as providing for Disaster Recovery.  Having the backups running on spinning disk also allows for online verification of the data (when is the last time you loaded all of your tapes from tape storage to verify their integrity?).  This is done at a very affordable price without spending a cent on tape infrastructure and all of its complexities.

In our environment tape isn’t just becoming extinct like the Dodo, it’s already gone and buried.

February 26, 2008

How We Keep Email Archiving Costs Low

Posted by Jeremy Hope, VP Operations

As Rick's blog entry from January 28 noted, Fortiva recently introduced an entry-level archiving solution (SmartStore) that is extremely price-competitive. To help people better understand how this is possible, I thought I’d explain the unique storage challenges that email archiving presents, and how we at Fortiva deal with those challenges in a way that allows us to keep costs low.

The majority of companies implement high performance, highly redundant, high priced storage for their transaction-based applications and slower performing, less redundant, lower cost storage for larger amounts of data within file based applications.  The challenge with archived data is that it requires storage with both characteristics, crossing the typical boundaries of storage solutions typically implemented within most IT environments. 

Archive data necessitates storage with high throughput, not only to be able to write the large amounts of data within a reasonable time, but also to allow for the searching of the data.  High redundancy within the archive data storage environment is expected since in most cases only one copy of the data will exist (making tape copies of hundreds of TBs of data is impractical).   Meanwhile the same characteristic, the sheer quantity of data, begs for less expensive storage to stay economical.

This leaves many IT Managers puzzled with how to provide an archive solution at a practical price with reasonable performance.    One solution is the use of a Software-as-a-Service (SaaS) solution like Fortiva, where you let the provider worry about the storage environment.  Still, many may wonder how providers such as Fortiva can provide lower cost per TB solutions (such as our recently announced SmartStore solution) without losing money due to the storage costs alone.

For Fortiva, the solution lies in a grid computing infrastructure that utilizes a large number of 1 or 2U servers with locally attached RAID disk arrays.   This hardware provides for a fast, highly redundant and scalable storage infrastructure.  This storage environment mixed with the Fortiva “secret sauce” – a proprietary Distributed File System at the application layer that tracks where data is within the grid of distributed servers – allows Fortiva to provide multiple redundant copies of data at an extremely low cost.  Another advantage of the solution is the consolidated computing power available by utilizing each CPU within the grid that is used for providing search and other application functionality.

The fact that Fortiva uses a grid environment for all clients distributed throughout a data center provides the economies of scale that no large enterprises can afford to implement themselves – a fact that is reflected in the low pricing Fortiva offers.

December 10, 2007

The Truly Centralized Email Repository: Is it even possible?

Posted by Stephen Prokai, Fortiva Professional Services

Regardless of what the most pressing drivers are behind an email archive project, having a centralized repository for all corporate email is perhaps the most appealing end-goals for almost everyone we speak with.  But is it even possible to corral all of that data?

Many archiving solutions start with a “from today” implementation which means you have to take additional steps to get yesterday’s data into the repository.  To make this task easier there is usually some way to import that data, but old email can have lots of homes.  The obvious ones are Exchange mailboxes, back-up tapes, shared network drives, laptops, desktops and even a legacy email archive. 

You need to make sure that you pay attention and check all of the possible places that email can hide on your network.  Buried PSTs are the trickiest to locate, especially on laptops that are not always connected to your network.  Even the best PST finder utility can’t locate something that it can’t see.   Existing PST aren’t the only trouble though. 

Unless you deactivate the users’ ability to create PSTs, you can bet that there will be more within minutes of the initial deletion.  This quickly becomes a user training/re-education discussion though.  If you are going to remove the ability for users to create PSTs you had better have an archive that can be deployed quickly and more importantly one that is fast and easy to use.

Dealing with legacy email from an archiving system that’s on its way out can also be a major part of the project.  You’ll need a way to first get the old data out, get it into a format that your new archive can handle then get it imported.  One key thing to note here is that you may not need to actually include all of that old data in the archive.

Corporate email retention policies usually define at what point an email can be disposed of.  It is likely that an effort to gather all disparately stored email messages will end up with a load of data that has already satisfied the defined retention policy.  Steps should be taken to make sure you do not import anything you don’t need.  Not doing so can lead to higher risk exposure as well as higher costs, both for the initial import project and on an ongoing basis.

Achieving a truly centralized repository for your email is possible, but it really requires a concerted effort and some serious planning.  Making sure you gather everything you know is on your network and even the email you don’t know is out there is critical.  The company retention policy needs to be clearly defined and enforced, and the users need to know what it is.  The risk factors and the cost for storage considerations can not be ignored.

Archives

Blog Search

Email Security Gateways, 2011

Magic Quadrant

Tweets

What people are saying right now about us.

©2012 Proofpoint, Inc.
threat protection: Proofpoint Enterprise Protection compliance: Proofpoint Enterprise Privacy governance: Proofpoint Enterprise Archive secure communication: Proofpoint Encryption