Archiving vs. Journaling: Saving the Right Stuff

  • Published on Jan 8, 2015
Written by: Innovative Driven

Email threatens to overwhelm us all – over 100 billion business emails were sent every single day in 2013[1] and that number is expected to grow.  Friends, colleagues, and now even inanimate objects all communicate by email. With this deluge in mind, how can an organization save their email in a comprehensive, defensible, and useful way?  More to the point, how can they do that while only saving what they need rather than everything?

There are three ways to retain email:

  • Archiving, which retains email by moving older messages to a less costly storage site and in the case of Exchange, simultaneously removing them from the server;
  • Journaling, which makes an exact and complete copy of each email as it is sent from and received by an account in a separate mailbox, invisible to the business user; and
  • Backup, which makes an exact and complete copy of an entire system at regularly scheduled intervals.

Each has unique characteristics that have positive and negative implications for storage and compliance and each is used for a different purpose.

Archiving

Archiving is used to store emails that are less frequently needed on less expensive storage than those that are current or mission-critical.  This feature comes standard on many email applications and is also available as a third party plug-in.  Archiving takes email off of the Exchange server and stores it somewhere else.  Many organizations choose to use a tiered system of storage – keeping important and recent email on “first tier” fast hard drives, which are often costly to purchase and maintain, while storing less important and less frequently used material on less expensive “second tier” storage like slower hard drives, optical drives, or tapes.  The archive system is generally indexed and cached and can later be restored to allow for search and recall of email as needed.  Archives are frequently stored by date – it’s less likely that, for example, you need an email from six years ago than one from six months ago, meaning that you can automate moving some of the older data to less expensive storage.  Archiving can be a manual or automated process, depending on the needs of your business.

Importantly, archived files can be changed by an end user, deleted, or otherwise manipulated by importing them and then altering them.  From an end user’s point of view, archiving isn’t particularly different from live email. Scheduled archives creates another legal wrinkle – email can be sent, read, responded to and deleted in between the scheduled archive times.  Archiving is good for ongoing operations of a business and to save money, but for litigation, you need a stronger lock on your potentially relevant emails.

Pro:
  • Saves money on storage
  • Better speeds on server
  • Turned on by default in Exchange
Con:
  • Not everything is saved if it is not caught by the archiving process, including anything transmitted and then deleted between archival cycles.
  • Auto-delete may remove some relevant email from the archive depending on settings.
  • Stores all messages without concern for content, can still skyrocket costs over time

Journaling

Journaling is a one way email lockbox.  Journaling is designed to help with compliance, regulatory, and litigation related document retention.  While archiving stores email for later recall by the user, journaling creates a separate retention email box for later recall by legal professionals.  It acts like a silent guardian of email which meets certain rules established by the administrator of the journaling mailbox.  Any email that meets the criteria is saved when it is sent or received,:  including content and metadata.  Journaling is included in major email services such as Microsoft Exchange, IBM’s Lotus Domino, or via Google’s Postini service.

Journaling is effective only from the moment it’s turned on which makes it an excellent choice for compliance with regulations like Sarbanes-Oxley and HIPAA. This means that any of the emails sent prior to its activation are not retained in the journaling software.  Journaling is designed for retention; end users can’t access their journaled email full copies are retained.  Organizations should only use journaling on those custodians and emails that are subject to a legal hold.  This is good for ongoing litigation, compliance issues, and internal investigations once an issue is known or litigation is contemplated.

Pro:
  • One way transmission of all data and metadata makes collection and processing easy and defensible.
  • Rules for collection mean that only email identified as potentially relevant to a compliance or litigation issue will be saved.
Con:
  • Only effective once it’s turned on.  Earlier messages will not be retained in this folder.

Backup

Backup is designed for emergencies.  Backup takes a static image of data at a point in time and makes a full copy of the data as it is.  This does not happen constantly, as with journaling, and there are many backup copies which may have multiple copies of the same email, contra journaling which contains just one copy of each email as it was sent.

A regularly scheduled backup can be there to return your organization to the position it was in before an emergency.  This is frequently used for disaster recovery, whether it’s a fire, an earthquake, or a virus on the companies network.  Backup can be difficult and costly to restore and usually isn’t kept on a live system, but it does put the system back to the state it was in at the exact moment in time that the backup was taken.  It is generally not seen as a good option for eDiscovery collection for a number of reasons including but not limited to cost, the level of difficulty to load old backups, and the difficulty in searching for the correct backup before it’s restored.  When all else has failed in your business, however, it is a way to get back up and running. In instances where deletion of email is suspected, a case can be made for restoring from the backup right before the suspected deletion.  This is a slow and potentially expensive process, but a valid one as the backup may contain the deleted emails.

Pro:
  • When disaster strikes, a backup can get your business up and running as quickly as your backups can load.
Con:
  • Backups are not well organized for search and retrieval.
  • Backups are snapshots taken at a particular moment meaning anything not on that system, including deleted messages, is lost.

The Case of Paul and Darian

Imagine that one of your company’s employees, Paul, has alleged a pattern of sexual harassment by his boss Darian, who allegedly sends him alternately flirtatious and threatening emails starting about a year ago and continuing to the present day.  Darian denies this.  The HR department, working with IT and Legal, turned on journaling when they first became aware of the issue; IT set a rule that any email between Paul and Darian should be saved.

Archiving: Even though the events started happening more than a year ago, it would be possible for Darian to delete all of the emails that he sent from his own account.  With Paul’s password or administrative rights, he could also get at whatever there were on Paul’s account, eliminating key evidence.  Paul could also have been deleting email between archive cycles – sending an email and then recalling it shortly thereafter.

Journaling: Even if Darian has deleted emails in active folders or archives, copies remain of all emails created after journaling was turned on.   These copies, and all their related envelope data, are retained in a separate folder, invisible to both Paul and Darian.

Because journaling is only effective from the moment it’s turned on, any of the emails sent between Paul and Darian in the year prior to HR’s action were not retained in the journaling mailbox.  Emails sent after activation can be easily collected.

Backup: If it is believed that Darian deleted archived and live copies of emails before journaling was activated, restoring backups may be necessary. This could be valuable in recreating email from before the journaling process was started as the backups would likely be stored in an inaccessible location.  This process would be costly and time consuming, which could raise proportionality concerns, but the data would still be there in case of emergency.  Backups would also be necessary if Darian had taken extraordinary steps to destroy the journaling system, such as releasing a virus or setting fire to the company servers.

Which one should you use?

That depends on your goals, but a combination of all three is sometimes the best course of action.

Archiving is helpful to the budgetary bottom line and to keep data uncluttered for day to day operations.  It doesn’t help in litigation and it certainly didn’t help Paul. If there isn’t a business purpose for keeping an email at the ready, there’s also likely no real reason to keep it in available in ready storage.  Storing it somewhere that it can be inexpensively housed but easily retrieved makes good business sense.

On the other hand, for the emails that you are trying to retain copies of for possible litigation or regulatory compliance reasons, journaling is a sensible solution.  It sends email on a one way trip to long term unalterable storage.  Users can’t access their own journaled mailboxes and that means that they can’t change or delete anything, either.

Backing up your email should be standard procedure for any organization regardless of archive or journaling strategy.  If your organization needs to produce documents that were not saved through archiving or journaling, however, you may need to deal with backups.  They may be costly to restore and cumbersome to comb through, but they can be a last line of defense against the destruction of evidence.  As a reminder, National Backup Day is March 31st.

[1] Radicati group, Email Statistics Report, 2013-2017, available at  http://www.radicati.com/wp/wp-content/uploads/2013/04/Email-Statistics-Report-2013-2017-Executive-Summary.pdf