Is it time to dump RAID for your Exchange 2013 deployment?

Jay Cuthrell, Flickr, http://www.flickr.com/photos/jcuthrell/6341927952

Jay Cuthrell http://www.flickr.com/photos/jcuthrell/6341927952

Now that Exchange Server 2013 SP1 is out, and CU5 is just around the corner, now might be the time that you begin to consider a migration to Microsoft's latest and greatest version of their messaging software. When it's time to upgrade that often means it is time to look for ways to make sure that you are getting best value for money out of your infrastructure.

In this post we'll take a brief look at the history of Exchange and it's relationship with RAID technologies, and see why Microsoft are designing Exchange so that it's becoming less relevant.

I'll follow up with another post with a few thoughts on what Exchange vNext should consider in a world dominated by scale-out storage and advanced virtualization features.

A brief overview of RAID and Exchange

Less than 10 years ago Exchange 2003 represented the pinnacle of messaging systems and still, even as it disappears from extended support, remains useful to many organizations. Exchange 2003 however heavily required on high performance disk drives to ensure access to email was responsive and messages were delivered on a timely basis.

Two core reasons why Exchange 2003 requires fast disks include firstly, the use of 32-bit technology, which limits the amount of data it can cache in RAM, and secondly the way it stores messages on disk. This means that to retrieve individual messages for users it cannot just search its local cache – it needs to access the underlying database tables themselves, which are often scattered across many physical disks.

This resulted in Exchange 2003 requiring storage that could provide high numbers of transactions per second, and was very good at responding to random read and write requests of small pieces of information. Usually the simple but expensive answer was lots of disk drives (to increase spindle count) configured as a RAID array.

To further support the need for RAID, Exchange 2003 also lacked native features to replicate user mailbox content between servers, and although multiple servers could be used for high availability, only a single copy of the data, often within a cluster, was available. To try and reduce the chances of data loss traditional database techniques such as splitting database and log data were used, and with single server implementations this technique remains valid today.

How Exchange 2007 changed the game

Many might view Exchange 2007 as the Windows Vista of Exchange releases, however the technologies introduced provided the functionality that modern Exchange today builds on. Two fundamental changes reduced reliance on fast disks and enabled an Exchange Server in a cluster to be almost disposable.

The move to 64-bit technologies meant that Exchange 2007 servers could use a lot more RAM, way above the 32-bit limitations, to cache mailbox content. This meant that when users accessed the same information within Exchange, if it was stored in the server's memory cache it didn't require additional access to disk. It also enabled more information to be cached in memory before updating the database, reducing the number of disk operations.

From a resilience perspective, the first iteration of what is now the foundation of Database Availability Groups, continuous replication was introduced. This technology, typically implemented through the Cluster Continuous Replication (CCR) feature meant that two Exchange servers could join a cluster together, with one acting as the active node, accessed by users and in charge of all Mailbox and Public Folder databases, and the second acting as a passive node. To ensure that physical database corruption couldn't easily be replicated, the database log files were used to replay changes to individual databases.

This meant that for the first time with Exchange, theoretically RAID was not a necessary technology to protect data in Exchange.

Exchange 2010 expanded upon the replication model and solved some of it's shortcomings. More than two database copies could be configured and databases were moved at the logical level from the server to an organization level, meaning that within a single Database Availability Group (DAG), the effective successor to the CCR cluster, each server could host a mix of active and passive database copies. Additionally further improvements at the performance level meant that often a single disk could meet the performance needs of mailbox users.

These innovations solved the key issues meaning some sort of RAID were required – multiple spindles per database were not needed, and if a single database failed (for example if the disk underneath died) it would simply move to another server in the DAG.

For the first time in Exchange's history a new model was introduced, JBOD – rather than use expensive RAID technology, just use a bunch of disks and swap them if one fails, then re-seed from the new active copy. Implemented to Microsoft's guidelines, this combination is known as Exchange Native Data Protection.

Cloud innovation in your datacentre

In 2007, Microsoft released Exchange Labs. This became Live@EDU, then it grew up (a lot!) and became the Exchange Online part of Office 365. Around the same time, the Business Productivity Suite (BPOS) arrived on the market. With BPOS' Exchange 2007 underpinnings and Exchange Labs's Exchange 14 beta underpinnings they gave Microsoft real-world knowledge of running massive services at large scale. For Microsoft, challenges they will have faced included both improving reliability and reducing cost.

Over the last few years the improvements that have steadily appeared in Exchange 2010, and then in Exchange 2013 show clearly that the experience of running these services has paid off for Microsoft, and will do for customers able to embrace the same vision. The latest incarnation of Exchange allows administrators to use larger disks, and store multiple databases on each disk to allow fast database re-seeds, and technology to take spare disks and automatically start that process. If this sounds a lot like the features that expensive storage arrays offer, then you'd be right.

What the storage vendors don't want you to hear

This leads us into the argument that the storage vendors don't want you to have. Should you rely on Exchange to protect your data, or spend a lot of extra money and buy into expensive storage – after all, Microsoft doesn't support using technologies like deduplication, thin provisioning or NAS storage for Exchange.

Most storage vendors will be keen to convince you that you should approach of using expensive solutions that abstract the storage from Exchange. For some organisations, that might actually be the best plan. There's something to be said for standardising on an approach across the board, even if it costs more money. If you run a single storage platform then with some extra work you can have a single approach for disaster recovery or backups.

That's doesn't mean it's the best solution for an Exchange 2013 deployment though, and storage vendors will argue with you, Exchange MVPs or Microsoft until we lose the will to live. Microsoft however have designed Exchange 2013 to work best in a particular way. If you buy into the same vision that helps Microsoft make money out of Office 365, then you too can look to provide large mailboxes, reduce backup headaches and save money. If you half buy into that vision by putting your Database Availability Group onto expensive storage you still gain the availability benefits and IO reductions, but potentially lose some of the cost savings, because you can't use all the really cool stuff that your storage vendor provides with Exchange.

If the best fit for your organisation is on-premises Exchange, then where you can, stand firm. Exchange is not like a common application server. There's lots of great targets to virtualise and/or attach to expensive storage, but Exchange is not necessarily the best candidate. Exchange uses lots of memory and CPU, and is designed in such a way that it expects to know where it is writing data to on disk. It has technologies built in that rival the capabilities of expensive storage arrays, rather than complement them.

That said, one size doesn't always fit all and there are definitely cases where using RAID, for example as part of a smaller Exchange deployment on an existing virtualized platform can make a lot of sense. However these are typically most appropriate to organizations that maintain a much smaller infrastructure and are looking to take advantage or hypervisor-level HA rather than Exchange HA. For larger organizations the use of RAID is often tied to a wider strategy to virtualize everything where possible, and in such cases a scale-out approach is often appropriate.

The ideal approach for large Exchange deployments is to use the multi-role, building-block approach with physical servers. This allows the use of cheap servers without external arrays, making use of internal storage. A great way to think of it is Exchange is bringing the RAID – a redundant array of inexpensive Exchange Servers (RAIES!). Keep it cheap and where you can, use JBOD.

For further reading, check out Microsoft's Preferred Architecture post on the Exchange Team Blog. Stay tuend for the follow up to this post looking at an alternative viewpoint on where Exchange on-premises should to head.

 

18 thoughts on “Is it time to dump RAID for your Exchange 2013 deployment?

  1. Pingback: The UC Architects » Episode 39: Playing in the SAN box

  2. Pingback: NeWay Technologies – Weekly Newsletter #96 – May 23, 2014 | NeWay

  3. Pingback: NeWay Technologies – Weekly Newsletter #96 – May 22, 2014 | NeWay

  4. Pingback: Weekly IT Newsletter – May 19-23, 2014 | Just a Lync Guy

  5. Pingback: Interesting things that i see on the internet, 19th May | 503 5.0.0 polite people say HELO

  6. Pingback: Interesting things that i see on the internet, 19th May | 503 5.0.0 polite people say HELO

Comments are closed.