Let’s face it, this topic has been in the back of everyone’s thinking for quite some time, yet few organisations of scale can achieve it. Tape has been around since the 1950s when pioneered by IBM to be a low-cost offline, and portable storage medium. In the last 65 years it has seen significant transformation with the market fairly singularly centred on the LTO Ultrium cartridge format.
LTO-6 is the current generation offering roughly 5TB of compressed data per cartridge, with a roadmap that extends to LTO-7 in October 2015, and LTO-8 which will see this increase even further over coming years.
The reality is that, since I worked at Quantum between 2000-2007, there has been a dramatic change in the paradigm for tape usage. Because of its portability and sequential nature, tape became the reason for people to often dislike backup. Yet backup need not be so dull!
These days, backups are staged to disk first before being copied to tape. Smart backup solutions are able to electronically copy backup content from one second-tier disk system to another, usually in an alternate site, so that the need for making regular tape copies is significantly diminished.
In CommVault’s terminology this is called a DASH copy. DASH is a horrible acronym for Dedupe-Accelerated Streaming Hash, which is about as bad as all of those terrible acronym’s IBM made up in the 1980s for their products. Forget the acronym; DASH just means FAST, and that’s what it does through only transferring new and unique sub-blocks of data between the primary and secondary copy of backup content.
This technology means that you can copy backup (or archive) data in any of these scenarios:
- From Production to DR
- From one or more remote sites to head office/data centre
- From any site to a cloud data centre
- Or all of the above together in any combination
The upshot of this is if you are copying data between disk arrays at your sites then your reliance on tape is significantly diminished.
When DASH copy is implemented, Perfekt often find clients today purchase a 1, 2 or 4 drive tape library or autoloader and make just weekly, fortnightly, or monthly tape copies which are more for archival purposes rather than traditional restore.
Because of the licensing schemes available with the CommVault Capacity License Agreement and the new Solution Bundles, clients are no longer metered on the back-end capacity of backup data stored. You can retain a day, a month, a year or a decade on disk for no additional license charge. You just need:
- The disk space to retain it
- A sufficiently large dedupe database on your media agent server
What do you need to get DASH Copy Working?
There are a couple of “considerations”. A consideration is a problem if you don’t think it through. If you plan ahead, then you will not run into issues.
The first is how to make the initial copy of data. DASH copy is incredibly efficient at moving backup content between sites. However, there is no special magic. That first copy will take some time to move. How long depends on:
- The data volume
- The network link (and how much of it you can use for this)
- A whole bunch of other “overheads”.
The devil is in the detail, so at Perfekt we have devised a simple formula to help you work this out which provides an approximation of the duration, in days, for the initial copy:
|Duration Days||Data Volume GB||Available Link Speed Mb/sec||Constant|
The constant factors in compression, TCP overhead, as well the CV Index and dedupe hash size. The following is a summary of the estimated numbers used for these factors:
- An estimated -15% allowance for the benefits of compression is given
- A +30% overhead for TCP/IP on the link speed
- +5% for the CommVault Index of the Data
- The Dedupe database creates a hash of each 128K block, which is 4K in size (+3%)
- Finally a unit conversion is made to account for data in GB and link speed in Mbps to output a duration in days
As an example a site with 500GB of data on a link with 10Mbps available would take at least 5.7 days to complete the initial copy process.
As an alternative to transferring the initial backups over the WAN, it is possible to seed the data using a portable USB-attached hard drive. In this approach, this hard drive transports the initial data set manually before establishing the regular (eg daily) DASH copy process.
Such a process however has considerable time and effort spent in handling and shipping of the drives, and as a result Perfekt would suggest to consider USB seeding if the WAN transfer time exceeds 14 days.
Of course, once the seeding is complete, since users do not rewrite entire reports, databases, presentations or spreadsheets every day. What is captured is just the sub-block changes, and these are efficiently replicated after the backup to the alternate site.
You can use the same formula as above, but take the daily sub-block change rate of between 2% and 5% of the data volume to determine the nightly DASH copy duration.
Taking our example of 500GB of data in a site with a link with 10Mbps, we could say that this has 2% or 5% of daily change. Pop that into the formula and you will see that the DASH copy duration on the same 10Mbps link is:
- 2%: 2hrs and 45 mins
- 5%: 6 hrs and 51 mins
These are certainly achievable in an overnight window.
We recommend that a minimum link speed of 10Mbps is used to support DASH copy. This ensures that it can make that first copy in sufficient time, but is also fast enough to handle the nightly copy should there be a rare occasion where something dramatic causes the change rate to be 10 or 15%. It may take a day or two to catch up. If the link was too slow, it may fall behind for so long that there is an exposure in getting the data off site.
With ongoing data growth and general system changes it is important to monitor transfer times of the DASH copies to ensure that they are completing in a reasonable time period and not lagging behind. Perfekt suggests that this is done with Aux Copy Fall Behind Alerts in console progress reporting.
Also the DASH copy summary report should be reviewed each month to monitor the overall health of the copies. This will help identify sites where greater link speeds may be required in the near future.
What if you don’t have a second site? Look up in the sky!
Not a problem. There are oodles (the technical term meaning more than you could imagine) of cloud providers wanting to have you store your backup data with them. There are two ways of storing CommVault backup data in cloud storage (I hate using the word “the cloud” assuming there is only one. The reality there are so many offerings. They are all different. Their costs are not the same and a good number will be out of business in less than 5 years).
The first way is to DASH copy to a cloud provider. This is preferred. Using this approach you would stand up a virtual CommVault media agent server in the cloud and purchase some cloud storage. The media agent is doing some hefty work, so the only gotcha here is the compute costs of virtual servers if your chosen cloud provider charges this way. It is best to not use this type of model for backup unless you pilot the process, measure the IOPs and extrapolate this within the costing model of your cloud provider.
The second way is to move data directly to some type of cloud storage without DASH copy. The issue with this is that you usually pay cloud providers per GB per month, and any attempt to push large data volumes to a cloud service without the benefit of dedupe will be unaffordable after a few years of a lengthy backup retention strategy. [It is affordable if you only want 1-6 months of content but that is not the normal business data retention cycle for most organisations, especially if you are looking to remove tape altogether. Any longer than a few years and you will quickly work out that you can buy a small tape library with LTO-6 drives and have plenty of change compared to the cloud costings].
Removing Tape – What Disk is Needed?
In such a topology, tape provides two key functions:
- A point in time complete “archive” copy beyond the longest disk-based retention period
- A copy of data as a last chance of recovery if all else fails
Because deduplication means that you can quite effectively retain many years of data copies this negates the need for point 1. Addressing point 2 is a business decision, and many sites do not have this today.
Back on point 1, there are a few basic factors that will need to be determined in order to estimate the size of disk array to retain your online backup content:
- How large is the first full copy of data: typically we see about 20% reduction due to compression and some deduplication
- Retention: for how many years you will retain backup copies
- Number of backups: eg 5 days per week or 7 days per week, 52 weeks per year
- Daily rate of change: typically between 2% and 5%, depending on the workload
The disk space required can then be approximated using this formula:
|Disk Space Required (TB)||Protected Data Volume (TB)||Allowance for compression & some dedupe||Number of backup days retained||Rate of daily change, 2-5%|
So in a site with 10TB of data with: normal 20% savings on the first backup, backups occurring 5 days per week, 52 weeks per year, online retention of 10 years, 2% of daily change; the usable disk volume required is then 528TB. Utilising 4TB nearline SAS drives, this could be accomplished in a storage array with dense enclosures in a tidy 9 rack units of footprint!
Of course this is simplified, volumes will start out smaller and grow with increased retention, and understandably there will be primary data growth and fluctuations to usage patterns over the retention period. This provides an indication of likely data capacity required.
Aren’t Spreadsheets Wonderful!
To extrapolate running costs of the required backup storage, here is a quick comparison of the disk array outlined above for the second (remote) site copy of the data, retained for 10 years:
|On-premise/co-lo high density storage array, 528TB usable Purchased up front, 10 year vendor support, inclusive of running costs||Cloud storage based on:
Ingest Tier of $0.0259GB / month
Storage Tier $0.012GB / month
Incrementally growing over 10 years
Compute (to run Media Agent) $1.169/hr
|$334K ex GST||$393K ex GST
Does not include costs for retrievals, and retrievals will be “problematic” at best, only to be required if all else has failed.
So, not a great deal in it when you factor this over a 10 year period; but useful to benchmark the differences between the available options. Of course, this is to simply protect 10TB of data without taking into account its own growth due to new workloads etc. The operational note on retrieving data is important. The on-premise storage will be very simple for restoration, where the cloud-based storage will be very slow (“tape-like”) and only to be used in emergencies.
And if the numbers just don’t work, there is still tape
Full scale recoveries are rare and mostly restore jobs are for small data sets. Depending on data volumes, retention requirements and other business methods, we are finding today that tape is still a very low-cost way of creating archival data copies. Made once per month, for example, a single or dual-drive LTO-6 autoloader is all that is needed to push a retention copy to tape which is probably never needed, but gives surety and another process to show strong data governance.
Should any of this be within your thinking, then give the experts at Perfekt a call. We love to help with your backup strategies.