Within Commvault, Indexing is the process that creates and maintains metadata about all content that Commvault protects, either through backup or archive processes. Indexing is the core to browsing backed up content facilitating restores, and allows analytics and reporting on protected content.
Prior to the release of Commvault Version 11 in 2015, the Commvault Indexing Architecture had largely remained unchanged since inception. Commvault refers to this technology as Indexing Version 1. In Indexing Version 1, a backup would write the index file to the Index Cache on the MediaAgent and each backup job created its own index. Upon completing the backup to the storage library, the Archive Index Job Phase would copy the index to the storage library. In addition, Commvault restores Indexes into the Index Cache (Commvault skips this step if the Index is present in the Index Cache) before subsequent incremental/differential backups and appends their index data to it, and then backs up the resulting complete index. As such, the Index for each job increases considerably in size until the next full backup is completed.
With Indexing Version 2 technology (introduced in Commvault Version 11), the Index Cache on the Media Agent holds Action Logs which are replayed into an Indexing Server MediaAgent’s Index Cache which is then backed up to the storage library. The transition to Indexing Version 2 has not been an easy one for the developers at Commvault. It has essentially required a re-development of each Commvault “Intelligent DataAgent” (known as iDataAgent).
At the time of writing, the iDataAgents supported with Indexing Version 2 include:
VMware Virtual Server Agent (Early Release) Microsoft Windows File Systems Agent. UNIX File System Agent
Exchange Mailbox Agent
Microsoft SQL Server (IntelliSnap and Block-Level backups only)
Whilst many of the original indexing concepts have been preserved in Indexing Version 2, these changes have resulted in a significant improvement in the efficiency of the Indexing process, along with a number of additional features and benefits:
|Index creation occurs asynchronously to the data backup process||
|Action logs and a cumulative index||
|Deleted items shown across backup cycles||
During browse operations, iDataAgents that are configured to use Indexing Version 2 show deleted items across backup cycles. This is a change from Indexing Version 1 agents, which only showed deleted items in the current cycle.
|Better support for GridStor||
When GridStor is deployed, Indexing Version 2 removes the time and disk space required to copy an Index between MediaAgents if the backup process fails over to a different MediaAgent.
|Enables additional operations||
Introducing Indexing Version 2 for VMware – #V2!
The Indexing Version 2 for VMware Virtual Server Agent is currently an Early Release feature and whilst Early Release features are best suited to new or controlled environments, this blog article focuses on how to you can take advantage of the benefits of the Indexing Version 2 for VMware right now. For businesses with VMware workloads, the Virtual Server Agent for VMware often has the largest footprint of all the Agents, and can certainly benefit from the Indexing improvements that come with this release.
Commvault have implemented Indexing Version 2 for the VMware Virtual Server Agent as part of new approach known as “VM-Centric Operations for Virtual Server Agent with VMware”. Previously when a VMware subclient was configured to protect multiple VMs, each backup would register just one backup job for all the clients protected. That meant it was not possible to drop the retention or delete the backups for individual VMs without impacting the entire protected subclient. This long-standing limitation has finally been overcome with the introduction of this technology as the VMware client backups now spawn a new job number for each VM that is discovered for backup. Also, you now have the option to temporarily enable/disable backup activity for each VM client instance manually in addition to the legacy method by modifying the VSA subclient properties. Restores can also be multi-streamed so you no longer have to manually create separate jobs to restore multiple VMs in the same subclient.
Implementation Considerations – V1 backup in the hand vs V2 in the bush.
Brilliant, right? Well yes, but be careful to read the fine print. When you apply a Service Pack to Commvault, your registered clients that were using Indexing Version 1 do not automatically get upgraded to Version 2. The upgrade has to be performed manually and requires running the “Upgrade to Indexing V2 workflow” which is available from the Commvault Store. Also currently not all of the client agents with a Job History support the upgrade workflow, which means a new client registration that has not been backed up needs to be configured and then upgraded with the workflow. The agents with Job History that support the upgrade workflow are currently limited to.
- Microsoft Windows File Systems Agent
- UNIX File System Agent
- Macintosh Agent
As you can see, the Indexing Version 2 support is limited to new VMware clients only, unless the client has no backup job history and the way around this is in existing environments is to create a new uniquely named Virtualisation Client. You can still use your existing storage policies and you can still preserve the deduplication benefits onto the new virtualisation client. It is an administrative nuisance, but as long as you are not low on storage library capacity, this alone is not going to be too much of a problem.
Also you should be aware that the fine print also states that if you choose the Synthetic Full option of performing an Incremental backups before/after, only the synthetic full backup operation runs. This means that if you reuse one of your existing schedule policies with an Incremental before the Synthetic Full, you will always miss a Recovery Point – so a separate schedule policy should be created with a Daily Incremental and separate Synthetic Full that runs outside the hours that the Incremental is expected to complete in.
In addition, you may be confused when Commvault Documentation states that you cannot perform a VM-Centric backup to tape but this only means the first phase of the backup cannot be a tape copy. Commvault still supports Virtual Machine backup copies on tape, so long as it is an Auxiliary Copy after first being backed up to disk, which is most common these days. However, just as you can now no longer perform a VM-Centric backup directly to tape – you can also no longer restore these VMs directly from tape. This is because the Commvault MediaAgent
uses Live Browse technology, which mounts the VM directly from the storage. The way Storage Policy precedence has been developed in Commvault prevents recalls the Tape Copy Job back onto the Primary Disk Copy. The only way around this limitation is an additional Disk based Storage Policy Copy for a Selective Copy of the Virtual Machine onto temporary Staging storage, where it can then be restored from disk using the LiveBrowse technology.
Now you may think you can get around this limitation by using “Cloud Storage Archive”, but support for this feature is limited.
Cloud Storage Archive Recall Support via Commvault Workflow
- Recalls of the following agents with Indexing Version 2 clients are supported using the Cloud Storage Archive Recall workflow:
- Microsoft Windows File System
- Macintosh File System
- Unix File System
- NAS NDMP agent support for non-deduplicated data
- Cloud Storage Archive Recall Support via Commvault Command Line
- Agents using Indexing Version 2 (use the Commvault Work flow for the ones listed above)
- Agents using Indexing Version 1
- Agents with Live Browse option
- Database Agents
- NAS agent with deduplicated data
- Persistent Recovery used for file system stub recalls
The recall of V1 and V2 Indexed VMware Clients is supported, but currently it must be performed using the legacy Commvault Command Line feature and is not supported with the superior Commvault Workflow method for recalling out of an archive tier. The legacy Command Line method means ALL the deduplicated data in the archive tier will need to be recalled for the restore – which takes extra time and does incur additional cloud storage costs. Perfekt is of the understanding that Commvault developers are working on a way to minimise this recall cost, but “Cloud Storage Archive” remains a viable option for those whom believe they will restore infrequently from an Archive Tier along with preferring the flexibility that cloud offers over tape backup media.
Finally, there is a significant gotcha that is not easy to pick up from the Commvault Online Documentation that can affect Selective Storage Policy Copies. Selective Copies provide backup administrators an option to protect all or just some of the sub clients protected in the primary
and synchronous Storage Policy copies. Prior to “VM-Centric Operations for VSA with VMware”, you only needed select the VMware subclient with, say, your monthly Synthetic Full copies to tape. Now with the introduction of “VM-Centric Operations for VSA with VMware”, a synthetic full backup for the VSA subclient will kick off a VM-Centric Synthetic Full operation for all the VMs but there is no Synthetic Full Job Id for the VMware subclient. This means only Traditional Fulls, Incrementals and Differential jobs are logged against the VMware subclient, thus under this configuration only the Traditional Full backups (which is usually just the first backup) would qualify being copied to tape. Unfortunately, this is a technical oversight and Perfekt is working Commvault to have this issue
patched. For organisations with a significant number of Virtual Machines and have been streamlining what they Auxiliary Copy, this potentially can be an administrative nightmare as the backup administrator would have to constantly select and deselect all the VMs that have been discovered from the last backup cycle. Whilst for those organisations who fall
into this category may be put off from transitioning to Indexing Version 2 solely for this issue, there is a solution! By creating a new Storage Policy just for the VMware backups and choose all VMs in the Selective Copy.
Perfekt’s position on Indexing Version 2 for VMware – (V2 or not V2? That is the question)
In conclusion, whilst there are quite a few gotchas in this Early Release for Indexing Version 2 for VMware, Commvault developers are certainly committed to reducing the number of limitations and improving the features on offer. Perfekt see that the Indexing Version 2 with VMware is highly desirable and we recommend that our clients consider transitioning to it for the protection of VMware workloads. Businesses whose are protecting their VMware environment with a Medium specification or greater MediaAgent and are not running critically low on free space with their Disk Libraries can certainly benefit by transition to Indexing Version 2, but there is nothing stopping smaller businesses either, as the ability to purge jobs for individual VMs is going to be just as welcome. Also, bear in mind that this is still an Early Release feature and if you do feel that you need support, both Commvault Support and the highly trained engineers here at Perfekt (Platinum Partner for Commvault) are well positioned to assist and maximise your organisation protection potential.
End Note: What does this mean for Microsoft Hyper-V?
There is no official news yet of when Indexing Version 2 for the Virtual Server Agent is going to include Microsoft Hyper-V. Commvault have a strong history in supporting Microsoft’s Hypervisor technology, even going as far as developing their own Changed Block Tracking (CVCBT) driver for Windows Server 2012 R2 Hyper-V before Microsoft introduced Resilient Change Tracking (RCT) in Windows Server 2016. Hyper-V customers should not feel that they are going to be left out for long as this feature is currently under development.