For those experienced with Commvault will, no doubt, have seen the HTML5 ‘Command Center’ getting royal attention every quarter. Gradually it has been transformed into the Commvault’s Crown Jewel, so much so that at times one could be forgiven for losing sight of some uncut diamonds in the Java CommCell Console, that hopefully will be fully migrated into the Command Center in future updates.
In this blog we look at an often-overlooked feature in Commvault, called Monitoring Policies, then show how you can access the Monitoring Policy Dashboards in the Command Center, and finally reflect on why on earth you are not using these gems right now!
Monitoring Policies are categorised under Log, Activity and System, but they all can be collected under a single SOLR IndexStore.
Commvault, I feel, have heavily overstated requirements for the Index Server. This could frighten off customers wanting to take full advantage of what the product can deliver. In our environment, I have had no problems using a dedicated Server with 20GB RAM and we’ve hardly touched the 550GB disk allocated for the Indexes. This is well below the requirements of 64GB RAM and 2TB SSD for the Index Directory, but you may need to scale up should you fully embrace the Monitoring Policies.
The dedicated monitoring server will need the Index Store Package loaded onto the registered Commvault Client. Once the Index Store Package is installed, you will need to ‘Add an Index Server to your CommCell Environment’ and include the ‘Log Monitoring’ Role and then assign the server with the Index Store package as the Node, along with the directory to store the Log Monitoring indexes. You may need to wait up to 30 minutes for it to fully prep the Apache SOLR, but if it does not come up then it could be because port 20000 is unreachable from the CommServe or because you have not allocated enough RAM to the client. If you want to tune the amount of memory allocated then you can follow the instructions here.
Once you have set that up, you just choose the Commvault Clients that you want to monitor, against the Policy Templates shown here.
It’s pretty straight forward, choose the Monitoring Policy Type
Then the Monitoring Type
Give it a name
Choose the Clients and/or Client Groups
Choose the Index Server and the retention for this monitoring component.
Specify the Schedule Details
Review your Configuration, press Finish and repeat for all the Policy Types.
Once you feel like you have set up all the policies and given enough time for the schedules to collect the data you can now pop into the Command Center to see the dashboards.
From here you can choose either the Log Monitoring Policies or System Monitoring.
The Log Monitoring feature is very straight forward and can be very useful when troubleshooting clients without having to pull the logs from the client and has some of the most useful log filtering features of Commvault’s fantastic troubleshooting application GxTail.
In our environment I was able to centrally pull the Commvault Logs for all clients, with the exception of the Edge Clients where their Commvault instance does not come with the full File System Agent. These clients only come with File System Core, which is a bit of a bummer. I have raised a CMR with Commvault to see if this can be incorporated with a future release.
Now you may be familiar with the Infrastructure Load report found in the Command Center which reports on System Resources (CPU/RAM usage), however, it is the Commvault System Monitoring Policy feature discussed below that is the reason for this blog. The hidden gemstones under here will currently require you to apply a little elbow grease to cut and polish, in order to make them shine.
In our environment running 11.21.15, I found that the System Monitoring was logging performance statistics without error but many of the dashboards would return errors like these.
It is possible that many have tried and gotten to this point and been dismayed, so I did some research into what was going on. It was apparent that the dashboards had slight errors in the way it was querying SOLR DB facets. For example, the query ‘graphtype timechart avg(cpu) by processname’ worked when changed to ‘graphtype timechart avg(progress_CPU) by processname’, and I found that these System Monitoring Dashboard queries requiring some attention were pre-cooked within a stored procedure inside the CommServeDB. When I raised this with Commvault Support, they very kindly compiled a Diagnostic Hotfix (v11SP21_Available_Diag2056_WinX64) that updated the CommServe and now the Media Agent Data Transferred widget needs one last touch up from development. So, if you are running a similar build to 11.21.15 and want to see a performance dashboard like this, then reach out to Commvault Support. Note that when you have the diagnostic patch loaded and then update to 11.22, as we have done, some of the dashboards will return empty graphs with the reason “No data found”.
Looking at the Dashboard below, suddenly we have an easy-to-use visual insight into how your Commvault processes are performing. If you may have been frustrated at troubleshooting an overnight or weekend performance problem through log bundles, I’m sure you will agree that this#DataIsBeautiful. I especially like the fact that these Monitoring Policies can provide a lot of information about what is happening in your environment without having to license any third party software. Certainly, it is very reassuring that I now have historical performance and log data that is Commvault specific which I can use to compare against, should we need to investigate issues on the monitored servers.
Or you can click each graph and drill down into a custom date range to analyse the Commvault Process level statistics.
In summary, at first System Monitoring policies may, unfairly be seen as forgotten diamonds in the rough, but by putting in a bit of effort you can transform them into shiny diamonds that shed light into your environment. Hopefully soon we will see a product update that will fully embrace this fantastic feature within the Command Center for both configuration and dashboard reports.
This week a significant update for Commvault was released within Feature Release 11.22 that will be of help to every customer protecting Exchange Online with Commvault, Point-In-Time Mailbox Recovery. This capability is not provided by Microsoft natively and they have said that “Point in time restoration of mailbox items is out of scope for the Exchange Online service”. This lack of a native capability has meant 3rd Party developers have had to work very hard into developing a fast, scalable and indexed backup mailbox solution. The technology backbone Commvault chose was a logical one – Apache Lucene SOLR which has long been used for File/Email Content Indexing, System Monitoring and other Analytics features. For many small-mid sized organisations using just one Index Server and Access Node, the performance when using Feature Release 11.20+ Modern Authentication is excellent, with download throughput figures of up to 2TB/day not uncommon.
However, despite the feature rich nature of the Commvault Exchange Mailbox Agent there was no true point-in-time restore technology. The biggest technical challenge to overcome, was that previously, the only way Commvault could perform a Point-In-Time recovery was to restore the SOLR Index Backup and replay it into a New Instance (or Instances) of an Index Server. Commvault Support have had this process down pat to help out the customers who may have been understandably daunted by this procedure, but there had to be a better way right? Well thankfully, the process of manually creating Index Servers and replaying the Index Backups will soon be no more.
Feature Release 11.22, which at the time of writing is in “Technical Preview” (General Availability status will be in February 2021) has solved this problem by changing the way the SOLR Index does the “Sharding” process. What is Sharding, and why do it? Well, it’s Lucene SOLR’s way of Scaling Out and your point-in-time results are cached into a new SOLR Core. Commvault now creates an Exchange Mailbox Recovery Point from just the User Mailbox you want to restore and the data is sharded off into a new SOLR core that will stay around for 30 days or until you delete the recovery point.
Now at the time of writing, the Point-In-Time recovery still restores messages deleted by the user. The Restore Mailbox Messages UI does give you the option to Include/Exclude Deleted messages, but in my testing that does not work yet. Also, during my testing if mail messages were backed up in one folder to then be backed up after the email was moved into a different folder, then the mailbox restore would restore both messages. These were the results in my test lab and test O365 environment, so your mileage may vary in your favour; however I’d probably recommend holding out for now with this new Commvault Agent as it is still under Technical Preview classification. I can confirm that Commvault is correctly recording the common Message Identifiers in the Indexes each time an email message has been moved, so we can be confident that this will be resolved without having to re-back up the data protected under this client.
Here is are some samples of how point in time recovery is performed. Note: this new feature is exclusively in the HTML5 Commvault Command Center.
First you will need at least some backups before your test.
Once the backup is complete, click the Client name.
Click on the Mailbox you want to restore.
A calendar full of all the recovery points will be visible for each day after you click on the date. In this instance I have chosen the backup at 5:33PM (Job Id 21) and clicked Create recovery point
Confirm the Recovery Point creation.
Locate the Recovery Points by clicking on the Recovery Points tab for the client, then tick the mailbox and click Restore > Restore Mailbox (chosen here) or Restore Messages
For in-place restores, all the messages protected up until this recovery point will be restored in place. Note: whilst at the time of writing, Deleted and Moved messages will be restored as copies of the original message; you will not get a double up of the message if it exists in the same folder.
Or, Restore the data to another Mailbox and Folder. Note: Commvault out of Mailbox Restores will recover all the messages into sub-folders underneath the folder you specify.
Long-time users of the Commvault CommCell Console whom have managed their environment off host would have been using the Java Web Start application. It was simple and easy. Just enter http://yourcommserve/console and open ‘galaxy.jnlp’. Unfortunately that galaxy is now far, far away. The ‘Java Web Start’ and ‘Java Plug-in’ have been deprecated since March 2019, and the continued use of Java SE requires an Oracle Subscription. So what is the best way to connect to the CommCell Console, without having to Remote Desktop into the CommServe each time?
Firstly, I cannot stress this enough – do not install the Commvault installation software to install the Commvault CommCell Console package onto your desktop. The risk here is that any patching of the CommServe also means the desktops must be patched at the exact same time which cannot be guaranteed, and using it this way can cause fatal errors including data loss.
The best way is to use the netx.jar bootstrap file. The simplest way to get the netx.jar is download it directly from the Commvault Cloud, and conveniently you don’t need to have a Maintenance Advantage login to download.
You can also elect to download the netx.jar file directly from your CommServe Web Server https://yourcommserve.company.local/console/netx.jar. If you are using Chromium based browser, you will likely be unable download the netx.jar file if your CommServe Web Server is using Self-Signed Certificates. If you have direct access to the CommServe, then you can copy the file located at “%CV_Instance001%\..\GUI\netx.jar” (e.g. C:\Program Files\Commvault\ContentStore\GUI\netx.jar”).
Now a number of times I’ve seen instances where people are using netx.jar to still be launching with Java 8 SE. At the time of writing it may seem like the Console works but you may run into Console Related Errors or expose yourself to the kind of risks mentioned earlier.
What you should be using is OpenJDK 11 instead. For almost two years Commvault CommCell Console has been compiled for Java 11. Currently OpenJDK 11.0.7 gets installed with the CommCell Console and is upgraded periodically with new Feature Releases of Commvault. OpenJDK 11 can be downloaded here thanks to the fantastic contributors to the AdoptOpenJDK community.
You can download the JDK as an MSI, but I prefer to download the zipped Binary instead because I would rather choose at runtime which Java Version I want to run. In this example the netx.jar and extracted zipped JRE are in my ‘downloads’ folder (don’t judge me) and created a shortcut on my desktop to.
"%HOMEPATH%\Downloads\OpenJDK_11.0.7\bin\java" -jar "%HOMEPATH%\Downloads\netx.jar"
From here, just enter the CommServe Hostname and click OK.
Then wait for Java to run a few commands
and within a few seconds you are prompted to log into your CommCell Console.
Whilst not as simple and convenient as the old Java Web Start way, it is the safest way of running the console remotely without having to Remote Desktop into Commvault Servers.
Back in August, I discovered an issue that impacted Commvault native dump backups of Amazon RDS SQL Server and will affect all users who are backing up these databases in a time-zone other than UTC+0. This blog goes into some detail about this problem and why you must be careful how you restore your backups. A Diagnostic Fix is currently available on request by Commvault and will soon be mainstream come November 2020 Maintenance Pack, but users need to be aware that this fix will not retrospectively resolve your past Amazon RDS SQL Server backups.
How to reproduce the issue
The steps described here are the for the Commvault CommCell Console, but the reproduction steps are also just as relevant to the Command Center.
Attempt a Browse and Restore by Job
Click View Content
Error: There is no data to restore. Verify that the correct dates have been entered.
So what happens if you really need to restore that backup and you browse and restore by Time Range?
Our first job backup here finished at 7:44:52PM on the 26th August, 2020 and I have chosen a restore by End Time that is 9:59 ahead of when the backup occurred (9:59 = +10 hours – 1 minute).
However if I repeat the Browse and Restore by End Time, but choose a time 10:01 ahead of the backup
And voila, Commvault was able to retrieve the list of SQL Server database backups from the Job!
And the problem is?
The “There is no data to restore. Verify that the correct dates have been entered” error only appears if there are no backups on or before the Browse by Date Time. Whenever you have an error message come up, you clearly know that you have to make corrective action. However, the problem here is that when you browse and restore this way, it is quite likely that you will either restore an older or newer backup; and the backup operator will not even know until the DBA discovers the error.
So a restore by Date and Time, requires the Backup Operator to do a time calculation. For many customers that work in a single time zone, this may be quite straight forward. However extra care must be taken when restoring databases that could be in a different zones.
The Good, the Bad and the Ugly
The good news is that there is a Diagnostic Hotfix from Commvault that needs to be installed on the Access Node, and I can confirm that it was prepared for at least Commvault FR11.20.17 and for FR11.21.5. Contact Commvault support if you just want this Diagnostic Fix, or install the November 2020 Maintenance Pack across your CommCell. This will re-enable you to do a Browse and Restore by Job without having to browse by date.
The bad is that it does not retrospectively fix the job history. Why? Sadly it is just too risky for Commvault to create an Update script to the sqlDBBackupInfo table to update the Database Backup times to reflect the true timestamp because there is no safe way to do it globally for all time-zones.
The ugly means your backup operators need to be aware when this patch is applied so they know which dates a Browse and Restore by Job will work and also when they must restore by providing a date range as described in this blog.
Within Commvault, Indexing is the process that creates and maintains metadata about all content that Commvault protects, either through backup or archive processes. Indexing is the core to browsing backed up content facilitating restores, and allows analytics and reporting on protected content.
Prior to the release of Commvault Version 11 in 2015, the Commvault Indexing Architecture had largely remained unchanged since inception. Commvault refers to this technology as Indexing Version 1. In Indexing Version 1, a backup would write the index file to the Index Cache on the MediaAgent and each backup job created its own index. Upon completing the backup to the storage library, the Archive Index Job Phase would copy the index to the storage library. In addition, Commvault restores Indexes into the Index Cache (Commvault skips this step if the Index is present in the Index Cache) before subsequent incremental/differential backups and appends their index data to it, and then backs up the resulting complete index. As such, the Index for each job increases considerably in size until the next full backup is completed.
With Indexing Version 2 technology (introduced in Commvault Version 11), the Index Cache on the Media Agent holds Action Logs which are replayed into an Indexing Server MediaAgent’s Index Cache which is then backed up to the storage library. The transition to Indexing Version 2 has not been an easy one for the developers at Commvault. It has essentially required a re-development of each Commvault “Intelligent DataAgent” (known as iDataAgent).
At the time of writing, the iDataAgents supported with Indexing Version 2 include:
VMware Virtual Server Agent (Early Release) Microsoft Windows File Systems Agent. UNIX File System Agent
Exchange Mailbox Agent
Microsoft SQL Server (IntelliSnap and Block-Level backups only)
Whilst many of the original indexing concepts have been preserved in Indexing Version 2, these changes have resulted in a significant improvement in the efficiency of the Indexing process, along with a number of additional features and benefits:
|Index creation occurs asynchronously to the data backup process||
|Action logs and a cumulative index||
|Deleted items shown across backup cycles||
During browse operations, iDataAgents that are configured to use Indexing Version 2 show deleted items across backup cycles. This is a change from Indexing Version 1 agents, which only showed deleted items in the current cycle.
|Better support for GridStor||
When GridStor is deployed, Indexing Version 2 removes the time and disk space required to copy an Index between MediaAgents if the backup process fails over to a different MediaAgent.
|Enables additional operations||
Introducing Indexing Version 2 for VMware – #V2!
The Indexing Version 2 for VMware Virtual Server Agent is currently an Early Release feature and whilst Early Release features are best suited to new or controlled environments, this blog article focuses on how to you can take advantage of the benefits of the Indexing Version 2 for VMware right now. For businesses with VMware workloads, the Virtual Server Agent for VMware often has the largest footprint of all the Agents, and can certainly benefit from the Indexing improvements that come with this release.
Commvault have implemented Indexing Version 2 for the VMware Virtual Server Agent as part of new approach known as “VM-Centric Operations for Virtual Server Agent with VMware”. Previously when a VMware subclient was configured to protect multiple VMs, each backup would register just one backup job for all the clients protected. That meant it was not possible to drop the retention or delete the backups for individual VMs without impacting the entire protected subclient. This long-standing limitation has finally been overcome with the introduction of this technology as the VMware client backups now spawn a new job number for each VM that is discovered for backup. Also, you now have the option to temporarily enable/disable backup activity for each VM client instance manually in addition to the legacy method by modifying the VSA subclient properties. Restores can also be multi-streamed so you no longer have to manually create separate jobs to restore multiple VMs in the same subclient.
Implementation Considerations – V1 backup in the hand vs V2 in the bush.
Brilliant, right? Well yes, but be careful to read the fine print. When you apply a Service Pack to Commvault, your registered clients that were using Indexing Version 1 do not automatically get upgraded to Version 2. The upgrade has to be performed manually and requires running the “Upgrade to Indexing V2 workflow” which is available from the Commvault Store. Also currently not all of the client agents with a Job History support the upgrade workflow, which means a new client registration that has not been backed up needs to be configured and then upgraded with the workflow. The agents with Job History that support the upgrade workflow are currently limited to.
- Microsoft Windows File Systems Agent
- UNIX File System Agent
- Macintosh Agent
As you can see, the Indexing Version 2 support is limited to new VMware clients only, unless the client has no backup job history and the way around this is in existing environments is to create a new uniquely named Virtualisation Client. You can still use your existing storage policies and you can still preserve the deduplication benefits onto the new virtualisation client. It is an administrative nuisance, but as long as you are not low on storage library capacity, this alone is not going to be too much of a problem.
Also you should be aware that the fine print also states that if you choose the Synthetic Full option of performing an Incremental backups before/after, only the synthetic full backup operation runs. This means that if you reuse one of your existing schedule policies with an Incremental before the Synthetic Full, you will always miss a Recovery Point – so a separate schedule policy should be created with a Daily Incremental and separate Synthetic Full that runs outside the hours that the Incremental is expected to complete in.
In addition, you may be confused when Commvault Documentation states that you cannot perform a VM-Centric backup to tape but this only means the first phase of the backup cannot be a tape copy. Commvault still supports Virtual Machine backup copies on tape, so long as it is an Auxiliary Copy after first being backed up to disk, which is most common these days. However, just as you can now no longer perform a VM-Centric backup directly to tape – you can also no longer restore these VMs directly from tape. This is because the Commvault MediaAgent
uses Live Browse technology, which mounts the VM directly from the storage. The way Storage Policy precedence has been developed in Commvault prevents recalls the Tape Copy Job back onto the Primary Disk Copy. The only way around this limitation is an additional Disk based Storage Policy Copy for a Selective Copy of the Virtual Machine onto temporary Staging storage, where it can then be restored from disk using the LiveBrowse technology.
Now you may think you can get around this limitation by using “Cloud Storage Archive”, but support for this feature is limited.
Cloud Storage Archive Recall Support via Commvault Workflow
- Recalls of the following agents with Indexing Version 2 clients are supported using the Cloud Storage Archive Recall workflow:
- Microsoft Windows File System
- Macintosh File System
- Unix File System
- NAS NDMP agent support for non-deduplicated data
- Cloud Storage Archive Recall Support via Commvault Command Line
- Agents using Indexing Version 2 (use the Commvault Work flow for the ones listed above)
- Agents using Indexing Version 1
- Agents with Live Browse option
- Database Agents
- NAS agent with deduplicated data
- Persistent Recovery used for file system stub recalls
The recall of V1 and V2 Indexed VMware Clients is supported, but currently it must be performed using the legacy Commvault Command Line feature and is not supported with the superior Commvault Workflow method for recalling out of an archive tier. The legacy Command Line method means ALL the deduplicated data in the archive tier will need to be recalled for the restore – which takes extra time and does incur additional cloud storage costs. Perfekt is of the understanding that Commvault developers are working on a way to minimise this recall cost, but “Cloud Storage Archive” remains a viable option for those whom believe they will restore infrequently from an Archive Tier along with preferring the flexibility that cloud offers over tape backup media.
Finally, there is a significant gotcha that is not easy to pick up from the Commvault Online Documentation that can affect Selective Storage Policy Copies. Selective Copies provide backup administrators an option to protect all or just some of the sub clients protected in the primary
and synchronous Storage Policy copies. Prior to “VM-Centric Operations for VSA with VMware”, you only needed select the VMware subclient with, say, your monthly Synthetic Full copies to tape. Now with the introduction of “VM-Centric Operations for VSA with VMware”, a synthetic full backup for the VSA subclient will kick off a VM-Centric Synthetic Full operation for all the VMs but there is no Synthetic Full Job Id for the VMware subclient. This means only Traditional Fulls, Incrementals and Differential jobs are logged against the VMware subclient, thus under this configuration only the Traditional Full backups (which is usually just the first backup) would qualify being copied to tape. Unfortunately, this is a technical oversight and Perfekt is working Commvault to have this issue
patched. For organisations with a significant number of Virtual Machines and have been streamlining what they Auxiliary Copy, this potentially can be an administrative nightmare as the backup administrator would have to constantly select and deselect all the VMs that have been discovered from the last backup cycle. Whilst for those organisations who fall
into this category may be put off from transitioning to Indexing Version 2 solely for this issue, there is a solution! By creating a new Storage Policy just for the VMware backups and choose all VMs in the Selective Copy.
Perfekt’s position on Indexing Version 2 for VMware – (V2 or not V2? That is the question)
In conclusion, whilst there are quite a few gotchas in this Early Release for Indexing Version 2 for VMware, Commvault developers are certainly committed to reducing the number of limitations and improving the features on offer. Perfekt see that the Indexing Version 2 with VMware is highly desirable and we recommend that our clients consider transitioning to it for the protection of VMware workloads. Businesses whose are protecting their VMware environment with a Medium specification or greater MediaAgent and are not running critically low on free space with their Disk Libraries can certainly benefit by transition to Indexing Version 2, but there is nothing stopping smaller businesses either, as the ability to purge jobs for individual VMs is going to be just as welcome. Also, bear in mind that this is still an Early Release feature and if you do feel that you need support, both Commvault Support and the highly trained engineers here at Perfekt (Platinum Partner for Commvault) are well positioned to assist and maximise your organisation protection potential.
End Note: What does this mean for Microsoft Hyper-V?
There is no official news yet of when Indexing Version 2 for the Virtual Server Agent is going to include Microsoft Hyper-V. Commvault have a strong history in supporting Microsoft’s Hypervisor technology, even going as far as developing their own Changed Block Tracking (CVCBT) driver for Windows Server 2012 R2 Hyper-V before Microsoft introduced Resilient Change Tracking (RCT) in Windows Server 2016. Hyper-V customers should not feel that they are going to be left out for long as this feature is currently under development.