CommVault Simpana Continuous Data Replicator explained

So.. I have a customer who is using CommVault Simpana version 9. Within his environment there is one particular server with a lot of data on it. The space allocation on the server is somewhere between 10TB and 12TB. The data type is nothing fancy! Just regular files…

They requested me to propose a solution which provides a swifter restore (better RTO & RPO) and ofcourse with minimal investments required. My first idea was let’s use storage based snapshots and/or – replication. But for a disaster recovery solution the investments required were pretty much not justifiable!

My second idea was to split the backup from the replication part by using CommVault Continuous Data Replicator.
+ Take a backup once a day;
+ Make a replication every 2 hours.

As a test we continued with an acceptation test in our lab environment to verify:
+ how we need to configure it;
+ how VSS is used in the setup;
+ to document how it works.

Environment description:
+ CommVault Simpana 9;
+ Source Windows 2008R2 installed;
+ Destination Windows 2008R2 installed.

License:
Please note Continuous Data Replicator needs to be licensed. You can verify if you have the required license by navigating to “Control Panel > License Administration > License Details“. In the table search for “Continuous Data Replicator for MS Windows”.

How it works?
In the documentation we can find the following remarks:

  • “VSS is utilized on the source computer for creating snapshots and the VSS writers are used for performing quiesce/unquiesce of Exchange and SQL data. VSS is also used for creating Recovery Points. For Windows 2000 or Windows XP, snapshots on the source computer during the SmartSync Scan phase are created by QSnap, and the default quiesce method is used for quiescing the applications. Recovery Point snapshots on the destination computer can be created using VSS to create Shadow Copies for the snapshots.”
  • “Snapshots represent a point-in-time of the data that can be used for various data protection operations. A snapshot is essentially an instantaneous set of pointers to the original data (sometimes referred to as a logical view) as it was at a given point-in-time. When the original data is changed, the pointers will trigger a copy of the original data block; this maintains the snapshot, allowing data protection operations to proceed without interruption.”

So let’s bring it to a test!

I installed the Continuous Data Replicator agent on both machines (source & destination) in our lab environment. Once done I created a Continuous Data Replicator source folder on the source server (D:\CDR-SOURCE) and placed some data in it (approx. 500mb). On the destination machine I created a Continuous Data Replicator folder (D:\CDR-DESTINATION) which remained empty as it will be filled up by Continuous Data Replicator.

The replication is asynchronuous or near-real-time. The time between two replication jobs can be configured by navigating to “Control Panel > Job Management > Job Updates > State update interval for ContinuousDataReplicator”. In production we should modify this from 15 (default value) to 120 or 2 hours. Please note this is global parameter which impacts all replication sets within the environment!

In the CommVault Administrative Console I created a Replication Set to include both servers. Afterwards I created  a data pair that will replicate data from source to destination. Once I kick-off the initial replication, I started with checking and documenting the logs etc to gain a better understanding how it works.

Once again, please note we only use regular file data today. Replication of data used by applications (SQL, etc) can require a differen approach. (We’ll bring this to a test on a later moment!)

Once the replication is initiated we notice on the source machine a VSS snapshot being created. This astonished me a bit as I thought it would only use VSS when using applications (SQL, AD, etc).
C:\Users\rendersr>vssadmin list shadows
Contents of shadow copy set ID: {59736a3f-7ecf-40f4-bd62-d5b762081460}
Contained 1 shadow copies at creation time: 5/28/2014 9:26:54 AM
Shadow Copy ID: {08190d29-50a8-437a-9b14-565410162fc3}
Original Volume: (D:)\\?\Volume{d110e211-0da6-11e2-a251-005056a87ba8}\
Shadow Copy Volume: \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy3
Originating Machine: server1.example.com
Service Machine: server1.example.com
Provider: ‘Microsoft Software Shadow Copy provider 1.0′
Type: ApplicationRollback
Attributes: Persistent, No auto release, Differential

In the logfiles we notice the following entries (in CDR.log on the source machine):
“2752 b38 05/28 09:26:44 1919 CDRFileXfer::StartMonitorAndCreateSnap() – [A:59] ConfigId=3, StartLogNo=546, LogOffset=563
2752 b38 05/28 09:26:44 1919 CDRFileXfer::CreateSnap() – [A:59] Creating snap using VSS
2752 b38 05/28 09:26:44 #### CJournalOperations::QueryFSUsnInfo() – Successfully queried change journal on Volume \\?\Volume{d110e211-0da6-11e2-a251-005056a87ba8}, JournalId=0x1CDAC99DB5B13BD (10/17/2012 9:01:47 PM), NextUsn=13085016
2752 b38 05/28 09:26:44 1919 CreateShadow() – Starting with taking snapshot of volume [\\?\Volume{d110e211-0da6-11e2-a251-005056a87ba8}\]
2752 b38 05/28 09:26:44 1919 CreateShadow() – Setting Provider GUID to [b5946137-7b9f-4925-af80-51abd60b20d5]
2752 b38 05/28 09:26:52 1919 CreateShadow() – Selecting volume to snap: [\\?\Volume{d110e211-0da6-11e2-a251-005056a87ba8}\]
2752 b38 05/28 09:26:54 1919 CreateShadow() – Shadow objects size: 1
2752 b38 05/28 09:26:54 1919 CreateShadow() – Successfully created shadow of volume [\\?\Volume{d110e211-0da6-11e2-a251-005056a87ba8}\], Shadow set ID [59736a3f-7ecf-40f4-bd62-d5b762081460], Shadow Copy ID [\\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy3]
2752 b38 05/28 09:26:54 #### CJournalOperations::QueryFSUsnInfo() – Successfully queried change journal on Volume \\?\Volume{d110e211-0da6-11e2-a251-005056a87ba8}, JournalId=0x1CDAC99DB5B13BD (10/17/2012 9:01:47 PM), NextUsn=13086536
2752 b38 05/28 09:26:54 1919 CDRFileXfer::StartMonitorAndCreateSnap() – [A:59] Logged DONE entry”

And…
“2752 b38 05/28 09:27:04 1919 DeleteShadow() – Successfully deleted shadow with shadow set ID [59736a3f-7ecf-40f4-bd62-d5b762081460] for volume [\\?\Volume{d110e211-0da6-11e2-a251-005056a87ba8}\]
2752 b38 05/28 09:27:05 1919 CDRFileXfer::SendLogUpdateInfo() – [A:59] Sent log update message to destination with following information: Start log number 546, StartUsn=563
2752 b38 05/28 09:27:05 1919 CDRFileXfer::Done() – [A:59] bSmart=1, bSmartOnly=0, bDelete=0″

After 15 minutes, we receive the following (in CDR.log on the source machine):
“2752 b80 05/28 10:42:40 #### CJournalOperations::SetCDRFSUsnInfo() – Successfully queried change journal on Volume \\?\Volume{d110e211-0da6-11e2-a251-005056a87ba8}, JournalId=0x1CDAC99DB5B13BD (10/17/2012 9:01:47 PM), FirstUsn=0, NextUsn=13091536
2752 b80 05/28 10:42:40 #### CJournalOperations::SetCDRFSUsnInfo() – Successfully set USN in change journal on Volume \\?\Volume{d110e211-0da6-11e2-a251-005056a87ba8}, JournalId=0x1CDAC99DB5B13BD (10/17/2012 9:01:47 PM), FirstUsn=0, NextUsn=13091536
2752 b3c 05/28 10:42:40 #### CDRLogReader::SendJournalEntry() – Attempting to send log 586, length=0xa00, RepSetMask=0x1, destination=gdcw2k8bk802.dir.ucb-group.com
2752 b3c 05/28 10:42:40 1919 CDRLogReader::InitializeJEProcessing() – Informing PairId: 59 of [MoveIn] operation with USN 729 in log number 586
2752 13a0 05/28 10:42:40 #### StartCDRObject() – Object LOGREADER (3) is already active; will not continue, Object=0x0000000009076400
2752 13a0 05/28 10:42:40 1919 CDRFileXfer::CheckSubstateAndNextAction() – [A:59] Pair is in SubState DONE (1000); Moving the pair to replicating state
2752 13a0 05/28 10:42:40 1919 CDRFileXfer::CheckSubstateAndNextAction() – [A:59] Leaving, State=DONE (8), SubState=DONE (1000)
2752 13a0 05/28 10:42:40 1919 CDRPendingJEsFile::ReadJE() – Read JE in pending JEs file with USN 729; This JE’s offset is 512 and its size is 1024″

Conclusion:
In a first phase a VSSsnapshot is created during a full resync process.
Once done, the change journal is consulted for any changes every 15 minutes (customizeable value on CommCell level).
Based on the USN (update sequence number) a set of changed or new files are copied to the CDR destination.
The structure of the destination is as follows:
PS D:\CDR-DESTINATION> gci -recurse
Directory: D:\CDR-DESTINATION\D\CDR-SOURCE
Mode LastWriteTime Length Name
—- ————- —— —-
-a— 5/22/2014 5:24 PM 3 first file.txt
-a— 3/13/2014 11:16 AM 338739200 hp_9_SP14_01302014.tar
-a— 3/13/2014 11:19 AM 14120960 linux_9.0.0B84_JavaUpdate7u45Part4
(46617).tar
-a— 3/13/2014 11:19 AM 909096960 linux_9_SP14_01302014.tar
-a— 5/27/2014 5:58 PM 3 second file.txt
-a— 3/13/2014 11:19 AM 4929488 WinX64_9.0.0B84_JavaUpdate7u45Part4(46617).exe

Remarks:
+ This is not a replacement for backup but can be used for DR.
+ We need to verify if the cdr replication process intervenes with running backup processes on the same server. If yes, what’s the impact?
+ Performance assessment (what time does it take to replicate 1TB of data, what time does it take for a delta copy based on standard change ratio’s).

Leave a Reply