Deduplication in Alike

<-- Back

One of the most powerful features of Alike is it's ability to globally dedpulicate data on both the source, and target side-- saving you storage and money.

The process of data-deduplication is seamless, but occurs in two areas during the 'data acquisition' phase of your backups.

Source Side Dedup

The source-side deduplication is performed by the ABD or Q-Hybrid agent, depending on your job and platform. In this case, the A2 pre-generates a "munge-cache" which the agent/ABD uses to determine if the data being acquired should be transferred to the A2 for storage. This cache is typically based on previous backups of the system being protected, but for "initial" backups of a VM, Alike will try to build a cache from "similar" systems (if the "loose dedup cache matching option is enabled"). If no similar system can be found, then source-side deduplication is disabled for that job, and all data from the source system must be transferred to the A2 for storage.

Target Side Dedup

Once the backup data has been transferred to the A2, the Data Engine component of Alike will perform the "target side deduplication" process. This is very straight-forward- the incoming data will only be stored to your ADS if it is new to Alike-- (identical) data that has already been stored will be discarded. This can be very significant, since similar systems can share large amounts of data.

Compression vs Deduplication

In addition to the deduplication techniques described above, Alike also performs data compression on all the data at it's source. So, any data protected by Alike will automatically be compressed before it is ever sent over the network. This pre-compression of data can yield an additional storage savings that varies from none (in the case of audio/video, and other data), to very significant (in the case of text, logs and other compressible data).