Skip to content

File splitting and compression

As explained in the gateway abstraction section, DataSync can use the file-system to store the replicated into files and copy them to other sites using any tool.

It is important to keep in mind that the size of these files can be considerable, extending the time needed to transport them. For this matter, DataSync performs some operations to prepare them before transportation:

Compression

Reduces the size of the data to transport by compressing files into zips with the "DSPackage" extension.

Decompression

The reverse process in which "DSPackage" is decompressed, obtaining the original files to be used in an import process.

Splitting

Divide large files into smaller "DSPart" files, along with a "DSIndex" which includes metadata of all of them. The size of each individual part can be configured in the gateway options.

Combining

Reads the information of the "DSIndex" in order to validate that all the "DSPart" files needed are present and then combines them into the original file.

Integrity check

This process aims to check the integrity of the files to know if they are damaged. To achieve that, DataSync calculates a crc32 code based on the information (bytes).
This code is generated during the splitting process and written into the file it belongs to.
Then during the combining process, the code is calculated again based on the file and compared against the one written in it. If these codes do not match, that means the file was altered/damaged.