Incremental backupAn incremental backup is one in which successive copies of the data contain only the portion that has changed since the preceding backup copy was made.[1][2][3][4] When a full recovery is needed, the restoration process would need the last full backup plus all the incremental backups until the point of restoration.[5] Incremental backups are often desirable as they reduce storage space usage, and are quicker to perform than differential backups.[6] VariantsIncrementalThe most basic form of incremental backup consists of identifying, recording and thus, preserving only those files that have changed since the last backup. Since changes are typically low, incremental backups are much smaller and quicker than full backups. For instance, following a full backup on Friday, a Monday backup will contain only those files that changed since Friday. A Tuesday backup contains only those files that changed since Monday, and so on. A full restoration of data will naturally be slower, since all increments must be restored. Should any one of the copies created fail, including the first (full), restoration will be incomplete.[7] A Unix example would be: rsync -e ssh -va --link-dest=$dst/hourly.0 $remoteserver:$remotepath $dst/hourly.1
The use of rsync's --link-dest option is what makes this command an example of incremental backup. Multilevel incrementalA more sophisticated incremental backup scheme involves multiple numbered backup levels. A full backup is level 0. A level n backup will back up everything that has changed since the most recent level n-1 backup. Suppose for instance that a level 0 backup was taken on a Sunday. A level 1 backup taken on Monday would include only changes made since Sunday. A level 2 backup taken on Tuesday would include only changes made since Monday. A level 3 backup taken on Wednesday would include only changes made since Tuesday. If a level 2 backup was taken on Thursday, it would include all changes made since Monday because Monday was the most recent level n-1 backup. Reverse incrementalAn incremental backup of the changes made between two instances of a mirror can be forward or reverse. If the oldest version of the mirror is treated as the base and the newest version as the revised version, the incremental produced is a forward incremental. If the newest version of the mirror is treated as the base and the oldest version as the revised / changed version, the incremental produced is a reverse incremental. In making backups using reverse incremental backups, each time a reverse incremental backup is taken, it is applied (in reverse) to the previous full (synthetic) backup, thus the current full (synthetic) backup is always a backup of the most recent state of the system. This is in contrast to forward incremental backups where the current full backup is a backup of the oldest version of the system, and to get a backup of the most recent state of the system, all of the forward incremental backups have to be applied to that oldest version successively. By applying a reverse incremental to a mirror, the result will be a previous version of the mirror. This gives a means to revert to any previous version of the mirror. In other words, after the initial full backup, each successive incremental backup applies the changes to the previous full, creating a new synthetic full backup every time, while maintaining the ability to revert to previous versions. The main advantage of this type of backup is a more efficient recovery process, since the most recent version of the data (which is the most frequently restored version) is a (synthetic) full backup, and no incrementals need to be applied to it during its restoration. Reverse incremental backup works for both tapes and disks, but in practice tends to work better with disks. Companies using the reverse incremental backup method include Intronis and Zetta.net. Incremental foreverThis style is similar to the synthetic backup concept. After an initial full backup, only the incremental backups are sent to a centralized backup system. This server keeps track of all the increments and sends the proper data back to the client during restores. This can be implemented by sending each incremental directly to tape as it is taken and then refactoring the tapes as necessary. If enough disk space is available, an online mirror can be maintained along with previous incremental changes so that the current or older versions of the systems being backed up can be restored. This is a suitable method in the case of banking systems.[citation needed] In modern cloud architectures, or disk to disk backup scenarios, this is much simpler. Data is broken into chunks and placed on a cloud storage system. Metadata about the chunks is stored in a persistent system, which allows the system to assemble a point in time backup from these chunks at restore time. There is no need to refactor tape. Block-level incrementalThis method backs up only the blocks within the file that changed. This requires a higher level of integration between the sender and receiver. Byte-level incrementalThese backup technologies are similar to the "block-level incremental" backup method; however, the byte (or binary) incremental backup method is based on a binary variation of the files compared to the previous backup: while the block-based technologies work with heavy changing units (blocks of 8K, 4K or 1K), the byte-based technologies work with the minimum unit, saving space when reflecting a change on a file.[8] Another important difference is that they work independently on the file system. At the moment, these are the technologies that achieve the highest relative compression of the data, turning into a great advantage for the security copies carried out through the Internet.[citation needed] Other backup typesSynthetic full backupA synthetic backup is an alternative method of creating full backups. Instead of reading and backing up data directly from the disk, it will synthesize the data from the previous full backup (either a regular full backup for the first backup, or the previous synthetic full backup) and the periodic incremental backups. As only the incremental backups read data from the disk, these are the only files that need to be transferred during offsite replication. This greatly reduces the bandwidth needed for offsite replication. Synthetic backup does not always work with the same efficiency. The rate of data uploaded from the target machine to data, synchronized on the storage, varies depending on the disk fragmentation.[9] DifferentialA differential backup is a cumulative backup of all changes made since the last full or normal backup, i.e., the differences since the last full backup. The advantage to this is the quicker recovery time, requiring only a full backup and the last differential backup to restore the system. The disadvantage is that for each day elapsed since the last full backup, more data needs to be backed up, especially if a significant proportion of the data has changed. Forward incremental-foreverA forward incremental-forever backup[10] allows the synthetic operation to create a new full backup, which is limited to the size of the incremental file, instead of the complete size of a full backup file as it would happen in a “forward mode with synthetic fulls”. The overall consumed I/O is the same as the reversed incremental, but during the duration of the backup activity only 1 write I/O is used and the snapshot of the VM is opened for less time than the reversed incremental; the remaining 2 I/O are used to update the full backup file. See also
References
Further reading
|