eXo JCR Backup Service
1 Concept
The main purpose of that feature is to restore data in case of system faults and repository crashes. Also the backup results may be used as a content history. The eXo JCR backup service was developed from the JCR 1.8 implementation. It's an independent service available as an eXo JCR Extensions project. The concept is based on the export of a workspace unit in the Full, or Full + Incrementals model. A repository workspace can be backup and restored using a combination of these modes. In all cases, at least one Full (initial) backup must be executed to mark a starting point of the backup history. An Incremental backup is not a complete image of the workspace. It contains only changes for some period. So it is not possible to perform an Incremental backup without an initial Full backup. The Backup service may operate as a hot-backup process at runtime on an in-use workspace. It's a case when the Full + Incrementals model should be used to have a guaranty of data consistency during restoration. An Incremental will be run starting from the start point of the Full backup and will contain changes that have occured during the Full backup too. A restore operation is a mirror of a backup one. At least one Full backup should be restored to obtain a workspace corresponding to some point in time. On the other hand, Incrementals may be restored in the order of creation to reach a required state of a content. If the Incremental contains the same data as the Full backup (hot-backup), the changes will be applied again as if they were made in a normal way via API calls. According to the model there are several modes for backup logic:- Full backup only : single operation, runs once
- Full + Incrementals : Start with an initial Full backup and then keep incrementals changes in one file. Runs until it is stopped.
- Full + Incrementals(periodic) : Start with an initial Full backup and then keep incrementals with periodic result file rotation. Runs until it is stopped.
2 How it works
2.1 Implementation details
Full backup/restore is implemented using the JCR SysView Export/Import. Workspace data will be exported into Sysview XML data from root node. Restore is implemented using the special eXo JCR API feature: a dynamic workspace creation. Restoring of the workspace Full backup will create one new workspace in the repository. Then the SysView XML data will be imported as the root node. Incremental backup is implemented using the eXo JCR ChangesLog API. This API allows to record each JCR API call as atomic entries in a changelog. Hence, the Incremental backup uses a listener that collects these logs and stores them in a file. Restoring an incremental backup consists in applying the collected set of ChangesLogs to a workspace in the correct order.2.2 Work basics
The work of Backup is based on the BackupConfig configuration and the BackupChain logical unit. BackupConfig describes the backup operation chain that will be performed by the service. When you intend to work with it, the configuration should be prepared before the backup is started. The configuration contains such values as:- types of full and incremental backup ? (fullBackupType, incrementalBackupType) Strings with full names of classes which will cover the type functional.
- incremental period - a period after that a current backup will be stopped and a new one will be started, in seconds (long).
- target repository and workspace names ? Strings with described names
- destination directory for result files ? String with a path to a folder where operation result files will be stored.
3 Configuration
As an optional extension, the Backup service is not enabled by default. You need to enable it via configuration. Below is an example configuration compatible with JCR 1.9.3 and later :<component>
<key>org.exoplatform.services.jcr.ext.backup.BackupManager</key>
<type>org.exoplatform.services.jcr.ext.backup.impl.BackupManagerImpl</type>
<init-params>
<properties-param>
<name>backup-properties</name>
<property name="default-incremental-job-period" value="3600" /> <!-- set default incremental period = 60 minutes -->
<property name="full-backup-type" value="org.exoplatform.services.jcr.ext.backup.impl.fs.FullBackupJob" />
<property name="incremental-backup-type" value="org.exoplatform.services.jcr.ext.backup.impl.fs.IncrementalBackupJob" />
<property name="backup-dir" value="target/backup" />
</properties-param>
</init-params>
</component>- incremental-backup-type (since 1.9.3) : t the FQN of incremental job class. Must implement org.exoplatform.services.jcr.ext.backup.BackupJob
- full-backup-type (since 1.9.3) : the FQN of the full backup job class; Must implement org.exoplatform.services.jcr.ext.backup.BackupJob
- default-incremental-job-period (since 1.9.3) :the period between incremetal flushes (in seconds)
- backup-dir : the path to a working directory where the service will store internal files and chain logs.
4 Usage
4.1 Perform a Backup
In following example we create a BackupConfig bean for the Full + Incrementals mode, then we ask the BackupManager to start the backup process.// Obtaining the backup service from the eXo container.
BackupManager backup = (BackupManager) container.getComponentInstanceOfType(BackupManager.class);
// And prepare the BackupConfig instance with custom parameters.
// full backup & incremental
File backDir = new File("/backup/ws1"); // the destination path for result files
backDir.mkdirs();
BackupConfig config = new BackupConfig();
config.setRepository(repository.getName());
config.setWorkspace("ws1");
config.setBackupDir(backDir);
// Before 1.9.3, you also need to indicate the backupjobs class FDNs
// config.setFullBackupType("org.exoplatform.services.jcr.ext.backup.impl.fs.FullBackupJob");
// config.setIncrementalBackupType("org.exoplatform.services.jcr.ext.backup.impl.fs.IncrementalBackupJob");
// start backup using the service manager
BackupChain chain = backup.startBackup(config);// stop backup
backup.stopBackup(chain);4.2 Perform a Restore
Restoration involves the reloading the backup file into a BackupChainLog and applying appropriate workspace initialization. The following snippet shows the typical sequence for restoring a workspace :// find ~~BackupChain~~ using the repository and workspace names (return null if not found) BackupChain chain = backup.findBackup("db1", "ws1"); // Get the RepositoryEntry and WorkspaceEntry ManageableRepository repo = repositoryService.getRepository(repository); RepositoryEntry repoconf = repo.getConfiguration(); List<WorkspaceEntry> entries = repoconf.getWorkspaceEntries(); WorkspaceEntry = getNewEntry(entries, workspace); // create a copy entry from an existing one // restore backup log using ready RepositoryEntry and WorkspaceEntry File backLog = new File(chain.getLogFilePath()); BackupChainLog bchLog = new BackupChainLog(backLog); // initialize the workspace repository.configWorkspace(workspaceEntry); // run restoration backup.restore(bchLog, repositoryEntry, workspaceEntry);
4.2.1 Restoring into an existing workspace
To restore a backup over an existing workspace, you are required to clear its data. Your backup process should follow these steps :- remove workspace
ManageableRepository repo = repositoryService.getRepository(repository); repo.removeWorkspace(workspace);
- clean database, value storage, index;
- restore (see snippet above)
4.2.2 System workspace
Restoring the JCR System workspace requires to shutdown the system and use of a special initializer. Follow these steps (this will also work for normal workspaces) :- Stop repository (or portal)
- clean database, value storage, index;
- In configuration the workspace set BackupWorkspaceInitializer to reference your backup.
<workspaces> <workspace name="production" ... > <container class="org.exoplatform.services.jcr.impl.storage.jdbc.JDBCWorkspaceDataContainer"> ... </container> <initializer class="org.exoplatform.services.jcr.impl.core.BackupWorkspaceInitializer"> <properties> <property name="restore-path" value="D:\java\exo-working\backup\repository_production-20090527_030434"/> </properties> </initializer> ... </workspace>
- Start repository (or portal).
5 Scheduling (experimental)
The Backup service has an additional feature that can be useful for a production level backup implementation. When you need to organize a backup of a repository it's necessary to have a tool which will be able to create and manage a cycle of Full and Incremental backups in periodic manner. The service has internal BackupScheduler which can run a configurable cycle of BackupChains as if they have been executed by a user during some period of time. I.e. BackupScheduler is a user-like daemon which asks the BackupManager to start or stop backup operations. For that purpose BackupScheduler has the method BackupScheduler.schedule(backupConfig, startDate, stopDate, chainPeriod, incrementalPeriod) where- backupConfig ? a ready configuration which will be given to the BackupManager.startBackup() method
- startDate ? a date and time of the backup start
- stopDate ? a date and time of the backup stop
- chainPeriod ? a period after which a current BackupChain will be stopped and a new one will be started, in seconds
- incrementalPeriod ? if it is greater than 0 it will be used to override the same value in backupConfig.
// geting the scheduler from the BackupManager BackupScheduler scheduler = backup.getScheduler(); // schedule backup using a ready configuration (Full + Incrementals) to run from startTime // to stopTime. Full backuop will be performed every 24 hours (BackupChain lifecycle), // incremental will rotate result files every 3 hours. scheduler.schedule(config, startTime, stopTime, 3600 * 24, 3600 * 3); // it's possible to run the scheduler for an uncertain period of time (i.e. without stop time). // schedule backup to run from startTime till it will be stopped manually // also there, the incremental will rotate result files as it configured in BackupConfig scheduler.schedule(config, startTime, null, 3600 * 24, 0); // to unschedule backup simply call the scheduler with the configuration describing the // already planned backup cycle. // the scheduler will search in internal tasks list for task with repository and // workspace name from the configuration and will stop that task. scheduler.unschedule(config);
on 06/08/2009 at 11:25