Partition lifecycle management

Data lifecycle management implies the organization of data to optimize the injection and the query performances, and to preserve the data in the database during the same lifecycle.

Only the Object Types having the same lifecycle and the same partition policy can be defined in the same space. For more information on spaces, see Define Datastore spaces. For more information on partition policy, see Define partitions for Datastore.

Partition lifecycle in Datastore is based on two notions:

  • Online period: During this period, audit information is available in the database. When the period is over, audit information can be archived.
  • The online period must be defined according to business requirements.
  • Retention period: During this period, audit information is not suppressed from the database, information can be remounted. When the period is over, audit information can be suppressed and no longer restored.
  • The retention period must be defined according to legal requirements.

The partition lifecycle should be taken into account during the analysis phase to define spaces and partition policy.

See Define partitions for Datastore for an overview of the topic.

Lifecycle rules

The following rules govern the lifecycle management:

  • Lifecycle management only concerns spaces with partitions having day policy or period policy.
  • Online and retention periods can be defined at space level or at partition level. If they are defined at space level, they are available for all partitions of the space except for the partitions that have specific periods (in this case, periods of the partition will be used).
  • It's recommended that the online and retention periods are managed at space level. The existing partitions of this space have the same online and retention periods defined for the space. The new partitions in this space will automatically have the same online and retention periods.
  • The retention period must be greater than or equal to the online period. If the online period is equal to the retention period, corresponding partitions will be directly purged without being backed-up.
  • The unmount partition date takes into account the collection dates (*) and the object dates (**) of the records contained in the partition. So, the user can execute queries on data that has a collection date earlier than the object date.
  • The unmount partition date is equal to the maximum date amongst all collection dates and object dates contained in the partition, plus the online period.
  • The purging partition date takes into account the collection dates and also the object dates of the records contained in the partition.
  • The purging partition date is equal to the maximum date amongst all collection dates and object dates contained in the partition, plus the retention period.
  • The purging partition date is not recalculated when a partition is remounted.
  • By default, the periods are unlimited. To remove a period, use the default value.
  • The system partitions that contain Error or Comment objects are unmounted at the same time than their referenced original Object when the life cycle is defined at space level. This is why it is recommended to define lifecycle at space level.
  • Example
    • Partition P1 with record collection date = 10-02-2017
    • Online period of the space that contains the partition P1 = 1M
    • Retention period of the space that contains the partition P1 = 1Y
    • The partition P1 will be unmounted the 11-03-2017.
    • The partition P1 will be purged the 11-02-2018.
    • If the partition P1 is remounted the 04-07-2017 with an online period of 7D:
    • The partition P1 will be unmounted the 11-07-2017.
    • The partition P1 will be purged the 11-02-2018.

(*) collection date is the injection date

(**) object date is the functional date of the line in the collection

Lifecycle implementation

To implement a partition lifecycle, you should start by backing up each partition into a dedicated file. All backup files are located on the server side of the database.

The DBA customizes the database parameters. Some default stored procedures are provided. The DBA can create additional ones and use them.

See "Enable the partition lifecycle functionality" in the "Post-installation" chapter of the Datastore Installation Guide for more details.

List partition information

Each partition has its own information. Use the dstools partition list service to get the partition name, online and retention periods.

See List partitions.

Update online and retention periods

Online and retention periods drive lifecycle processes. If the periods are not defined at partition level, partitions use the periods defined at space level.

By default, the periods are unlimited. To remove a period, use the "default" value.

Use the dstools partition update service to update the online period or the retention period of a space or partition.

See Update spaces or partitions.

Maintenance service

Partitions for which the online period is over are identified and unmounted. Partitions for which the retention period is over are purged.

After the online period is over, the UNMOUNT maintenance task backs up partitions in the BACKUP database server directory.

The database backup process must first secure the files. Then, a commitment file is created by the backing up process in the COMMITMENT directory. See Carve partitions for more information about the commitment file.

The database administrator integrates the backup of the produced file and the generation of the commitment file.

The CARVE maintenance task drops the tables and views of one partition: the partition that has the corresponding commitment file in the COMMITMENT directory. The partition is stated as OFFLINE.

When the retention period is over, the PURGE maintenance task purges partitions.

The backup can be mounted to make a partition available for queries.

When the online period is equal to the retention period, partitions are purged without being backed-up.

Tasks for lifecycle management

There are different tasks to unmount and mount partitions in Datastore.

Tasks to unmount partitions

Partitions to be unmounted are identified using online periods and data dates contained in the partition. To unmount partitions, check the list of partitions ready to be unmounted and execute the following maintenance tasks:

UNMOUNT task

Each partition to be unmounted is backed-up using a stored procedure. A dump file per partition to be unmounted is produced in the BACKUP directory.

After the process is done, the backup robot secures the backuped file through a commitment file.

Note   The backup robot process is a shuttled operation not triggered by Datastore.
Note   For security reasons, the dstools service never drops backup files by itself.

CARVE task

The partition is considered as unmounted when the commitment file is received in the COMMITMENT directory.

The carve task drops the database objects of the unmounted partition from the commitment file, but does not drop the partition description.

PURGE task

If the retention period is over, the partition description is removed. The partition cannot be mounted anymore.

Task to mount partitions

To mount partitions, execute the partition MOUNT service to plan the restoration of the partition, and execute the following task:

MOUNT maintenance task

The tool restores the list of partitions defined by the user.

Caution   The corresponding backup files must be present in the RESTORE directory.

Restore a partition

To restore a partition:

  • Put the backup file in the RESTORE database server directory.
  • Use the Mount partition service to plan the restoration of the partition:
  • dstools partition mount --partition PARTITION --online 6M
  • Use the MOUNT maintenance task to perform the restoration.

See Mount partitions for more details.

Customize backup and restore processes

If you want to customize backup and restore processes to secure and compress the dump files, some extension points are available for Oracle database only.

Refer to the section "Customizing a lifecycle management datapump job" of the <Install>/Datastore/DatastoreRuntime/extra/database-oracle/readme.txt file.

Advanced architecture implementation

Depending on your database architecture, a set of additional tasks has to be executed.

Organize your partitions

With an advanced type of architecture, you can adapt the way your partitions are organized.

For instance, you may want to inject a set of files into a single partition to improve the injection performances as described in the Datastore Capacity Planning Guide. In this case, you have to customize the partition service.

At injection time, the partition service java bean computes the partition key that defines the partition to be used.

The process is detailed in the "Custom Partition Service" section in the <Install>/Datastore/DatastoreRuntime/extra/java/readme.txt file.

Manage an injection sequence

Depending on the time frame of the expected sequence and on its architecture, sequence injections may have to be performed in a dedicated partition without maintaining indexes.

When the injection sequence ends, indexes have to be rebuilt to make the partition available for queries.

The "Altering dynamic indexes" section in the <Install>/Datastore/DatastoreRuntime/extra/database-oracle/readme.txt file or the <Install>/Datastore/DatastoreRuntime/extra/database-sqlserver/readme.txt file presents:

  • how to turn indexes off when the partition is created.
  • how to rebuild the indexes after performing the injection sequence (and potentially the table compression)

One instance for several injection chains

To take advantage of the Oracle RAC scalability, it is mandatory to dedicate a set of chains per instance and per partition. This organization avoids instance synchronization.

A possible implementation is:

  • Dedicate one injector machine per instance
  • Install the corresponding chains on each injector machine
  • Use the Datastore profile to define a user-dedicated URL:
    • In Administration, define a profile per instance. For each profile, use the Main database user Datastore Runtime setting to set the URL.
    • Use the profile parameter --as at injection time to connect with the dedicated instance:
      dstools inject --as MAIN_FLOW --collection AUDIT auditTrails.dat
      MAIN_FLOW is the profile to use.
    • Note Use the dstools setting list command to display the URL used by this batch tool.

Table compression

Compressing new tables with Hybrid Columnar Compression (HCC) after injection reduces their disk allocation.

Tables to compress can be identified because their indexes are created as unusable (indexes have to be rebuilt once the table is compressed).

Use the following scripts to identify and compress tables, connected as DSMAIN user:

  • Table identification:
  • Select TABLE_NAME from USER_INDEXES where STATUS='UNUSABLE' group by TABLE_NAME;
  • Table compression:
  • alter table <TABLE_NAME> move compress for query low parallel (degree <PARALLEL>);

Where:

INDEX_NAME: name of the index to rebuild

PARALLEL: number of Oracle processes used to rebuild the index.

Refer to the Oracle documentation for more information.

Related Links