For the list of all supported Decision Insight downloads and releases, see the Downloads page.

How to do a database transformation

For testing or support reasons, it may be needed to get the production data out of the production environment, possibly on a less secure environment. Under those circumstances, it may be necessary to hide sensitive data.

Database transformation allows to define, for eligible attributes, if and how to transform their values so that it does not expose private data. The transformation can be a custom encryption, a replacement by a pseudonym, or the replacement by a generic value : every case is unique and only the user performing the transformation knows what is the right security level.

Database transformation rewrites attribute values into the database. Unlike an update performed via data integration, these changes affect the attribute history so that previous values cannot be retrieved.

For each attribute to transform, the user must provide a function (Groovy function) that generates the new value from the old one. This process is executed from an existing checkpoint and generates a new checkpoint.

Constraints and limitations

Which attributes can be transformed

Must be from user-defined model

Only attributes from the user-defined model can be transformed. That means system attributes (such as entity/relation/attribute/space/... names, precomputings, computings or data integration states, ...) cannot be transformed

There is an exception for computed attributes. An option allows you to transform them.

Must be a supported attribute type

The following attribute types that can be transformed Boolean, Decimal, Integer and String

That means that the following attributes cannot be transformed: RelationBaseline, Classifier, Dispersion Multipliers, Forecast, CalendarDuration, Thresholds, DurationInstant, Interval

Must not be part of a key

An attribute used in an entity key cannot be transformed.

Must not be an encrypted attribute

Encrypted attributes cannot be transformed.

Which transformations can be done

Update only

Only update transformations are possible, than means you cannot perform a database transformation to:

  • Add a new value.
  • Delete an existing value.

Type consistency

The transformed value type must have the same type as the original one. See also Attribute types.

Changing String attribute value size

In the case of string attributes, each transformed attribute may have a different size than the original data. However, if your transformed database is required to have a similar level of performance as the original one, the transformation must generate transformed data which size is similar to the original data.

Consequences of using a transformed database

Computing workflow consistency

If some attributes that participate in a computing workflow – i.e. the attribute is an input or an output of a computing – the transformation will render the computing workflow inconsistent.

For example, if there is a computing represented by attribute A that sums another attribute B:

  • Transforming the attribute A, will make it different than the sum of B.
  • Transforming the attribute B will make the attribute A different than the sum of transformed B.

These inconsistencies could be fixed by recomputing the impacted computed attributes:

  • If the computing is based on a precomputing, then these precomputings must be deleted and recreated.
  • Otherwise, force the recomputation of the computed attribute. 

In both cases, you must recompute attribute A.

Configuration attributes and APPX

Configuration attributes, that is non-partitionned attributes, are exported in as  APPX. So, if after a transformation, an APPX exported from an non-transformed database is imported into your transformed instance of DI, it will override the transformed data with the data from the APPX file that you're importing.

How to use the database transformation

To transform your database, you must:

  1. Generate a database transformation script if not already done. To create one:
    • A template must be generated from a running node. 
    • The template must be filled by writing a transformation function for each attribute to transform.
  2. Execute the database transformation script on a checkpoint:
    • From a node checkpoint:
      • On a live node, the checkpoint must be locked during the transformation process. Performance impact of the node must be anticipated.
      • On a cold node, any checkpoint can be used.
    • From a database checkpoint backup.

Generate the database-transformation.groovy script

The generateDatabaseTransformation command generates the template of the Groovy script that will be used to transform the database. This template contains an identity transformation function for each attributes that is eligible for transformation.

Execute the shell command named generateDatabaseTransformation.

g! generateDatabaseTransformation
File /opt/decision-insight/var/work/radon-administration/database-transformation20180924T131020705Z.groovy has been generated.
g! generateDatabaseTransformation --with-computing
File /opt/decision-insight/var/work/radon-administration/database-transformation20180924T131718468Z.groovy has been generated.

The --with-computing option generates a script with all attributes that are either computed or input of other computed attributes. For more information, see computing workflow consistency

This generates a file like this:

database-transformation.groovy
import groovy.transform.Field

// TODO add global variable, eg @Field Random rnd = new Random()

// Generated on 2018-09-24T13:10:20.705Z
 
// Attribute Account.customerName
def transformAccount_customerName(String account_customerName) {
    return account_customerName
}
 
// Attribute Payment.amount
def transformPayment_amount(Double payment_amount) {
    return payment_amount
}
 
// Do not change lines belows
[
        "00000002-0001-0000-12b5-e760d3000000": this.&transformAccount_customerName,
        "00000002-0001-0000-12b5-e760d3000001": this.&transformPayment_amount,
]

Update database-transformation.groovy

Update the script to write your transformation functions:

database-transformation.groovy
import groovy.transform.Field

// TODO add global variable, eg @Field Random rnd = new Random()
@Field Random rnd = new Random()

// Generated on 2018-09-24T13:10:20.705Z
 
// Attribute Account.customerName
def transformAccount_customerName(String account_customerName) {
    return "customer" + rnd.nextLong()
}
 
// Attribute Payment.amount
def transformPayment_amount(Double payment_amount) {
    return payment_amount
}
 
// Do not change lines belows
[
        "00000002-0001-0000-12b5-e760d3000000": this.&transformAccount_customerName,
        "00000002-0001-0000-12b5-e760d3000001": this.&transformPayment_amount,
]

If the format of the groovy script changes in future releases, there will be no automatic migration to the new format.

Launch the database transformation

To apply the database transformation:

  • on a checkpoint from a node

    bin/tnd-database-transformation --checkpoint <checkpointTT|LAST> --script <database-transformation.groovy> --output <outputFolder>
    
  • on a node checkpoint backup

    bin/tnd-database-transformation --backup <backupFolder> --checkpoint <checkpointTT|LAST> --script <database-transformation.groovy> --output <outputFolder>
    

Using third parties libraries

In order to execute your database transformation, you might need to use third parties libraries to manipulate data. Add the parameter --add-classpath in the command line to use third parties jars.

The classpath separator is the comma, you can also use one --add-classpath per jar.

bin/tnd-database-transformation --checkpoint <checkpointTT|LAST> --script <database-transformation.groovy> --output <outputFolder> --add-classpath <third-party.jar,...>

bin/tnd-database-transformation --checkpoint <checkpointTT|LAST> --script <database-transformation.groovy> --output <outputFolder> --add-classpath third-party.jar --add-classpath other-third-party.jar

Use the database transformation to export application model data only

The --application-model-only  option generates a script that restrict the transformation to platform data only. It means that no data associated to the model created in the application will be available in the generated checkpoint.

If, for confidentiality reason, data cannot be exported with a support archive, this feature can be used as an alternative since no sensitive data will be exported.

As previously described, usage will be in 3 steps:

// step 1: generate the groovy script with --application-model-only option
generateDatabaseTransformation --application-model-only
// step 2: modify the script
// step 3: use the script
bin/tnd-database-transformation --checkpoint <checkpointTT|LAST> --script <database-transformation.groovy> --output <outputFolder>

Related Links