Reduction & Masking

The BCV5 Data Reduction and Masking Facility (BCV5 RM) is an add-on component that allows you to specify regular SQL queries for one or more tables in a copy process. For tables that have a SQL query, BCV5 does not simply copy the source data into the target environment as is. Instead, the target tables are populated with the result sets from these queries.

When you want to provide production data for your test or development environment, it is not always desirable to copy all of the contents of the tables. For example, if developers want to verify the correctness of a code change, they typically want to run unit tests with a small subset of the data because using all the production data would significantly prolong the turnaround time.

Reduction and Masking
With RM you can specify SQL statements with WHERE conditions to thin out the data. You are free to use the full set of features that Db2 offers – BCV5 does not impose any limitations. Only those rows that satisfy your condition will show up in the target table.

In addition, production data should be modified before it is copied into test or development environments in order to remove personally identifiable information (PII). Since BCV5 uses regular SQL queries, you are free to change the contents of columns that are returned by Db2. All built-in functions are available, and you can provide your own masking algorithm by embedding it in a user defined function (UDF).

Ease of use
The BCV5 ISPF interface guides you through the process of specifying your reduction and masking rules. When you specify the table name, BCV5 automatically obtains the structure of the table and generates a default SELECT statement. This statement can be modified as needed. BCV5 then automatically checks if the SQL statement is valid.

You can also assign masking and reduction rules dynamically. Simply specify a pattern for the table name and column name, and BCV5 will automatically apply the selected masking and reduction algorithm to the tables that match your criteria. Dynamic masking and reduction rules are a subsystem-wide setting. They will be applied to all BCV5 copy tasks that read data from that Db2 subsystem. This way, you can enforce masking for all copy processes that read from production data, for example.

After specifying the masking and reduction criteria, BCV5 generates the complete job chain to extract the data. This includes all steps that are necessary to transfer data from one Db2 subsystem to another. The complete process is automated – It generates the DDL for all objects involved and then executes it in the target environment. If the objects already exist in the target, BCV5 will perform a compatibility check. The process also includes the transfer of the data from the source to the target environment in the fastest way possible. When a table is copied with masking or reduction, the modified data is directly written into the target tablespaces, and target indexes are rebuilt automatically. No temporary data sets that might contain sensitive data are created.

Ensuring Referential Integrity
Specifying a reduction or masking rule for one table may also affect other tables. In many data models tables are related to other tables using referential integrity constraints. Rows from one table must have a corresponding row in a parent table, otherwise the referential integrity is violated.

In the following figure you see how rows of the DEPT and EMP tables relate to each other. The primary and foreign keys of the tables indicate these relationships. If you specify a reduction or masking rule for the DEPT table then you must also specify a corresponding rule for the EMP table.

BCV5 can propagate reduction or masking rules for you. If a table has a reduction or masking rule, it searches the Db2 catalog for child tables and generates default rules for those as well. In this example, if you specify a reduction rule for the DEPT table that extracts only the records for department C01, BCV5 will automatically generate a rule for the EMP table so that only employees in department C01 are copied. This mechanism helps you to identify and correctly process tables that are part of a referential integrity relationship.