Copying Db2 tablespaces and indexes on file system level is fast and reduces CPU consumption when compared to traditional methods of copying Db2 data, such as Unload/Load. Db2 ships with the DSN1COPY tool, which operates on the VSAM clusters that Db2 uses to store data. Vendor solutions, such as BCV5, use the same basic idea as DSN1COPY but take it to the next level by adding flexible selection and renaming mechanisms and provide a scheduler-friendly job chain for copying an arbitrary number of objects with unmatched flexibility. Tools like DSN1COPY, BCV5, and other vendor tools usually access the VSAM clusters using standard access methods provided by z/OS. While this is already about 10 times as fast as Unload/Load based solutions, it might seem tempting to use a different approach to copying VSAM clusters: hardware assisted data set level copies.
The promise of hardware assisted copy tools
Hardware assisted copy tools are most often used to make point-in-time copies of a volume or of a group of volumes. One of the most well-known tools is probably FlashCopy from IBM. Tools from other storage solution vendors work in a similar way – and thus have the same restrictions – but use different names, such as Dell EMC Snap, Hitachi ShadowImage, and others. FlashCopy V2 supports copying data sets (or extents in FlashCopy terminology) in addition to entire volumes, and it also supports track relocation. This makes it possible to make a copy even if the source and target VSAM clusters do not occupy the exact same tracks on their respective volumes. When copying Db2 tablespaces and indexes, data set level FlashCopy may seem like a good fit. Usually, FlashCopy is used in combination with ADRDSSU because FlashCopy itself only copies the physical raw data. It does not perform any data management functions, such as defining or cataloging the target VSAM clusters.
The idea behind FlashCopy is that you establish a relationship between the tracks on the source and target volumes that are occupied by the source and target VSAM clusters. This group of tracks usually sits in multiple adjacent areas of your volumes. These areas are called track sets, which means that a copy operation will work on multiple track sets. FlashCopy then operates in the background and copies the specified tracks from the source volumes to the target volumes. After copying a track set, the relationship is automatically released.
What happens behind the curtains when FlashCopy makes copies of data sets is not trivial, but a simplified explanation looks like this:
For each source track, the following rules apply:
- Reading, either before or after the physical copy, always returns the current source data.
- Writing before it has been copied causes FlashCopy to copy the track synchronously, then release the relationship and write to the source track. This ensures that the target track reflects the data before the change, which is important for a point-in-time copy.
- Writing after it has been copied modifies the source track right away without any consequences to the target (because the relationship has already been released).
For each target track, the following rules apply:
- Reading before it has been copied returns data from the corresponding source track. Think of the target track as a pointer to the source track.
- Reading after it has been copied returns data from the actual target track.
- Writing before it has been copied causes FlashCopy to copy the track synchronously, then release the relationship and write to the target track.
- Writing after it has been copied modifies the target track right away.
The logical copy is instantaneous and you can work with both the source and the target data sets as if they were separate, even if FlashCopy is still busy moving data in the background. Eventually, the
physical copy will complete, but you are free to read from and write to both source and target at any time.
What you get is an exact 1:1 copy of your source data sets. Every target data set will look identical to its corresponding source data set at the point in time when the copy was triggered. This fact is very important because it is the main reason that FlashCopy is not an ideal choice for copying Db2 tablespaces and indexes.
more»