Loading extract files into DB2
DB2's IMPORT command will be used for loading all the extracted data into their corresponding DB2 tables. While the DB2 LOAD command offers superior performance, the IMPORT command offers more flexibility as it is possible to ensure that existing data in the table is never lost even if the utility or the database server crashes during processing. As the DB2 equivalents of the GENO tables are small in size, insert performance provided by the IMPORT utility will be adequate. Further, the operations of IMPORT are logged and the data is hence recoverable from a previous backup in the event of a database failure whereas LOAD operations are not logged.
The extract files will meet the following requirements to facilitate loading into DB2:
1. Their codepage will be UTF-8 (CCSID 1208 on mainframe). As UTF-8 is the default code page on DB2 LUW, this will avoid codepage conversions.
2. The files will be comma-delimited. To ensure that character data which contains commas does not end up disrupting the file format, all character data will be enclosed by double quotes. In case the actual data can itself contain double quotes, then such double quotes will be preceded by an extra double quote. For example, to import the string: I am 6" tall the file will supply "I am 6"" tall".
3. During IMPORT, the INSERT_UPDATE option will be used to load data. This way, new rows will get inserted while existing rows whose values may have changed will get updated. It will be ensured that all DB2 tables will have primary keys defined
4. A COMMITCOUNT OF 0 will be used (this is the default). This way, the IMPORT utility will only commit all of its inserts and udpates at the very end. Doing this ensures that an application that happens to be reading the DB2 table during the time the import is going on will either see the old snapshot of the data or the new one but not something in between. The data loads will, however, be scheduled such that no other applications try reading the DB2 table at the time of the import. COMMITCOUNT 0 avoids the need to have restart logic built into the IMPORT process.
5. Dates will be extracted in the format YYYYMMDD (DB2 default).
6. A sample import command for an EMP table will look like:
import from emp.csv of del
modified by dateformat="YYYYMMDD" coldel, commitcount 0
messages emp.out
insert_update into emp
When the above command is being invoked from a shell script or unix prompt, the " will be escaped by prefixing with \.
7. At the end of the import, the messages file (emp.out for example) will be inspected to see if all data was successfully imported.