Learnings

Thursday, May 13, 2010

Shell script: To check network and perform some action when dead

#set -x

while [ 1=1 ]
do
sleep 5
ping -n 1 www.google.com > ping.result
n=`cat ping.result | grep -c -e 'could not find' -e 'unreachable' -e 'error'`
# echo "Value of n is $n, disc is $disc"
if [ "$n" -ne "0" ]
then
echo `date | awk '{print $4}'`---
if [ "$disc" != "1" ]
then
echo "Disconnecting..."
`artdial.exe /d`
echo "Disconnected"
fi
disc=1
else
echo `date | awk '{print $4}'` +++
if [ "$disc" != "0" ]
then
# Wait a while before trying to connect
sleep 5
echo "Connecting..."
`artdial.exe /u:ramakv@in.ibm.com /p:nA15ch3n Wecm Bangalore`
dc=$?
disc=$dc
echo "...Connected"
fi
fi
done

Wednesday, May 12, 2010

DB2 Import Unix script with validation

# For bash shell, works on AIX
# It may be required to do a sort before diff in the last command
# This script loads the specified input file to the DB2 table EMP
# and checks the status of the load. It also confirms that the data
# was loaded as expected by exporting back the data from EMP and
# comparing with what is in the input file
if [ !$# -eq 1 ]
then
echo "USAGE: DB2IMPEMP.sh ";
exit 1
fi
# Following line picks out the initial part of the extract file name
# passed as input parameter and stores in variable messagefile
# with an extension .msg. Also sets the name of the output file
# to store exported data
messagefile=${1%.*}".msg"
outputfile=${1%.*}".out"
echo $messagefile
echo "Connecting to database..."
db2 connect to nec
echo "...Connected"
retval=`db2 import from $1 of del modified by coldel\| commitcount 0 messages $messagefile replace into EMPLOYEE`

if [ ! -z "$(echo $retval | awk '/SQL.*N/')" ]
then echo "Import error: "$retval;
exit -1;
fi

if [ ! -z "$(echo $retval | awk '/SQL.*W/')" ]
then echo "Import warning: "$retval;
fi


retval=`db2 export to $outputfile of del modified by coldel\| select \* from EMPLOYEE`

compare=`diff $1 $outputfile`
if [ $? -ne 0 ]
then
echo "Data Validation Error";
exit -2;
fi

Labels:

Friday, May 07, 2010

DB2 Import Considerations

Loading extract files into DB2
DB2's IMPORT command will be used for loading all the extracted data into their corresponding DB2 tables. While the DB2 LOAD command offers superior performance, the IMPORT command offers more flexibility as it is possible to ensure that existing data in the table is never lost even if the utility or the database server crashes during processing. As the DB2 equivalents of the GENO tables are small in size, insert performance provided by the IMPORT utility will be adequate. Further, the operations of IMPORT are logged and the data is hence recoverable from a previous backup in the event of a database failure whereas LOAD operations are not logged.

The extract files will meet the following requirements to facilitate loading into DB2:
1. Their codepage will be UTF-8 (CCSID 1208 on mainframe). As UTF-8 is the default code page on DB2 LUW, this will avoid codepage conversions.
2. The files will be comma-delimited. To ensure that character data which contains commas does not end up disrupting the file format, all character data will be enclosed by double quotes. In case the actual data can itself contain double quotes, then such double quotes will be preceded by an extra double quote. For example, to import the string: I am 6" tall the file will supply "I am 6"" tall".
3. During IMPORT, the INSERT_UPDATE option will be used to load data. This way, new rows will get inserted while existing rows whose values may have changed will get updated. It will be ensured that all DB2 tables will have primary keys defined
4. A COMMITCOUNT OF 0 will be used (this is the default). This way, the IMPORT utility will only commit all of its inserts and udpates at the very end. Doing this ensures that an application that happens to be reading the DB2 table during the time the import is going on will either see the old snapshot of the data or the new one but not something in between. The data loads will, however, be scheduled such that no other applications try reading the DB2 table at the time of the import. COMMITCOUNT 0 avoids the need to have restart logic built into the IMPORT process.
5. Dates will be extracted in the format YYYYMMDD (DB2 default).
6. A sample import command for an EMP table will look like:
import from emp.csv of del
modified by dateformat="YYYYMMDD" coldel, commitcount 0
messages emp.out
insert_update into emp
When the above command is being invoked from a shell script or unix prompt, the " will be escaped by prefixing with \.
7. At the end of the import, the messages file (emp.out for example) will be inspected to see if all data was successfully imported.

Wednesday, May 05, 2010

Glossary model thoughts

IDA provides a Glossary model which has abbreviations for most commonly used terms in data models. This is good for standardization and helps generate consistent column names from attribute names. Only issue is that, sometimes with small words, the whole word may be better than the abbrn. E.g, REMIT is abbr to RMIT. IDA should allow an option wherein we can enter two abbrns for a word and it uses the longer abbr when space allows and switches to shorter one when some limit is reached.