My Internship at the Data Warehouse at the New York State Department of Taxation and Finance

For my internship I am working with data warehouse staff on extract, transfer and load program and scripts. My duties involved in the last one month are:

  • Basics of Database design
  • Extract tools and techniques
  • Transformation techniques and issues
  • Data research
  • Load techniques and efficiencies
  • Monitoring system runs with restart applications

I am using windows XP, MS Office, Microsoft outlook 2010, Lotus Notes, Unix, IBM Mainframe, IBM TSO environment and Advanced Query Tool (AQT)

Detail explanation about the above point:

First thing I come in and I  log in, I check the previous day loads which is automatically set and I check it through the data library called WinSCP, which is a  tool we use in data warehouse to store all our scripts. WinSCP is a SFTP client and FTP client for Windows. It’s main function is the secure file transfer between a local and a remote computer. In WinSCP, we have different folders for Production and Migration. I have learned to fix if there is any error in any of the job.  I have learned and am very familiar with all UNIX commands.  To check daily job I have to go to (PR) production library through mgtrpt, data and out file(pr/mgtrpt/data/out), Where the out files are located on the WinSCP(Unix). All our load scripts are done and run through Unix.

Last one month I have been working with new Datawarehouse schema, Tables and Nicknames(Federated views). I learned how to extract all the data from the tables in IBM Mainframe. Then extract JCL (Job Control Language) with respective column names. After extracting data from the tables I did all the changes according to AQT layout. That means column names and data types. AQT is a Advanced Query Tool. I learned how to run SQL query using AQT. Learned how to sort a file.  Used to put extract file in desired order for loading table.

Learned how FTP(File Transfer Protocol) works. Use JCL to translate load file from IBM Mainframe to AIX box where database loaded and resides.

UNIX: I have been learning a lot UNIX commands and very familiar with it now. Using UNIX command learned how to pre allocates files used in load scripts and set permissions for each file.  Before running the load scripts learned out copy DAT file to TEST file using UNIX command. Once all the column positions changed according to starting and ending position and check for null indicator position. Then to run the load script. This is how we load/replace run using respective file for the table requested.

Over all, it is a great experience and a great place to work! My supervisor and co-workers have been very helpful in making my internship, a very rewarding experience.

Shubha Raj
Information Science

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s