# Lab: DataStage: Import COBOL copybook and read EBCIDIC data

IBM DataStage Flow Designer allows you to read data from a mainframe. More specifically, you can specify inputs in your DataStage job to be in EBCIDIC format and to import COBOL copybooks as table definitions.

To keep things simple in this lab we're going to speak generally about mainframes and COBOL. Files from a mainframe are usually saved as binary files and sFTPed to some server where DataStage can access them. The binary files alone are not enough for DataStage to read the contents. A COBOL playbook is required to translate the data from binary to ascii. Both files are available here: <https://github.com/IBM/datastage-standalone-workshop/tree/master/data/mainframe>.

This lab consists of the following steps:

1. [Create job layout](#1-create-job-layout)
2. [Add COBOL copybook as a table definition](#2-add-cobol-copybook-as-a-table-definition)
3. [Customize the job](#3-customize-the-job)
4. [Compile, run, view output](#4-compile-run-view-output)

## About the data

The example binary data should be downloaded to the server. Switch to the server and run:

```bash
cd /opt/IBM/InformationServer/Server/Projects/dstage1
wget https://raw.githubusercontent.com/IBM/datastage-standalone-workshop/master/data/mainframe/example.bin
```

The copybook can be downloaded to the client machine by going to the [GitHub repo](https://github.com/IBM/datastage-standalone-workshop/blob/master/data/mainframe/copybook.cob) and saving the file to the desktop. The copybook we're using looks like this:

```
01  RECORD.
    05  ID                        PIC S9(4)  COMP.
    05  COMPANY.
        10  SHORT-NAME            PIC X(10).
        10  COMPANY-ID-NUM        PIC 9(5) COMP-3.
        10  COMPANY-ID-STR
    05  METADATA.
        10  CLIENTID              PIC X(15).
        10  REGISTRATION-NUM      PIC X(10).
        10  NUMBER-OF-ACCTS       PIC 9(03) COMP-3.
```

## Before you start: Launching DataStage Flow Designer

Before we start the lab, let's switch to the `iis-client` VM and launch `Firefox`.

![Switch to iis-client](https://1138345240-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MEzw3DrGZcdmHwc51yL%2Fsync%2F4af2f82e080f3cf332a10c188968c608c54c7c11.png?generation=1613580975614829\&alt=media)

Launch the desktop client by going to the start menu and searching for `DataStage Designer`.

![DataStage Designer](https://1138345240-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MEzw3DrGZcdmHwc51yL%2Fsync%2F627f2c577fdcb43187153c8175b96a2b3a746379.png?generation=1613580975682951\&alt=media)

## 1. Create job layout

Start a new `Parallel Job` project and create a job that looks like the image below. Remember to wire the elements together. It should have:

* 1 x Complex Flat File
* 1 x Peek
* 1 X Sequential File

![mainframe job](https://1138345240-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MEzw3DrGZcdmHwc51yL%2Fsync%2Fc4af9c6519fecbdc36e962850276b2159c299d8d.png?generation=1613580990640055\&alt=media)

## 2. Add COBOL copybook as a table definition

In the toolbar click on `Import` > `Table Definitions` > `COBOL File Definitions`.

![COBOL definitions](https://1138345240-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MEzw3DrGZcdmHwc51yL%2Fsync%2F1a97088144fe0080df302c2a26758d97aa0003b0.png?generation=1613580985648364\&alt=media)

Specify the downloaded copybook file and click `Import`.

![Import copybook](https://1138345240-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MEzw3DrGZcdmHwc51yL%2Fsync%2F634f707f8682865672dd45820032f7cb6fe95d4c.png?generation=1613580991364844\&alt=media)

You have just imported your copybook definitions!

## 3. Customize the job

The first step is to double click on the Complex Flat File node, go to the `File options` tab, and specify the example binary file as the input. Critically, we must specify `Fixed block` as the record type.

![file options](https://1138345240-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MEzw3DrGZcdmHwc51yL%2Fsync%2F99382a6523743dfc2e66366ee3dcc42abdf5f5cf.png?generation=1613580992454141\&alt=media)

Go to the `Record options` tab and choose the `Binary` data format and `EBCDIC` as the character set.

![record options ](https://1138345240-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MEzw3DrGZcdmHwc51yL%2Fsync%2Ff4c3b6f1d0cfb991e5c759a3e9daa25e9d2d6f27.png?generation=1613580958582548\&alt=media)

Go to the `Records` tab and click the `Load` button, this will give us the option to specify a copybook.

![load copybook](https://1138345240-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MEzw3DrGZcdmHwc51yL%2Fsync%2F70aba36d4c9ddb5a2ae20b250889727939d8015e.png?generation=1613580966755881\&alt=media)

Choose to use on the copybook that was imported in the previous step.

![find copybook](https://1138345240-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MEzw3DrGZcdmHwc51yL%2Fsync%2F49384ad887bb0d6996508a5df9a2c4f4c106a8e9.png?generation=1613580995667126\&alt=media)

Select the `>>` icon to use all fields from the copybook.

![specify fields](https://1138345240-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MEzw3DrGZcdmHwc51yL%2Fsync%2Fce215b9fca617557f7a9a307e3f2ada15b2400ac.png?generation=1613581002772708\&alt=media)

The `Records` tab should now show the various column names from the copybook.

![add records](https://1138345240-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MEzw3DrGZcdmHwc51yL%2Fsync%2F07028678570fa999ed114be95b832e2557a7bec1.png?generation=1613580988547765\&alt=media)

Double clicking on the `Peek` node allows us to map output from the `Complex Flat File`. Click on the `Output` section and choose the `Columns` tab.

Enter the following new columns:

* ID
* SHORT\_NAME
* CLIENTID
* COMPANY\_ID\_NUM
* COMPANY\_ID\_STR
* REGISTRATION\_NUM
* NUMBER\_OF\_ACCTS
* ACCOUNT\_NUMBER
* ACCOUNT\_TYPE\_X

![add colums](https://1138345240-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MEzw3DrGZcdmHwc51yL%2Fsync%2F34480549b4003b73f10cadcdd8ad68fe6b1ddbe7.png?generation=1613580979559599\&alt=media)

Still in the `Output` section now click on the `Mapping` tab and choose to `Auto-Match`.

![auto-match](https://1138345240-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MEzw3DrGZcdmHwc51yL%2Fsync%2Fc4bda0ff3853fca3721cad12cb036757530057d1.png?generation=1613580987553048\&alt=media)

Double clicking on the `Sequential File` node brings up a single option. To specify a filename, choose `mainframe.csv` for example.

![file name](https://1138345240-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MEzw3DrGZcdmHwc51yL%2Fsync%2F40462bf44ff0fc01ad51fd5b453e70ff225814ab.png?generation=1613580966229372\&alt=media)

## 4. Compile, run, view output

Compile and run the job using the usual icons from the toolbar.

After running the job you can view the output from the Designer tool by clicking on the Sequential File node and clicking the `View Data` button. Click `OK` on the next dialog.

![output file contents](https://1138345240-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MEzw3DrGZcdmHwc51yL%2Fsync%2Fd80fd5b33d60118b5d343b3711a050cf14625e4b.png?generation=1613580986524451\&alt=media)

You'll be able to see one row of data with an ID, SHORT\_NAME and a few other fields.

![shows the data](https://1138345240-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MEzw3DrGZcdmHwc51yL%2Fsync%2F7fdec4f7dc0457413fe3f8a38f4edaabad50c648.png?generation=1613580994447445\&alt=media)

The output file is also written to the server. Switch to the server VM by clicking the first icon on the `Environment VMs panel` and selecting `iis-server`. Login as the `root` user with the password `inf0Xerver`.

![Switch to server VM](https://1138345240-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MEzw3DrGZcdmHwc51yL%2Fsync%2F1220da958ddbc17ed19ad67ceb1e6c9639dd7561.png?generation=1597955733364644\&alt=media)

Change your direcotory using `cd` to the location where you had stored the file.

```bash
cd /opt/IBM/InformationServer/Server/Projects/<project-name>/
```

Finally, output your results using the `cat` command.

```bash
cat mainframe.csv
```

![in a terminal](https://1138345240-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MEzw3DrGZcdmHwc51yL%2Fsync%2Feddb01e16c323f202fc2a9eec7ec8353565de41d.png?generation=1613580970432726\&alt=media)

**CONGRATULATIONS!!** You have completed this lab!
