While dealing with files data in TIBCO BW, you often come across situations where large files have duplicate data and you want to process only duplicate-free data. In this step by step TIBCO tutorial, I will explain how we can read data from a file, parse the data based on a data format and then use For-each group statement in a mapper to remove duplicates from the records.
We have a file (order.txt) which contains Orders data in the form of Product Name and Product Price in comma separated format as shown below:
As you can see the above example data, we have duplicate entries for the products like mobile and watch. We will go step by step to write a TIBCO process that will read this data from the file, parse it and then using a mapper, we will get duplicate-free data.
Let’s proceed step by step with the tutorial.
Step 1: Create Data Format for Comma Separated File Data
As we will be reading comma separated data from the orders.txt file, we need to define a Data Format in our project by using Data Format resource from the Parse Palette.
In the configuration tab of Data Format, specify Delimiter Separated as Format Type and use Comma (,) as a Column Separator. Also choose Carriage Return/Line Feed as Line Separator as shown in screenshot below:
Next, move to Data Format tab and define the format of the data. For our case, we have orders data with Product Name and Price values, so we define Data Format Schema accordingly as shown in below screenshot:
Step 2: Create Tibco Process to Read File, Parse Data and Remove Duplicate Records
Create a new process in your designer project. Add a Read File activity to read orders data. In the Input tab of this Read File activity, specify the full path of the file to be read as shown below:
After Read File activity, add Parse Data activity from the Parse Palette in your process. In the configuration of Parse Data activity, choose the Data Format (that we created in Step 1) as you can see in screenshot below:
The next thing that we are going to do is the actual step which will do the needful. In this step, we will add a Mapper in our process and then use for-each-group statement to remove duplicate orders that we have parsed already.
In the Input Editor tab of the Mapper, define a complex element with the elements for order details (ProductName and Price) as shown below:
Now go to the Input tab and right click on Order Element. Choose Statement–>Surround with For-each-group option. After this, map Parse Data output’s order element to it. As we want to get only unique orders based on the product name, we use ProductName element in the Grouping field. Also map, ProductName and Price elements to the schema.
The complete input mapping is shown below:
This completes configurations and mappings for all the activities in our process. Complete process will now look like below:
Now let’s proceed to the next step in which we will run the process in designer tester and see the results.
Step 3: Test TIBCO Process to remove Duplicate Data
Load the process in the designer tester so that Its job is created. As you can see below, process has run successfully and Parse Data has given output with all the orders from the file (duplicates included):
And now if you see the output of Mapper, you can see that It has given duplicate free output by removing duplicates:
This completes the tutorial. I hope it was beneficial for you and you liked it. Feel free to contact me for any further help.