Thursday, March 27, 2014

ODI Series - Pulling file names and timestamp from directory

Recently, I was asked a question by my colleague, Linga (he always comes up with something or the other) - Can we pull file names from a directory and process the latest one out of them, while keeping the history of processed files ?

I could recall that one other colleague Parasuram PJP did something similar. When I tried out his code, that did not work for me due to some reason. Anyways, he was kind enough to share some pearls of wisdom and I sat down searching details about Jython to fetch file names and additional information. To find out the latest file out of all, we need to capture the timestamp of its last modification.

Since we need to keep a log of all files which gets processed, we will be maintaining a RDBMS table (SQL Server in my case) -  FILENAMES with following structure

Along with this comes a twist where initial part file names will be same having different suffix. Just to simulate the scenario, I could think of some files which were on my system used while exporting Essbase data. The directory looks something like this. We have multiple files with their names starting with 'DtDemo'. The task is to load these files into SQL table and then find out the latest file to process it.

This can be achieved by writing few lines of code in Jython, call it as a procedure and then put it in a package. 
We start with creating a new procedure to use Jython technology and add a step 'PullFile' to it

Define an option for procedure to store the source directory name - SOURCE_DIR

'Command on Source' uses the RDBMS schema where the FILENAMES table exists and 'Command on Target' uses Jython with code to pull filenames, timestamp for last modification. This information is pushed to SQL table.

The SQL table have all files with names starting with 'DtDemo'. I played around with 'DtDemoBasic.txt' to save it at different times to get different timestamps for this file. Every time the procedure is executed, it loads the names of the files with their last modified timestamp. The filename for the latest one can be pulled by doing a MAX on timestamp.

It was not less that Linga added another requirement to optimize his integration - Process the file only if the file got updated, else ignore it

All I could think of  - compare the timestamp of latest two files and then decide whether to process it or not. Lets create procedure which executes an update command to update the PRCSD_FLAG to 'Y' for the latest file. This procedure should get executed only when the timestamp of latest file is different from second latest file. We would do this check in an ODI  variable and then  call the procedure based on variable's value

The variable would be executed on the RDBMS schema with table FILENAMES and have the code to compare the timestamps of top first file with the top second file when files are arranged in descending order of timestamps

The complete flow can be arranged in a package with following steps:

  • Pull all files from directory in a table
  • Refresh variable with SQL to compare timestamps of top two files
  • Evaluate the variable to check if value is '0' (zero)
  • Execute procedure to update flag if variable value is zero

Jython code and SQL queries can be downloaded from here.

Hope this helps !!!

Sunday, March 9, 2014

Automation of SmartView reports

The most commonly used reporting tool with Hyperion setup is the Financial Reporting aka HFR. HFR brings with it, the ability to design reports using various datasources - Essbase, Planning, SAPBW, MSOLAP, HFM. It has been quite popular to burst static PDF reports using inbuilt scheduler or using the command line utility for Essbase, Planning and HFM. Along with HFR, SmartView is another tool used for reporting.

SmartView provide options to download HFR reports in spreadsheets and carryout analysis as most analyst love to do. One common requirement which HFR do not offer as an out-of-box solution, is the ability to burst reports in spreadsheets. This post delves into an alternate approach to cater such requirements.

The Smart View has been enhanced lately to include lot of new capabilities - Retaining Excel based formatting, butterfly reports, multi-grid reports, etc. - which combined with its macros functions help analysts design, automate and burst spreadsheet based reports in an easy and convenient way. Rather than focusing on new features, we would focus on its automation capability and take a small example to demonstrate the same.

Let's consider a SmartView template for the sample application Demo-Basic. The template pulls data for all accounts by periods for all products & all markets. The formatting includes few conditions and relative icons to give template some cosmetic look and feel. We can make sure conditional formatting is retained by enabling excel formatting from SmartView options. To automate refreshing of such templates, SmartView uses functions which can be embedded within excel macros. 

Navigate to Developer tab in Excel and open Visual Basic to open VBA editor.

The left pane of editor shows the project which consists of sheets and modules. Sheets refer to the tabs in workbook and modules store code. The modules are files with extension .bas, usually function library or custom code.

Before using the SmartView functions we need to import the library of SmartView functions under modules. Right click on Modules -> Import and navigate to smartview.bas. This file is located under SmartView\Bin folder, usually C:\Oracle\Middleware\SmartView\Bin. This gets imported as SmartViewVBA
We need to add a new module where we will use these SmartView function as excel macros. Right click on Modules -> Insert -> Module and name it anything. We named it Module5.

The code to retrieve sheet is shown below
  • The code first checks existence of connection with name "MyConn". We can name it anything, more appropriately application-dbname
  • In case the connection does not exist, it creates a connection
  • Login to Essbase using the parameters passed
  • Retrieve sheet and logout. We can all multiple sheets to retrieve

In order to schedule it, we create a VBScript file which makes call to the subroutine "RetrieveSheets" and refresh sheet(s). 

We can also add some VBScript code to email the spreadsheet to users (code not included here). You can download these files here.


Run_SmartView.vbs.txt: Remove the .txt extension (I added it to avoid being blocked by server).
This is a vbscript file which makes call to the Excel sheet and executes the Macro embedded in it. You are required to update the path of Excel spreadsheet, username, password, appname & dbname in this file. This username, password, appname and dbname will be passed on to the Excel macro when it is called. Double clicking this file or calling it from cmd will run it. Right click -> Edit to edit file

SmartView_Retrieve.xlsm: Macro enabled spreadsheet. Please make sure you have macros enabled to run and view code.

Hope this helps !!!