Recently I went through the website of Qliktech and read where it says "And unlike traditional BI, QlikView delivers immediate value with payback measured in days or weeks rather than months, years, or not at all.". It made me think for a while. what are they referring to ? Do they mean Qlikview as a BI solution can be implemented in weeks? Are they talking only about small department level deployments or just the development phase of the BI project life cycle or is it just a sales pitch? I am just wondering how much part a tool plays in a BI project Implementation life cycle and how does a tool effect the development phase.
Whether you use Qlikview or some traditional BI like OBIEE for that matter, about 70% of the project life cycle remains unaffected or untouched.
Then I thought about my recent Oracle BI Project. If we would have replaced OBIEE with Qlikview, will the whole project gets implemented in weeks instead of months or years. I don't think so.
Well, The topic is BI Project Implementation Life Cycle and not Qlikview.
I have worked on some small Qlikview project Implementations which generally span from 2-4 Months and also large OBIEE implementations ranging from 9 months to close to 1 year.
My latest assignment was a large Oracle BI (OBIEE) Global Implementation Project.
I would like to share my knowledge and learnings from the current as well as previous projects. I will also discuss how large deployments differ from smaller ones and also the different phases involved in a BI project.
Big deployments generally includes ETL as well as BI and takes somewhere around 9 months to 2 Years to implement the solution. But when we compare it with small deployments using tools like Qlikview, it has a small life cycle ranging from somewhere around 3 months to 6 months because they generally do not involve a datawarehouse.
There are various reasons for not having a datawarehouse like
1. Time and Cost: For smaller organisations the IT budgets are generally small. Including Datawarehouse in a BI initiative increases the project cost significantly. The organisations needs to Purchase additional license for a new database. Even if they already have a database in place, it can not be used for various reasons and is not recommended as well. The organisation also needs to purchase a license for a ETL tool and required hardware. Other than the software and Hardware, the cost is involved in hiring resources for database and ETL.
2. Small data size: As small organisations do not have large or huge data sets (Even if they have multiple applications) and hence generally neglect the relevance and importance of having a datawarehouse in place. Not only it complements BI by providing faster response times for BI end users but also it reduces the development cycle of the BI by reducing the level of complexities of the logic. Using a datawarehouse helps in having much simpler BI data models which requires less development time and maintainance.
3. Less complex data models: Small organisations generally have small customised business applications running which do not have very complex OLTP data structures and hence it becomes relatively easy to design BI data models eliminating the need for a datawarehouse solution.
4. Less number of Users: Organisations having smaller number of BI users (Using the application at a given point of time) generally have this option of building their BI solution directly on top of their OLTP database. They can either allow BI users to directly query the OLTP database which in some way reduces the performance of the business application users already connected or they can use tools like Qlikview or Hyperion Essbase which employs this technique of storing the data in their proprietery data file formats and allow a disconnected type analysis. The later one is a much better option as it provides a much faster performance as against querying an OLTP database as for some reason unknown to me these proprietary data files are highly optimized to be used by their respective applications and also this do not impact the performance of the OLTP application.
Now let's discuss a typical BI Project life cycle which comprises of following phases:
1. Project Initiation and Analysis:
For any project to start a business need is a must. for example "replacing an old customized reporting system with a new reporting and analysis system which is more flexible, cost effective, fast and requires less maintainance" or "Need to consolidate data from disparate sources and a common standardized platform for reporting and analysis" could be a business need.This business need is evaluated at a very high level as to how critical is the need and how it well it falls in line with the strategic objectives of an Organization. Formally a Project manager is identified and appointed who further investigate and perform some first level of analysis.
It is then Project manager who creates a document with details like Business case, initial risk assessment, scheduling and budgeting, stakeholders identification etc taking help from the available resources and then a formal approval is taken.
2. Planning and Designing:
Requirements gathering is the most important and critical part of the whole project. A requirement document is created for this purpose and all the requirements are documented in much details. The requirements are finalized and then project scope is created. To fulfill these requirements, various tools are evaluated and whichever is the best fit is selected. Now the team resources are identified. All the logistics, hardware and software requirements are identified and procured.
Data models are designed and it is documented in a Technical design document which also specifies other technical details like features and functionalities of the design, security and authentication, best practices, repository variables, layers, database connections, connection pools etc etc..
Prototypes are designed to show the features and functionalities as documented in the requirement document and is formally approved. Project kickoff.
Before starting the development of the data models and reports, the project scope is divided into small more manageable modules or batches.
It is a good practice to divide it on the basis of functional areas and subject areas.
So let's suppose we decided to do it for functional area Human Resources among other functional areas like Sales and Distribution, finance, Inventory and Manufacturing etc under which we are planning to create different subject areas like
Employee Compensation and Bonus, Learning and Development, Employee movements and transfers etc. You may have one data model for complete HR or seperate data models for each subject areas.
For smaller organizations with less complex OLTP data structures, it is possible and feasible to have a single data model for complete Human Resources.
For large and complex OLTP structures, it is generally not possible as otherwise the size of the fact table will be extremely large horizontally as well as vertically. This will give an extremely slow performance as well as from maintainance perspective also the time taken to load the fact table will be more and unpractical.
Once we decide on our strategy for the development, we start with developing the data model as per the designs created in the previous phase i.e Planning and Designing.
The data model is developed generally in layers. In Oracle OBIEE, there are three layers and Cognos allows you to build as many layers as you wish and BO provides 2 layers(I am not very sure on this and would request some comments on this).
In Qlikview, we can make it single layered or 2 layered by renaming the column names in the script.
For all practical purposes, upto 3 layers is a good idea but you may agree or disagree on that. Based on your
requirements of maintainance you can decide on that.
OBIEE has 3 predefined layers namely Physical Layer, Business and Modeling layer and Presentation layer.
Physical layer is where we simply make connections to the database and import the metadata for database objects like tables, columns, views, primary and foreign key relationships. Now we do not make any changes related to changing the names of the columns which help the administrator and developers from maintainance perspective.
Based on the available physical layer objects we create our OLAP models in Business layer by defining dimensions, facts and Hierarchies.
In the presentation layer, we categorize the objects based on subject areas from the objects available in OLAP model in Business layer. We rename the objects present in Presentation layer from end users perspective or business terminology.
This whole process really helps the developers to understand and visualize the complete model and saves lot of time in debugging or day to day maintainance activities. This process oriented approach is again an attempt to divide and rule and making our life a bit simpler.
Once you are done with your model, the next step is to start developing your reports and bringing in the identified resources for report development into the team.
The reports based on the subject area are divided among the team and with the help of report specifications available in Technical design document created in the previous phase.The reports are generally designed by report developers.
While the report designers are engaged, the data model developers work developing the next subject area. Initially the team size is less and as the work keeps on growing, more people are added in the team.
The development also includes setting up the object level as well as data level security , creating users and groups and creating variables as per the technical design document.
Testing is one of the most critical phase and also sometimes most time consuming phase of a BI project Implementation.
There are 3 types of testing done on a project namely Unit Testing (UT), System Integration Testing (SIT) and User Acceptance Testing (UAT).
Unit Testing is generally done by the developers and they test that the code or report they have developed is working as per the requirement or specifications. This is performed in Development environment where Developers had developed the
report. Developers prepare a test case plan for themselves listing all the cases they would like to test. These cases could be testing the font size, color, spell check, prompts, filters, data values etc.
Sometimes developers may exchange the reports with their team members to perform a peer unit testing and this is a good practice as it is little easier to find out mistakes in other's report than your own.
Once Unit testing is complete, the code (data models and reports) are transferred to Test Environment.
This Testing Environment is generally similar to the production environment but that is not the case always. Having the test environment same as production environment allows us to anticipate the performance and behavior of the developed solution in a much better way.
Once the code is transferred to Test environment, System Integration Testing is performed. SIT checks how all the individually pieces works collectively and are integrated well to produce the desired output. This test is performed by the IT team members or by identified testers from the client side. However before they perform the test a sampling based dry run is required to be performed by the development team.
Once the testing team start testing the application, they put all the defects in a defect log sheet mentioning the
details of the defect.
At this point of time, it is recommended to appoint some dedicated members from the development team to fix those identified defects and update the defect log sheet. While this activity is going on, other team members are assigned next set of development work and they keep working on developing next batch of reports. It may happen that the same team will fix the defects by allocating some portion of their time and rest of the time in developing next batch reports. But this may bring some imbalance or turbulence in the system as it will become very difficult to really work on two things simultaneously. Bug fixing involves lot of coordination with ETL team as well as testers and sometimes consumes time more than what was anticipated which ultimately may impact the development activities. Having a dedicated team for bux fixing activities would be very useful and effective.
Once all bugs are identified and fixed, the Business users are asked to perform the User acceptance test. The test cases are prepared by business users and they check if all the requirements are fulfilled and they really getting what they want. Here business users compare the result set with the result set from their existing system and verify the results.
One of the biggest challenge in SIT or UAT is if any data related inaccuracies are found, it becomes really difficult to find the root cause. The developer needs to check which version is true. There may be some internal logic or transformations or formulas applied in the existing application and this analysis consumes whole lot of time requiring lot of coordination with the team supporting the existing application or system.
After the code is tested and verified completely, it is transferred to the production environment and opened for the real end users. Before that general awareness sessions and training sessions are held for the end users to use the new system. For some time the new system is put on a stabilization period (Generally ranges from 15 days to 2 or 3 months but it could be even more) where in case a bug occurs, it is fixed by the same development team.
During this time the new system or application is made to run parallel with the existing system or application and once the stabilization period is over, the old system is replaced partially or completely.
Once the stabilization period is over and the system gets stabilized, the support team is provided with all the project related knowledge and the development team is rolled off.
The implementation of BI Project gets over.
Please feel free to share your comments.