Saturday, April 21, 2012

Business Intelligence and Data warehouse

So far I had been thinking to start blog on technical aspects in terms of Business Intelligence and Data warehouse, specifically on Cognos. It is said “It is never too late until you start”.
To start with basics, the first question that comes into mind is what is Business Intelligence or a data warehouse.
Business Intelligence (referred as BI) is somewhat a theoretical description of data ware house. BI corresponds to extraction of important data and providing it to the decision makers so that they can decide business strategy. Consider an example if I am owner of an Industry which produces and sells FMCG products. How do I get to know that which of my product is most liked by people or most sold? Which product is popular across different geographies and which product is not being liked by people? Likewise I as a part of leadership team would have plenty of questions and to decide company’s strategy across various geographies and other parameters I will need answers to such questions. The answer is provided by Business Intelligence. Business comprises of multiple tools, of them the most used are ETL and Reporting. Reporting tool provides exact data to the end user (Management team) in the form of spreadsheets, HTML report or PDFs.
Coming to Data warehouse, whatever stated in the above paragraph is practically implemented starting with data warehouse. All of us know what an OLTP (Online transactional processing) system is. For those unaware of it, OLTP is the primary database where actual data is stored. So if you purchase a face wash from a superstore and the person at counter makes entry it directly gets stored into the OLTP system. OLTP systems are pretty much complicated with excessive number of tables and columns and detailed information is stored in them this leads to the concept of Data warehouse. Data warehouse is a denormalised or say simplified version of the OLTP system. When I say simplified that means In a data warehouse only required tables and columns are used, most of the tables are merged to create proper Dimensions, calculations are performed to generate summarized data in the form of fact. Since data is summarized the queries executed on a data warehouse are less expensive as compared to those executed on the OLTP system. Moreover we as a best practice avoid playing with OLTP system for reporting purpose. However, there are specific cases where we do need to report out of OLTP system.
I have used some terms like Dimensions and fact in above paragraph, what do they denote. Dimension is entity with respect to which we measure the data, in simple term tables which store data like year, quarter, month, week, day (Time dimension)or Country, region, state, city (Geography dimension) or Managerial hierarchy or any other aspect for which we need to see the data.
Fact on other hand contains all the measurable data. Revenue, count, profit etc, all these details can be stored in a single fact table. A data warehouse in ideal scenario comprises of multiple dimension tables and a fact table. However, there are multiple cases where we get to see multi fact environment but that would be discussed in detail later.
This is just a quick overview of BI. Next we will start with data warehouse concepts. Along with theoretical concepts this blog will surely cover the complex scenarios faced by me and solution for same.