Picking Up Where We Left Off: The Financial Reporting Cube
As we recall, we set out in Part I of this article to build a basic Financial Reporting cube. Along the way, we discussed some of the high-level requirements one would expect to find in standard financial reporting, including balance sheets, income statements, accounts receivable and accounts payable reports. Beginning with the creation of a core Expense cube, we looked at the components of a cube that could be used to partially support a simple Income Statement, introducing new concepts to our growing knowledge about Analysis Services as they became useful. Throughout the lesson, we re-traced many of the common steps we transition in building any cube, as we began the construction of a sample finance cube for our hypothetical financial reporting department.
In Part I we performed setup of the core cube, focusing first on the expense side of the Income Statement. As we noted, the sample database that we used does not include a "financial statements fact table;" it focuses more on the revenues side of the equation in the construction of the ubiquitous Sales cube that it presents (and that virtually all subject matter authors select) as a handy model for tutorial and other purposes. Our focus is the construction of a cube in a way that reflects reality - and that presents scenarios where the data that we want to analyze are scattered in various tables throughout the database.
As we begin the second half of the lesson, our next step is to pull the sales data into our Financial Reporting cube, and show how we can then integrate the expense and revenue data in a way that revenues are matched with the direct costs of generating those revenues. (Sales are matched with the costs of sales, i.e. the "direct" costs of goods that have been sold, etc.), and netted together to result in a Gross Margin amount. Other Expenses (often called "fixed" expenses) then need to be subtracted out of the Gross Margin to come to the overall Net Profit or Loss.
Our objective is to illustrate the construction of a basis, in the form of an OLAP cube, for the production of meaningful statements that reflect an organization's operating results. There are many approaches available to us to accomplish our objective within MSSQL Server 2000 Analysis Services, using only the FoodMart 2000 database provided as a sample in implementation of MSSQL Services Analysis Services. The route we take, while perhaps circuitous from the perspective of many, will represent an attempt to cover various aspects of one general approach. There are certainly other, perhaps more efficient / performance oriented methods for combining data from diverse sources, some of which we will mention as we progress.
Adding the Revenue Component
As we have mentioned above, financial reporting often pairs Revenues / Sales with the associated Cost of Goods Sold, to arrive at a Gross Margin. As many of us learned in school, an Income Statement, upon which this article will focus, displays the income (in this case, used synonymously with the term "Gross Margin"), expenses, and net earnings / net loss for a particular period of time, such as a three-month period (a fiscal quarter) or an annual period (a fiscal year). While income statements can become quite detailed and sophisticated, depending upon the needs and reporting requirements that drive their production, we will use a simple format within this article, so as to focus more upon the OLAP aspects, and less upon accounting terminology and nuances.
To simplify, the Gross Margin represents the gross profit from sales activity. This gross profit then has operating expenses (the "fixed" costs of operating the business) applied. After subtracting these operating expenses (which include overhead, among many other possible costs) from the contribution made by the Gross Margin, we can determine the Net Income / Net Loss of the operation. In our simple income statement structure, the Gross Margin will be assembled from data in the sales fact table (in the form of store sales and store costs - assumed here to be directly related, and appropriately matched from an accounting perspective, for the sake of simplicity). We will then derive Net Income from the combination of the Gross Margin and the Expenses that we intend to derive from the Expense cube we created in Part I of this lesson.
Suffice it to say that these, as well as far more complex, accounting treatments can be complicating factors in the creation of a financial reporting cube. Our case, however, is a great illustration of the steps we need to take to assemble a reporting cube from multiple tables - a scenario that is pervasive in the business world. Many large clients with whom I have acted as a consultant, such as brokerages and large manufacturing concerns, maintain extensive sales data in one or more data stores and detailed operating expenses in completely separate stores, for both analysis and operational purposes. The complexity only grows when multiple operations are combined to consolidate financial operations on a yet higher level, and so forth. While centralized data sources (such as warehouses or data marts) are typically constructed to provide a more unified reporting platform, as often as not, I come across scenarios much like that in our lesson, where parts of the reporting whole must be drawn from multiple sources into an OLAP cube. MSSQL Server 2000 Analysis Services provides excellent options for handling these scenarios, as we will see in this article and throughout this series.
Creating the Cube - Revenue Focus
Because our intent is to "marry" the Revenue and Cost of Goods Sold (and the Gross Margin that results from the combination of the two) with the Operating Expense that is summarized in the Expense cube, we need to plan a cube structure for revenues that mirrors the structure, at least with regard to the elements upon which we seek to report, of the Expense cube. When we review the Expense cube as we created it in Part I, we see the following structure:
Illustration 1: The Schema for the Expense Cube
Now, let's take a look at how sales_fact_1997, the table containing the revenue data from which we will build the core Revenue cube, stacks up to the structure of the Expense cube, from the perspective of dimensions and measures, by reviewing the table structure, as displayed in Illustration 2 below.
Illustration 2: The Design View of the sales_fact_1997 Table (FoodMart.mdb)
It becomes readily apparent that there are differences that must be overcome in creating a Revenue cube that closely resembles the Expense cube. This is probably why most of the articles and books that are readily available to us, at this youthful point in the life of Analysis Services, tend to supply surrogate databases with tables that make objectives like our own easier to accomplish, or to simply focus on simplistic objectives that can be accommodated conveniently by the existing FoodMart 2000 tables. The fact is that the requirement we have identified in this comparison, a requirement to align divergent sources, is a reality that most of us face everyday in the real world, for the reasons we have noted earlier, and a host of other drivers.
My focus in this and other articles is to provide practical options for dealing with requirements that approach reality - while constraining the lessons to the sample databases and other objects that are common to anyone who would be taking the time to join our sessions - not to simply teach what is "easily possible" within the FoodMart sample, or to require anyone who participates to purchase or otherwise obtain a database that has been, in effect, created to produce the desired results easily. Limitations of the samples aside, it is still possible to illustrate many concepts in a realistic manner without reliance upon outside tools or artificial environments.
To handle the situation confronting us, we might take numerous approaches, but one that I found both straightforward and "self-contained" within the FoodMart 2000 database and Analysis Services will be presented. To restate the detailed objectives in building the Revenue cube, we need to create a cube whose structure resembles the Expense cube with regard to any dimensions or measures that will exist in the ultimate central cube. The Revenue cube would, on the surface, appear to need the following fields to mirror the Expense cube enough to align itself for "apples to apples" reporting purposes:
We can safely forego further consideration of currency_id, as we will assume, for the sake of our lesson, that both tables are denominated in the same currency. The data fields above will be useful to us in Revenue cube design and creation, however, and would appear to need to be addressed. Let's discuss each item briefly to understand the nature of the road ahead more fully.
First, the account_id field simply does not exist in the sales_fact_1997 table. However, we know that all the dollar amounts contained therein either represent sales, or the costs of the sales (Costs of Goods Sold, in our earlier discussion). The Expense cube relies upon the accounts table in the FoodMart 2000 database to provide the data used in the definition of its account dimension. The simple chart of accounts used in the company's reporting is represented in the table view of the account table, as partially pictured below:
Illustration 3: Table View of the account Table (FoodMart2000.mdb)
We can reasonably assume that the accounts assigned to the summarized cost of sales for a given date-row is matched to the corresponding sales from the same date. We can also see, from the chart of accounts in the account table above, the account to which the sales are assigned (account_id 3100, Gross Sales), together with the account to which Costs of Sales are assigned (account_id 3200, Cost of Goods Sold). We therefore know the account_id to assign to each value. However, herein lies our next complication.
The values themselves appear on the same row within the table, which means we cannot assign a single account_id to each line to allow us to become comparable with the layout of the Expense cube, which is generated from a fact table where one account_id exists per row. A breakout of the data into a more accommodating fact table design would be desirable, of course, but perhaps not be an option. I encounter more complicated renditions of this issue within many client engagements, particularly where the source system happens to be a relational or hybrid database. Often these clients do not want to hear that they "need a data warehouse." Whatever the scenario within which this complication is found, we will explore one way of approaching it, realizing that many others are within reach, but with an eye toward working within the constraints and data structures that I have outlined earlier.
Next stop on the list of alterations for alignment of the Revenue cube to the Expense cube is the assignment of the name "amount" to the measure, versus the currently assigned "store_sales" or "store_cost." Once the values are rearranged in separate lines, we will see that this is simple enough.
Category_id does not exist in the sales_fact_1997 table, although this would be simple enough to add. We will forgo this item, however, as we can see (through a review of the category table) that, although it appears to have been designed to contain ACTUAL, BUDGET, and other such classifications (and probably does for a year with which we are not concerned for our present purposes), a quick review of the expense_fact table reveals that only ACTUAL appears to be used anyway. As we don't stand to gain much from adding this to our Financial Reporting Cube (keep in mind that an added dimension often means an exponential increase in processing requirements, etc., especially as the warehouse / database grows over the years), we will simply pass on adding it to the modifications list we are compiling. Finally, the store_id and time_id requirements can be obtained from the existing sales_fact_1997 table without further manipulation.
Preparing the Source Data for the Revenue Cube
In order to meet our outlined objectives of working within the existing FoodMart 2000 database, we will visit the Access environment briefly to establish a means of creating a suitable "virtual" fact table to support the desired Revenue cube design.
As most of us are aware by this point in the series, FoodMart 2000.mdb is located, by default in the [Install Directory]:\Program Files\Microsoft Analysis Services\Samples directory; Once we have located the database on our PC's, we will navigate to it, and open it in MS Access, where we see the layout shown in Illustration 4 below.
Illustration 4: Selecting the FoodMart 2000 Database