### Introduction

In The CROSSJOIN() Function: Breaking Bottlenecks, we introduced our efforts to enhance CROSSJOIN() by discussing how the function can contribute to a degradation in processing for queries that rely upon it in scenarios where we encounter medium- to large-sized data sets. We acknowledged the utility of the CROSSJOIN() function in scenarios where we wish to generate a cross-product of members in two different sets, noting that the capability that it gives us to specify "all possible combinations" is convenient - indeed, the most straightforward way to perform such a combination of two sets.

As we noted next, however, the indiscriminate use of the CROSSJOIN() function, like many other MDX functions, can slow reporting and analysis dramatically. The consequential degradation of processing, we added, is often due to a failure to understand how the function performs set combinations, and how its action can lead to huge results datasets when applied to large cubes.

NOTE: For detailed introductory information on the CROSSJOIN() function, see my Database Journal article MDX Essentials: Basic Set Functions: The CrossJoin() Function.

In The CROSSJOIN() Function: Breaking Bottlenecks, we discussed that, in combining two sets, CROSSJOIN() combines every member of the first set (all from a single dimension) with every member of the second, creating a "Cartesian" effect as a result. We further noted that combining two sets, for example, with the following query illustrates a scenario, (on the scale of the small Warehouse sample that installs with MSAS), where we experience an appreciation for the consequences when the number of members in set 1, times the number of members in set 2, times the member population in set 3, results in many combinations.

```
SELECT
{[Measures].[Warehouse Profit]} ON COLUMNS,
{CROSSJOIN([Warehouse].[Warehouse Name].Members,
CROSSJOIN([Store].[Store Name].Members,
[Product].[Product Name].Members))} ON ROWS
FROM
[WAREHOUSE]
```

We stated that this query would generate 898,560 combinations (24 individual Warehouses times 24 Stores times 1,560 distinct Products). The number of combinations in a cube of the vastly larger sizes that occur in today's business environment can obviously be crippling to performance. This is aggravated by the fact that the sparsity that is prevalent in large cubes naturally leads to CROSSJOIN() results with an even higher sparsity factor. (The results dataset produced by the above query yields only a tiny fraction of combinations with non-empty measures.)

We learned that the process of generating all possible combinations, empty or not, lies behind the performance drag, and becomes even more pronounced when we attempt to perform query operations that must wait for the combinations to be assembled, and then be applied to the resulting dataset. The time consumed in assembling a large number of empty combinations in these scenarios is largely wasted when a large percentage are "tossed" in a subsequent step in the march toward the ultimate results. We will see an example of this within this article, where we will begin with a query that is suffering from the kind of degradation we are talking about.

For purposes of our practice procedure, we will assume that we have been asked by management of a hypothetical client, once again, to investigate degradation in performance of a query. The query was originally constructed at the request of a group of information consumers in the Corporate Planning department, shortly after the implementation of MSAS at the FoodMart organization. The creator of the query, who initially wrote the MDX in a way that seemed intuitive, intended to optimize it later. When his position evaporated with the movement of many IT functions to an offshore organization, the query was left in its original, suboptimal state.

Attempts to communicate with the offshore support team were abandoned when it was learned that the building housing the group was destroyed in a recent regional disaster. While most of the documentation, code, and other collateral for in-process enterprise projects, together with corporate financial information and customer information files, has been reported missing, we are able to locate the query details among several files left by the original author on a development server.

We discuss the requirement with the information consumers, and gain insight into a potentially important factor that might affect our optimization strategy. According to the consumers, the production version of the Warehouse cube under consideration contained an additional calculated member, added to meet a specific reporting requirement. We develop a plan to examine the query under consideration, before offering options for improving its performance.

As is often the case, we decide to work with a copy of the development version of the Warehouse cube, to allow the original to remain isolated. It is to this copy of the cube that we will make the modification required to bring it into alignment with the cube under consideration, adding the calculated member to the clone in preparation for our optimization exercises.