Getting Started with SQL Server 2012 Cube Development
上QQ阅读APP看书,第一时间看更新

Choosing the deployment mode

As you understand, the choice you make about the deployment mode is very important, it is a choice that you make during installation of the server.

So what determines the right model for you? From a user perspective, the models look the same; the user can use both of them from any of the user tools that can connect to Analysis Services. The difference between the models is mainly something that the developer gets exposed to. As you saw in the architectural diagram previously, there are some differences on the languages that you use when developing the logic and the queries.

In the multidimensional world, you use a language called MDX (Multidimensional Expressions) to query the model and to write the business logic. MDX is a hard language to learn but it is extremely powerful when you really master it.

Tip

More information on how to use MDX to query your cube will be covered in Chapter 5, QueringYour Cube, in this book.

In the tabular world, on the other hand, you work with tables and relations just like you would in a relational database. The native query language is not like SQL in the relational database, but instead a language called DAX (Data Analysis Expressions). This language is a formula-based language that is very similar to Excel formulas.

To most novices DAX is much easier to start with since it is so similar to Excel formulas, and almost every developer knows how to work with a relational database.

Based on this, you may think that choosing the deployment mode is an obvious decision favoring the tabular mode; however, what really determines the model of your choice is the business problem that you want to solve. The multidimensional model contains more possibilities for the developer and allows for more complex tasks.

The following table describes the differences between the multidimensional model and the tabular mode:

To the new user, the features in this list may be hard to understand, so I will take some time explaining what all these functions and features do, and why you would use them in some of your projects.

Actions

When building a BI solution there is sometimes a requirement that you should be able to invoke an external command from within the analysis program. This is something that you have the ability to do by using the feature called actions in multidimensional mode. You define the action as part of your cube; as an example you can invoke a web page or report from your action.

The most common action is a drillthrough action that allows the users to go from an aggregated level in the cube directly down to the lowest level of the cube. The developers have the ability to define which measures and dimension attributes should be returned to the user when executing the action. This can be extremely useful to a business user who quickly wants to see which values build up a certain aggregation.

In the previous screenshot, you can see how to invoke a custom action from within Excel 2013. When this is done, an additional sheet will open in Excel showing the first 1000 rows of the values building up the aggregation.

If this type of functionality is important to you, then your only choice today is the multidimensional model.

Aggregations

The tabular model is built to be an in-memory model that has the ability to quickly scan and process billion of rows, and aggregate the values quickly when the user queries it. The multidimensional model, on the other hand, is a file-based model that needs aggregations to work well. This means that the developer defines pre-calculated values that will be used by the engine when users query the model. Adding more aggregations to the cube both increases processing time as well as how much storage is used by the cube, so the developer needs to be careful when defining the aggregations, and define it on the levels that the users query the most.

In many cases, the in-memory model can be much quicker than the multidimensional model. This is especially true when users often query a multidimensional model on a level that has no aggregates. In other cases, the multidimensional cube is preferred for performance reasons; for example, when your users only query the cubes on aggregated values, or if you have a large amount of data that will not fit into the memory of your server.

Custom assemblies

In the multidimensional model, you have the possibility to run custom assemblies. Custom assemblies are a way to extend the functionality of Analysis Services. You can write .NET programs that can perform a specialized task that will be loaded into and executed by Analysis Services. An example of common custom assemblies are specific security models that cannot be implemented easily with the normal security mechanism in Analysis Services, but that can be solved using .NET code.

Custom rollups

In cubes you have the possibility of defining measures, which is a calculation on a value. Most often they use common formulas such as sum, average, or count. In some cases, you may want to add custom rollups that change the way a measure is calculated on a specific level. As an example, think of a measure summarizing a value over a time dimension; for business reasons, you may want to have a specific formula for the year level and not a summary of all the months or quarters. Custom rollups allows you to write such logic and they only exist in the multidimensional world.

Distinct count

In multidimensional cubes, you have the possibility of creating distinct count measures. This special type of measure counts the distinct values instead of counting all the values. In the tabular world, you have to define this using a simple DAX formula instead. The formula is very simple and looks like the following: [Measure] = DISTINCTCOUNT(Table[Column]). In many cases, distinct count measures in tabular models can be much more effective than in the multidimensional world.

In multidimensional models, there are a lot of special considerations you need to take in order to get distinct count to perform well. For more information about this, please see the following whitepaper: http://www.microsoft.com/en-us/download/details.aspx?id=891.

Linked objects

In Analysis Services' multidimensional cubes, you have the concept of linked objects. They allow you to include dimensions, or measure groups from other databases to a cube without duplicating the data. This can be useful if you want to include objects that could be important to end users analyzing your cube without wasting disk space. This concept is missing in the tabular mode, where everything is stored compressed in the memory of the server.

Many-to-many relationships

To many the most crucial feature that is missing from the tabular cubes, is the ability to create many-to-many relationships between tables. This is something that, in many cases, pushes the project towards using multidimensional modeling instead. However, even though it is not officially supported, there is a possibility of creating a model containing many-to-many relations through DAX scripts. There are several blogs on the web that describe how to do it. One of them is the following: http://www.sqlbi.com/articles/many2many/.

Parent-child hierarchies

Parent-child hierarchies are dimensions that contain hierarchies that reference themselves through a self-join; this is a feature that is missing from tabular models. In the multidimensional model, you have it as a native dimension type, and it can be used to describe objects such as an organizational structure. However, in the tabular world, you have the ability to create a parent-child hierarchy through DAX code. However, the code gets very complex as can be seen in the following blog post: http://www.powerpivotblog.nl/powerpivot-denali-parent-child-using-dax.

Translations

In many global organizations, there is a need to support multiple languages to support users in different countries with the same solution. In the multidimensional cubes, you have the possibility of defining a translation of the metadata in the cube as well as defining text values, such as the dimension attributes. The text values should point to different values depending on what language the user has on its client computer. This means that you can build one cube that can contain several representations of the cube structure, but only have the data once. In the tabular model, this is something that is missing, if you have the business requirement that you need to support several languages you have to create different models for different languages.

Writeback

One important feature that you have in multidimensional cubes, is the possibility of writing back values from the client to the cube and even to the relational database. This can be used to create solutions such as budgeting and planning applications, or to facilitate what-if analysis against data that resides in your cube. This is a feature that is missing from the tabular model and there is no real way of working around it, except for using the client-based tabular model that you have in Excel through PowerPivot, and using a feature called linked worksheets. In this case, you do not write the data down to the database, instead you are writing the value into the Excel workbook.

Note

If you want to understand how to create a writeback enabled application, refer to Chapter 6, Adding Functionality to Your Cube.

Tool support

When it comes to the different end-user tools that can be used to query Analysis Services, most of them work both against multidimensional cubes as well as tabular models. There is only one exception and that is a new tool from Microsoft called Power View. Power View is built to be a very interactive analytic tool that allows end users to slice and dice data in a very visual way. Charts and objects are automatically linked to each other; this means that if you filter one chart, all other related objects on the report will be filtered as well. You also have the ability of using visualizations, such as maps and moving scatter charts when creating your report. The following screenshot of a sales analysis dashboard contains several of these visualizations.

This new tool only works against tabular models, both server models as well as models created in PowerPivot, since it does not use MDX—instead of DAX—to query the model. Currently only tabular models understand DAX but this will change in the future.

Note

Microsoft has announced that support for Power View against multidimensional models will come in future versions of SQL Server. To read more about this announcement please see the following blog post: http://blogs.msdn.com/b/analysisservices/archive/2012/11/29/power-view-for-multidimensional-models-preview.aspx.

The MSDN (Microsoft Developer Network) documentation contains more information on the considerations that are applicable, as we can see in the following link: http://msdn.microsoft.com/en-us/library/hh212940.aspx.

As you have seen, the choice that you need to do on the model is not something that can be taken lightly. On one hand, you have the possibilities of in-memory models and the simple relational model in tabular; on the other hand, you have the more mature and advanced multidimensional model with its advantages. What you end up choosing is a decision that you need to take during the pre-study phase of the project. This book will mainly focus on how to create multidimensional models. There will be a tutorial on how to create tabular models in Chapter 9, In-memory, the Future, of this book.