Tuesday, January 14, 2014

Ctrl+Shift+Enter - Welcome To The Wild World Of Excel Arrays

One of the most obscure and least documented features in Excel is also one of the most powerful. If you have not yet encountered array calculations in Excel, I’ll let you in on the secret. An array formula is one that performs multiple calculations in one cell. Here is a simple example:

In order to calculate the value of inventory, the Cost should be multiplied by the Units On-Hand for each Item. Once the four multiplications have taken place, the results should be summed.

A common approach to this calculation is to add a new column to perform the multiplications. The formula in cell D2 is =B2*C2, in cell D3 is =B3*C3, and so on.

After each row’s Cost Value is calculated, a SUM Function adding column D will total $97.00. This result was obtained in two steps and used five cells, one for each Item and one for the SUM. If there is more data, this approach will use more cells to calculate a result. The accumulation of enough of these intermediate calculations can bloat a data model to a size that is difficult to open or email.

An Array Formula can achieve the same result of $97.00 by performing this calculation in one step using one cell. The Array Formula is: =SUM(B2:B5*C2:C5)

Each of the four cells in column B is multiplied by the corresponding cell in column C. The four multiplications are then enclosed in a SUM Function to add the four results. The only problem is that after typing in this formula and pressing the Enter key, a #VALUE! error appears. Since this formula is an array calculation, it needs to be entered using a special combination of keys: Ctrl+Shift+Enter. Hold Ctrl and Shift down while pressing Enter, doing this will enclose the formula in curly brackets. This is Excel’s method of showing that it understands that an array operation is performed. The curly brackets are never typed in, they result from entering an array formula with Ctrl+Shift+Enter.

There is another Array solution for this calculation that does not require Ctrl+Shift+Enter, the SUMPRODUCT Function. You can see in the function ScreenTip that an array is expected. =SUMPRODUCT(B2:B5,C2:C5) and then Enter will multiply corresponding cells then add the results. Ctrl+Shift+Enter was required in the previous example because the SUM function is not expecting arrays to be multiplied inside.

This example is a simple introduction to the existence of array calculations. The real benefit of an array formula is in situations where there are thousands of Items. The array formula can be performed in one cell without the need for thousands of intermediate calculations. This results in smaller data models and can help avoid 30-60 Megabyte file sizes.

Also, array formulae open up amazing new possibilities…

Where is the MAXIF Function? Where is the MINIF Function?

A common gateway to the world of array calculations is when you want to retrieve the maximum value based on a condition, a MAXIF function. Even though Excel has provided functions for SUMIF, COUNTIF and even AVERAGEIF, there is no MAXIF. Here is an example to show how an array formula can simulate MAXIF until the day when Microsoft adds the MAXIF function to Excel.

This data set stores the sales value for each salesperson on each day through the year. Imagine the table starts in cell A1 and ends in cell C2000. What is the largest daily sales value for Tara? The solutions I see most often are:

1) Manually sort or filter the data to calculate the MAX Function for the “Tara” records. This is a good solution for a quick one-time calculation but is not ideal for calculating the MAX for each salesperson.

2) In column D, add an IF Function for each row to only display sales values where the salesperson is “Tara” then use the MAX Function on the new column. This solution is not much better than the first one.

3) Use a PivotTable to calculate the MAX value for Tara and the other salespeople. This is a good solution and has much more flexibility than the first two.

4) Use an array formula for maximum flexibility.

I prefer formula-based solutions because they automatically recalculate when cell inputs change. Here is the array formula to calculate Tara’s maximum daily sales (don’t forget Ctrl+Shift+Enter):

=MAX(IF($B$2:$B$2000=E2,$C$2:$C$2000,""))

IF($B$2:$B$2000=E2,$C$2:$C$2000,"") will perform 1,999 IF calculations, each comparing a value in column B to “Tara” in cell E2. Wherever the column B value is “Tara”, the corresponding value in column C is returned, otherwise the empty string “” is returned. Once all these IF calculations are finished, they are enclosed inside a MAX Function which will extract the largest value. The spreadsheet has been spared the use of 1,999 cells.




















This formula can easily be copied and pasted downwards to the other salespeople. The formula can also be easily modified to determine the MIN, MEDIAN or QUARTILE of Tara’s sales values. Any summary function can be used to quickly understand the data.

One caution though, since almost 2,000 calculations are happening in one cell, the array formula takes a little longer to calculate than a simple formula, but only one cell was required. It is not advisable to take this array calculation and copy and paste it thousands of times, you will have wandered into the realm of databases at that point.

Knowing how to use arrays will take your data models to the next level of efficiency. Array calculations work in different versions of Excel and can help you create better data models. There are many more applications of these ideas, especially those simulating database functionality on a small scale in Excel.

If you are interested in learning more about Excel Array Calculations, let me show your organization how it can improve its Excel models. For any analytics training or analytical solutions, please contact:

Email: scottjcuthbertson@gmail.com
LinkedIn: http://ca.linkedin.com/in/scottcuthbertson
Blog: http://theanalyticspro.blogspot.ca/

Scott Cuthbertson BBA

Scott is an Analytics Professional who has worked with over 100 clients in the retail and financial services industries. He has helped organizations improve their labour scheduling, inventory replenishment, performance management, organization design and pricing. He has been a practitioner of analytics since childhood, using data models to win Hockey Pools. His expertise lies in developing data models, reporting, algorithms and solutions for any organization that wants to use their data to improve decision making. He also teaches at the CMA Professional Development Institute.

© Scott Cuthbertson 2014 Blog: Words From The Analytics Pro