Taking Advantage of Custom Arrays in Google Sheets

One feature in Google Sheets that I find incredibly helpful that isn’t well understood is the ability to build custom arrays/ranges within formulas. The Google Sheets documentation has a brief & unassuming callout to this capability that does not really acknowledge its power or potential.

Let’s dig into how this capability works and how to use it to super charge your spreadsheets…

Building Custom Ranges

The documentation above shows some pretty simple examples.

Fundamentals

In essence, you can use brackets { } to create multi-cell ranges.

Separating values with commas (e.g. ={ 1 , 2 } creates an array with columns.

image

Separating values with semicolons (e.g. ={ 1 ; 2 } creates an array with rows.

image

These two concepts can be combined to create two-dimensional arrays-within-arrays (e.g. ={ { 1, 2 } ; { 3 , 4 } } creates an array with rows.

image

Arrays of Ranges

While it is sometimes helpful to create static custom arrays as in the examples above :point_up:, this capability becomes more powerful when used to stitch together cell ranges in your sheets into custom arrays.

For example, using the formula ={A1:A3,C1:C3}, you can see that our custom array pulls just the first and third columns from the three columns of data above it (it skips the middle column, column B).

Applications

So now that we understand how to build custom arrays, how can we use them to improve our spreadsheets? Below, I’ve provided a few examples of the ways I often use them.

Single-cell Header + Formula Rows

When possible, I try to embedded key column-spanning formulas in my header rows. This approach allows users to add and delete rows in the table body without disrupting the formulas.

Let’s understand how this formula works:
={"Double Value";ARRAYFORMULA(2*A2:A)}

You can see that we create a custom array with the first row being the static text "Double Value". After that, a semicolon separates data vertically (i.e. new row) and below that we use an ARRAYFORMULA() to dynamically calculate allow content below the header as the data in column A but multiplied by 2.

This multiply-by-two formula is simple, but you can create much more elaborate formulas with this approach that included nested if-statements and VLOOKUPS(). For example, the output can been made more pretty and performant like this:
={"Double Value";ARRAYFORMULA(IF(ISBLANK(A2:A),IFERROR(1/0),2*A2:A))}

Less Brittle VLOOKUP()

The VLOOKUP() function can be a powerful way to find linked data from other sources but, because of the third parameter which identifies the column index, it can be brittle when a user adds or removes columns within its search range. Let’s look at this basic example:
=VLOOKUP(C2,A2:B,2,FALSE)

If the user were to add a Middle Name column between columns A and B, you can imagine that the direct reference to the range in the VLOOKUP() would expand from A2:B to A2:C. But Sheets isn’t smart enough to update the static number 2 for the column index. So, essentially, if we added this middle-name column, the VLOOKUP() would change to lookup the middle name from the first name provided in column C (D after the imaginary column insertion if you are spatially intuitive and following along closely :wink:).

The solution to this problem is to build your VLOOKUP() range from a custom array in this way:
=VLOOKUP(C2,{A2:A,B2:B},2,FALSE)

In this approach, rather than providing a block range, we pull in just the columns that are needed for our lookup. The search column is always first. The result column is always second. So the static column index (2) remains unchanged. If the user were to add a Middle Name column between columns A and B, you can imagine that the direct references would update as follows (and that the lookup would still correctly find the last name value):
=VLOOKUP(D2,{A2:A,C2:C},2,FALSE)

Reverse VLOOKUP()

Have you ever noticed that the search/lookup column in a VLOOKUP() formula must always be the first? What if you want to find data using a column in the middle of your range and return a value from a column to the left of that? The solution is pretty straightforward using the approach above.

Using the following formula, we can use a last name to lookup a first name:
=VLOOKUP(D2,{C2:C,A2:A},2,FALSE)

P.S. I learned this trick from Ben Collins long ago.

Dynamic Data Ranges with ARRAYFORMULA()

Another cool usage of custom ranges is the capability to dynamically generate derived data and to include that virtual, formula-created array as a range. In the example below, our VLOOKUP() function searches for the first name in D2 in the first-column range (A2:A) but the returned value is a dynamically-generated full name column created within the formula using ARRAYFORMULA():
=VLOOKUP(D2,{A2:A,ARRAYFORMULA(C2:C&", "&A2:A&IF(ISBLANK(B2:B),""," "&B2:B&"."))},2,FALSE)

This approach works great with VLOOKUP() but dynamically-generated data ranges can be used as source ranges for all sorts of functions including QUERY() :point_down:.

Custom QUERY() Ranges

Similar to the "Less Brittle VLOOKUP()" example, the QUERY() function can be easily compromised by the addition or removal of columns in its range parameter. Let’s take a look at this example:
=QUERY(A2:C7,"SELECT C, A ORDER BY C, A LIMIT 4")

Here, we are using the QUERY() function to return the first 4 last- then first-names in alphabetical order. We are referencing columns by their letters (e.g. C and A) in the provided range.

As with the brittle VLOOKUP() example, if a user deleted middle-initial column B, you can imagine that the direct reference in the formula would update to:
=QUERY(A2:B7,"SELECT C, A ORDER BY C, A LIMIT 4")

The problem is that Sheets isn’t quite smart enough to update the static query text, so the query will try to select data from column C which is outside of the (new) range and will crash the query.

What is the solution? You’re getting the hang of this, right? It’s to package our data ranges within a custom array like this (we can skip over the middle-initial column since our query doesn’t use it):
=QUERY({A2:A7,C2:C7},"SELECT Col2, Col1 ORDER BY Col2, Col1 LIMIT 4")

An important note here is that we cannot reference columns using their Sheets column letters if our data is not provided from a cell range. You can see in the formula above, we reference column C using Col2 based on the provided column order in the custom array and column A is referenced using Col1.

Randy,
Very useful information on a powerful feature that seems to have scarce documentation. The reverse VLookup is something I just recently discovered by experimentation and was pleasantly surprised it worked!

Thanks.

1 Like