Use The Onion Framework To Approach Complex Formulas

Complex Formulas? The Onion Framework? Huh?

I’m talking about the idea that complex formulas in Google Sheets are a lot like onions.

They have layers.

And they sometimes make you cry. ?

The Onion Method For Complex Formulas

If you’re building complex formulas, then I advocate following a one-action-per-step approach.

What I mean by this is that you build your formulas in a series of steps, and only make one change with each step.

The Onion Method is a framework by which to approach hard formulas, and consists of these three elements:

  1. Put each new step of the formula in a new cell
  2. Label each step with a simple “Step 1”, “Step 2”, etc. in adjacent cells
  3. Change the background color of each formula cell, so they can be easily found

This lets you see the formula progress in an incremental way and is really helpful when you’re building or tyring to understand complex formulas.

Sometimes a step might result in an error (typically a #N/A or #REF!), but that’s ok, provided it gets fixed in a subsequent step, as shown in this SUMPRODUCT example:

Advanced Formula steps example

Each of these intermediary formulas in the above image moves us forward incrementally, until the final answer is obtained in step 6.

Similarly, if you’re trying to understand complex formulas, peel the layers back until you reach the core (which is hopefully a function you understand!). Then, build it back up in steps to get back to the full formula.

Example 1: Building Complex Formulas With The Onion Method

Let’s start with this job positions dataset and use the QUERY function to summarize the results:

Job position dataset

Step 1

Setup the first, simple QUERY formula to select columns A and B:

=QUERY(A1:B,"select A, B")

This doesn’t change the data, but it’s always a good idea to set up a basic query first to ensure you have the correct dataset selected as the input to your QUERY function.

Step 2

Summarize the data by job position, using a GROUP BY clause in the QUERY function:

=QUERY(A1:B,"select A, sum(B) group by A")

Onion Method for complex formulas

Step 3

Filter out the blank rows using the WHERE clause: “is not null”, as follows:

=QUERY(A1:B,"select A, sum(B) where A is not null group by A")

Step 4

Use an ORDER BY clause to sort the table by total in descending order:

=QUERY(A1:B,"select A, sum(B) where A is not null group by A order by sum(B) desc")

Step 5

Fix the header of the total column using the LABEL clause:

=QUERY(A1:B,"select A, sum(B) where A is not null group by A order by sum(B) desc label sum(B) 'Total Positions'")

Onion Method for complex formulas

Good work!

We’ve created a pivot table using the QUERY function rather than an actual pivot table. Building it in steps, where the formula evolves slightly with each step, was key to making this work.

Let’s continue, and see how to add a total row to this QUERY formula.

Step 6

Using array literals, add a placeholder line for the total row:

={QUERY(A1:B,"select A, sum(B) where A is not null group by A order by sum(B) desc label sum(B) 'Total Positions'");{"TOTAL","TBC"}}

Step 7

Our final step is to convert this placeholder to an actual formula, to give the correct total. As with the data input to the query function, we leave the range reference open-ended to ensure it remains dynamic and will include new data automatically:

={QUERY(A1:B,"select A, sum(B) where A is not null group by A order by sum(B) desc label sum(B) 'Total Positions'");{"TOTAL",SUM(B1:B)}}

The result is:

Onion Method for complex formulas

Example 2: Deconstructing Complex Formulas With The Onion Method

If you’re trying to understand complex formulas in Google Sheets that someone else has shared with you, you can still approach it with this Onion Method.

Simply peel back the layers until you reach the innermost function. Copy that into a new cell and start from the inside and work out, building up to the full formula again.

Let’s see an example.

Suppose we’re given this worksheet with US State names:

dataset of US State Names

And we’re also given this formula:

=ArrayFormula(INDEX(A1:A20,MODE(MATCH(A1:A20,A1:A20,0))))

which gives an output of Texas.

But how does this formula work?

Applying The Onion Method, we peel back the layers to the core function and then build it up in steps again.

Step 1

In a new cell, add the innermost MATCH function:

=MATCH(A1:A20,A1:A20,0)

Step 2

=ArrayFormula(MATCH(A1:A20,A1:A20,0))

which outputs an array of the position of the first occurrence of the words in column A. We see a 2 next to every occurrence of Texas for example, because the first time it occurred was in position 2.

Step 3

Now, we wrap it with the MODE function to find the most frequently occurring position:

=ArrayFormula(MODE(MATCH(A1:A20,A1:A20,0)))

By definition, the MODE function takes a range of numbers for input and finds the most commonly occurring value.

However, what happens if we have a range of text values and want to find the most frequent?

In this case, MATCH has been used to create a range of numbers for the MODE function.

By now, we’ve probably deduced that this formula finds the most frequent word in a list.

Step 4

Finally, we can retrieve the actual text value, i.e. the most frequent State name, by adding the INDEX function to get the full original formula, like this:

=ArrayFormula(INDEX(A1:A20,MODE(MATCH(A1:A20,A1:A20,0))))

This will give the output Texas in this specific example.

Nice!

Complex Formulas Onion Method Template For Your Use

Click here to open a read-only copy of the template >>

This template contains both examples from this tutorial.

To make your own editable copy, please go to File > Make a copy… under the File menu.

Complex Formulas Onion Method Conclusion

The Onion Method is a framework that allows you to approach complex formulas in a systematic way.

Even if you’re presented with an “impossible” challenge to answer or an “impossible” formula to decipher, just follow this framework. If required, peel back the layers and then work from the inside out in an incremental fashion.

You’ll be amazed at how quickly your understanding of challenging formulas broadens and deepens. You’ll encounter and understand brand new functions that you’ve never heard of before. Plus, you’ll find out all sorts of secret tricks with existing formulas.

Who knows, you might even cry tears of joy instead of despair…

Radial Bar Charts in Google Sheets

In this post, I’m going to show you how to create radial bar charts in Google Sheets.

Radial Bar Chart in Google Sheets

They look great and grab your attention, which is important in this era of information overload.

But they should be used sparingly because they’re harder to read than a regular bar chart (because it’s harder to compare the length of the curved bars).

How To Create A Radial Bar Chart In Google Sheets

Let’s begin with the data.

In this example, we’ll create a radial bar chart in Google Sheets with 3 series.

We need a column of values for these 3 series, for example, products with a number of units sold.

Next, we need some upper limit (max value) for our bars. This allows us to scale the bars properly.

Lastly, we need a helper column that calculates the difference between the max value and the actual value.

Here’s the data for the radial bar chart, in cells E3:H6:

Radial bar chart data table with Formula

Ok, I’m going to let you in on a little secret now…

This is not a single chart. No sir, it’s three charts overlaid on top of each other.

And yes, this means it takes three times as long to create!

Step 1: Create the inner circle

Highlight the first row of data but exclude the max value column. In the example dataset above, highlight E4:G4 and insert a chart.

Select a doughnut chart.

Under the Setup menu, make sure to check the “Switch rows/columns” checkbox, so your chart looks like this:

Donut chart in Google Sheets

Under the customize menu of the chart tool, set the following conditions:

  • Background color: None
  • Chart border color: None
  • Donut hole size: 67%
  • Set Slice 2 color to none
  • Remove the chart title
  • Set the legend to none

This is what the inner donut should look like:

Donut chart in Google Sheets

Step 2: Create the middle circle

Repeat the steps above for the inner circle, but use the next row of data, choose a different color, and set the donut hole size to 77% (you may have to experiment with these percentages to line everything up at the end).

Drag the second donut chart on top of the first and line up the radial bars to get:

Radial Bar Chart in Google Sheets

Step 3: Create the outer circle

Again, repeat the steps above from the inner circle to create a third donut chart, using the third row of data, a different color, and setting the donut hole size to 81% (again, this might need tweaking to line everything up).

Drag this third donut chart on top of the other two and you have a radial bar chart in Google Sheets!

Radial Bar Chart in Google Sheets

Note on editing charts:

Since the charts are placed on top of each other, you’ll only be able to access the top chart to edit. You’ll have to move it to the side to access the chart underneath, and then move that one if you want to access the inner chart.

Step 4: Add the data labels

It gets messy to add the data labels to each chart through the chart editor, so I opted to create formulas to add my data labels into the cells next to each bar of the radial bar chart.

To access cells underneath the charts, click on a cell outside of the chart area and then use the arrow keys on your keyboard to reach the desired cell.

Once there, add the following formula:

=E6&": "&TEXT(F6,"#,0")

This formula uses the TEXT function to combine text and numbers in Google Sheets.

This shows the series name and value alongside each bar:

Data labels for radial bar chart in Google Sheets

To finish, remove the gridlines from your Sheet to give the chart a clean look.

Radial Bar Chart in Google Sheets

Can I see an example worksheet for the radial bar chart?

Yes, here you go.

Real World Examples of Radial Bar Charts

My friend Jeff Sauer, who founded Data Driven U to teach people data-driven marketing, contacted me recently about creating a radial bar chart for one of his workshops.

He is graciously sharing his report here, so you can see a radial bar chart with six rings:

Data Driven U chart

This is a screenshot of his Google Sheet!

(If you’re looking for top draw digital marketing, then you should definitely check out Jeff’s site: DataDrivenU.com This is not an affiliate link, just a personal recommendation!)

You’ve probably also seen a radial bar chart in the wild with the Apple Watch Rings Chart!

Apple Watch Ring Chart

Google Sheets REGEX Formulas

This post explores the Google Sheets REGEX formulas with a series of examples to illustrate how they work.

Regular expressions, or REGEX for short, are tools for solving problems with text strings. They work by matching patterns.

You use REGEX to solve problems like finding names or telephone numbers in data, validating email addresses, extracting URLs, renaming filenames containing the word “Application” etc.

They have a reputation for being hard, but once you learn a few basic rules and understand how they work you can use them effectively.

There are three Google Sheets REGEX formulas: REGEXMATCH, REGEXEXTRACT, and REGEXREPLACE. Each has a specific job:

‚ÄĘ REGEXMATCH will confirm whether it finds the pattern in the text.

‚ÄĘ REGEXEXTRACT will extract text that matches the pattern.

‚ÄĘ REGEXREPLACE will replace text that matches the pattern.

Let’s understand them with a series of examples.

Table Of Contents

Example 1: Google Sheets REGEX Formula REGEXMATCH

The REGEXMATCH function returns a TRUE if it matches the pattern you provide anywhere in the text and FALSE if there are no matches in the text.

For example, suppose we have this dataset of vintage wines, where each row has a mix of text and/or numbers:

REGEX Formula Introduction

Let’s create a simple REGEXMATCH to test whether a cell contains a number, i.e. the year vintage is given.

In cell A2, add this formula:

=REGEXMATCH(A2,"[0-9]")

This will give a TRUE output if it finds a number in the string, or FALSE if there are no numbers.

[0-9]+ matches any number from 0 to 9 in the input string.

So, provided there is one number in the input string, this pattern will give us a match:

Google Sheets REGEXMATCH example

See how the text without any numbers “House white wine” gives a FALSE output because there is no match.

Important Note

This pattern matches a single number. It doesn’t care what else might be in the cell.

For example, it returns TRUE for a meaningless string containing numbers, which is probably not the behavior you’re looking for in this case.

Google Sheets REGEX formula example

By the end of this tutorial, you’ll understand enough to know how to fix this yourself.

(For the solution, see the formula at the end of this tutorial.)

Example 2: Google Sheets REGEX Formula REGEXEXTRACT

Using the same wine dataset as above, we want to create a new column in our dataset with the vintages, i.e. a column with the year only.

This is a perfect example of when to use a Google Sheets REGEX formula. We’ll create a regular expression pattern to match any numbers and then use REGEXEXTRACT to extract them.

As with everything in spreadsheets, there are multiple REGEX patterns that could solve this.

We saw the [0-9] pattern above, but we can also use the named character class for digits \d

This matches any digits (i.e. numbers 0 to 9).

So the REGEXEXTRACT formula to extract the year looks like this:

=REGEXEXTRACT(A2,"[0-9]+")

or

=REGEXEXTRACT(A2,"\d+")

The + means get one or more.

Both formulas return a result of “1999”, because the text in A2 is “Red Bordeaux 1999”.

If no numbers are found, the formula returns a #N/A error.

Google Sheets REGEX Regexextract Formula

Two Important Notes

Note 1:

The REGEX formulas require text inputs and they give you text outputs back. So the 1999 output above is formatted as text. To convert to a number you need to wrap the result with a VALUE function. See Example 4 below for more details.

Note 2:

If there were more numbers in the text string e.g. “Red Bordeaux 1999 or 2001” only 1999 is returned by the REGEXEXTRACT formula because it doesn’t match the space or letters between the numbers.

It only matches the numbers, so it matches the first number it sees, then keeps matching numbers until it hits the first non-number where it stops matching, i.e. the space at the end of 1999.

Example 3: Google Sheets REGEX Formula REGEXREPLACE

The REGEXREPLACE will replace all sets of numbers in the text with a new value, for example, this formula:

=REGEXREPLACE(A2,"\d+","2021")

will replace 1999 in the sentence “Red Bordeaux 1999” with “2021” and return the answer: Red Bordeaux 2021.

Important Note

The REGEXREPLACE function replaces ALL sets of numbers in the text, unlike the REGEXEXTRACT which just extracts the first pattern it matches.

Example 4: Use REGEXEXTRACT And VALUE To Extract Numbers From Text

Consider this data about a supertanker ship:

REGEX Formula data

Is it possible to extract those numbers with a REGEX formula, even though they’re formatted differently?

You bet!

This formula will extract numbers with or without thousand separators and/or decimal places:

=REGEXEXTRACT(A2,"[\d,.]+")

The REGEX pattern [\d,.]+ means match any digits, commas, or periods and extract them.

So the REGEXEXTRACT formula matches the first digit it finds then keeps going with the match provided the next character is either another digit or a comma or a period, When it reaches something else, like a space or a letter, the match breaks, and the extract is completed.

We’re not quite done yet though.

Using The VALUE Function To Convert The Output To Numbers

The output of the REGEXEXTRACT formula is a string, not a number.

So we need to convert the output to a number by wrapping the result with the VALUE function like this:

=VALUE(REGEXEXTRACT(A2,"[\d,.]+"))

The formula above is not foolproof, however.

Improving The Pattern Match

If the text string has a period or comma before the first digit then this will be extracted as the match.

For example, if the input text string was: “The ship is huge. It’s 630,962 tonnes full load.” then the REGEXEXTRACT formula from above will extract the first period only.

It matches the period after “huge” and then stops the match because of the space character that follows.

How do we modify the formula to ensure the extract begins with a number?

Well, we change the REGEX to match a number first, before anything else, like this:

=REGEXEXTRACT(A2,"\d[\d,.]*")

Here the REGEX matches on a digit first, before looking for more digits, commas, or periods.

If you’re eagle-eyed you’ll notice that the plus + has changed to a star * after the square bracket. This means zero or more of the characters in the square brackets, to account for a situation where there is a single-digit number that we want to match.

The REGEX pattern \d[\d,.]* matches a digit, followed by zero or more characters that are digits, commas, or periods.

Now the result of the formula extraction is 630,962, which is the correct answer.

Remember, the output of the REGEXEXTRACT formula is a string, so you’ll need to wrap it with the VALUE function to convert it to a number, e.g.

=VALUE(REGEXEXTRACT(A2,"\d[\d,.]*"))

Example 5: Check Telephone Numbers With REGEXMATCH

For this example, I’m going to consider US phone numbers with dashes between the sections, i.e. numbers of this format: XXX-XXX-XXXX

It’s 3 digits, then a dash, 3 digits, dash, then 4 digits.

By the end of this tutorial, you’ll have enough information to modify the example to other regions of the world.

Here’s the data:

REGEXMATCH telephone numbers Google Sheets

Let’s build a REGEX formula to check whether the string in column A is a valid US phone number.

Using what we learned above, we know that the expression \d matches digits. So our first attempt is this formula:

Step 1:

=REGEXMATCH(A2,"\d\d\d-\d\d\d-\d\d\d\d")

This matches 3 digits, then a dash, 3 digits, dash, then 4 digits and it works ok. It shows TRUE when it matches a telephone number and FALSE otherwise.

But it’s verbose. We can simplify it by using a quantifier clause.

Step 2:

=REGEXMATCH(A2,"\d{3}-\d{3}-\d{4}")

The {3} means match exactly 3 of the preceding pattern, i.e. match exactly 3 digits.

This works great, except it still matches the final number on row 8. It matches the 3-3-4 and doesn’t care about the extra digits that come at the end.

But we know this isn’t a valid phone number, so how do we get it to stop after the 4 digits and discount anything with more than 4 digits in the final set?

Word Boundaries

We wrap the expression with a special expression called a word boundary, denoted by \b before and after the main expression.

Technically, this matches a “zero-width nothing”. What that means is that it marks a boundary between a word character (e.g. letter or digit or _) and a non-word character. So it will match the digits up to 4 and then look for a boundary. A fifth digit will break the match, but a space won’t because it defines a boundary.

Our new expression looks like this, with a \b at the beginning and end.

Step 3:

=REGEXMATCH(A2,"\b\d{3}-\d{3}-\d{4}\b")

The final thing we might do is to wrap this with an IFERROR function to handle number inputs like row 6 above that cause an error output (since the REGEX formulas only work with text strings).

Step 4:

=IFERROR(REGEXMATCH(A2,"\b\d{3}-\d{3}-\d{4}\b"),FALSE)

REGEXMATCH telephone numbers Google Sheets

Example 6: Reorder Name Strings With REGEXREPLACE

In this example, we’re looking at the REGEXREPLACE function and a key concept in regular expressions: capturing groups.

Suppose we have a list of names in this format: “Surname, First Name” and we want to switch the order to “First Name Surname”.

REGEXREPLACE in Google Sheets

Here’s the formula to switch the order of the names:

=REGEXREPLACE(A2,"(\w+)(?:, )(\w+)","$2 $1")

Let’s break this down:

REGEXREPLACE finds all substrings that match the pattern and replaces them with the value given. It takes 3 arguments: 1) the input text, 2) the pattern to match, and 3) the replacement value.

Let’s look at each in turn:

Input text

Surname, First name e.g. Collins, Ben

Matching pattern

(\w+)(?:, )(\w+)

\w+ matches word characters

(\w+) creates a numbered capturing group. It matches the first word up to the comma.

(?: opens a non-capturing group, which essentially means match but don’t capture the text in this group

(?:, ) is the non-capturing group that matches a comma and space but doesn’t capture them

(\w+) creates a second numbered capturing group. It matches the second word after the comma and space.

Replacement Value

$2 $1

Now, this is where it gets interesting!

Our matching pattern captured each of the names as a numbered group, which we’re now able to refer to with $1 or $2

Group $1 captured the surname and group $2 captured the first name.

To reverse the names, we put group $2 first, then $1

The output looks like this:

REGEXREPLACE in Google Sheets

SOLUTION From Example 1

Modify the REGEXMATCH formula to be:

=REGEXMATCH(A2,"\b[0-9]{4}\b")

It uses a quantifier {4} and word boundaries \b to only match 4-digit numbers (see example 5 for more information on quantifiers and word boundaries).

REGEX Google Sheets

Learn More

The Google Sheets REGEX Formula Cookbook

There are hundreds of more REGEX examples in my Google Sheets REGEX Formula Cookbook course.

Resources

For the full list of allowed syntax, check the re2 syntax page.

Use The SWITCH Function to Categorize Data Efficiently

The SWITCH function is a useful tool for categorizing data. In the right circumstances, it can save you from messy, nested IF functions.

The SWITCH function is used to test an expression against a list of cases. It returns a value when the expression is equal to one of the cases.

It has some similarities to the IFS function but differs because SWITCH tests for exact matching rather than whether a condition is true (e.g. X > Y). SWITCH also has a default option to return a value if no match is found.

Let’s see an example. Suppose we¬†have this data set of student grades and we¬†want to add context to each grade level.

SWITCH function with student grades

Use this SWITCH function to categorize these grades:

=SWITCH(B2,
"A","Top marks! Great job!",
"B","Keep up the good work",
"C","Could do better",
"Failed. Remedial study needed")

Let’s break it down:

B2 is the condition we’re going to test. It’s the grade letter from column B¬†and it’s the input to our SWITCH function.

We check the value of B2 against the first case “A”. If they match (i.e. the grade in B2 was also “A”) then SWITCH returns the string “Top marks! Great job!”.

If the value from B2 doesn’t match “A”, we move on and test it against “B”¬† and if that fails, test against “C”.

If the condition doesn’t match any of the cases, the last string is returned: “Failed. Remedial study needed”. It’s our catch-all solution. Anything that’s not “A”, “B” and “C” will return this answer.

This is what the solution looks like:

Switch Function in Google Sheets

Give it a try and SWITCH things up!

The FACT Function in Google Sheets (And Why A Shuffled Deck of Cards Is Unique)

Let’s start with a mind-blowing fact, and then use the FACT function in Google Sheets to explain it.

Pick up a standard 52 card deck and give it a good shuffle.

The order of cards in a shuffled deck will be unique.

One that has likely never been seen before in the history of the universe and will likely never be seen again.

I’ll let that sink in.

Isn’t that mind-blowing?

Especially when you picture all the crazy casinos in Las Vegas.

Let’s understand why, and in the process learn about the FACT function and basic¬†combinatorics (the study of counting in mathematics).

Four Card Deck

To keep things simple, suppose you only have 4 cards in your deck, the four aces.

You can create this deck in Google Sheets with the CHAR function:

CHAR cards

The formulas to create these four cards are:

Ace of Clubs:

=CHAR(127185)

Ace of Spades:

=CHAR(127137)

Ace of Hearts:

=CHAR(127153)

Ace of Diamonds:

=CHAR(127169)

Let’s see how many different combinations exist with just these four cards.

Pick one of them to start. You have a choice of four cards at this stage.

Once you’ve chosen the first one, you have three cards left, so there are 3 possible options for the second card choice.

When you’ve picked that second card, you have two cards left. So you have a choice of two for the third card.

The final card is the last remaining one.

So you have 4 choices * 3 choices * 2 choices * 1 choice = 4 * 3 * 2 * 1 = 24

There are 24 permutations (variations) with just 4 cards!

Visually, we can show this in our Google Sheet by displaying all the different combinations with the card images from above:

Fact function card combinations in Google Sheets

(I’ve just shown the first 6 rows for brevity.)

You can see for example, when moving from row 1 to row 2, we swapped the position of the two red suits: the Ace of Hearts and the Ace of Diamonds.

Five Card Deck

This time there are 5 choices for the first card, then 4, then 3, then 2, and finally 1.

So the number of permutations is 5 * 4 * 3 * 2 * 1 = 120

Already a lot more! I have not drawn this out in a Google Sheet and leave that as an optional exercise for you if you wish.

The FACT function

The FACT function in Google Sheets (see documentation) is a math function that returns the factorial of a given number. The factorial is the product of that number with all the numbers lower than it down to 1.

In other words, exactly what we’ve done above in the above calculations.

Four:

The 4 card deck formula is:

=FACT(4)

which gives an answer of 24 permutations.

Five:

The 5 card deck formula is:

=FACT(5)

which gives an answer of 120 permutations.

Six:

A 6 card deck is:

=FACT(6)

which gives an answer of 720 permutations.

Twelve:

A 12 card deck has  479,001,600 different ways of being shuffled:

=FACT(12)

(You’re more likely to win the Powerball lottery at 1 in 292 million odds than to get two matching shuffled decks of cards, even with just 12 cards!)

Fifty Two:

Keep going up to a full deck of 52 cards with the formula¬†and it’s a staggeringly large number.

=FACT(52)

Type it into Google Sheets and you’ll see an answer of¬†8.07E+67, which is 8 followed by 67 zeros!

(This number notation is called scientific notation, where huge numbers are rounded to the first few digits multiplied by a 10 to the power of some number, 67 in this case.)

This answer is more than the number of stars in the universe (about 10 followed by 21 zeros).

Put another way if all 6 billion humans on earth began shuffling cards at 1 deck per minute every day of the year for millions of years, we still wouldn’t even be close to finding all possible combinations.