Complex Formulas? The Onion Framework? Huh?

I’m talking about the idea that complex formulas in Google Sheets are a lot like onions.

They have layers.

And they sometimes make you cry. ?

## The Onion Method For Complex Formulas

If you’re building complex formulas, then I advocate following a **one-action-per-step approach**.

What I mean by this is that you build your formulas in a series of steps, and only make one change with each step.

The Onion Method is a framework by which to approach hard formulas, and consists of these three elements:

- Put each new step of the formula in a new cell
- Label each step with a simple “Step 1”, “Step 2”, etc. in adjacent cells
- Change the background color of each formula cell, so they can be easily found

This lets you see the formula progress in an incremental way and is really helpful when you’re building or tyring to understand complex formulas.

Sometimes a step might result in an error (typically a #N/A or #REF!), but that’s ok, provided it gets fixed in a subsequent step, as shown in this SUMPRODUCT example:

Each of these intermediary formulas in the above image moves us forward incrementally, until the final answer is obtained in step 6.

Similarly, if you’re trying to understand complex formulas, peel the layers back until you reach the core (which is hopefully a function you understand!). Then, build it back up in steps to get back to the full formula.

Learn advanced formula techniques in the new Extreme Formulas in Google Sheets course

## Example 1: Building Complex Formulas With The Onion Method

Let’s start with this job positions dataset and use the QUERY function to summarize the results:

### Step 1

Setup the first, simple QUERY formula to select columns A and B:

`=QUERY(A1:B,"select A, B")`

This doesn’t change the data, but it’s always a good idea to set up a basic query first to ensure you have the correct dataset selected as the input to your QUERY function.

### Step 2

Summarize the data by job position, using a GROUP BY clause in the QUERY function:

`=QUERY(A1:B,"select A, sum(B) group by A")`

### Step 3

Filter out the blank rows using the WHERE clause: “is not null”, as follows:

`=QUERY(A1:B,"select A, sum(B) where A is not null group by A")`

### Step 4

Use an ORDER BY clause to sort the table by total in descending order:

`=QUERY(A1:B,"select A, sum(B) where A is not null group by A order by sum(B) desc")`

### Step 5

Fix the header of the total column using the LABEL clause:

`=QUERY(A1:B,"select A, sum(B) where A is not null group by A order by sum(B) desc label sum(B) 'Total Positions'")`

Good work!

We’ve created a pivot table using the QUERY function rather than an actual pivot table. Building it in steps, where the formula evolves slightly with each step, was key to making this work.

Let’s continue, and see how to add a total row to this QUERY formula.

### Step 6

Using array literals, add a placeholder line for the total row:

`={QUERY(A1:B,"select A, sum(B) where A is not null group by A order by sum(B) desc label sum(B) 'Total Positions'");{"TOTAL","TBC"}}`

### Step 7

Our final step is to convert this placeholder to an actual formula, to give the correct total. As with the data input to the query function, we leave the range reference open-ended to ensure it remains dynamic and will include new data automatically:

`={QUERY(A1:B,"select A, sum(B) where A is not null group by A order by sum(B) desc label sum(B) 'Total Positions'");{"TOTAL",SUM(B1:B)}}`

The result is:

## Example 2: Deconstructing Complex Formulas With The Onion Method

If you’re trying to understand complex formulas in Google Sheets that someone else has shared with you, you can still approach it with this Onion Method.

Simply peel back the layers until you reach the innermost function. Copy that into a new cell and start from the inside and work out, building up to the full formula again.

Let’s see an example.

Suppose we’re given this worksheet with US State names:

And we’re also given this formula:

`=ArrayFormula(INDEX(A1:A20,MODE(MATCH(A1:A20,A1:A20,0))))`

which gives an output of Texas.

But how does this formula work?

Applying The Onion Method, we peel back the layers to the core function and then build it up in steps again.

### Step 1

In a new cell, add the innermost MATCH function:

`=MATCH(A1:A20,A1:A20,0)`

### Step 2

`=ArrayFormula(MATCH(A1:A20,A1:A20,0))`

which outputs an array of the position of the first occurrence of the words in column A. We see a 2 next to every occurrence of Texas for example, because the first time it occurred was in position 2.

### Step 3

Now, we wrap it with the MODE function to find the most frequently occurring position:

`=ArrayFormula(MODE(MATCH(A1:A20,A1:A20,0)))`

By definition, the MODE function takes a range of numbers for input and finds the most commonly occurring value.

However, what happens if we have a range of text values and want to find the most frequent?

In this case, MATCH has been used to create a range of numbers for the MODE function.

By now, we’ve probably deduced that this formula finds the most frequent word in a list.

### Step 4

Finally, we can retrieve the actual text value, i.e. the most frequent State name, by adding the INDEX function to get the full original formula, like this:

`=ArrayFormula(INDEX(A1:A20,MODE(MATCH(A1:A20,A1:A20,0))))`

This will give the output Texas in this specific example.

Nice!

## Complex Formulas Onion Method Template For Your Use

Click here to open a read-only copy of the template >>

This template contains both examples from this tutorial.

To make your own editable copy, please go to *File > Make a copy…* under the File menu.

## Complex Formulas Onion Method Conclusion

**The Onion Method** is a framework that allows you to approach complex formulas in a systematic way.

Even if you’re presented with an “impossible” challenge to answer or an “impossible” formula to decipher, just follow this framework. If required, peel back the layers and then work from the inside out in an incremental fashion.

You’ll be amazed at how quickly your understanding of challenging formulas broadens and deepens. You’ll encounter and understand brand new functions that you’ve never heard of before. Plus, you’ll find out all sorts of secret tricks with existing formulas.

Who knows, you might even cry tears of joy instead of despair…

Excellent post on complex formulas. What are your thoughts on using an outside program in the Google suite to put together epic long formulas in Sheets? I’ve tried this with limited success in Docs since some of the formatting for the formula appears to be lost copying and pasting from Docs to Sheets.

Hey Dave,

Yes, I think it can be helpful! I actually use Sublime Text myself (a fancy text editor for coding) to write out really long formulas. It’s helpful to split them over multiple lines, indent etc. The formatting ports across no problem, but I know copying and pasting formulas from the web can sometimes be problematic, e.g. with quotation signs messing up.

Cheers,

Ben

Thanks Ben

Something I found really helped me get through very complex layered formulas is to essentially build them layer by layer IN the actual formula bar with 1 line per layer rule (ish!).

The formula bar itself can be pulled down to make it a bigger working window and then you can write each later independently

e.g. I have a formula that looks for names in an agent table and then repeats them 10 times before positioning the next name…

So where the actual formula is this:

=sort(transpose(split(concatenate(arrayformula(split(rept( unique((filter(LinkedDataTable!E6:E,(LinkedDataTable!F6:F =E3)

))) & “\|”, 10), “|”))), “\”)))

I built it and READ it this way, which makes it far easier to find and replace any issues/errors or to edit it for other uses or updates.

= sort

(transpose(

split(

concatenate(

arrayformula(

split(

rept(

unique(

(filter

(LinkedDataTable!E6:E,(LinkedDataTable!F6:F =E3)

)))

& “\|”, 10),

“|”)))

, “\”)))

Hope others find it helpful!

Thanks for sharing, Joel!

I agree that writing formulas over multiple lines can be really helpful when they get really long. Something I’ve found recently though is that the line breaks are not persisted by Google. When you exit the formula and come back, it’s reverted to the formula all being squashed up again. Haven’t figured out if there’s a way to avoid this yet…

Cheers,

Ben

This is strange behaviour. I tried it in one of my Sheets. =IMPORTRANGE(SheetLastMonth,”TDC!H5:J8″) in one cell remains on one line no matter what, but elsewhere on the same tab I have no trouble getting =IMPORTRANGE(

SheetLastYear,”TDC!F14:F29″) to persist over two lines.

Scratch that. I reopened the file and found the formula on one line is now on two and deleting the newline doesn’t work; it stays on two lines. Bizarre, and too random to rely on π

Yes! I agree. I’ve found that it works sometimes, but often not. If I write a formula in an external editor, with line breaks, and copy that into the formula bar then I find they persist and stay in place then.

…. make an irrelevant/obvious change. Press enter. Go back and it’s formatted as you left it. Re-instate the obvious change. Press enter. Go back in and it’s all still formatted as you left it. I think it’s a Google cache optimisation. FWIW. Thx for all you insights Ben.

D

I’ve found using spaces (extra spaces wont impact the actual formula) after each line break helps remove this issue.

Using a more systemic approach like this also makes it easier for other people to maintain documentation as it’s easier to understand what the creator of the formula had in mind by following the different steps.

This is important to keep in mind when we create spreadsheets that ware expected to work for our teams whether we are part of the team/project or not.

Totally! That’s a great point, Davo. Thanks for sharing.

Thanks for the post, excellent information. I am a big fan of the IFERROR to clean up data and provide a structure to my tables.

You’re welcome! IFERROR is a very handy function.

Great.

I laughed or cried, complex formulas I wrote

Or both at the same time

Thank you very much Ben, for these great examples.

I often use the onion construction when I have to use more than 2 functions because for some reason, it never works properly the first time. I found it easier to understand how formulas work, and go easier to the result.

Your work is soooo interesting and so helpful.

Thank you for sharing so much!

You’re welcome! It definitely makes things easier to understand π

Great process and great timing. I recently spent a bit creating a custom function in apps script. Super easy to read and maintain. I was pretty happy with it. Only to find that it constantly errors with

βError

Service invoked too many times in a short time: exec qps. Try Utilities.sleep(1000) between calls. (line 0).β

So now I get to recreate it with built in functions, outside of the script editor. This process will help. Thanks!

Here a challenge I can not resolve …and maybe you could explain to me how resolve it ..with The Onion Method For Complex Formulas

this is like a special transpose data

here is my public url spreadsheet

https://docs.google.com/spreadsheets/d/1p9OBBYDCkGbfMCDutFdNRptW7z650n0UjEFgsdHbIUA/edit#gid=0

THANKS in advance to everyone could help me or could give me a link to solve it.

Ben, the formatting of the source data has changed.

Instead of suffixing the population with the contents of the square brackets, it is now prefixing it. As such the regex no longer works as is.

I believe I was able to resolve this (after many, much more complicated attempts, having overlooked the simply solution) by changing the regex from

[0-9,]+

to

[0-9,]+$

Thanks for the website by the way. Really helping with my new job!

I came up with grouping solution, here it is:

(?:])([0-9,]+)

Both Matts and mine seem to be working as per 11.02.2020

Peace!

Requesting help in putting together an “=ImportHTML” formula for Google Sheets to get the “Earnings Date” for ticker symbol EA (or any other valid ticker symbol) from CNBC (https://www.cnbc.com/quotes/?symbol=EA&qsearchterm=). In this case, I am trying to import the date “2020-07-30.”