How To Draw The Cantor Set In Google Sheets

The Cantor set is a special set of numbers lying between 0 and 1, with some fascinating properties.

It’s created by removing the middle third of a line segment and repeating ad infinitum with the remaining segments, as shown in this gif of the first 7 iterations:

Cantor Set Iterations

The formulas used to create the data for the Cantor set in Google Sheets are interesting, so it’s worth exploring for that reason alone, even if you’re not interested in the underlying mathematical concepts.

But let’s begin by understanding the set in more detail…

What Is The Cantor Set?

Cantor Set in Google Sheets

The Cantor set was discovered in 1874 by Henry John Stephen Smith and subsequently named after German mathematician Georg Cantor.

The construction shown in this post is called the Cantor ternary set, built by removing the middle third of a line segment and repeating ad infinitum with the remaining segments.

It is sometimes known as Cantor dust on account of the dust of points that remain after repeatedly removing the middle thirds. (Cantor dust also refers to the multi-dimensional version of the Cantor set.)

The set has some fascinating, counter-intuitive properties:

  • It is uncountable. That is, there are as many points left behind as there were to begin with.
  • It’s self-similar, meaning each subset looks like the whole set.
  • It’s fractal with a dimension that is not an integer.
  • It has an infinite number of points but a total length of 0.

Wow!

How To Draw The Cantor Set In Google Sheets

To be clear, the Cantor set is the set of numbers that remain after removing the middle third an infinite number of times. That’s hard to comprehend, let alone do in a Google Sheet 😉

But we can create a picture representation of the Cantor set by repeating the algorithm ten times, as shown in this tutorial:

Create The Data

Step 1:

In a blank sheet called “Data”, type the number “1” into cell A1.

Step 2:

In cell B1, type this formula:

={ FILTER(A1:A,A1:A<>"") ;
SUM(FILTER(A1:A,A1:A<>"")) ;
FILTER(A1:A,A1:A<>"") }

Step 3:

Drag this across your sheet up to column J, which creates the data for the first 10 iterations.

Each formula references the column to the left. For example, the formula in cell D will reference column C.

Your data will look like this:

Data For Cantor Dust in Google Sheets

How does this formula work?

It combines array literals and the FILTER function.

Let’s break it down, using the onion framework.

The innermost formula is:

=FILTER(A1:A,A1:A<>"")

This formula grabs all the data from column A and returns any non-blank entries, in this case just the value “1”.

Now we combine two of these together with array literals:

={ FILTER(A1:A,A1:A<>"") ;
FILTER(A1:A,A1:A<>"") }

Here the array literals { ... ; ... } stack these two ranges.

In this first example, it puts the number “1” with another “1” beneath it in column B.

Then we add a third FILTER and also SUM the middle FILTER range to create our final Cantor set algorithm:

={ FILTER(A1:A,A1:A<>"") ;
SUM(FILTER(A1:A,A1:A<>"")) ;
FILTER(A1:A,A1:A<>"") }

As we drag this formula to adjacent columns, the relative column references will change so that it always references the preceding column.

In column B, the output is:

1,1,1

Then in column C, we get:

1,1,1,3,1,1,1

And in column D:

1,1,1,3,1,1,1,9,1,1,1,3,1,1,1

etc.

This data is used to generate the correct gaps for the Cantor set.

Draw The Cantor Set

We’ll use sparklines to draw the Cantor set in Google Sheets.

Cantor Dust In Google Sheet
Click to enlarge

Step 4:

Create a new blank sheet and call it “Cantor Set”.

Step 5:

Next, create a label in column A to show what iteration we’re on.

Put this formula in cell A1 and copy down the column to row 10:

="Cantor Set "&ROW()

This creates a string, e.g. “Cantor Set 1”, where the number is equal to the row number we’re on.

Step 6:

The next step is to dynamically generate the range reference. As we drag our formula down column B, we want this formula to travel across the row in the “Data” tab to get the correct data for this iteration of the Cantor set.

Start by generating the row number for each row with this formula in cell B1 and copy down the column:

=ROW()

(I set up my sheet with the data in columns because it’s easier to create and read that way. But then I want the Cantor set in a column too, hence why I need to do this step.)

Step 7:

Use the row number to generate the corresponding column letter with this formula in cell C1 and copy down the column:

=ADDRESS(1,ROW(),4)

This uses the ADDRESS function to return the cell reference as a string.

Step 8:

Remove the row number with this formula in cell D1 and copy down the column:

=SUBSTITUTE(ADDRESS(1,ROW(),4),"1","")

Step 9:

Combine these two references to create an open-ended range reference for the correct column of data in the “Data” sheet.

Put this formula in cell E1 and copy down the column:

="'Data'!"&ADDRESS(1,ROW(),4)&":"&SUBSTITUTE(ADDRESS(1,ROW(),4),"1","")

This returns range references e.g. 'Data'!A1:A

Step 10:

Put this formula in cell F1 and copy down the column:

=INDIRECT("'Data'!"&ADDRESS(1,ROW(),4)&":"&SUBSTITUTE(ADDRESS(1,ROW(),4),"1",""))

This will show #REF! errors: “Array result was not expanded because it would overwrite data in…”

However, don’t worry, these are only temporary as we’ll dump this data into the sparkline formula next.

Step 11:

In column G, create a default sparkline formula:

=SPARKLINE(INDIRECT("'Data'!"&ADDRESS(1,ROW(),4)&":"&SUBSTITUTE(ADDRESS(1,ROW(),4),"1","")))

This shows the default line chart (except for the first row where it shows a #N/A error).

Step 12:

In column H, convert the line chart sparkline to a bar chart sparkline by specifying the charttype in custom options:

=SPARKLINE(INDIRECT("'Data'!"&ADDRESS(1,ROW(),4)&":"&SUBSTITUTE(ADDRESS(1,ROW(),4),"1","")),{"charttype","bar"})

Step 13 (optional):

Finally, in column I, change the colors to a simple black and white scheme, by specifying color1 and color2 inside the sparkline:

=SPARKLINE(INDIRECT("'Data'!"&ADDRESS(1,ROW(),4)&":"&SUBSTITUTE(ADDRESS(1,ROW(),4),"1","")),{"charttype","bar";"color1","black";"color2","white"})
Cantor Set Data Formulas
Click to enlarge

Feel free to delete any working columns once you have finished the formula showing the Cantor set.

Finished Cantor Set In Google Sheets

Here are the first 10 iterations of the algorithm to create the Cantor set:

Cantor Set In Google Sheet
Click to enlarge

Of course, this is a simplified representation of the Cantor set. It’s impossible to create the actual set in a Google Sheet since we can’t perform an infinite number of iterations.

Can I see an example worksheet?

Yes, here you go.

See Also

You might enjoy my other mathematical Google Sheet posts:

PI Function in Google Sheets And Other Fun π Facts

Complex Numbers In Google Sheets

How To Draw The MandelBrot Set In Google Sheets, Using Only Formulas

The FACT Function in Google Sheets (And Why A Shuffled Deck of Cards Is Unique)

How To Draw The Sierpiński Triangle In Google Sheets

Exploring Population Growth And Chaos Theory With The Logistic Map, In Google Sheets

Use The Onion Framework To Approach Complex Formulas

Complex Formulas? The Onion Framework? Huh?

I’m talking about the idea that complex formulas in Google Sheets are a lot like onions.

They have layers.

And they sometimes make you cry!

The Onion Method For Complex Formulas

If you’re building complex formulas, then I advocate following a one-action-per-step approach.

What I mean by this is that you build your formulas in a series of steps, and only make one change with each step.

The Onion Method is a framework by which to approach hard formulas, and consists of these three elements:

  1. Put each new step of the formula in a new cell
  2. Label each step with a simple “Step 1”, “Step 2”, etc. in adjacent cells
  3. Change the background color of each formula cell, so they can be easily found

This lets you see the formula progress in an incremental way and is really helpful when you’re building or tyring to understand complex formulas.

Sometimes a step might result in an error (typically a #N/A or #REF!), but that’s ok, provided it gets fixed in a subsequent step, as shown in this SUMPRODUCT example:

Advanced Formula steps example

Each of these intermediary formulas in the above image moves us forward incrementally, until the final answer is obtained in step 6.

Similarly, if you’re trying to understand complex formulas, peel the layers back until you reach the core (which is hopefully a function you understand!). Then, build it back up in steps to get back to the full formula.

Example 1: Building Complex Formulas With The Onion Method

Let’s start with this job positions dataset and use the QUERY function to summarize the results:

Job position dataset

Step 1

Setup the first, simple QUERY formula to select columns A and B:

=QUERY(A1:B,"select A, B")

This doesn’t change the data, but it’s always a good idea to set up a basic query first to ensure you have the correct dataset selected as the input to your QUERY function.

Step 2

Summarize the data by job position, using a GROUP BY clause in the QUERY function:

=QUERY(A1:B,"select A, sum(B) group by A")

Onion Method for complex formulas

Step 3

Filter out the blank rows using the WHERE clause: “is not null”, as follows:

=QUERY(A1:B,"select A, sum(B) where A is not null group by A")

Step 4

Use an ORDER BY clause to sort the table by total in descending order:

=QUERY(A1:B,"select A, sum(B) where A is not null group by A order by sum(B) desc")

Step 5

Fix the header of the total column using the LABEL clause:

=QUERY(A1:B,"select A, sum(B) where A is not null group by A order by sum(B) desc label sum(B) 'Total Positions'")

Onion Method for complex formulas

Good work!

We’ve created a pivot table using the QUERY function rather than an actual pivot table. Building it in steps, where the formula evolves slightly with each step, was key to making this work.

Let’s continue, and see how to add a total row to this QUERY formula.

Step 6

Using array literals, add a placeholder line for the total row:

={QUERY(A1:B,"select A, sum(B) where A is not null group by A order by sum(B) desc label sum(B) 'Total Positions'");{"TOTAL","TBC"}}

Step 7

Our final step is to convert this placeholder to an actual formula, to give the correct total. As with the data input to the query function, we leave the range reference open-ended to ensure it remains dynamic and will include new data automatically:

={QUERY(A1:B,"select A, sum(B) where A is not null group by A order by sum(B) desc label sum(B) 'Total Positions'");{"TOTAL",SUM(B1:B)}}

The result is:

Onion Method for complex formulas

Example 2: Deconstructing Complex Formulas With The Onion Method

If you’re trying to understand complex formulas in Google Sheets that someone else has shared with you, you can still approach it with this Onion Method.

Simply peel back the layers until you reach the innermost function. Copy that into a new cell and start from the inside and work out, building up to the full formula again.

Let’s see an example.

Suppose we’re given this worksheet with US State names:

dataset of US State Names

And we’re also given this formula:

=ArrayFormula(INDEX(A1:A20,MODE(MATCH(A1:A20,A1:A20,0))))

which gives an output of Texas.

But how does this formula work?

Applying The Onion Method, we peel back the layers to the core function and then build it up in steps again.

Step 1

In a new cell, add the innermost MATCH function:

=MATCH(A1:A20,A1:A20,0)

Step 2

=ArrayFormula(MATCH(A1:A20,A1:A20,0))

which outputs an array of the position of the first occurrence of the words in column A. We see a 2 next to every occurrence of Texas for example, because the first time it occurred was in position 2.

Step 3

Now, we wrap it with the MODE function to find the most frequently occurring position:

=ArrayFormula(MODE(MATCH(A1:A20,A1:A20,0)))

By definition, the MODE function takes a range of numbers for input and finds the most commonly occurring value.

However, what happens if we have a range of text values and want to find the most frequent?

In this case, MATCH has been used to create a range of numbers for the MODE function.

By now, we’ve probably deduced that this formula finds the most frequent word in a list.

Step 4

Finally, we can retrieve the actual text value, i.e. the most frequent State name, by adding the INDEX function to get the full original formula, like this:

=ArrayFormula(INDEX(A1:A20,MODE(MATCH(A1:A20,A1:A20,0))))

This will give the output Texas in this specific example.

Nice!

Another Example To Deconstruct Complex Formula

Here’s another example of the Onion method to deconstruct complex formulas:

Get A Unique List Of Items From A Column With Grouped Words

Complex Formulas Onion Method Template For Your Use

Click here to open a read-only copy of the template >>

This template contains both examples from this tutorial.

To make your own editable copy, please go to File > Make a copy… under the File menu.

Complex Formulas Onion Method Conclusion

The Onion Method is a framework that allows you to approach complex formulas in a systematic way.

Even if you’re presented with an “impossible” challenge to answer or an “impossible” formula to decipher, just follow this framework. If required, peel back the layers and then work from the inside out in an incremental fashion.

You’ll be amazed at how quickly your understanding of challenging formulas broadens and deepens. You’ll encounter and understand brand new functions that you’ve never heard of before. Plus, you’ll find out all sorts of secret tricks with existing formulas.

Who knows, you might even cry tears of joy instead of despair…

Radial Bar Charts in Google Sheets

In this post, I’m going to show you how to create radial bar charts in Google Sheets.

Radial Bar Chart in Google Sheets

They look great and grab your attention, which is important in this era of information overload.

But they should be used sparingly because they’re harder to read than a regular bar chart (because it’s harder to compare the length of the curved bars).

How To Create A Radial Bar Chart In Google Sheets

Let’s begin with the data.

In this example, we’ll create a radial bar chart in Google Sheets with 3 series.

We need a column of values for these 3 series, for example, products with a number of units sold.

Next, we need some upper limit (max value) for our bars. This allows us to scale the bars properly.

Lastly, we need a helper column that calculates the difference between the max value and the actual value.

Here’s the data for the radial bar chart, in cells E3:H6:

Radial bar chart data table with Formula

Ok, I’m going to let you in on a little secret now…

This is not a single chart. No sir, it’s three charts overlaid on top of each other.

And yes, this means it takes three times as long to create!

Step 1: Create the inner circle

Highlight the first row of data but exclude the max value column. In the example dataset above, highlight E4:G4 and insert a chart.

Select a doughnut chart.

Under the Setup menu, make sure to check the “Switch rows/columns” checkbox, so your chart looks like this:

Donut chart in Google Sheets

Under the customize menu of the chart tool, set the following conditions:

  • Background color: None
  • Chart border color: None
  • Donut hole size: 67%
  • Set Slice 2 color to none
  • Remove the chart title
  • Set the legend to none

This is what the inner donut should look like:

Donut chart in Google Sheets

Step 2: Create the middle circle

Repeat the steps above for the inner circle, but use the next row of data, choose a different color, and set the donut hole size to 77% (you may have to experiment with these percentages to line everything up at the end).

Drag the second donut chart on top of the first and line up the radial bars to get:

Radial Bar Chart in Google Sheets

Step 3: Create the outer circle

Again, repeat the steps above from the inner circle to create a third donut chart, using the third row of data, a different color, and setting the donut hole size to 81% (again, this might need tweaking to line everything up).

Drag this third donut chart on top of the other two and you have a radial bar chart in Google Sheets!

Radial Bar Chart in Google Sheets

Note on editing charts:

Since the charts are placed on top of each other, you’ll only be able to access the top chart to edit. You’ll have to move it to the side to access the chart underneath, and then move that one if you want to access the inner chart.

Step 4: Add the data labels

It gets messy to add the data labels to each chart through the chart editor, so I opted to create formulas to add my data labels into the cells next to each bar of the radial bar chart.

To access cells underneath the charts, click on a cell outside of the chart area and then use the arrow keys on your keyboard to reach the desired cell.

Once there, add the following formula:

=E6&": "&TEXT(F6,"#,0")

This formula uses the TEXT function to combine text and numbers in Google Sheets.

This shows the series name and value alongside each bar:

Data labels for radial bar chart in Google Sheets

To finish, remove the gridlines from your Sheet to give the chart a clean look.

Radial Bar Chart in Google Sheets

Can I see an example worksheet for the radial bar chart?

Yes, here you go.

Real World Examples of Radial Bar Charts

My friend Jeff Sauer, who founded Data Driven U to teach people data-driven marketing, contacted me recently about creating a radial bar chart for one of his workshops.

He is graciously sharing his report here, so you can see a radial bar chart with six rings:

Data Driven U chart

This is a screenshot of his Google Sheet!

(If you’re looking for top draw digital marketing, then you should definitely check out Jeff’s site: DataDrivenU.com This is not an affiliate link, just a personal recommendation!)

You’ve probably also seen a radial bar chart in the wild with the Apple Watch Rings Chart!

Apple Watch Ring Chart

Google Sheets REGEX Formulas

This post explores the Google Sheets REGEX formulas with a series of examples to illustrate how they work.

Regular expressions, or REGEX for short, are tools for solving problems with text strings. They work by matching patterns.

You use REGEX to solve problems like finding names or telephone numbers in data, validating email addresses, extracting URLs, renaming filenames containing the word “Application” etc.

They have a reputation for being hard, but once you learn a few basic rules and understand how they work you can use them effectively.

There are three Google Sheets REGEX formulas: REGEXMATCH, REGEXEXTRACT, and REGEXREPLACE.

Each has a specific job:

• REGEXMATCH will confirm whether it finds the pattern in the text.

• REGEXEXTRACT will extract text that matches the pattern.

• REGEXREPLACE will replace text that matches the pattern.

Let’s understand them with a series of examples.

Continue reading Google Sheets REGEX Formulas

Use The SWITCH Function to Categorize Data Efficiently

The SWITCH function is a useful tool for categorizing data. In the right circumstances, it can save you from messy, nested IF functions.

The SWITCH function is used to test an expression against a list of cases. It returns a value when the expression is equal to one of the cases.

It has some similarities to the IFS function but differs because SWITCH tests for exact matching rather than whether a condition is true (e.g. X > Y). SWITCH also has a default option to return a value if no match is found.

Let’s see an example. Suppose we have this data set of student grades and we want to add context to each grade level.

SWITCH function with student grades

Use this SWITCH function to categorize these grades:

=SWITCH(B2,
"A","Top marks! Great job!",
"B","Keep up the good work",
"C","Could do better",
"Failed. Remedial study needed")

Let’s break it down:

B2 is the condition we’re going to test. It’s the grade letter from column B and it’s the input to our SWITCH function.

We check the value of B2 against the first case “A”. If they match (i.e. the grade in B2 was also “A”) then SWITCH returns the string “Top marks! Great job!”.

If the value from B2 doesn’t match “A”, we move on and test it against “B”  and if that fails, test against “C”.

If the condition doesn’t match any of the cases, the last string is returned: “Failed. Remedial study needed”. It’s our catch-all solution. Anything that’s not “A”, “B” and “C” will return this answer.

This is what the solution looks like:

Switch Function in Google Sheets

Give it a try and SWITCH things up!