Pivot Table Maps in Google Sheets…yes, really!

Let me show you a unique use case for pivot tables – Pivot Table Maps!

Pivot Table Map of Washington DC

Can you guess which city this is?

It’s Washington D.C. and it’s also a pivot table in Google Sheets. The image on the left is the Pivot Table map built with a pivot table. The image on the right is a screenshot of Washington D.C. from Google maps.

Look closely and you might just be able to see the Google Sheet row and column headings around the map.

Wait, what?

I downloaded a dataset of DC crime data, which had latitude and longitude data, and plotted it as a pivot table with conditional formatting.

Pivot Table Maps of Washington DC

The dark areas in the pivot table map represent Rock Creek Park (huge wooded area with few people, hence few crimes) and two rivers, the Anacostia, which runs through the SE of the city, and the Potomac river, which delineates the W edge of the city.

Annotated Pivot Table Map of Washington DC

And here’s a close up of the Sheet, to convince you it’s really a pivot table (click to enlarge) 😉

Pivot Table map in Google Sheets

I first saw this technique implemented in Excel (see this example) but I haven’t seen anything in Google Sheets before.

How to make Pivot Table Maps

It’s actually pretty simple and can be performed on any dataset with latitude and longitude data, to varying degrees of success (some datasets are too large, some not granular enough).

  1. Download your chosen dataset.
  2. Round your latitude and longitude data columns to 1, 2 or 3 decimal places (depending on how granular your data is, trial and error).
  3. Highlight the whole data table and create a pivot table.
  4. Insert latitude as rows.
  5. Sort the latitude rows as descending, since larger positive numbers are further North.
  6. Insert longitude as columns.
  7. Highlight all the longitude columns and reduce the column width of all of them in one go, so all the cells are square.
  8. Insert your measure into the values section of your pivot report builder, in the DC map example above I inserted “Offense” and used COUNTA to count them.
  9. Highlight whole pivot table and choose Conditional Formatting –> Color Scale (I chose variations of yellow).
  10. Change the background of the pivot table to black.
  11. Zoom out on your Sheet as far as you need to see the full “map”.

Want to see this template?

Click here >>

Feel free to make your own copy (File > Make a copy…).

When could we use Pivot Table Maps?

It’s a quick and dirty way of visualizing geographic data sets.

It’s not really suitable for deep analysis or final charts, where a tool like Tableau would really prove it’s worth, but for a quick look with only a few minutes work, it’s not too bad!

Here’s a few more Pivot Table Maps of my own:

A pivot table map of 7,000 breweries and brew pubs in the USA:

A pivot table map of breweries and brew pubs in the USA

Houston, Texas as a pivot table map, created from 311 data:

Houston pivot table map

Subway stations of New York City pivot table map:

NYC Subway pivot table map

Let me know if you try this yourself! I’d love to see any creations you come up with.

How To Make a Table in Google Sheets, and Make It Look Great

This post is a collection of formatting tips for how to make a table in Google Sheets.

Let’s start with a simple table, completely devoid of any formatting:

Table no format

The goal of this article is to show you how to make a table in Google Sheets look great, like this:

How to make a table in Google Sheets and make it look great

Continue reading How To Make a Table in Google Sheets, and Make It Look Great

18 Best Practices for Working with Data in Google Sheets

This article outlines 18 best practices for working with data in Google Sheets.

It’s a compilation of my own experiences of working with data in spreadsheets for 15+ years, along with the opinions of others I’ve worked with and reports and articles I’ve read online.

By no means is it meant to be exhaustive or the last word on the subject, but if you follow these guidelines, you should have a robust data workflow.

Following these best practices for working with data will make you more efficient and reduce the chance of errors creeping in. It’ll make your work easier to follow and understand and add value to your team’s or client’s workflow process. It’s a good habit to have, and it’ll serve you well as you progress with your data career.

Continue reading 18 Best Practices for Working with Data in Google Sheets

How to Create a Scatter Plot in Google Sheets

Whenever I’ve taught data analysis classes or data visualization classes, for General Assembly or privately or online, I find that the humble scatter plot is often poorly understood.

Perhaps it’s because they’re less common than simple bar charts, line charts or pie charts? Or maybe it’s because they take a bit more mental effort to understand what they’re telling us?

Regardless, they’re a crucial tool for analyzing data, so it’s important to master them. This post looks at the meaning of scatterplots and how to create them in Google Sheets.

What is a scatter plot?

Simply put, a scatter plot is a chart which uses coordinates to show values in a 2-dimensional space.

In other words, there are two variables which are represented by the x- and y-axes.

scatterplot in google sheets

In this example, the scatter plot shows the relationship between pageviews of a website and the number of signups that website received. As you can see, when the number of pageviews increases, the number of signups tends to also increase. They are positively correlated, but more on that in a minute.

Often the variable along the x-axis is the independent variable, which is the variable under the control of the experimenter, and the variable up the y-axis is called the dependent variable, or measured variable, because it’s the variable being observed to see how it changes when the independent variable changes.

It’s possible for both variables to be independent, in which case it doesn’t matter which axis they’re plotted on and the scatter plot shows any correlation between the two.

Why is a scatter plot useful?

A scatter plot is incredibly useful because it can show you, at a glance, what the big picture is, what the overall relationship is, what the trend is, between two variables.

Looking at the numbers alone is not particularly intuitive. It’s hard, impossible often, to determine how they’re related to each other.

Scatter plot example

Let’s take a look at a real-world example, using data showing property sales in Manhattan. I’ve extracted the data for properties between 1,000 sq.ft. and 5,000 sq.ft. and removed any without a sales price listed.

This leaves 250 values in a dataset, like so:

Scatter plot data

To create a scatter plot, highlight both columns of data (including the header row).

Then click Insert > Chart

Initially it’ll create a terrible bar chart, where each of the 250 rows of data is represented by a bar. Yikes!

bad bar chart

It’s a very simple fix to transform it into a scatterplot. On the chart menu, on the Data tab, simply choose the Scatter option, as shown in this image:

scatter plot menu

There you have a nice scatter plot!

Focus on a single point for a moment (shown in red in this image):

Reading a scatterplot

You can read off a pair of values, in this case 3,000 sq. ft. and $3,750,000, which tell us that we have a data point (representing a property sold in Manhattan) which was 3,000 square foot and had a sales price of $3.75 million.

We can write it as a coordinate pair: (3,000 , 3,750,000)

So each point, each plot, in our chart represents a coordinate pair of area and sales price, each plotted according to the rows of data in our dataset.

This is the real power and beauty of a scatter plot. It shows all of those rows of data in a single chart, so we can absorb something about the dataset as a whole.

Interpreting a scatter plot (correlation)

Well all those points on your scatter plot are pretty and they show something, but what exactly? And is there anything else we can glean from the scatter plot?

They show trends within our dataset.

But it’s hard to see this from just the points, so we can add a trendline like so (shown in red):

scatterplot with trendline

Ah ha! That’s interesting and useful.

It shows a general upward trend, which is what we’d expect. As the size of a property increases, so does it’s sales price.

Now, if we want to predict a sales price for a given area, say 4,500 sq. ft., we can use this line.

Start at the 4,500 sq. ft. mark on the x-axis, trace up to the line and then across to the y-axis and read off the value:

scatter plot and trendline

I can read off a value of $5,900,000 as the predicted value of a 4,500 sq. ft. property.

You might be wondering how to do that a bit more “scientifically”?

Well, we can use the equation of the trend line to calculate the number.

The line equation takes the basic form:

y = ax + b

So to predict y, we need to know the value of x (4,500 sq.ft. in our example) multiplied by the value of a (which is the slope of the line) and adding on the value of b (the intercept, or where the line crosses the y-axis).

We calculate a from our data using the SLOPE function:

=SLOPE( B2:B277 , A2:A277 )

which gives us: 1166.42218

We calculate b from our data using the INTERCEPT function:

=INTERCEPT( B2:B277 , A2:A277 )

which gives us: 712264.7317

Then I can calculate my predicted y-value using the equation:

y = 1166.42218 x + 712264.7317

into which I plug in the x-value of 4,500 sq. ft.:

y = 1166.42218 * 4500 + 712264.7317

to get the answer: $5,961,165

All that from a humble scatterplot.

How do we know if this line is a good fit? Will it give us “good” predictions?

Stay tuned for the next post, where we’ll look at how to answer that question.

See Also

How to make a Histogram in Google Sheets and overlay a Normal Distribution Curve

Want to learn more about Data Analysis?

There’s a lot more to scatterplots, and my new course, Data Analysis with Google Sheets, does a deep dive into scatterplots and how to use them to understand your data better.

Understanding Average In Google Sheets With The World’s Richest Person

This is a story about a bar, 10 regular folks, and the world’s richest man, to explore different measures of average in Google Sheets.

Somewhere along the way, we’ll seek to demonstrate the robustness of the different average measures, but more on that in a minute.

I want you to picture your favourite bar or pub.

For me, it might be a pint of ale at The Dickens Inn, near the River Thames in London:

Dickens Inn London pub

I should just finish this blog post here, and we could all spend the rest of the day in happy reverie, supping our favourite tipple.

Alas, that won’t do! We have work to do and things to learn, so let’s get started.

Continue reading Understanding Average In Google Sheets With The World’s Richest Person