Learn SQL code from the comfort of a Google spreadsheet

Have you heard of SQL, Structured Query Language, the code used to communicate with databases?

You know it’s a super important data skill to master but it’s intimidating because you’ve never seen it before.

You want to learn SQL, but don’t know where to start?

Well, here’s a first step you can take:

Try SQL out in a Google spreadsheet – a familiar and easy environment!

(Note 1: There are a few minor differences which I explain in this tutorial.)

(Note 2: You may find you need to “straighten” the quotes in the formulas in this tutorial if you copy and paste them directly into your Google sheet, to avoid curly quotes which won’t work. You need straight quotes.)

Setting up your Google sheet

1. Go to this Google sheet containing the data we need.

2. Click into cell A1 and hit Ctrl + A (on PC) or Cmd + A (on Mac) to highlight the whole table.

3. Copy the data Ctrl + C (on PC) or Cmd + C (on Mac).

4. Open your own blank Google sheet and paste the table data with your cursor at cell A1, so you have your own copy of the data:

Country population data table

5. Ensure you have the whole table selected (repeat step 2 in your own sheet):

table highlighted

6. Go to the menu: Data > Named ranges… and click this menu. A new pane will show in the right side of your spreadsheet as follows:

Named range menu

7. In the first input box, enter the word “countries”, as shown in the image below. This names our table of data so we can refer to it easily.

Named range menu detail

8. In Google sheets we must use the QUERY() function and write our pseudo-SQL code inside this function. So, we’ll enter all of our SQL code inside a QUERY function in cell G1.

=QUERY(countries,"our SQL code goes here between the quotes",1)

Ok, now we’re set up, let’s start writing SQL code!

SELECT all the data

The SQL code SELECT * retrieves all of the columns from our data table.

9. To the right side of the table, type the following formula into cell G1:

=QUERY(countries,” SELECT * “,1)

10. The output from this query is our full table again, because SELECT * retrieves all of the columns from the countries table:

select star

Wow, there you go! You’ve written your first piece of SQL code! Pat yourself on the back.

The equivalent real world SQL code would be:

FROM countries

Spot the difference: We don’t include a FROM clause with the QUERY function. This is different to SQL in the real world, where we must specify a FROM clause to select our table.

SELECT specific columns only

11. What if we don’t want to select every column, but only certain ones? Modify your query formula to read:

=QUERY(countries,”SELECT B, D”,1)

12. This time we’ve selected only columns B and D, so our output will look like this:

select specific columns

Equivalent real world SQL code:

SELECT country, population
FROM countries

WHERE clause

The WHERE clause specifies a condition that must be satisfied. It filters our data. It comes after the SELECT clause.

13. Modify your QUERY formula in cell G2 to select only countries that have a population greater than 100 million:

=QUERY(countries,”SELECT B, D WHERE D>100000000″,1)

14. Our output table is:

select where clause 1

Equivalent real world SQL code:

SELECT country, population
FROM countries
WHERE population > 100000000

15. Let’s see another WHERE clause example, this time selecting only European countries. Modify your formula to:

=QUERY(countries,”SELECT B, C, D WHERE C = ‘Europe’ ” ,1)

16. Now the output table is:

select where clause 2

Equivalent real world SQL code:

SELECT country, continent, population
FROM countries
WHERE continent = 'Europe'

ORDER BY clause

The ORDER BY clause sorts our data. We can specify column(s) and direction (ascending or descending). It comes after the SELECT and WHERE clauses.

17. Let’s sort our data by population from smallest to largest. Modify your formula to add the following ORDER BY clause, specifying an ascending direction with ASC:

=QUERY(countries,”SELECT B, C, D ORDER BY D ASC”,1)

18. The output table:

Select with order by ascending

Equivalent real world SQL code:

SELECT country, continent, population
FROM countries
ORDER BY population ASC

19. Modify your formula in cell G1 to sort the data by country in descending order, Z – A:

=QUERY(countries,”SELECT B, C, D ORDER BY B DESC”,1)

20. Output table:

Select with order by descending

Equivalent real world SQL code:

SELECT country, continent, population
FROM countries

LIMIT clause

The LIMIT clause restricts the number of results returned. It comes after the SELECT, WHERE and ORDER BY clauses.

21. Let’s add a LIMIT clause to our formula in G1 and return only 10 results:

=QUERY(countries,”SELECT B, C, D ORDER BY D ASC LIMIT 10″,1)

22. This now returns only 10 results from our data:

limit clause

Equivalent real world SQL code:

SELECT country, continent, population
FROM countries
ORDER BY population ASC

Arithmetic functions

We can perform standard math operations on numeric columns.

So let’s figure out what percentage of the total world population (7.16 billion) each country accounts for.

23. We’re going to divide the population column by the total (7,162,119,434) and multiply by 100 to calculate percentages. So, modify our formula to read:

=QUERY(countries,”SELECT B, C, (D / 7162119434) * 100″,1)

I’ve divided the values in column D by the total population (inside the parentheses), then multiplied by 100 to get a percentage.

24. The output table this time is:

arithmetic operations in SQL

Note – I’ve applied formatting to the output column in Google Sheets to only show 2 decimal places.

Equivalent real world SQL code:

SELECT country, continent, (population / 7162119434) * 100
FROM countries

Quick aside:

That heading for the arithmetic column is pretty ugly right? Well, we can rename it using the LABEL clause (however, careful as this is not part of SQL syntax). Try this out:

=QUERY(countries,"SELECT B, C, (D / 7162119434) * 100 LABEL (D / 7162119434) * 100 'Percentage'",1)

Equivalent real world SQL code:

SELECT country, continent, (population / 7162119434) * 100 AS Percentage
FROM countries

Aggregation functions

We can use other functions in our calculations, for example min, max and average.

25. To calculate the min, max and average populations in your country dataset, use aggregate functions in your query as follows:

=QUERY(countries,”SELECT max(D), min(D), avg(D)”,1)

26. The output returns three values – the max, min and average populations of the dataset, as follows:

aggregate SQL functions

Equivalent real world SQL code:

SELECT max(population), min(population), avg(population)
FROM countries

GROUP BY clause

Ok, take a deep breath. This is the most challenging concept to understand. However, if you’ve ever used pivot tables in Google Sheets (or Excel) then you should be fine with this.

The GROUP BY clause is used with aggregate functions to summarize data into groups, in the same way a pivot table does.

27. Let’s summarize by continent and count out how many countries per continent. Change your query formula to include a GROUP BY clause and use the COUNT aggregate function to count how many countries, as follows:

=QUERY(countries,”SELECT C, count(B) GROUP BY C”,1)

Note, every column in the SELECT clause (i.e. before the GROUP BY) must either be aggregated (e.g. counted, min, max) or appear after the GROUP BY clause (e.g. column C in this case).

28. The output for this query is:

select query with group by

Equivalent real world SQL code:

SELECT continent, count(country)
FROM countries
GROUP BY continent

29. Let’s see a more complex example, incorporating many different types of clause. Modify the formula in G1 to read:

=QUERY(countries,”SELECT C, count(B), min(D), max(D), avg(D) GROUP BY C ORDER BY avg(D) DESC LIMIT 3″,1)

This may be easier to read broken out onto multiple lines:

SELECT C, count(B), min(D), max(D), avg(D)

This summarizes our data for each continent, sorts by highest to lowest average population and finally limits the results to just the top 3.

The code is very similar to real SQL and it’s looking pretty complex now! Good job!

30. The output of this query is:

complex query

Equivalent real world SQL code:

SELECT continent, count(country), min(population), max(population), avg(population)
FROM countries
GROUP BY continent
ORDER BY avg(population) DESC

Well, that’s all I got for this tutorial folks!

I hope this tutorial allowed you to understand and write some basic SQL code, and see that it needn’t be intimidating.

And I hope you feel inspired to keep learning!

Please feel free to leave comments or questions below.


Official documentation for the QUERY() function.

Official documentation for Google’s Visualization API Query Language.

Want to keep learning SQL? Try this online tutorial from W3 Schools.

Prefer video? Try this free 1 hour introduction to the SQL language.

21 thoughts on “Learn SQL code from the comfort of a Google spreadsheet”

    1. Hi Alex – no, unfortunately you can’t use the column titles inside the QUERY function in Google sheets (see this Stack Overflow thread for some discussion on this subject). The only other variation I’ve seen is the ability to use “Col1”, “Col2”, “Col3” etc. when you combine the QUERY function with one of the IMPORT functions, e.g. try this:

      =query(importhtml("https://en.wikipedia.org/wiki/The_Simpsons","table",3),"select Col1, Col4, Col8 where Col8 > 25",1)

      For details on this, check out this introduction on Stack Overflow.

  1. This was really useful, thank you.
    I’m struggling to use arithmetic operators in a query. I want to minus the sum of one column from the sum of another. My instinct says
    query(data, “select D, (Sum(E) – Sum(G)) group by D where E>G”,1) but that doesn’t work :/
    Any guidance would be gratefully received!

    1. Hey Jonathan,

      You were really close but the WHERE clause needs to come before the GROUP BY, as the WHERE is applied to each row. In SQL there is a HAVING clause that comes after the GROUP BY but it’s not available in this Google Visualization API Query Language. So, the formula would be e.g.:

      =query(data,"select C, sum(D), sum(E), sum(D) - sum(E) where D>E group by C")

      Have a look at Step 3 in this sheet: https://docs.google.com/spreadsheets/d/1YGaB2odHHDVjFSUv-KY10iplxfFaQ4Wxs4dbBYj1ng0/edit?usp=sharing

      Feel free to make your own copy (File > Make a copy…) so you can edit it.


      1. Great!
        I’m getting all fancy now, then I do something that breaks it and I have to decipher the error codes (I’m actually beginning to understand those now too!)
        Self-taught, no experience, so thanks again. Your article and reply have helped me improve.
        The latest iteration after your input, look at me doing all SQL stuff!
        =query(data,”select D, Sum(E) – SUM(G) where E>G group by D order by sum(E) – Sum(G) desc label Sum(E) – Sum(G) ‘Difference'”,1)

  2. Great summary. Thanks much. I am trying to do a query to return duplicate results, and I seem to be doing something wrong with the syntax. It looks like this:

    Select I, count(F)
    WHERE count(F) > 1
    Order by count(F)

    The error I get looks like this:

    Unable to parse query string for Function QUERY parameter 2: CANNOT_BE_IN_WHERE: COUNT(F)

    Any idea what I am doing wrong here?

    1. Hi David,

      The problem here is that you’re not allowed aggregate functions (in this case COUNT) in the WHERE clause. The WHERE clause filters data at the row-level and won’t work with aggregated data. In real-world SQL, there’s a filter for aggregate groups called HAVING that comes after the GROUP BY, but sadly it’s not available in Google Sheets Query language.

      So the solution here is to remove the WHERE line altogether and then wrap your query inside of another one, which will filter on the groups greater than 1. It’ll look something like this:

      =query(query(,"select I, count(F) group by I order by count(F)"),"select * where Col2 > 1")

      You may need to tweak this to get it to work in your particular scenario.

      Here’s a screenshot of a quick example:

      How to do a Query without the HAVING function in Google Sheet


  3. Awesome. I was wondering what to do without the HAVING clause. Query a sub-query. Good thinking. And thanks for the Col2 syntax. Don’t know where I’d have found that. After a little tweaking to reach my goals, I achieved this, which works like a charm:

    =query(query(‘Form Responses 1’!F:I,”
    select I, count(F)
    where F = ‘Yes’
    group by I
    order by count(F)
    Label count(F) ‘Instances’
    select *
    where Col2> 1

  4. Is there anyway to use OR:

    iferror(query(importrange(AM3,”my_movies!C6:DF”),”select Col1 where ‘Col107’ OR ‘Col108’ = ‘george'”)))

    Any help would be awesome

  5. Hi Ben,

    I’m a TC (top contributor) on the Google Sheet forum and this post is probably the best intro to Query that I’ve seen.

    I’ve developed fairly extensive sheets and sheet systems for clients and friends, but have remained shy about Queries – the syntax has always seemed too delicate and I just spend too much time in trial and error… “maybe THIS clause goes first, maybe I need a comma here or there” etc.

    I’ll be coming back here to review your tutorial often, as I think it contains the answers I need.

    Thanks for the post!

Leave a Reply

Your email address will not be published. Required fields are marked *