Google Sheets Query function: Learn the most powerful function in Google Sheets

Have you heard of the Google Sheets query formula?

It allows you to use SQL (Structured Query Language, the code used to communicate with databases) style code to manipulate your data in Google Sheets and it’s super powerful. Arguably the most powerful function in Google Sheets.

Maybe you want to learn SQL, but don’t know where to start? You know it’s a super important data skill to master but it’s intimidating because you’ve never seen it before.

Well, here’s a first step you can take:

Try SQL out in a Google spreadsheet using the Google Sheets QUERY function – a familiar and easy environment!

(Note 1: There are a few minor differences which I explain in this tutorial.)

(Note 2: You may find you need to “straighten” the quotes in the formulas in this tutorial if you copy and paste them directly into your Google sheet, to avoid curly quotes which won’t work. You need straight quotes.)

Setting up your Google sheet

1. Go to this Google sheet containing the data we need.

2. Click into cell A1 and hit Ctrl + A (on PC) or Cmd + A (on Mac) to highlight the whole table.

3. Copy the data Ctrl + C (on PC) or Cmd + C (on Mac).

4. Open your own blank Google sheet and paste the table data with your cursor at cell A1, so you have your own copy of the data:

Google Sheets QUERY function data

5. Ensure you have the whole table selected (repeat step 2 in your own sheet):

table highlighted

6. Go to the menu: Data > Named ranges… and click this menu. A new pane will show in the right side of your spreadsheet as follows:

Named range menu

7. In the first input box, enter the word “countries”, as shown in the image below. This names our table of data so we can refer to it easily.

Named range menu detail

8. In Google sheets we use the Google Sheets QUERY function and write our pseudo-SQL code inside this function. So, we’ll enter all of our SQL code inside a QUERY function in cell G1.

=QUERY(countries,"our SQL code goes here between the quotes",1)

Ok, now we’re set up, let’s start writing SQL code!

SELECT all the data with the Google Sheets QUERY function

The SQL code SELECT * retrieves all of the columns from our data table.

9. To the right side of the table, type the following Google Sheets QUERY function into cell G1:

=QUERY(countries,"SELECT *",1)

10. The output from this query is our full table again, because SELECT * retrieves all of the columns from the countries table:

select star

Wow, there you go! You’ve written your first piece of SQL code! Pat yourself on the back.

The equivalent real world SQL code would be:

SELECT *
FROM countries

Spot the difference: We don’t include a FROM clause with the Google Sheets QUERY function. This is different to SQL in the real world, where we must specify a FROM clause to select our table.

SELECT specific columns only

11. What if we don’t want to select every column, but only certain ones? Modify your Google Sheets QUERY function to read:

=QUERY(countries,"SELECT B, D",1)

12. This time we’ve selected only columns B and D, so our output will look like this:

select specific columns

Equivalent real world SQL code:

SELECT country, population
FROM countries

WHERE clause

The WHERE clause specifies a condition that must be satisfied. It filters our data. It comes after the SELECT clause.

13. Modify your Google Sheets QUERY function in cell G2 to select only countries that have a population greater than 100 million:

=QUERY(countries,"SELECT B, D WHERE D > 100000000",1)

14. Our output table is:

select where clause 1

Equivalent real world SQL code:

SELECT country, population
FROM countries
WHERE population > 100000000

15. Let’s see another WHERE clause example, this time selecting only European countries. Modify your formula to:

=QUERY(countries,"SELECT B, C, D WHERE C = 'Europe' ",1)

16. Now the output table is:

select where clause 2

Equivalent real world SQL code:

SELECT country, continent, population
FROM countries
WHERE continent = 'Europe'

ORDER BY clause

The ORDER BY clause sorts our data. We can specify column(s) and direction (ascending or descending). It comes after the SELECT and WHERE clauses.

17. Let’s sort our data by population from smallest to largest. Modify your formula to add the following ORDER BY clause, specifying an ascending direction with ASC:

=QUERY(countries,"SELECT B, C, D ORDER BY D ASC",1)

18. The output table:

Select with order by ascending

Equivalent real world SQL code:

SELECT country, continent, population
FROM countries
ORDER BY population ASC

19. Modify your formula in cell G1 to sort the data by country in descending order, Z – A:

=QUERY(countries,"SELECT B, C, D ORDER BY B DESC",1)

20. Output table:

Select with order by descending

Equivalent real world SQL code:

SELECT country, continent, population
FROM countries
ORDER BY country DESC

LIMIT clause

The LIMIT clause restricts the number of results returned. It comes after the SELECT, WHERE and ORDER BY clauses.

21. Let’s add a LIMIT clause to our formula in G1 and return only 10 results:

=QUERY(countries,"SELECT B, C, D ORDER BY D ASC LIMIT 10",1)

22. This now returns only 10 results from our data:

limit clause

Equivalent real world SQL code:

SELECT country, continent, population
FROM countries
ORDER BY population ASC
LIMIT 10

Arithmetic functions

We can perform standard math operations on numeric columns.

So let’s figure out what percentage of the total world population (7.16 billion) each country accounts for.

23. We’re going to divide the population column by the total (7,162,119,434) and multiply by 100 to calculate percentages. So, modify our formula to read:

=QUERY(countries,"SELECT B, C, (D / 7162119434) * 100",1)

I’ve divided the values in column D by the total population (inside the parentheses), then multiplied by 100 to get a percentage.

24. The output table this time is:

arithmetic operations in SQL

Note – I’ve applied formatting to the output column in Google Sheets to only show 2 decimal places.

Equivalent real world SQL code:

SELECT country, continent, (population / 7162119434) * 100
FROM countries

Quick aside:

That heading for the arithmetic column is pretty ugly right? Well, we can rename it using the LABEL clause (however, careful as this is not part of SQL syntax). Try this out:

=QUERY(countries,"SELECT B, C, (D / 7162119434) * 100 LABEL (D / 7162119434) * 100 'Percentage'",1)

=QUERY(countries,"SELECT B, C, (D / 7162119434) * 100 LABEL (D / 7162119434) * 100 'Percentage' ",1)

Equivalent real world SQL code:

SELECT country, continent, (population / 7162119434) * 100 AS Percentage
FROM countries

Aggregation functions

We can use other functions in our calculations, for example min, max and average.

25. To calculate the min, max and average populations in your country dataset, use aggregate functions in your query as follows:

=QUERY(countries,"SELECT max(D), min(D), avg(D)",1)

26. The output returns three values – the max, min and average populations of the dataset, as follows:

aggregate SQL functions

Equivalent real world SQL code:

SELECT max(population), min(population), avg(population)
FROM countries

GROUP BY clause

Ok, take a deep breath. This is the most challenging concept to understand. However, if you’ve ever used pivot tables in Google Sheets (or Excel) then you should be fine with this.

The GROUP BY clause is used with aggregate functions to summarize data into groups, in the same way a pivot table does.

27. Let’s summarize by continent and count out how many countries per continent. Change your query formula to include a GROUP BY clause and use the COUNT aggregate function to count how many countries, as follows:

=QUERY(countries,"SELECT C, count(B) GROUP BY C",1)

Note, every column in the SELECT clause (i.e. before the GROUP BY) must either be aggregated (e.g. counted, min, max) or appear after the GROUP BY clause (e.g. column C in this case).

28. The output for this query is:

select query with group by

Equivalent real world SQL code:

SELECT continent, count(country)
FROM countries
GROUP BY continent

29. Let’s see a more complex example, incorporating many different types of clause. Modify the formula in G1 to read:

=QUERY(countries,"SELECT C, count(B), min(D), max(D), avg(D) GROUP BY C ORDER BY avg(D) DESC LIMIT 3",1)

This may be easier to read broken out onto multiple lines:

=QUERY(countries,
"SELECT C, count(B), min(D), max(D), avg(D)
GROUP BY C
ORDER BY avg(D) DESC
LIMIT 3"
,1)

This summarizes our data for each continent, sorts by highest to lowest average population and finally limits the results to just the top 3.

The code is very similar to real SQL and it’s looking pretty complex now! Good job!

30. The output of this query is:

complex query

Equivalent real world SQL code:

SELECT continent, count(country), min(population), max(population), avg(population)
FROM countries
GROUP BY continent
ORDER BY avg(population) DESC
LIMIT 3

Well, that’s all I got for this tutorial folks!

I hope this tutorial allowed you to understand and write some basic SQL code, and see that it needn’t be intimidating.

And I hope you feel inspired to keep learning!

Please feel free to leave comments or questions below.

Advanced techniques

> How to add a total row to your Query formulas

> How to use dates as filters in your Query formulas

Resources

Official documentation for the QUERY() function.

Official documentation for Google’s Visualization API Query Language.

Want to keep learning SQL? Try this online tutorial from W3 Schools.

Prefer video? Try this free 1 hour introduction to the SQL language.

84 thoughts on “Google Sheets Query function: Learn the most powerful function in Google Sheets”

    1. Hi Alex – no, unfortunately you can’t use the column titles inside the QUERY function in Google sheets (see this Stack Overflow thread for some discussion on this subject). The only other variation I’ve seen is the ability to use “Col1”, “Col2”, “Col3” etc. when you combine the QUERY function with one of the IMPORT functions, e.g. try this:

      =query(importhtml("https://en.wikipedia.org/wiki/The_Simpsons","table",3),"select Col1, Col4, Col8 where Col8 > 25",1)

      For details on this, check out this introduction on Stack Overflow.

  1. This was really useful, thank you.
    I’m struggling to use arithmetic operators in a query. I want to minus the sum of one column from the sum of another. My instinct says
    query(data, “select D, (Sum(E) – Sum(G)) group by D where E>G”,1) but that doesn’t work :/
    Any guidance would be gratefully received!

    1. Hey Jonathan,

      You were really close but the WHERE clause needs to come before the GROUP BY, as the WHERE is applied to each row. In SQL there is a HAVING clause that comes after the GROUP BY but it’s not available in this Google Visualization API Query Language. So, the formula would be e.g.:

      =query(data,"select C, sum(D), sum(E), sum(D) - sum(E) where D>E group by C")

      Have a look at Step 3 in this sheet: https://docs.google.com/spreadsheets/d/1YGaB2odHHDVjFSUv-KY10iplxfFaQ4Wxs4dbBYj1ng0/edit?usp=sharing

      Feel free to make your own copy (File > Make a copy…) so you can edit it.

      Cheers,
      Ben

      1. Great!
        I’m getting all fancy now, then I do something that breaks it and I have to decipher the error codes (I’m actually beginning to understand those now too!)
        Self-taught, no experience, so thanks again. Your article and reply have helped me improve.
        The latest iteration after your input, look at me doing all SQL stuff!
        =query(data,”select D, Sum(E) – SUM(G) where E>G group by D order by sum(E) – Sum(G) desc label Sum(E) – Sum(G) ‘Difference'”,1)

  2. Great summary. Thanks much. I am trying to do a query to return duplicate results, and I seem to be doing something wrong with the syntax. It looks like this:

    Select I, count(F)
    WHERE count(F) > 1
    GROUP BY I
    Order by count(F)

    The error I get looks like this:

    Error
    Unable to parse query string for Function QUERY parameter 2: CANNOT_BE_IN_WHERE: COUNT(F)

    Any idea what I am doing wrong here?

    1. Hi David,

      The problem here is that you’re not allowed aggregate functions (in this case COUNT) in the WHERE clause. The WHERE clause filters data at the row-level and won’t work with aggregated data. In real-world SQL, there’s a filter for aggregate groups called HAVING that comes after the GROUP BY, but sadly it’s not available in Google Sheets Query language.

      So the solution here is to remove the WHERE line altogether and then wrap your query inside of another one, which will filter on the groups greater than 1. It’ll look something like this:

      =query(query(,"select I, count(F) group by I order by count(F)"),"select * where Col2 > 1")

      You may need to tweak this to get it to work in your particular scenario.

      Here’s a screenshot of a quick example:

      How to do a Query without the HAVING function in Google Sheet

      Thanks,
      Ben

  3. Awesome. I was wondering what to do without the HAVING clause. Query a sub-query. Good thinking. And thanks for the Col2 syntax. Don’t know where I’d have found that. After a little tweaking to reach my goals, I achieved this, which works like a charm:

    =query(query(‘Form Responses 1’!F:I,”
    select I, count(F)
    where F = ‘Yes’
    group by I
    order by count(F)
    Label count(F) ‘Instances’
    “),”
    select *
    where Col2> 1
    “)

  4. Is there anyway to use OR:

    iferror(query(importrange(AM3,”my_movies!C6:DF”),”select Col1 where ‘Col107’ OR ‘Col108’ = ‘george'”)))

    Any help would be awesome

    1. Hi Robert,

      If you change:

      ”select Col1 where ‘Col107’ OR ‘Col108’ = ‘george'”

      to:

      ”select Col1 where Col107 = 'george' OR Col108 = 'george'”

      (i.e. remove the quotes around Col107 and Col108 and add the = ‘george’ for both) then this should work for you.

      Thanks,
      Ben

  5. Hi Ben,

    I’m a TC (top contributor) on the Google Sheet forum and this post is probably the best intro to Query that I’ve seen.

    I’ve developed fairly extensive sheets and sheet systems for clients and friends, but have remained shy about Queries – the syntax has always seemed too delicate and I just spend too much time in trial and error… “maybe THIS clause goes first, maybe I need a comma here or there” etc.

    I’ll be coming back here to review your tutorial often, as I think it contains the answers I need.

    Thanks for the post!
    Best,
    Lance

    1. No, SQL-style joins are not possible in the QUERY function. I’d suggest running two separate QUERY functions to get the data you want and then “joining” in the spreadsheet sense with VLOOKUP or INDEX/MATCH.

      Cheers,
      Ben

  6. So, if we can’t use Labels in the =QUERY function, does that mean we are limited to only the first 26 columns? After Z are we stuck?

    1. No, you can keep using letters in the same way as columns are labeled in the spreadsheet. So AA, AB, AC etc. becomes your next set of headings…

  7. Excellent tutorial! Many thanks.

    I have a question.
    I have 2 tabs in a spreadsheet.
    As a simplified example of my problem:
    Tab 1 has columns: country, population, inflation
    Tab 2 has columns: country, area
    I cannot combine the information all onto one tab.

    How do I create a query which will give me the equivalent of:
    select area, population
    from Tab1, Tab2
    where Tab1.country = Tab2.country

    Any advice hugely appreciated.

    1. Hi John,

      Unfortunately you can’t perform “joins” (bringing data from two different tables together) with the QUERY function, so you won’t be able to do what you’re trying to do with the QUERY function.

      So it sounds like a scenario for using a lookup formula (e.g. VLOOKUP or INDEX/MATCH) against the country name and retrieving the area and population into a single tab that way.

      Thanks,
      Ben

  8. I have the following data-set. I just want to have data from it for the financial year 2015-2016 only. How to do it? Please suggest. Thanks.
    Col A Col B
    2015 1
    2015 2
    2015 3
    2015 4
    2015 5
    2015 6
    2015 7
    2015 8
    2015 9
    2015 10
    2015 11
    2015 12
    2016 1
    2016 2
    2016 3
    2016 4
    2016 5
    2016 6
    2016 10
    2016 11
    2016 12
    2017 1
    2017 2
    2017 3
    2017 4
    2017 5
    2017 6
    2017 7

    1. Hi Rajnish,

      You should be able to do with this QUERY function:

      =QUERY(A1:B31,"select * where A = 2015 or A = 2016")

      Thanks,
      Ben

  9. Thank you for the tutorial! It has helped me a lot these past couple of days.

    I was wondering if it was possible to format the results of the query as they are returned. For example, using your sample spreadsheet:

    =QUERY(countries,”SELECT B, C”,1)

    And instead of Asia, Europe, Africa, North America, South America, we would get AS, EU, NA, SA.

    Thank you.

    1. Hi Kenny,

      The short answer is not really. The easiest solution is to create a small lookup table of the continents and their 2-letter codes and then just add a new lookup column next to the QUERY output columns.

      However, it is possible to create a QUERY formula version for this specific example. It works by taking the first two letters of the continent as shorthand, or for the two word continents (e.g. North America) taking the first letter of each word (i.e. NA).

      Here’s the full formula (warning, it’s a beast!):

      =ArrayFormula({query(countries,"select B",1),if(if(isnumber(SEARCH(" ",QUERY(countries,"SELECT UPPER(C)",1)
      )),2,1)=2,left(QUERY(countries,"SELECT UPPER(C)",1),1)&mid(QUERY(countries,"SELECT UPPER(C)",1),search(" ",QUERY(countries,"SELECT UPPER(C)",1))+1,1),left(QUERY(countries,"SELECT UPPER(C)",1),2))})

      which gives an output like this:

      Query and Array formula

      Whew, that was a sporting challenge! Here’s a working version in a Google Sheet.

      Cheers,
      Ben

      1. Wow, that’s exactly what I was looking for! Thank you so much! Looking at the query, I don’t feel so bad about not being able to figure this out myself.

        Now to sit here for the next few hours/days trying to understand how this thing works.

        Again, thank you and Happy Holidays!

          1. I’ve got to tell you that the extras you put into the sheet helped tremendously in helping me understand what you did. Thank you again!

  10. hi, how in the sql statement do a reference for a cell using where
    example,=query(libro;”SELECT C WHERE B=’A1′”;1). a1 doesn’t work. a1 has a id

    1. Hey Jonathan,

      You’ll need to move the A1 cell reference outside of the quotes to get this to work. Something like this:

      =query(libro,"select C where B = " & A1,1)

      Hope that helps!

      Cheers,
      Ben

      1. It works when it’s a number but doesn’t work when it’s a text. May you help me and tell also how could I use wildcards. Thank’s

    1. Great question. You can nest the QUERY functions to get a subquery type of behaviour. So use the first subquery to create a subset of your data, then insert this as your data table in the second query:

      =QUERY(QUERY(yourData,"your select"),"your 2nd select")

      Hope that helps!

      Ben

  11. Ben,
    As always, excellent work! Thank you for putting together such well documented tutorials. I will often add your name to my google searches to get results from you, specifically.

    I was hoping QUERY could be the solution for a particular challenge I have now. The problem is best illustrated here: http://imgur.com/a/Kbkm1

    I’m trying to write a formula that will count the items in one column that also appear in a separate column. For other, similar situations, I’ve added a helper column as seen in column D and the formula bar in the screenshot. I would then sum that entire column. I’m trying to find a way to do this without the helper column—putting all of this into one cell, such as cells E3 and E6, in the example (whose desired results would be 3 and 2, respectively).

    Is there some sort of query (or other function) that would provide this? Perhaps SELECT…COUNT…WHERE…(IN?) Or does QUERY only work with data structured as a database, where each row is a record and linked all the way across?

    An alternative would be to build a custom formula with GAS. That would be a fun project, but if there’s something easier, that would be nice.

    A secondary goal for this project (in a separate formula) is to return those values that were repeated in other columns, e.g., E3 (and overflow) would be:
    monkey
    pen
    magic

    (if you want the sample data to experiment with, you can find it here: https://goo.gl/e1acwu. Hopefully will save typing out some dummy data.)

    If you have a chance to take a crack at this, I’d be interested to see what you find. Thanks again for your work!

    1. Hey Andy,

      This is a perfect problem to solve with array formulas.

      So these ones give you the counts:

      =ArrayFormula(count(iferror(match(C2:C1000,A2:A1000,0),"")))

      =ArrayFormula(count(iferror(match(C2:C1000,B2:B1000,0),"")))

      and this one will output the matching values:

      =ArrayFormula(query(sort(vlookup(C2:C1000,A2:A1000,1,FALSE),1,TRUE),"select * where Col1 <> '#N/A'"))

      Hope this helps!

      Cheers,
      Ben

      1. Ben, thank you! This is awesome. I definitely need to explore array formulas more. I have a feeling knowing their uses could open up a lot of possibilities (or simplifications, at least).

        1. Yes, they’re awesome and super handy when you get the hang of them, although they can be much slower than regular formulas if you have big datasets.

  12. My years as a SQL dabase programmer are still useful, I see! Thanks Very Much for showing this, now my Sheet data is truly useful!

    1. Hi Ben. How can i use the AND function with query.
      e.g I want all states in columnA which are >5 from column B AND >10 from column C.

      1. Hey Goa,

        Not sure what your data range looks like, but you can simply have multiple WHERE clauses, like so:

        =QUERY(A1:E100, "select A,B,C where B > 5 and C > 10")

        Cheers,
        Ben

    1. The QUERY function can’t modify the underlying data, it can only output a copy of the data after doing any filter/aggregation etc. to it. So it’s only like SQL in the querying your data sense, not in a database admin sense.

  13. Hi Ben,
    I have been using query for quite sometime now and I have recently encountered an issues that i could use your expertise. So I use d the query function to pull data from multiple tabs in the same worksheet. The data in these tabs change everyday but the problem arises when one of these tabs do not have any data. I have tried both ” where Col1 ‘ ‘ ” and “where Col1 is not null” not avoid the empty tabs but it doesnt seem to work.

    The query function is as follows:
    =QUERY({‘ BD CR Control’!$A$16:$J;’ BB CR Control’!$A$16:$J;’ CD CR Control’!$A$16:$J;’ DB CR Control’!$A$16:$J;’ GB CR Control’!$A16:$J;’ RB CR Control’!$A$16:$J},”select Col4,sum(Col7),sum(Col10),sum(Col9),sum(Col8) where Col1 is not null group by Col4 label Col4 ‘Device’,sum(Col7) ‘Users’,sum(Col10) ‘Reg Visits’,sum(Col9) ‘Regs’,sum(Col8) ‘FTDs’ “)

    Sid

    1. Hi Sid, what’s the error message say? Can you share an example sheet? I couldn’t replicate this error myself ¯\_(ツ)_/¯

      1. Sid,

        I do something similar to this, but in two separate steps. Might be more work than you are interested in, but here’s what I do. I have a worksheet that does the combining, and I use a sort function to surround it all. That way any blank lines from any of the tabs will sort to the bottom. Like:

        =sort({‘ BD CR Control’!$A$16:$J;’ BB CR Control’!$A$16:$J;’ CD CR Control’!$A$16:$J;’ DB CR Control’!$A$16:$J;’ GB CR Control’!$A16:$J;’ RB CR Control’!$A$16:$J},1)

        Then I run the query based on THAT data, in which I can refer to columns by letter rather than Col#.

        In case that helps…

        -David

        1. Hi David,
          I did try this but the issue is that the 2mill cell limit will be reached. This is why I wanted to query directly from multiple tabs rather than querying from the combined tab which doubles the size of the data.

          Sid

      2. Hi Ben,
        Like you said. The error occurs in a weird inconsistent way. But I did come up with a temporary fix by passing a script that populates dummy values to the first row of the empty sheet.

        Sid

  14. Hi Ben,

    I’m looking for a way to produce a ‘search’ function using the query tool, is this possible? for someone to look up any word or number and only the results in the database containing that word would be displayed. I’ve seen examples of this but unsure where to begin.

    Any help is much appreciated,
    Thank you!
    Lucy

    1. Hi Lucy,

      Sounds like you could use Data Validation (see here for an example) to create drop down lists from which the user could select what they want, e.g. which category.

      Then your query function would look like something like this:

      =QUERY(data,"SELECT A, B, C WHERE C = '" & A1 & "' ",1)

      where the reference A1 is the cell with the data validation in, or the cell where someone would type in whatever they wanted to ‘search’ for. Note this can be any cell, doesn’t have to be in A1.

      Hope that helps!

      Ben

      1. How do you return any rows that contain a certain word in a particular column?

        So if we say we are searching for the word “food” would it be something like this:

        =QUERY(data,"SELECT A, B, C WHERE C = '" & food & "' ",1)

        Thanks in advance Ben. I know I’m close but not finding much helpful documentation anywhere else.

        1. You’re close! You just need to modify your QUERY function to be, with the word you want to match inside the single quotes:

          =QUERY(data,"SELECT A, B, C WHERE C = 'food'",1)

          Cheers!
          Ben

  15. Hi
    great tutorial !
    I need some help with the date
    =QUERY(Display;”SELECT B,N,C,E,F,I,O WHERE E>today() and N starts with ’36′”;1)
    it doesn’t work…

    Thanks 🙂

  16. Here is a handy apps script function that extends the ability of query() so that you can specify the columns by header rather than ‘D’ or Col5 or whatever. Much more intuitive


    /**
    * Enhances Google Sheets' native "query" method. Allows you to specify column-names instead of using the column letters in the SQL statement (no spaces allowed in identifiers)
    *
    * Sample : =query(data!A1:I,SQL("data!A1:I1","SELECT Owner-Name,Owner-Email,Type,Account-Name",false),true)
    *
    * Params : rangeName (string) : A quoted form of the range of the headers like "data!A1:I1" not data!A1:I1.
    * queryString (string) : The SQL-like query using column names instead of column letters or numbers
    * useColNums (boolean) : false/default = generate "SELECT A, B, C" syntax
    * true = generate "SELECT Col1, Col2, Col3" syntax
    */
    function SQL(rangeName, queryString, useColNums){
    var ss = SpreadsheetApp.getActiveSpreadsheet();
    var range = ss.getRange(rangeName);
    var row = range.getValues()[0];

    for (var i=0; i<row.length; i++) {
    if (row[i].length < 1) continue;
    var re = new RegExp("\\b"+row[i]+"\\b","gm");
    if (useColNums) {
    var columnName="Col"+Math.floor(i+1);
    queryString = queryString.replace(re,columnName);
    }
    else {
    var columnLetter=range.getCell(1,i+1).getA1Notation().split(/[0-9]/)[0];
    queryString = queryString.replace(re,columnLetter);
    }
    }
    //Logger.log(queryString);
    return queryString;
    }

    1. This was a great question, and much harder than it first appeared. Assuming I have the population dataset in columns A to D, then this formula will give me the countries with the minimum population for each continent:

      =ArrayFormula(
        {
          query(A1:D234,"select C, min(D) group by C order by min(D)",1),
          {
            "Country";
            vlookup(query(A1:D234,"select min(D) group by C order by min(D) label min(D) ''",1),{D1:D234,B1:B234},2,0)
          }
        }
      )

      What a beast! There may be other, more succinct, formulas out there…

      I’ll probably turn this into a formula challenge/blog post 🙂

      Cheers,
      Ben

  17. Hi Ben

    I have a dataset:
    Quarter 1 1 1 2 2 2
    Month 1 2 3 4 5 6
    Item1 8 6 5 9 2 4
    Item2 3 1 4 7 3 5
    Item1 4 7 2 6 2 4

    I need to sum Item1 where quarter = 1.

    Will query allow me to do this?

    1. Query is part of the answer but won’t get you there on it’s own, because you can’t filter on columns. If you know Quarter 1 was always the first three columns you could do it with a single query, but I’ve taken it to be more general, wherever quarter = 1 in row 1. Assuming your dataset was in the range A1:G5, then this formula gives the sum of Item1’s where Quarter = 1:

      =sum(query(transpose(query(A1:G5,"select * where A = 'Item1'",1)),"select sum(Col2), sum(Col3) where Col1 = '1' group by Col1",1))

      Essentially, it’s just a query to select item 1’s, then that gets transposed, then I run a second query to select Quarter 1’s, then sum the answer!

      Cheers,
      Ben

  18. So, I’m trying to use Data Validation for user selection to compare certain cells to one another (trying is the keyword). I’m new to a lot of what sheets has to offer and not too knowledgeable, but I can normally fumble my way through it to figure it out. However, I’m trying to get query to output a single cell instead of a row or column from the data set range. Maybe I’m using this in a way it’s not intended to used.

    So, in the example table above with the countries, could you set up a Data Validation to have the user select a specific country and have query spit out just the population of that country?

    1. Hey Jeremy,

      In answer to your specific question, if you have a data validation in one cell that creates a drop-down list of countries then you could just use a VLOOKUP to retrieve the country population from that data table, e.g.

      =vlookup(F2,B:D,3,false)

      where the data table was in columns A:D and the data validation was in cell F2.

      If you want a single cell answer back from your query function, then you’ll want to return a single column with an aggregation function.

      Hope that helps!

      Ben

  19. Hi Ben, I used this function, but i am not getting the result correctly. I have created the range. it has dates and order number and relative information. and total lines are 20. I used the function as below. =QUERY(Data,”select K where Q >3″,1), all the Q column contains 10(above 3 as per the formula. but it is showing the result of only 18. 2 row result is not appearing.

    can you help me figure it out. and am i using this wrong.

    Reagards
    Mahesh

    1. Hi Mahesh,

      Hmm, sounds like it maybe an issue with the data range you’ve included in the query, so make sure the full data table is referenced in the query function. The 1 at the end of the query function means you have a header row, so make sure you’ve included that in your query range too.

      Also, check that the two missing rows have 10 in them, and that’s the number 10 and not stored as a text value. (To fix, copy in one of the other 10 values.)

      Cheers,
      Ben

  20. Hi,
    Great Tutorial! Thanks.

    I had noticed that, in the original dataset, the value for ‘Rank’ was not always an integer, rather an ’emdash’ (I think, whatever). See Puerto Rico.

    However, the result for those countries for SELECT * and beyond was empty! I wondered why this was the case and I poked around and had almost given up. Then I stumbled upon this:

    https://support.google.com/docs/answer/3093343?hl=en

    Under the collapsible fieldset with the legend:
    ‘Missing data when you query text and numbers’

    This anomaly is explained well.

    Regards!

  21. Is it possible to select multiple columns and group by multiple columns?

    Here’s what I have so far and it works:
    =query(Attendance!A1:D,” select B, count(A) where A is not null group by B label count(A) ‘Count’ “,1)

    But I’d like it to say this, which does not work:
    =query(Attendance!A1:D,” select B, C count(A) where A is not null group by B, C label count(A) ‘Count’ “,1)

    Column A is a time stamp.

    I’m looking to select first and last name from google form responses and group by first and last name.

  22. I have adapted some of this help as well as off another site. I am close but need to clean it up a bit more for my uses.

    This is a zillow scraper for realtors.

    In column A1 is the zillow URL for a home.

    In column B1
    =ArrayFormula(QUERY(QUERY(IFERROR(IF({1,1,0},IF({1,0,0},INT((ROW($A:A)-1)/20),MOD(ROW($A:A)-1,20)),importXML($A:$A,”//span[@class=’snl phone’] | //meta[@property=’og:zillow_fb:address’]/@content | //meta[@property=’product:price:amount’]/@content| //div[@class=’hdp-fact-ataglance-value’] “))),”select min(Col3) where Col3 ” group by Col1 pivot Col2″,0),”offset 1″,0))

    The output is 19 columns
    address,price, 9 columns of data i dont need, and 8 columns of phone numbers sorted by min value

    What I need is

    Address, price, 8 phone numbers as pulled from import order 1 2 3 4 5 6 7 8, then the rest of the data in any order.

    Because I can search the 10-19 columns for data i need, but sometimes there is no extra data, and the phone numbers get put in different columns and messes up my references.

    If I cant get special sorting via querries, can we turn off sorting and just go with 8 phones up front, then price then the rest of the data?

Leave a Reply

Your email address will not be published. Required fields are marked *