How to extract email addresses tagged with a specific Gmail label

Extract email from Gmail

Everyone uses email in different ways. For me, email is at the very centre of my business. Everything flows through my Gmail account.

Being able to categorize all work enquiries or questions with a specific label, for example Queries (which I nest under my Work label) is one of the most useful features of Gmail.

Recently I needed to extract all of the email addresses for the hundreds of messages under this label. Super tedious to do manually, but thankfully there’s a much quicker way using Apps Script.

In this post, we’ll see how to setup a Google Sheet to extract all the email address for a specific Gmail label. I’ll share the code and walk through it with you.

How to use the Gmail Service with Apps Script and Google Sheets

Step 1: Set up the Google Sheet

The Google Sheet for this example is super simple for a change.

Cell B1 is where we type the label that we want to extract emails from, and then on row 3 are the two column headings, Name and Email.

The sheet looks like this:

Setup sheet for gmail app

Really important note

In this example I have a nested label, where I want email addresses from the label “queries” which belongs to the parent label “work”. Hence I need to write the label with a dash to show this relationship, with the parent label listed first, hence: “work-queries”.

If you are just looking at a single label with no relationship with other labels, then you can just type that label, e.g. “work”.

Step 2: Add the code

Go to Tools > Script editor...

Rename the project and clear out the boiler plate code so your editor window is blank:

Blank script editor window

Then paste in the following code (this same code is here on GitHub):

This script uses the Gmail Service to retrieve all the messages with the chosen label, and from there extract the names and email addresses.

Lines 2 – 7 add the custom menu to our spreadsheet. Specifically, line 5 adds the selector item to the menu, Extract emails..., which, when clicked, runs the function called extractEmails.

This extractEmails function begins on line 10.

In lines 13 to 16 we set up some variables to hold the active sheet and the label we’ve entered in cell B1:

Line 18 searches my Gmail with this label and gets all email threads tagged with that label.

Line 21 extracts all the messages from those threads and puts them into an array.

Lines 26 to 30 extract all of the To and From email addresses from all of these emails (so it doesn’t matter whether I sent or received, we’ll still grab the email addresses).

Lines 33 to 35 de-duplicate the email array, by filtering out any items in the array that occur more than once. Only the first instance will satisfy the filter condition (that the item index equals its position) and thus be included in our new array, uniqueEmailArray.

Lines 37 to 53 (up to the period) parse the email addresses to separate out the name and the email address and throw away the quotation marks and angled brackets. In other words, they turn this:

"Ben Collins" <benlcollins@gmail.com>

into this:

Ben Collins &nbsp , &nbsp benlcollins@gmail.com

The most difficult line to understand in this section of code is probably the regular expression on line 41 (which also messes up the code colors in this wordpress plugin!):

Let’s break it down, piece by piece:

  •    /   indicates the start of the regular expression
  •    \s*   means zero or more whitespace
  •    "?   means zero or one quotation mark "
  •    [^"]*   matches any character that is not a quotation, zero or more times
  •    ([^"]*)   is a numbered capturing group, which groups the regex between the parentheses
  •    "?   means zero or one quotation mark "
  •    \s+   means one or more white spaces
  •    & l t ;   means a left-angled bracket < (because this the format of our email addresses we receive from the Gmail Service
  •    (.+)   means one or more of any character (this is the email address)
  •    & g t ;   means a right-angled bracket > (because this the format of our email addresses we receive from the Gmail Service
  •    /   indicates the end of the regular expression

Click here for the full list of re2 regular expression syntax.

The rest of line 53 down to line 61 simply filters out some email addresses I’m not interested in. You’ll find there will still be some tidy up to do with the outputted list, but this step helps reduce that a little.

Line 64 clears out any old data that’s already in our sheet. It clears columns A and B from row 4 down to wherever the last bit of data is in columns A or B.

Finally, line 67 pastes in the array of names and email addresses that we’ve created, into columns A and B, starting from row 4, underneath the headings.

Right, that’s the script dealt with, let’s put this thing to work for us!

Step 3: Grant permission to Apps Script to access the necessary resources

Select the onOpen function in the menu and hit run:

run onOpen function

The script will send a message to tell you it’s preparing to execute:

Preparing script message

Next up, a message to alert you the script is running:

Script running message

However, it won’t run all the way through yet.

The authorization dialog popups will appear, to confirm it’s ok to access our spreadsheets and our Gmail.

Click Review Permissions to continue:

authorization required

The next authorization popup will show you details of what resources the script wants to access, in this case the spreadsheet service and Gmail.

Click Allow to continue:

authorization script resources

Finally you’ll see the yellow popup messages again at the top of your browser to say the script is preparing to run and then running.

When they disappear, head back to your spreadsheet and you should now see a new, custom menu called Extract Emails:

Custom email menu

Step 4: Extract the email addresses

Now it’s simply a case of running your script from that custom menu we created above.

Select Extract Emails... and click it to run your email extractor function.

Assuming it all works ok, then you should see names and emails populate your Sheet, as shown in the animated image at the top of this screen.

41 thoughts on “How to extract email addresses tagged with a specific Gmail label”

  1. Hi Ben,
    I’m so glad you provide this App Script.
    I needed it badly.
    I have gmail account of my customer conversation and 85000 email msgs.

    How do I use this script to read all To, From, Body to look for email address and save it in sheet.
    Please help me out on it.
    Thanks

    1. Hey Vinit,

      So this script already pulls out all the To and From addresses. Finding email addresses in the body copy will require you extract the body (using the getBody() method or the getPlainBody() method) and then parsing that, which may be more tricky than the To or From field because it’s more data and could contain @ symbols not necessarily indicating an email address.

      Thanks,
      Ben

      1. Hi Ben, wow, the script worked like a charm! But I would like more details on how to tweak it to find addresses in the body copy. I’m not a developer so I’m on shaky ground, and this would be incredibly helpful. We lost our older customer database when our office was struck by lightning (seriously – I kid you not). All we have are gmails that were forwarded to us by our old shop system, with the customer address in the body. Your script grabbed customers who happened to reply to those emails, which was about 10%. Thanks in advance.

  2. This is great! Thank you! Some of the emails I’m pulling come from a contact form on my website. I figured out how to add getRepyTo() to get the right email from the form (since it has to send from my email domain from the form), but I”m struggling getting the name. I’ve set the contact form up so that I’m getting the person’s name in the subject field, and I know Google has a getSubject() script where I can pull that info. How would I add a script here using getSubject() and add it in the name field if getFrom() = a specific email address (contact@emmersionlearning.com)?

  3. Hi Ben,

    this is awesome!!! thanks for that.
    I was wondering, is it possible to edit the scrip to also take out subject line so I would have name, email and subject in each row?

    thanks again for this script

  4. Hi is this possible with this script to extract all incoming emails? I run the script and it extracted only 242 emails. What i am doing wrong.
    Thanks

  5. Thank you so much for the script Ben 🙂
    I too am able to extract only a limited amount (about 500+)of emails from my sent folder, is there a way to extract more emails?

  6. Thanks for the script, very useful. Regarding only getting 242 emails, remember these are already de-duplicated and there may be many on each line. I got 850+ rows in the spreadsheet, many lines had multiple email addresses on them. My challenge is to read each line and place each email onto a separate line or row and de-duplicate the lot, so that I can then remove those I do not want and be left with a list I can use. Thanks again, I learnt a lot from going through your explanation.

      1. Hi Ben.

        Mine is also just 400+ from the supposedly 3000+..
        I have also double checked a few email addresses that are inside the emails with that label but not captured by the script.

        In any case, thank you for sharing this!

  7. This is my code
    -> when i am running from play button in the code screen it is log my required output( here Street)
    but when i am calling from cell like ” =final() ”
    it is giving error that you can’t perform action at given line ( which is GmailApp.Serach() line”

    i will be really happy if you can help me.

    Code :

    function final(){

    var threads1 = GmailApp.search("subject:submitted via SalesRabbit",0,100);

    for (var i = 0; i < threads1.length; i++) {
    var messages=threads1[i].getMessages();
    for(var j=0;j<messages.length;j++){
    var body=messages[i].getPlainBody();

    var AreaId=body.substring(body.indexOf("Area Id")+8,body.indexOf("Autopay"));
    var Autopay=body.substring(body.indexOf("Autopay")+8,body.indexOf("City"));
    var City=body.substring(body.indexOf("City")+5,body.indexOf("Contactdate"));
    var Contacdate=body.substring(body.indexOf("Contactdate")+11,body.indexOf("Email"));
    var Email=body.substring(body.indexOf("Email")+6,body.indexOf("Fname"));
    var Fname=body.substring(body.indexOf("Fname")+6,body.indexOf("Lead Id"));
    var Lead_Id=body.substring(body.indexOf("Lead Id")+9,body.indexOf("Lname"));
    var Lname=body.substring(body.indexOf("Lname")+6,body.indexOf("Numbertvs"));
    var Numbertvs=body.substring(body.indexOf("Numbertvs")+10,body.indexOf("Package"));
    var Package=body.substring(body.indexOf("Package")+10,body.indexOf("Phone"));
    var Phone=body.substring(body.indexOf("Phone")+10,body.indexOf("Promo Price"));
    var PromoPrice=body.substring(body.indexOf("Promo Price")+10,body.indexOf("Reciever Configuration"));
    var Reciever_Configuration=body.substring(body.indexOf("Reciever Configuration")+10,body.indexOf("Rep Id"));
    var RepId=body.substring(body.indexOf("Rep Id")+10,body.indexOf("Sale Date"));
    var SaleDate=body.substring(body.indexOf("Sale Date")+10,body.indexOf("Satelliteprovider"));
    var Satelliteprovider=body.substring(body.indexOf("Satelliteprovider")+10,body.indexOf("State"));
    var State=body.substring(body.indexOf("State")+10,body.indexOf("Street"));
    var Street=body.substring(body.indexOf("Street")+7,body.indexOf("Zip"));
    Logger.log(Street);
    return Street;

    }
    }

    }

  8. Thanks Ben – it worked like a charm!

    I’m delighted as I have no clue about scripts. Your instructions were very clear and helpful.

    I even managed to amend line 28 so as to only pull in replies as opposed to generating a list of everyone I had contacted.

    Thanks very much,

    Ben G

  9. This was a very helpful starting point. I tweaked it a bit to find messages from a specific person instead. One problem I was running into were messages that were sent to multiple people. The .getTo() method was returning a comma-delimited string instead of individual addresses. I compensated by changing the one push to the array with this:

    thisMessage.getTo().split(“,”).forEach(function (item) {
    emailArray.push(thisMessage.getFrom(),item);
    });

  10. Thanks for the tips and brief tutorial , quite helpful in getting the help while working with the gmail and provide the better way that can extract the email from the gmail.

  11. I am running into all sorts of trouble using this app. I only have 7,999 emails in a folder but the app is extracting over 20,000 email addresses. I tested it by created a new label and put 2 emails in it. It generated over 38 emails. Something is wrong with the code. Has anyone else had this problem?

  12. Thanks for this, Ben! Similar to some of the other folks, I’m trying to figure out why it’s pulling out a limited number of e-mail addresses (284 out of 38,000+ emails). Is it scoped to a certain date range by chance?

    Any help you have for this would be infinitely appreciated!

  13. Oh, the reason why I mention the date is I noticed that all the e-mail addresses that were extracted were from e-mails from February 2 or later. Nothing from February 1, 2018 or earlier is extracted.

  14. Thanks for script. Same issue for me as others, it only pulls out a limited number. In my case I have 600 emails and script pulls 351. I removed the to() and just get the from()..maybe its a time out on the script?

  15. Hello Ben,

    Thank a lot for the script and tutorial, they were excellent and a good starting point. Can you help me a little at least with some directions? I am looking to add the date of the specific email and I added “d.getData()” in the first array and I intend to get it somehow in the Sheet instead of the Name variable as I added that together with the email one in the same cell. Now my intuition is that I have to somehow pull that info in the regular expression syntax but I don’t know how as I am not good with regular expressions. Now there is also the high chance that it’s not done from there so that’s why I am asking for directions.

    P.S.: Without using d.getDate() as I don’t know how to include it in this code it gives me this error: “TypeError: Cannot find function match in object Mon Mar 19 2018 22:10:26 GMT+0200 (EET).” from where I understand the format of the date but I want only the “DD/MM/YYYY” and mostly how to add this info instead of the name variable.

    Thanks in advance!

  16. Great stuff.

    Initially I had some problems because the e-mails I wanted to extract came from a contact form, but then I was able to change line 26 and the “GetTo” field to “GetReplyTo” and it worked perfectly.

    Thanks a lot for making this!

  17. Hi Ben,
    This is great and very helpful!
    One thing though, it seems that the script cant handle labels with more than 500 threads.
    Is that really the case? If so, is there a way to workaround it?

  18. This one is really nice .. but I need Vice versa of this,

    I need the label of email, by putting a specific email address,
    how can we do that?

  19. Hi, thanks for the script, but there is some problem with the script. im getting error when running this script, i don’t the ERR:- The coordinates or dimensions of the range are invalid. (line 67, file “Code”).

    Could you please help me with this?

    1. Its because their is no data in the spreadsheet.
      Comment out this line
      // clear any old data
      // sheet.getRange(4,1,sheet.getLastRow(),2).clearContent();
      // Functional Programming is not all that….
      I’m use to Classes and or procedual programming background.. Functional programming is not for me.. Although i like it for some cases… Consider this: Its actually less efficient and for me harder to read..

  20. Hi Ben, Great Stuff!

    How would i tweak this code for it to apply to several labels in cells B1,C1,D1… as opposed to just one specific label.

    Cheers
    ARJ

  21. Hi there,

    Thanks a lot for your code, which was very helpful to me (and for others, I guess) !!!

    Reading other comments, I did bring some changes to your code =
    _ added CC and BCC fields
    _ splitted several emails for CC and BCC fields (separated by commas)
    _ removed the “space” and the “” before email adresses (case when the name is empty)

    Since I did not manage to do this with the .map() and .filter() functions, I have to change somehow the code.

    Here is my code, in case you need it :
    // add menu to Sheet
    function onOpen() {
    var ui = SpreadsheetApp.getUi();
    ui.createMenu(‘Extract Emails’)
    .addItem(‘Extract Emails…’, ‘extractEmails’)
    .addToUi();
    }

    // extract emails from label in Gmail
    function extractEmails() {

    // get the spreadsheet
    var ss = SpreadsheetApp.getActiveSpreadsheet();
    var sheet = ss.getActiveSheet();
    var label = sheet.getRange(1,2).getValue();

    // get all email threads that match label from Sheet
    var threads = GmailApp.search (“label:” + label);

    // get all the messages for the current batch of threads
    var messages = GmailApp.getMessagesForThreads (threads);

    var emailArray = [];

    // get array of email addresses
    messages.forEach(function(message) {
    message.forEach(function(currentEmail) {

    // get email FROM
    emailArray.push(currentEmail.getFrom());

    // get email TO
    currentEmail.getTo().split(“,”).forEach(function (item) {
    emailArray.push(item);
    });

    // get email CC
    currentEmail.getCc().split(“,”).forEach(function (item) {
    emailArray.push(item);
    });

    // get email BCC
    currentEmail.getBcc().split(“,”).forEach(function (item) {
    emailArray.push(item);
    });

    });
    });

    // de-duplicate the array
    // var uniqueEmailArray = emailArray.filter(function(item, pos) {
    // return emailArray.indexOf(item) == pos;
    // });

    var cleanedEmailArray = {} ;

    emailArray.forEach(function(el) {
    var name = “”;
    var email = “”;

    var matches = el.match(/\s*”?([^”]*)”?\s+/);

    if (matches) {
    name = matches[1];
    email = matches[2];
    }
    else {
    name = “”;
    email = el.replace(” “, “”).replace(“>”, “”).replace(“<", "");
    }

    // add only if email is not empty AND (( email is not already in cleanedEmailArray OR cleanedEmailArray has an empty name ))
    if ( email != "" && ( !(email in cleanedEmailArray) || cleanedEmailArray[email] === "" ) ) { cleanedEmailArray[email] = name } ;

    });

    var uniqueEmailArray = {} ;

    // remove duplicated entry
    for (var key in cleanedEmailArray) {
    if ( key in uniqueEmailArray ) { if ( uniqueEmailArray[ key ] === "" ) { uniqueEmailArray[ key ] = cleanedEmailArray[ key ] ; } }
    else { uniqueEmailArray[ key ] = cleanedEmailArray[ key ] }
    };

    // logger to check results
    /* Logger.log("cleanedEmailArray = " + cleanedEmailArray + "\n" ) ;
    for ( key in cleanedEmailArray) {
    Logger.log("cleanedEmailArray [ " + key + " ] = " + cleanedEmailArray[key] + "\n" ) ;
    };*/

    // clear any old data
    sheet.getRange(4,1,sheet.getLastRow(),2).clearContent();

    // paste in new names and emails and sort by email address A – Z
    var CurrentRow = 4 ;
    var Values = [] ;
    for ( key in uniqueEmailArray) {
    sheet.getRange(CurrentRow,1).setValue( key ) ;
    sheet.getRange(CurrentRow,2).setValue( uniqueEmailArray[key] ) ;
    CurrentRow = CurrentRow + 1 ;
    };
    }

Leave a Reply

Your email address will not be published. Required fields are marked *