Programmatically send personalized emails

/ proxycurl

How to programmatically send personalized emails to 200 decision makers of Fortune 500 companies (with code samples)

Steven Goh
Share:

Subscribe to our newsletter

Get the latest news from Proxycurl

I want to close bigger deals by reaching decision-makers directly

I am the founder of Nubela. I lead sales for Proxycurl, and I do this alone. So it is vital that whatever I do is leveraged and effective. Luckily, I can code.
I figured that general emails do not get to decision-makers in large(r) companies, and I want to move up the value chain. I want to close bigger deals. I think I can achieve that by reaching decision-makers directly and with emails personalized with a unique problem-solution statement for their company. To say, I want to do what other sales reps are doing and more.
To be very specific, I want to reach 200 decision-makers at a single burst. I want to send 25 personalized follow-ups per decision-maker.
In this blog post, I will share how I accomplished this with code.

Getting a list of companies with Crunchbase Pro

I want to target larger companies that have the budget to purchase Proxycurl to build data-driven products. In particular, I have shortlisted companies that belong to the likes of sales automation tools, job boards, and talent sourcing companies to be our target market. Crunchbase Pro is excellent for building a list like such.

With Crunchbase Pro, I started a Company Search for a list of companies that

  1. matched Proxycurl's target industries
  2. have revenues that are more than 5+M per annum

Then, I exported the search results into a CSV file, ensuring that I have a column of data that includes the company's Linkedin Profile.

Once I have the list, I want to enrich the data with company names and their corresponding corporate website. This is the Python script I used to enrich data for the list of companies I had exported from Crunchbase Pro:

async def get_company(company_profile_url: str):
    api_endpoint = f'{PROXYCURL_HOST}/api/linkedin/company'
    header_dic = {'Authorization': 'Bearer ' + PROXYCURL_API_KEY}

    for _ in range(RETRY_COUNT):
        try:
            async with httpx.AsyncClient() as client:
                r = await client.get(api_endpoint,
                                     params={'url': company_profile_url},
                                     headers=header_dic,
                                     timeout=PROXYCURL_XHR_DEFAULT_TIMEOUT)
                assert r.status_code == 200
                return r.json()
        except:
            continue

    return None


async def enrich_companies(lis):
    for profile_url in lis:
        coy = await get_company(profile_url)
        if coy is None:
            return None
        website = coy.get('website', None)
        coy_name = coy.get('name', None)

        # todo - (task for reader) save `website` and `coy_name` in a file

Find decision makers with Proxycurl API

Now that I have companies, I need decision-makers. Decision-makers in this exercise mean people in the roles of CEO, COO, CTO, and VP of Product.

To accomplish this, I will search for them on Google. For example, if I want to find the Linkedin profile of the CEO of Cognism, I will enter the following search phrase in Google:

linkedin.com/in ceo cognism

Chances are, the correct profile will be in the search result. I will then repeat the query with different roles till I have a list of profiles. And it works.

To perform Google searches at scale, I will use Proxycurl's "Crawling other pages" endpoint. This is how I programmatically make Google Search queries with Proxycurl:

async def google_search_async(search_term, retry_count=5) -> List[str]:
    """
    Perform a Google Search via Overlord and return a list of results in terms of URLs in the first page.
    """
    for _ in range(retry_count):
        try:
            search_url = f"https://www.google.com/search?q={quote(search_term)}"
            payload = {'url': search_url,
                       "type": 'xhr',
                       }

            async with httpx.AsyncClient() as client:
                r = await client.post(f"{OVERLORD_ENDPOINT}/message",
                                      auth=(OVERLORD_USERNAME,
                                            OVERLORD_PASSWD),
                                      json=payload,
                                      timeout=PROXYCURL_XHR_DEFAULT_TIMEOUT)
                if r.status_code != 200:
                    print(
                        f"Google search failed with {r.status_code}, retrying.")
                assert r.status_code == 200

            html_src = r.json()['data']
            soup = BeautifulSoup(html_src, features="html.parser")
            result_lis = soup.select(".g a[ping]")
            href_lis = []
            for result in result_lis:
                href = result['href']
                if '//webcache.googleusercontent.com/search' in href:
                    continue
                if 'https://translate.google.com/translate' in href:
                    continue
                href_lis += [href]
            if len(href_lis) == 0:
                continue
            return href_lis
        except:
            traceback.print_exc()
            continue
    raise Exception

However, my computer is not smart enough to understand when a CEO is the same as "Chief Executive Officer." Or that "Engineering Head" and "Chief Engineering" are very much alike. For that, I have an algorithm which I call is_string_similar(). You can find the algorithm to check if two strings are similar here.

Once I have a list of LinkedIn profiles, I need to ensure that:

  1. The profile's current employment belongs to the company that I am googling for (Google gets this wrong sometimes)
  2. The profile's current role at the company matches the decision making roles.

To perform the checks above, I will:

  1. Enrich the Linkedin profiles with Proxycurl's Person Profile Endpoint to get the profile's list of experiences.
  2. Verify that his/her active employment matches up.

This is how I accomplish the above in Python code:

async def get_person_profile(profile_url):
    api_endpoint = f'{PROXYCURL_HOST}/api/v2/linkedin'
    header_dic = {'Authorization': 'Bearer ' + PROXYCURL_API_KEY}

    for _ in range(RETRY_COUNT):
        try:
            async with httpx.AsyncClient() as client:
                r = await client.get(api_endpoint,
                                     params={'url': profile_url},
                                     headers=header_dic,
                                     timeout=PROXYCURL_XHR_DEFAULT_TIMEOUT)
                if r.status_code == 404:
                    return None
                assert r.status_code == 200
                return r.json()
        except:
            continue

    print(f"{profile_url} retried {RETRY_COUNT} times but still failing")
    return None


async def google_search_async(search_term, retry_count=3) -> List[str]:
    """
    Perform a Google Search via Overlord and return a list of results in terms of URLs in the first page.
    """
    for _ in range(retry_count):
        try:
            search_url = f"https://www.google.com/search?q={quote(search_term)}"
            payload = {'url': search_url,
                       "type": 'xhr',
                       }

            async with httpx.AsyncClient() as client:
                r = await client.post(f"{OVERLORD_ENDPOINT}/message",
                                      auth=(OVERLORD_USERNAME,
                                            OVERLORD_PASSWD),
                                      json=payload,
                                      timeout=PROXYCURL_XHR_DEFAULT_TIMEOUT)
                if r.status_code != 200:
                    print(
                        f"Google search failed with {r.status_code}, retrying.")
                assert r.status_code == 200

            html_src = r.json()['data']
            soup = BeautifulSoup(html_src, features="html.parser")
            result_lis = soup.select(".g a[ping]")
            href_lis = []
            for result in result_lis:
                href = result['href']
                if '//webcache.googleusercontent.com/search' in href:
                    continue
                if 'https://translate.google.com/translate' in href:
                    continue
                href_lis += [href]
            return href_lis
        except:
            traceback.print_exc()
            continue
    raise Exception


async def find_people_in_roles(coy_name: str, li_coy_profile_url: str = None) -> List[str]:
    MAX_WORKERS = 10
    ROLES = ['ceo',
             'cto',
             'coo',
             'vp engineering'
             ]

    def does_role_match(role: str, person_profile: Dict) -> bool:
        for exp in person_profile['experiences']:
            if not (exp['ends_at'] is None and util.is_string_similar(coy_name, exp['company'])):
                continue

            if not util.is_string_similar(role, exp['title']):
                continue

            return True
        return False

    async def search_li_profile(role: str) -> List[str]:
        url_result_lis = await google_search_async(f"linkedin.com/in {role} {coy_name}", retry_count=RETRY_COUNT)
        profile_url_lis = list(filter(lambda x: 'linkedin.com/in' in x,
                                      url_result_lis))
        return (role, profile_url_lis)

    print("Performing google search for Linkedin profiles")
    tasks = [search_li_profile(role) for role in ROLES]
    search_results = await asyncio.gather(*tasks)

    profile_url_lis = []
    for _, profile_lis in search_results:
        for profile_url in profile_lis:
            if profile_url not in profile_url_lis:
                profile_url_lis += [profile_url]
    print(f"Total of {len(profile_url_lis)} profiles to query")

    profile_dic = {}
    working_lis = []
    for idx, profile_url in enumerate(profile_url_lis):
        working_lis += [profile_url]

        if (idx > 0 and len(working_lis) > 0 and idx % MAX_WORKERS == 0) or idx == (len(profile_url_lis) - 1):
            print(f"Working on {len(working_lis)} profiles..")
            tasks = [get_person_profile(url) for url in working_lis]
            profile_data_lis = await asyncio.gather(*tasks)
            for idx, url in enumerate(working_lis):
                profile_dic[url] = profile_data_lis[idx]
            working_lis = []

    print("Done crawling profiles")

    # save data somewhere

And now, I have a list of Linkedin profiles of decision makers.

Get email addresses of decision-makers

There are two parts to this problem. The first part is in getting an email address. The second part is in verifying the email address.

In getting email addresses, I use Clearbit because they have an API to resolve a domain name and a name into an email address.

And while Clearbit is good at fetching email addresses with high accuracy, their data is quite stale. My experience is that up to 30-40% of email addresses fetched from Clearbit bounce. And it's terrible for my cold outreach domain's reputation. Cold emails don't work if my emails end up in Spam. Mine does not because I make sure my emails are personal, and they do not bounce.

To verify email addresses that I retrieve from Clearbit, this is how I do it:

def verify_email(email):
    client = neverbounce_sdk.client(
        api_key=NEVERBOUNCE_API_KEY,
        timeout=NEVERBOUNCE_TIMEOUT)
    resp = client.single_check(email)
    return resp['result'] == 'valid' and resp['status'] == 'success'

With this, I have a list of email addresses of decision makers from my list of target companies.

Send sequenced cold emails

I have tried sending templated one-liner emails, and while they did work, I wanted better hit-rates from my emails. Everything that I have set out to do so far can be automated. But the one thing I cannot automate is understanding my target company's business; figuring out how Proxycurl's API can fit into their product; writing it down in clear, simple problem-solution statements in 25 personalized email templates.

So while my emails are templated, 80% of the content in my emails are variables customized by my input. And this is how I do it.

Simple web-app to customize email templates

I create a simple web app that iterates through the list of prospects to personalize an email for that specific company. For 200 decision-makers, this took me a few hours going through about a hundred companies. Personalizing emails was the worse part of the process.

Did it work?

Unfortunately, I got better results with one-liner cold email outreaches to general emails. Here is my take-away:
As smart as I thought I was, I think I overreached. I was overly verbose with the problem-solution statement because the first paragraph of the email was 3-4 lines long. Secondly, I am wrong in assuming that CEOs or COOs cared about Proxycurl. Proxycurl is, after all a solution to be vetted by a developer/product team and not the CEO.
I am not giving up. Next up - I am crawling developers of product-oriented companies and reaching out to them directly with simple short-liners. I hope it works better, and I will keep you updated in a follow-up post.

Subscribe to our newsletter

Get the latest news from Proxycurl

Steven Goh
Share:

Featured Articles

Here’s what we’ve been up to recently.