How to programmatically send personalized emails to 200 decision makers of Fortune 500 companies (with code samples)
I want to close bigger deals by reaching decision-makers directly
I am the founder of Nubela. I lead sales for Proxycurl, and I do this alone. So it is vital that whatever I do is leveraged and effective. Luckily, I can code.
I figured that general emails do not get to decision-makers in large(r) companies, and I want to move up the value chain. I want to close bigger deals. I think I can achieve that by reaching decision-makers directly and with emails personalized with a unique problem-solution statement for their company. To say, I want to do what other sales reps are doing and more.
To be very specific, I want to reach 200 decision-makers at a single burst. I want to send 25 personalized follow-ups per decision-maker.
In this blog post, I will share how I accomplished this with code.
Getting a list of companies with Crunchbase Pro
I want to target larger companies that have the budget to purchase Proxycurl to build data-driven products. In particular, I have shortlisted companies that belong to the likes of sales automation tools, job boards, and talent sourcing companies to be our target market. Crunchbase Pro is excellent for building a list like such.
With Crunchbase Pro, I started a Company Search for a list of companies that
- matched Proxycurl's target industries
- have revenues that are more than 5+M per annum
Then, I exported the search results into a CSV file, ensuring that I have a column of data that includes the company's Linkedin Profile.
Once I have the list, I want to enrich the data with company names and their corresponding corporate website. This is the Python script I used to enrich data for the list of companies I had exported from Crunchbase Pro:
async def get_company(company_profile_url: str):
api_endpoint = f'{PROXYCURL_HOST}/api/linkedin/company'
header_dic = {'Authorization': 'Bearer ' + PROXYCURL_API_KEY}
for _ in range(RETRY_COUNT):
try:
async with httpx.AsyncClient() as client:
r = await client.get(api_endpoint,
params={'url': company_profile_url},
headers=header_dic,
timeout=PROXYCURL_XHR_DEFAULT_TIMEOUT)
assert r.status_code == 200
return r.json()
except:
continue
return None
async def enrich_companies(lis):
for profile_url in lis:
coy = await get_company(profile_url)
if coy is None:
return None
website = coy.get('website', None)
coy_name = coy.get('name', None)
# todo - (task for reader) save `website` and `coy_name` in a file
Find decision makers with Proxycurl API
Now that I have companies, I need decision-makers. Decision-makers in this exercise mean people in the roles of CEO, COO, CTO, and VP of Product.
To accomplish this, I will search for them on Google. For example, if I want to find the Linkedin profile of the CEO of Cognism, I will enter the following search phrase in Google:
linkedin.com/in ceo cognism
Chances are, the correct profile will be in the search result. I will then repeat the query with different roles till I have a list of profiles. And it works.
To perform Google searches at scale, I will use Proxycurl's "Crawling other pages" endpoint. This is how I programmatically make Google Search queries with Proxycurl:
async def google_search_async(search_term, retry_count=5) -> List[str]:
"""
Perform a Google Search via Overlord and return a list of results in terms of URLs in the first page.
"""
for _ in range(retry_count):
try:
search_url = f"https://www.google.com/search?q={quote(search_term)}"
payload = {'url': search_url,
"type": 'xhr',
}
async with httpx.AsyncClient() as client:
r = await client.post(f"{OVERLORD_ENDPOINT}/message",
auth=(OVERLORD_USERNAME,
OVERLORD_PASSWD),
json=payload,
timeout=PROXYCURL_XHR_DEFAULT_TIMEOUT)
if r.status_code != 200:
print(
f"Google search failed with {r.status_code}, retrying.")
assert r.status_code == 200
html_src = r.json()['data']
soup = BeautifulSoup(html_src, features="html.parser")
result_lis = soup.select(".g a[ping]")
href_lis = []
for result in result_lis:
href = result['href']
if '//webcache.googleusercontent.com/search' in href:
continue
if 'https://translate.google.com/translate' in href:
continue
href_lis += [href]
if len(href_lis) == 0:
continue
return href_lis
except:
traceback.print_exc()
continue
raise Exception
However, my computer is not smart enough to understand when a CEO is the same as "Chief Executive Officer." Or that "Engineering Head" and "Chief Engineering" are very much alike. For that, I have an algorithm which I call is_string_similar()
. You can find the algorithm to check if two strings are similar here.
Once I have a list of LinkedIn profiles, I need to ensure that:
- The profile's current employment belongs to the company that I am googling for (Google gets this wrong sometimes)
- The profile's current role at the company matches the decision making roles.
To perform the checks above, I will:
- Enrich the Linkedin profiles with Proxycurl's Person Profile Endpoint to get the profile's list of experiences.
- Verify that his/her active employment matches up.
This is how I accomplish the above in Python code:
async def get_person_profile(profile_url):
api_endpoint = f'{PROXYCURL_HOST}/api/v2/linkedin'
header_dic = {'Authorization': 'Bearer ' + PROXYCURL_API_KEY}
for _ in range(RETRY_COUNT):
try:
async with httpx.AsyncClient() as client:
r = await client.get(api_endpoint,
params={'url': profile_url},
headers=header_dic,
timeout=PROXYCURL_XHR_DEFAULT_TIMEOUT)
if r.status_code == 404:
return None
assert r.status_code == 200
return r.json()
except:
continue
print(f"{profile_url} retried {RETRY_COUNT} times but still failing")
return None
async def google_search_async(search_term, retry_count=3) -> List[str]:
"""
Perform a Google Search via Overlord and return a list of results in terms of URLs in the first page.
"""
for _ in range(retry_count):
try:
search_url = f"https://www.google.com/search?q={quote(search_term)}"
payload = {'url': search_url,
"type": 'xhr',
}
async with httpx.AsyncClient() as client:
r = await client.post(f"{OVERLORD_ENDPOINT}/message",
auth=(OVERLORD_USERNAME,
OVERLORD_PASSWD),
json=payload,
timeout=PROXYCURL_XHR_DEFAULT_TIMEOUT)
if r.status_code != 200:
print(
f"Google search failed with {r.status_code}, retrying.")
assert r.status_code == 200
html_src = r.json()['data']
soup = BeautifulSoup(html_src, features="html.parser")
result_lis = soup.select(".g a[ping]")
href_lis = []
for result in result_lis:
href = result['href']
if '//webcache.googleusercontent.com/search' in href:
continue
if 'https://translate.google.com/translate' in href:
continue
href_lis += [href]
return href_lis
except:
traceback.print_exc()
continue
raise Exception
async def find_people_in_roles(coy_name: str, li_coy_profile_url: str = None) -> List[str]:
MAX_WORKERS = 10
ROLES = ['ceo',
'cto',
'coo',
'vp engineering'
]
def does_role_match(role: str, person_profile: Dict) -> bool:
for exp in person_profile['experiences']:
if not (exp['ends_at'] is None and util.is_string_similar(coy_name, exp['company'])):
continue
if not util.is_string_similar(role, exp['title']):
continue
return True
return False
async def search_li_profile(role: str) -> List[str]:
url_result_lis = await google_search_async(f"linkedin.com/in {role} {coy_name}", retry_count=RETRY_COUNT)
profile_url_lis = list(filter(lambda x: 'linkedin.com/in' in x,
url_result_lis))
return (role, profile_url_lis)
print("Performing google search for Linkedin profiles")
tasks = [search_li_profile(role) for role in ROLES]
search_results = await asyncio.gather(*tasks)
profile_url_lis = []
for _, profile_lis in search_results:
for profile_url in profile_lis:
if profile_url not in profile_url_lis:
profile_url_lis += [profile_url]
print(f"Total of {len(profile_url_lis)} profiles to query")
profile_dic = {}
working_lis = []
for idx, profile_url in enumerate(profile_url_lis):
working_lis += [profile_url]
if (idx > 0 and len(working_lis) > 0 and idx % MAX_WORKERS == 0) or idx == (len(profile_url_lis) - 1):
print(f"Working on {len(working_lis)} profiles..")
tasks = [get_person_profile(url) for url in working_lis]
profile_data_lis = await asyncio.gather(*tasks)
for idx, url in enumerate(working_lis):
profile_dic[url] = profile_data_lis[idx]
working_lis = []
print("Done crawling profiles")
# save data somewhere
And now, I have a list of Linkedin profiles of decision makers.
Get email addresses of decision-makers
There are two parts to this problem. The first part is in getting an email address. The second part is in verifying the email address.
In getting email addresses, I use Clearbit because they have an API to resolve a domain name and a name into an email address.
And while Clearbit is good at fetching email addresses with high accuracy, their data is quite stale. My experience is that up to 30-40% of email addresses fetched from Clearbit bounce. And it's terrible for my cold outreach domain's reputation. Cold emails don't work if my emails end up in Spam. Mine does not because I make sure my emails are personal, and they do not bounce.
To verify email addresses that I retrieve from Clearbit, this is how I do it:
def verify_email(email):
client = neverbounce_sdk.client(
api_key=NEVERBOUNCE_API_KEY,
timeout=NEVERBOUNCE_TIMEOUT)
resp = client.single_check(email)
return resp['result'] == 'valid' and resp['status'] == 'success'
With this, I have a list of email addresses of decision makers from my list of target companies.
Send sequenced cold emails
I have tried sending templated one-liner emails, and while they did work, I wanted better hit-rates from my emails. Everything that I have set out to do so far can be automated. But the one thing I cannot automate is understanding my target company's business; figuring out how Proxycurl's API can fit into their product; writing it down in clear, simple problem-solution statements in 25 personalized email templates.
So while my emails are templated, 80% of the content in my emails are variables customized by my input. And this is how I do it.
I create a simple web app that iterates through the list of prospects to personalize an email for that specific company. For 200 decision-makers, this took me a few hours going through about a hundred companies. Personalizing emails was the worse part of the process.
Did it work?
Unfortunately, I got better results with one-liner cold email outreaches to general emails. Here is my take-away:
As smart as I thought I was, I think I overreached. I was overly verbose with the problem-solution statement because the first paragraph of the email was 3-4 lines long. Secondly, I am wrong in assuming that CEOs or COOs cared about Proxycurl. Proxycurl is, after all a solution to be vetted by a developer/product team and not the CEO.
I am not giving up. Next up - I am crawling developers of product-oriented companies and reaching out to them directly with simple short-liners. I hope it works better, and I will keep you updated in a follow-up post.