Problem

I got on a call with the CEO of a company that helps businesses recruit tech talents. Let's call this company TechTalentRecruitment. The perfect customer profile for TechTalentRecruitment is small and medium-sized companies that are hiring for software engineers.

Getting a list of small and medium companies

The prospect has a few requirements. The prospect's ideal customer profile is:

  1. A small or medium business. Venture capital funding signals can identify such companies.
  2. Is looking to hire software engineers.
  3. Located in the US, Germany, and Canada

We will leave item number 2 for the next step. In the meantime, I exported a list of companies that matches item 1 and item 3 with Crunchbase Pro.

Use Proxycurl Linkedin API to filter for companies that are hiring software engineers.

To use the Job Listings endpoint in Proxycurl's Linkedin API, I need the company's search_id. You can fetch the search_id of the company using Proxycurl's Linkedin Company Profile API endpoint.

For each company in the exported CSV file, I will fetch its corresponding search_id like this:

def get_search_id(coy_url):
    api_key = 'API_KEY'
    host = 'https://nubela.co/proxycurl'
    api_endpoint = f'{host}/api/linkedin/company'
    header_dic = {'Authorization': 'Bearer ' + api_key,
                  }
    response = requests.get(api_endpoint,
                            params={'url': coy_url},
                            headers=header_dic)
    dic = response.json()
    return dic['search_id']

With the search_id, I will query for jobs posted by each company like this:

def get_all_jobs(search_id):
    jobs = []
    page_no = 0
    while True:
        data = get_job_listing(search_id, page_no)
        if len(data['job']) == 0:
            break

        jobs += data['job']

        if data['next_page_no'] is not None:
            page_no = data['next_page_no']

    return jobs

  
def get_job_listing(search_id, page_no=0):
    api_key = 'API_KEY'
    host = 'https://nubela.co/proxycurl'
    api_endpoint = f'{host}/api/linkedin/company/job'
    header_dic = {'Authorization': 'Bearer ' + api_key}
    response = requests.get(api_endpoint,
                            params={'search_id': search_id,
                                    'page': page_no,
                                    },
                            headers=header_dic)
    return response.json()

The role of the job can be found in the response returned by the job listing endpoint. To verify that a job posted by a company is for a software engineer role, I will call does_job_qualify() function on it. (This function relies on a function I built that compares similar string; you can find the code for is_string_similar() here)

def does_job_qualify(job):
    keywords = ['software engineer', 'backend engineer',
                'frontend engineer', 'software developer', 'software architect']
    for kw in keywords:
        if is_string_similar(kw, job['job_title']):
            return True
    return False

Putting it all together, this is how I take a list of companies exported from Crunchbase and identify companies that are hiring for software engineers

def run():
    company_lis = get_company_lis()
    for idx, company_url in enumerate(company_lis):
        print(f"{idx}/{len(company_lis)}: Searching {company_url}")
        search_id = get_search_id(company_url)
        jobs = get_all_jobs(search_id)
        print(f"Found {len(jobs)} jobs")
        has_matching_jobs = list(map(does_job_qualify, jobs))
        if True in has_matching_jobs:
            print(f"SUCCESS! Adding to CSV.")
            append_valid_company(company_url)

Results

Out of 1000 Series A/B/C companies based in the US that I had exported from Crunchbase Pro, I found 286 companies that were actively hiring software engineers. Not bad. A 28% hit rate of prospect companies.


PS: Looking to wield Linkedin datasets for your company's growth efforts? Please send me an email to [email protected]