Fresh Data Is Here for Person, Company & Employee Search API Endpoints Learn more

The Ultimate Guide to LinkedIn Search: Jobs, People, Profiles, Candidates, & More

/ proxycurl

The Ultimate Guide to LinkedIn Search: Jobs, People, Profiles, Candidates, & More

We heard you want to do a LinkedIn search. Awesome, that's what we're here for!

In this article, we'll walk you through several ways to search LinkedIn, including both user interfaces and APIs. We'll also provide some code samples in case you're interested in checking out a third-party API solution called Proxycurl which provides a developer-friendly interface for searching LinkedIn.

UI tools

First, let's talk about some UI tools that can help you search LinkedIn. These range from free to paid, and the quality of the results you get will vary with price.

Use Google to search LinkedIn

You can use Google to search LinkedIn with the site keyword. For example, we could search for the company Apple by typing site:linkedin.com apple into Google. If you have any terms that you expect to appear on the LinkedIn page of the target you're seeking, you can add them, freeform, to your Google search.

This behavior might be convenient if you're seeking a single page and know its name already, but it's generally not gonna work out so well - try searching site:linkedin.com/company cupertino and getting the City of Cupertino and Cupetino Union School District, nothing to do with Apple - there's no way to provide structured data here, which is what we need to do.

Limited LinkedIn searching capability on Google for company profiles and structured data

Using LinkedIn search through its built-in search bar is also free. For example, if you want to search for the company Apple, you type "Apple," and then you will probably get the company Apple. You can then click the Apple feature screen, and the next thing you know, you're on Apple's LinkedIn profile.

Say, though, that you wanted to look for a technology company located in the United States. Then you could instead use the filters along the top of the page. They look like this:

LinkedIn search filters, none selected
LinkedIn search filters, none selected

Here's how they look once you've selected some of them:

Search within LinkedIn with filters for people and company profiles
Several LinkedIn search filters selected

But what if this isn't enough granularity?

LinkedIn Sales Navigator

The mecca of boolean search. It's a paid product, but if you only need it for one month, you can cash in on your once-in-a-lifetime free trial, and it's a pretty excellent tool at that price.

You get access to nearly unlimited ("nearly" is necessary, it's not truly unlimited) Boolean logic on a large number of fields, and you can also save the lists you've made for future access. For more information on the LinkedIn Sales Navigator, do check out our LinkedIn Sales Navigator deep dive.

Advanced searching capabilities in LinkedIn Sales Navigator including Boolean

UpLead

UpLead is the only unofficial UI option in our list, and it's here because it's just that good (but keep in mind how much you'll be spending on it). Its UI contains a lot of the fields you're used to in Sales Navigator, like "technologies used," but UpLead enriches with data from sources other than LinkedIn.

The catch? UpLead is expensive; in addition to a monthly fee, you have to pay a credit cost per individual profile that you unlock, and each profile costs as much as $0.60 on their low-commitment plan or $0.30 on their high-commitment plan. Go there for the GUI, but don't stay for volume, and certainly don't stay for their API.

Solutions for developers

Now, we'll talk about a couple tools aimed at developers. We'll first compare and contrast two competing products for Search, Proxycurl and People Data Labs, and we'll show you why we think Proxycurl is the better option. And just for fun, we'll show off a couple of other features that Proxycurl has.

Let's get right into it!

Proxycurl vs PDL

How much does each one cost?

Before we look at the code, let's discuss the costs. PDL gives you a few free credits, but after that, this request normally costs you $0.23 - $0.16 cents per result. Assuming you're making queries at scale (think 200+ results per search), Proxycurl's flat fee ($0.70 - $0.32, depending on your plan) will be fully amortized across your result set and effectively disappear. At that point, you're paying somewhere between $0.20 and $0.09 per result, depending on volume, and even less than that if you're on an enterprise plan.

To summarize:

  • PDL has no fee per query, but each result costs more.
  • Proxycurl has a fee per query, but each result costs less.
  • When you query a lot of results, the fee per query is amortized into the Price Per Result, and so overall Proxycurl is charging you a lot less.

Even more TL;DR: Proxycurl costs less than PDL. Always do the full math for yourself when you see sites with different credit arithmetic.

The case study: Searching for SWE alumni of Discord

In this case study, your company needs to fill developer positions with candidates who are alumni of some specific company or set of companies (in this case, we're using Discord as our example company).

We'll omit the results in both cases; we want to avoid committing people's PII to a blog post. Proxycurl is set to give you a JSON, which you could easily continue processing if you dropped out json.dumps. PDL will generate a CSV.

Proxycurl option

First, we'll look at the Proxycurl solution, where about half the file is imports and defining constants:

import json, os, requests
api_key = os.environ['PROXYCURL_API_KEY']
headers = {'Authorization': 'Bearer ' + api_key}

api_endpoint = 'https://nubela.co/proxycurl/api/search/person'
params = {
    'current_company_linkedin_profile_url': 'https://www.linkedin.com/company/discord/',
    'current_role_title': '(?i)(software engineer|\bswe\b)',
}
response = requests.get(api_endpoint, params=params, headers=headers)
print(json.dumps(response.json()))  # json.dumps is only for formatting

Note the (?i) flag to make our regular expression case-insensitive.

PDL option

Now we'll take a look at the People Data Labs solution. Here's how you do it using their Elasticsearch-based approach. The advantage of Elasticsearch is more control over each query. The disadvantage is, well, queries that look like this (and note that we're still missing out on job titles like "staff software engineer" etc.).

# This code was generated by People Data Labs's Query Builder
import os, requests, JSON, time, csv

API_KEY = os.environ['PDL_API_KEY']

MAX_NUM_RECORDS = 1 # a small number here since we don't have unlimited API access there

# NO CHANGES NEEDED BELOW HERE
PDL_URL = "https://api.peopledatalabs.com/v5/person/search"
request_header = {
    "Content-Type": "application/JSON",
    "X-API-key": API_KEY
}

ES_QUERY = {
    "query": {
        "bool": {
            "must": [
                {
                    "term": {
                        "job_company_linkedin_url": "linkedin.com/company/discord"
                    }
                },
                {
                    "terms": {
                        "job_title": [
                            "software engineer",
                            "senior software engineer",
                            "swe"
                        ]
                    }
                }
            ]
        }
    }
}

num_records_to_request = 100
params = {
    "dataset": "all",
    "query": json.dumps(ES_QUERY),
    "size": num_records_to_request,
    "pretty": True
}

# Pull all results in multiple batches
batch = 1
all_records = []
start_time = time.time()
while batch == 1 or params["scroll_token"]:
    if MAX_NUM_RECORDS != -1:
        # Update num_records_to_request
        # Compute the number of records left to pull
        num_records_to_request = MAX_NUM_RECORDS - len(all_records)
        # Clamp this number between 0 and 100
        num_records_to_request = max(0, min(num_records_to_request, 100))

    if num_records_to_request == 0:
        break

    params["size"] = num_records_to_request
    response = requests.get(PDL_URL, headers=request_header, params=params).json()
    
# snip a bunch more result-processing code

I'd rather stick to my params dict, but you're welcome to give it a try.

Some more Proxycurl examples

In the final section, we'll look at a couple of interesting examples with Search using Proxycurl. If you want more, you can find a whole bunch more in our Proxycurl Search blog announcement.

Search for Stripe SWEs

This was the toy example I used in our LinkedIn Sales Navigator article, but in that article, I never provided code. Let's fix this immediately!

Specifically, we'll search for:

  • Stripe SWEs
  • With 6 years of experience at Stripe
  • Who aren't in the Bay Area (here, we'll use San Francisco)

And here's the code:

import json, os, requests
api_key = os.environ['PROXYCURL_API_KEY']
headers = {'Authorization': 'Bearer ' + api_key}

api_endpoint = 'https://nubela.co/proxycurl/api/search/person'
params = {
    'city': '^(?!.*San Francisco)',
    'current_company_linkedin_profile_url': 'https://www.linkedin.com/company/stripe/',
    'current_role_before': '2018-01-01',
}
response = requests.get(api_endpoint, params=params, headers=headers)
print(json.dumps(response.json()))  # json.dumps is only for formatting

Don't recognize the regex syntax in the city field? That's called "negative lookahead," and it means that the city field cannot contain "San Francisco." Even if you don't recognize lookaround, you might be familiar with \b, another so-called "zero-width assertion." Zero-width assertions state (or "assert") a condition that must be true about the regex at this point without actually consuming any characters (so their "width" is "zero"). They're super useful!

Anyway, since this query returns people, we won't print the results here; you know the drill. But feel free to run it on your own - or modify it to your heart's content.

Find alumni of your school

Here's another one you can personalize and do at home. Find alumni of your school with your same major working at top companies - maybe this can help you get an introduction somewhere.

Here's the base code:

import json, os, requests
api_key = os.environ['PROXYCURL_API_KEY']
headers = {'Authorization': 'Bearer ' + api_key}

api_endpoint = 'https://nubela.co/proxycurl/api/search/person'
params = {
    'education_school_name': 'Caltech',
    'education_field_of_study': 'Mathematics',
}
response = requests.get(api_endpoint, params=params, headers=headers)
print(json.dumps(response.json()))  # json.dumps is only for formatting

Parameters you can add to this include:

  • current_company_name or past_company_name with a list of companies you're interested in separated by | and with ^()$ wrapping the expression. For example, ^(Amazon|Apple)$.
  • Alternatively, if you're only interested in one company, use the current_company_linkedin_profile_url or past_company_linkedin_profile_url field.

Once again, since this query returns a list of names, we won't show the example output.

Conclusion

Last time, I said there was no silver bullet if you wanted a company API. And the same thing is true if you want something as general as to "search LinkedIn." Boolean search? LinkedIn Sales Navigator - or maybe UpLead - or perhaps you know of a better tool. But if you want to search LinkedIn as a developer, you want to go with Proxycurl. Ever since our Search API launched, we've been hard at work making it a best-in-class experience for developers to automate their way through thousands of results. So why wait? Sign up for an account and claim your free credits now!

Megan Cutrofello
Share:

Subscribe to our newsletter

Get the latest news from Proxycurl

Featured Articles

Here’s what we’ve been up to recently.