How to automate due diligence through the enrichment of people and companies
Due diligence is important – you need to know who you're talking to, but it doesn't have to be difficult...

/ proxycurl

Due Diligence Simplified: Automate the Enrichment of People and Companies

Colton Randolph
Share:

Subscribe to our newsletter

Get the latest news from Proxycurl

There are several reasons why you might need to enrich a data profile on a person or company; it entirely depends on the industry.

For example:

  • A sales team trying to better understand a prospect before reaching out and trying to close a deal
  • HR doing research on a candidate's job history before presenting an offer
  • A startup doing their due diligence on a list of VCs they’re interested in reaching out to for investment opportunities
  • The inverse, a VC performing due diligence on a list of startups they're interested in

And beyond...

Regardless of industry, you want to be able to know who you’re talking to. That’s where due diligence comes in.

Plus, even if you already have a solid dataset, it doesn’t hurt to have other data sources to be able to use to verify and validate your existing one.

Which brings up a valid question:

What’s the best data provider to enrich people and companies with?

I’m glad you asked.

I might be slightly biased here (I’ll prove it later, though), but I’d like to introduce you to Proxycurl.

We’re a B2B data provider that you can plug into your existing due diligence process to enrich and/or validate data profiles on people and companies.

Proxycurl is specifically designed to be developer-friendly and integrate with your existing systems, if possible.

Most of our customers use our API, but we also sell our entire dataset (it’s called LinkDB).

A big benefit of using our API to enrich data profiles, though, is the fact we can offer a 29-day freshness guarantee, and in many cases, the data is scraped live.

So, that means you’re getting an extremely fresh source of data to enrich your data profiles with.

That said, by the end of this article, my goal is for you to walk away knowing a much simpler and easier way to enrich just about any data profile without requiring any additional information.

But first, I need to explain what enriching a data profile really means…

What does enriching a data profile mean?

Enriching a data profile refers to the process of adding more information or details to an existing set of data about an individual or company.

The aim is to get a clearer, more comprehensive picture of the person or organization in question.

This could involve adding data such as employment history, education, professional connections, social media activity, and more.

In essence, it's about making the profile more ”rich” in terms of information.

Now, let's tackle the purpose of due diligence.

What’s the purpose of due diligence?

Due diligence, in its simplest form, is a thorough investigation or audit of a potential investment, business partnership, or any other situation where understanding the full picture of an individual is crucial.

It's essentially the research phase that ensures all parties involved have a clear understanding of what they're getting into.

This can range from understanding the financial health of a company you're considering investing in, to vetting a potential employee's background and qualifications.

More or less, due diligence is simply about mitigating risks and making informed decisions.

By conducting due diligence, you can avoid potential pitfalls, protect your reputation, ensure you’re complying with all relevant regulations and standards, and beyond.

For many industries, KYC (Know Your Customer) is a mandated form of due diligence, ensuring that businesses understand the true identity of their clients and can screen for potential risks.

But beyond the regulatory requirements, due diligence also provides businesses with valuable insights that can inform their strategies, tailor their services, and ultimately build stronger, more trusting relationships with their clients or partners.

In this article, we’ll primarily focus on that aspect of due diligence.

How can Proxycurl simplify my due diligence process?

Proxycurl offers a non-intrusive platform to pull extensive data about people and companies.

Through Proxycurl's API, you can effortlessly access details such as work experience history, personal email addresses, phone numbers, and beyond without the need for additional KYC methods (take one data point, such as an email, and expand it into many).

This ensures that there's a perfect equilibrium between the depth of data and not annoying your customers, clients, and prospects.

Also, if you’re curious about the legality of Proxycurl, all of our data is scraped from publicly available sources and then packaged together into an easily usable medium (our API).

Meaning, Proxycurl is 100% legal. We also comply with CCPA and GDPR.

Now that I’ve given you a bit of background, let’s break down the actual due diligence use cases of Proxycurl and how you can implement them.

First up:

HR and employment due diligence

HR firms, recruitment agencies, and jobs in general have to perform due diligence practically every time before they hire someone.

You can’t employ someone you don’t know anything about. It’s important to validate and verify; you can save yourself months of hassle that way.

The good news is, using Proxycurl, there are two different ways you could easily look up a work history.

Verifying work history and skills with a social media profile URL

The first way to verify employment history is by using our People API, specifically our Person Profile Endpoint.

Our Person Profile Endpoint only requires a social media URL, so you could look up a work history and more with either a Twitter (X), Facebook, or LinkedIn profile URL.

Here’s a quick Python example:


import requests

import json

api_key = 'Your_API_Key_Here'

headers = {'Authorization': 'Bearer ' + api_key}

api_endpoint = 'https://nubela.co/proxycurl/api/v2/linkedin'

params = {

    'linkedin_profile_url': 'https://linkedin.com/in/johnrmarty/',

    'extra': 'include',

    'github_profile_id': 'include',

    'facebook_profile_id': 'include',

    'twitter_profile_id': 'include',

    'personal_contact_number': 'include',

    'personal_email': 'include',

    'inferred_salary': 'include',

    'skills': 'include',

    'use_cache': 'if-recent,

}

response = requests.get(api_endpoint,

                        params=params,

                        headers=headers)

formatted_data = json.dumps(response.json(), indent=4)

print(formatted_data)

In response, because of the additional parameters used here, we’re returned the following:

  • Full name
  • Occupation
  • Summary of the individual
  • The country they’re from
  • Their current source of employment
  • Past sources of employment (and length, job roles)
  • Skills

Not bad, right?

Now, let’s do it without a social media URL.

Verify work history with just a first and last name

By using our Search API, specifically our Person Search Endpoint, you can look up and enrich an identity based on a couple of varying identifiers, such as:

  • Name
  • Current company
  • Current job role
  • Education
  • And more

You can use one or any of those search parameters from the full list available.

Let’s say we want to hire someone named “Bob Dylan” and his role is going to be a content writer.

Using some simple Python, we could look up that identity and enrich it:

import requests

import json

api_key = 'Your_API_Key_Here'

headers = {'Authorization': 'Bearer ' + api_key}

api_endpoint = 'https://nubela.co/proxycurl/api/search/person/'

params = {

    'country': 'US',

    'enrich_profiles': 'enrich',

    'page_size': '10',

    'past_role_title': '(?i)content writer',

    'current_role_title': '(?i)content writer',

    'last_name': '(?i)Dylan',

    'first_name': '(?i)Bob'

}

response = requests.get(api_endpoint,

                        params=params,

                        headers=headers)

formatted_data = json.dumps(response.json(), indent=4)

print(formatted_data)

After running that script, we actually got one result back:

Bob Dylan LinkedIn profile

It’ll also return other available data points like:

  • City, state
  • Full work history
  • Skills
  • And beyond

With our Search API, all you need is one or two available identifiers, and then you can enrich practically an entire identity.

Now, let me give you another example outside of verifying employment history.

Due diligence on people

Our API works with both people and companies, but let’s start with use cases designed to be used on people rather than companies.

Validating any given email address for free

Using our Disposable Email Address Check Endpoint, you can check if any email address belongs to a free or disposable email service.

In other words, you can validate if there are any worthless email addresses on any of your prospecting lists before you send – and this endpoint is entirely free.

Here’s an example of how you could use some Python with it to automatically validate a list of email addresses:

import requests

import csv

api_key = 'YOUR_API_KEY'  # Replace with your actual API key

headers = {'Authorization': 'Bearer ' + api_key}

api_endpoint = 'https://nubela.co/proxycurl/api/disposable-email'

valid_emails = []

free_emails = []

disposable_emails = []

# Open the input CSV file and read the emails

with open('emails_list.csv', 'r') as input_file:

    reader = csv.DictReader(input_file)

    for row in reader:

        email = row['email']

        params = {'email': email}

        response = requests.get(api_endpoint, params=params, headers=headers)

        

        data = response.json()

        

        # Handle rate limit error

        if 'error' in data:

            print(f"Error for {email}: {data['error']}")

            continue

        

        # Categorize the email

        if data.get('is_disposable_email') == True:

            disposable_emails.append(email)

        elif data.get('is_free_email') == True:

            free_emails.append(email)

        else:

            valid_emails.append(email)

# Write the emails to a new CSV file in separate columns

with open('categorized_emails.csv', 'w', newline='') as output_file:

    writer = csv.writer(output_file)

    writer.writerow(['Valid Emails', 'Free Emails', 'Disposable Emails'])  # Write the headers

    

    # Write emails row by row

    max_rows = max(len(valid_emails), len(free_emails), len(disposable_emails))

    for i in range(max_rows):

        valid_email = valid_emails[i] if i < len(valid_emails) else ''

        free_email = free_emails[i] if i < len(free_emails) else ''

        disposable_email = disposable_emails[i] if i < len(disposable_emails) else ''

        

        writer.writerow([valid_email, free_email, disposable_email])

print("Export of categorized emails completed!")

That’ll take a list of emails within “emails_list.csv” (in the same folder as your Python script) and automatically validate them for you, placing them in one of three columns:

  • Valid emails
  • Free emails
  • Disposable emails

Again, this is entirely for free – there’s no reason not to be doing this before you ever send out any cold emails.

Now, let me show you an enrichment example.

Enriching leads with only an email

Using our Reverse Email Lookup, you could enrich an identity using nothing but an email.

So, go ahead and create a new folder, then create a new .CSV named “input_emails.csv” with only one row like so (of course, replacing with the desired emails you’d like to enrich):


[email protected]

[email protected]

[email protected]

Next, create another Python script in that same folder and paste the following code:


import json

import requests

import csv

# Your API key

API_KEY = ‘Your_API_Key_Here'

# API endpoint

api_endpoint = 'https://nubela.co/proxycurl/api/linkedin/profile/resolve/email'

# Headers for the API request

headers = {'Authorization': 'Bearer ' + API_KEY}

# Input and output CSV file names

input_file = 'input_emails.csv'

output_file = 'enriched_data.csv'

# Read email addresses from the input CSV file

with open(input_file, 'r') as csvfile:

    reader = csv.reader(csvfile)

    email_addresses = [row[0] for row in reader]

# Open the output CSV file for writing

with open(output_file, 'w', newline='') as csvfile:

    fieldnames = [

        'email', 'full_name', 'profile_picture', 'current_occupation',

        'country', 'city', 'state', 'linkedin_profile', 'twitter_profile',

        'facebook_profile', 'past_work_experiences', 'linkedin_posts',

        'personal_emails', 'personal_phone_numbers', 'skills_interests'

    ]

    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)

    writer.writeheader()

    # Loop through each email address

    for email_address in email_addresses:

        params = {

            'lookup_depth': 'deep',

            'email': email_address,

            'enrich_profile': 'enrich',

        }

        response = requests.get(api_endpoint, params=params, headers=headers)

        result = response.json()

        # Extract data from the API response

        profile = result.get('profile', {})

        experiences = profile.get('experiences', [])

        past_work_experiences = '; '.join([exp.get('title', '') + ' at ' + exp.get('company', '') for exp in experiences])

        writer.writerow({

            'email': email_address,

            'full_name': profile.get('full_name', ''),

            'profile_picture': profile.get('profile_pic_url', ''),

            'current_occupation': profile.get('occupation', ''),

            'country': profile.get('country_full_name', ''),

            'city': profile.get('city', ''),

            'state': profile.get('state', ''),

            'linkedin_profile': result.get('linkedin_profile_url', ''),

            'twitter_profile': result.get('twitter_profile_url', ''),

            'facebook_profile': result.get('facebook_profile_url', ''),

            'past_work_experiences': past_work_experiences,

            'linkedin_posts': '',  # This data is not provided in the shared response

            'personal_emails': '; '.join(profile.get('personal_emails', [])),

            'personal_phone_numbers': '; '.join(profile.get('personal_numbers', [])),

            'skills_interests': '; '.join(profile.get('skills', []))

        })

print(f"Data exported to {output_file}")

In return, that’ll export the enriched identities to a file in the same folder named “enriched_data.csv”:

A csv file of enrich identities

It should be noted that any of the additional responses here can be implemented into this script.

You would just need to add the extra responses desired, depending on the information needed.

We could also enrich leads with a social media profile URL.

Enriching leads with social media profile URLs

If you don’t have an email yet but were able to find their social media URL, you could pull their personal email address as well as other information by using our Personal Profile Endpoint.

For example, let's say we have a list of LinkedIn profile URLs but nothing else.

First, we need to create a new file named “input_profiles.csv” in its own folder, formatted as follows:


linkedin_profile_url

https://www.linkedin.com/in/colton-randolph/

https://www.linkedin.com/in/stevengoh/

Then, if you don’t already have pandas (a Python library) installed, you’ll need to install pandas with pip install pandas to run the following script.

After installing pandas, we need to create another Python script in the same folder as “input_profiles.csv”:


import pandas as pd

import requests

# Define your Proxycurl API key and API endpoint

api_key = 'YOUR_API_KEY'

api_endpoint = 'https://nubela.co/proxycurl/api/v2/linkedin'

# Load the input CSV file with LinkedIn profile URLs

input_csv = 'input_profiles.csv'

output_csv = 'enriched_profiles.csv'

# Create empty lists to store enriched data

enriched_data = []

# Read the CSV file into a DataFrame

try:

    df = pd.read_csv(input_csv)

except FileNotFoundError:

    print(f"Error: Input CSV file '{input_csv}' not found.")

    exit(1)

# Iterate through the LinkedIn profile URLs in the DataFrame

for index, row in df.iterrows():

    linkedin_profile_url = row['linkedin_profile_url']

    

    # Make API request without the 'extra' parameter

    headers = {'Authorization': 'Bearer ' + api_key}

    params = {

        'linkedin_profile_url': linkedin_profile_url,

        'personal_email': 'include',  # Include personal email

        'personal_contact_number': 'include',  # Include personal contact number

    }

    response = requests.get(api_endpoint, params=params, headers=headers)

    

    # Check if the API request was successful

    if response.status_code == 200:

        api_data = response.json()

        

        # Extract the relevant information from the API response

        full_name = api_data.get('full_name', '')

        profile_picture = api_data.get('profile_pic_url', '')

        current_occupation = api_data.get('occupation', '')

        country = api_data.get('country', '')

        city = api_data.get('city', '')

        state = api_data.get('state', '')

        

        # Extract personal email and contact number if available

        personal_emails = api_data.get('personal_emails', [])

        personal_contact_numbers = api_data.get('personal_numbers', [])

        # Get the first personal email if available

        personal_email = personal_emails[0] if personal_emails else ''

        # Get the first personal contact number if available

        personal_contact_number = personal_contact_numbers[0] if personal_contact_numbers else ''

        # Create a dictionary with the extracted data

        enriched_profile = {

            'full_name': full_name,

            'profile_picture': profile_picture,

            'current_occupation': current_occupation,

            'country': country,

            'city': city,

            'state': state,

            'linkedin_profile': linkedin_profile_url,

            'personal_email': personal_email,

            'personal_contact_number': personal_contact_number,

        }

        # Append the enriched data to the list

        enriched_data.append(enriched_profile)

    else:

        print(f"Failed to fetch data for profile at index {index}: {linkedin_profile_url}")

# Create a new DataFrame from the enriched data

enriched_df = pd.DataFrame(enriched_data)

# Save the enriched data to a new CSV file

enriched_df.to_csv(output_csv, index=False)

print(f"Enriched data saved to {output_csv}")

After running that script, you’ll see it turns LinkedIn profile URLs into an enriched dataset, including email, personal phone number, country, and more.

It all depends on the required fields you want to pass over to the .CSV. You could add more data than demonstrated here. All of the available responses are listed here.

Again, as I mentioned earlier, you can also use Facebook as well as Twitter (X) with this endpoint, but LinkedIn will be the most reliable.

This just goes to show you can enrich an identity with about any data point. Email, social media URL, whatever.

Who needs extra KYC?

You just need one tiny identifier, and we can do all of the rest.

Now, let’s move on to companies.

Due diligence on companies

We have much of the same functionality for companies as we do for people.

Enriching a company

Using our Company Profile Endpoint, you can insert any LinkedIn company URL and automatically enrich it.

So, let’s say we want to do a competitive analysis on Coresignal, as they’re one of our (cough, cough, inferior and expensive) competitors.

Here’s a quick Python example of just that:


import requests

import json

api_key = 'Your_API_Key_Here'

headers = {'Authorization': 'Bearer ' + api_key}

api_endpoint = 'https://nubela.co/proxycurl/api/linkedin/company'

params = {

    'url': 'https://www.linkedin.com/company/coresignal/',

    'resolve_numeric_id': 'true',

    'categories': 'include',

    'funding_data': 'include',

    'extra': 'include',

    'exit_data': 'include',

    'acquisitions': 'include',

    'use_cache': 'if-recent',

}

response = requests.get(api_endpoint,

                        params=params,

                        headers=headers)

# Check if the request was successful

if response.status_code == 200:

    # Parse and print the content of the response in a formatted manner

    data = response.json()

    print(json.dumps(data, indent=4))

else:

    print(f"Error: {response.status_code}")

After running the script, we received the following response:


{

    "linkedin_internal_id": "43181787",

    "description": "Coresignal was founded with a goal to make large amounts of fresh public web data accessible to any company worldwide.\n\nWe offer parsed, ready-to-use data from 20 public web data sources for investment, HR tech, and sales tech companies, placing a special emphasis on data freshness.\n\n740M employee data records, 104M company data records, and millions of other records split into 8 data categories enable companies to build data-driven products and extract actionable insights.\n\nYou can choose a data solution that suits your needs, pick between multiple delivery frequency options and get data in different formats: JSON, CSV, or HTML. We also offer six convenient APIs for easy search and retrieval of fresh data records.\n\nThe Coresignal team includes some of the industry\u2019s most experienced web data extraction professionals coming from big data, lead generation, and e-commerce backgrounds.\n\nOur combined experience and a culture of knowledge sharing allow us to help businesses utilize our data in the most efficient way. This is one of the main reasons why over 400 data-driven companies have already chosen Coresignal as their public web data provider.\n",

    "website": "http://coresignal.com",

    "industry": "Information Technology & Services",

    "company_size": [

        51,

        200

    ],

    "company_size_on_linkedin": 34,

    "hq": {

        "country": "US",

        "city": "New York",

        "postal_code": "1001",

        "line_1": "630 3rd Ave",

        "is_hq": true,

        "state": "NY"

    },

    "company_type": null,

    "founded_year": 2016,

    "specialities": [

        "Firmographic data",

        "Employee data",

        "Job posting data",

        "Startup data",

        "Technographic data",

        "Company employee review data",

        "Company funding data",

        "Tech product review data",

        "Professional network data",

        "Company API",

        "Employee API",

        "Jobs API",

        "Company scraping API",

        "Employee scraping API",

        "Jobs scraping API"

    ],

    "locations": [

        {

            "country": "US",

            "city": "New York",

            "postal_code": "1001",

            "line_1": "630 3rd Ave",

            "is_hq": true,

            "state": "NY"

        }

    ],

    "name": "Coresignal",

    "tagline": "Freshest public firmographic and talent data for 360\u00b0 competitive intelligence and data-driven products.",

    "universal_name_id": "coresignal",

    "profile_pic_url": "https://s3.us-west-000.backblazeb2.com/proxycurl/company/coresignal/profile?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=0004d7f56a0400b0000000001%2F20231016%2Fus-west-000%2Fs3%2Faws4_request&X-Amz-Date=20231016T184242Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=287ec7802cbb10b3ff438ee5862f7b33789be65414cc6fb3278737a31baf5c98",

    "background_cover_image_url": "https://s3.us-west-000.backblazeb2.com/proxycurl/company/coresignal/cover?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=0004d7f56a0400b0000000001%2F20231016%2Fus-west-000%2Fs3%2Faws4_request&X-Amz-Date=20231016T184242Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=46d9685e7445a9661703f9b5f5cf1a8877d6a8412c3c7cb7f39dc7a1f6888551",

    "search_id": "43181787",

    "similar_companies": [

        {

            "name": "Oxylabs.io",

            "link": "https://www.linkedin.com/company/oxylabs-io/",

            "industry": "Information Technology & Services",

            "location": "Vilnius, Lithuania"

        },

        {

            "name": "Smartproxy",

            "link": "https://www.linkedin.com/company/smartproxy/",

            "industry": "Information Technology & Services",

            "location": "New York"

        },

        {

            "name": "Nord Security",

            "link": "https://www.linkedin.com/company/nordsecurity/",

            "industry": "Computer & Network Security",

            "location": null

        },

        {

            "name": "Kilo Health",

            "link": "https://www.linkedin.com/company/kilo-health/",

            "industry": "Health, Wellness & Fitness",

            "location": "Vilnius, Vilniaus"

        },

        {

            "name": "Attention Insight",

            "link": "https://www.linkedin.com/company/attention-insight/",

            "industry": "Market Research",

            "location": "Hamburg, Hamburg"

        },

        {

            "name": "DueDil (now FullCircl)",

            "link": "https://www.linkedin.com/company/duedil/",

            "industry": "Information Services",

            "location": "London"

        },

        {

            "name": "Surfshark",

            "link": "https://www.linkedin.com/company/surfshark/",

            "industry": "Information Technology & Services",

            "location": "Vilnius"

        },

        {

            "name": "People Data Labs",

            "link": "https://www.linkedin.com/company/peopledatalabs/",

            "industry": "Computer Software",

            "location": "San Francisco, CA"

        },

        {

            "name": "Tesonet",

            "link": "https://www.linkedin.com/company/tesonet/",

            "industry": "Computer Software",

            "location": "Vilnius, Vilniaus"

        },

        {

            "name": "Zyte",

            "link": "https://www.linkedin.com/company/zytedata/",

            "industry": "Information Technology & Services",

            "location": "Ballincollig, Cork"

        },

        {

            "name": "VV Solar Solutions LLC",

            "link": "https://www.linkedin.com/company/vv-solar-solutions/",

            "industry": "Mechanical Or Industrial Engineering",

            "location": null

        },

        {

            "name": "CyberCare",

            "link": "https://www.linkedin.com/company/cybercarecompany/",

            "industry": "Information Technology & Services",

            "location": "Vilnius, Vilniaus"

        }

    ],

    "affiliated_companies": [],

    "updates": [

        {

            "article_link": null,

            "image": null,

            "posted_on": {

                "day": 13,

                "month": 10,

                "year": 2023

            },

            "text": "Just like ships in the open sea, firms need some sort of North Star\u2728 to navigate the competitive business landscape toward success. In business, #benchmarks provide such reference points for finding direction. \ud83e\udded \n\n#BusinessBenchmarking comes in various shapes and can be tailored to suit firms of all sizes. But most importantly, it can help firms improve their current procedures, allocate their assets more effectively, and create #data-based business strategies. \n\nTake a closer look at the different types of #benchmarking and how anyone from big corporations to small businesses benefit from it. \u2b07\ufe0f",

            "total_likes": 6

        },

        {

            "article_link": "https://www.linkedin.com/in/ACoAAAnlb-UBck__0dt_mQm64NYcMOoj5QDN-Ok",

            "image": "https://media.licdn.com/dms/image/D5622AQEwgzPapxTTZg/feedshare-shrink_2048_1536/0/1697036143908?e=1700092800&v=beta&t=8iNUw_bvbXVpPY-Ti_Y49YsaBu0qlKMO3wk4yP5rAXc",

            "posted_on": {

                "day": 12,

                "month": 10,

                "year": 2023

            },

            "text": "Hey there, it's Ugnius and Karolis saying hello from the HR Technology Conference & Exposition 2023 in Las Vegas! \ud83d\udc4b\n\nAre you joining us at this year's event? If you're here or planning to be, come say hi!\n\nAnd if you're curious about how #data can supercharge your #HR strategies and would like to have a 1:1 consultation, don't hesitate to let us know here \u27a1\ufe0f https://lnkd.in/d_nmMVix\n\n#HRTechConf #HRtech",

            "total_likes": 107

        },

        {

            "article_link": "https://www.linkedin.com/company/hr-technology-conference/",

            "image": "https://media.licdn.com/dms/image/D4D22AQGp-k_9-MS5JA/feedshare-shrink_2048_1536/0/1696858497052?e=1700092800&v=beta&t=OKKL176H0qVmZz_Q7myLhagdYm-WEqew0Dum6cdWdMQ",

            "posted_on": {

                "day": 10,

                "month": 10,

                "year": 2023

            },

            "text": "\ud83d\udce3 Let's meet in Las Vegas TOMORROW! \ud83d\udce3\n\nFrom October 10th to 13th, you can find us at the HR Technology Conference & Exposition in Las Vegas, where you will have the opportunity to:\n\n\ud83d\udc65 Meet the people behind our brand\u00a0\n\ud83d\udcca Conveniently identify your #data needs\n\u2753 Ask any questions and receive immediate answers\n\ud83d\udcbc Discover how public talent data can help enhance #HR solutions\n\nBook a meeting with us \u27a1\ufe0f https://lnkd.in/d_nmMVix\u00a0\n#HRTechConf #HRtech",

            "total_likes": 15

        }

    ],

    "follower_count": 1268,

    "acquisitions": {

        "acquired": [],

        "acquired_by": null

    },

    "exit_data": [],

    "extra": {

        "crunchbase_profile_url": "https://www.crunchbase.com/organization/coresignal",

        "ipo_status": "Private",

        "crunchbase_rank": 144694,

        "founding_date": null,

        "operating_status": "Active",

        "company_type": "For Profit",

        "contact_email": "[email protected]",

        "phone_number": null,

        "facebook_id": null,

        "twitter_id": null,

        "number_of_funding_rounds": 0,

        "total_funding_amount": null,

        "stock_symbol": null,

        "ipo_date": null,

        "number_of_lead_investors": 0,

        "number_of_investors": 0,

        "total_fund_raised": 0,

        "number_of_investments": null,

        "number_of_lead_investments": 0,

        "number_of_exits": null,

        "number_of_acquisitions": null

    },

    "funding_data": [],

    "categories": [

        "big-data",

        "data-mining",

        "database"

    ]

}

Nice.

As you can see, depending on the amount of information they’ve made available, we can provide you with many data points like:

  • The amount of funding a company has received
  • The rounds of investments
  • Similar companies
  • Exits, acquisitions, and beyond

There is definitely enough information that it’d be valuable to integrate into your workflow.

The example given above is about Coresignal, a competitive analysis, but you could use it for prospecting due diligence as well.

It’ll enrich any profile you give it, giving you a better understanding of any company.

Searching for companies based on certain criteria and enriching them

It’s very simple to automatically search for companies using specific filtering parameters with our Company Search Endpoint.

It’s why many VC firms and the like use Proxycurl for due diligence on deal sourcing.

Let’s say we’re looking for early-stage San Francisco-based companies.

With our Company Search Endpoint, we can use the following script to search for early-stage San Francisco startups and then export them to a .CSV:


import requests

import csv

# API endpoint

endpoint = "https://nubela.co/proxycurl/api/search/company"

# API headers

headers = {

    "Authorization": "Bearer Your_API_Key_Here"

}

# Parameters for the API request

params = {

    "country": "us",

    "city": "(?i)San Francisco",

    "funding_amount_max":"100000",

    "employee_count_min":"10",

    "industry": "(?i)Financial Services",

    "enrich_profiles": "enrich",

    "page_size": 10

}

# Make the API request

response = requests.get(endpoint, headers=headers, params=params)

data = response.json()

# Check if 'results' key exists in the data

if 'results' in data:

    # Extract the results

    results = data['results']

    # Print the companies matching the criteria

    print("Companies matching the criteria:")

    print("--------------------------------")

    # Prepare data for CSV export

    csv_data = [["Company Name", "LinkedIn URL", "HQ", "Industry", "Description"]]

    for result in results:

        profile = result.get('profile', {})

        company_name = profile.get('name', "N/A")

        linkedin_url = result.get('linkedin_profile_url', "N/A")

        hq_value = profile.get('hq', {})

        hq = hq_value.get('city', "N/A") if hq_value else "N/A"

        industry = profile.get('industry', "N/A")

        description = profile.get('description', "N/A")

        print(f"Company Name: {company_name}")

        print(f"LinkedIn URL: {linkedin_url}")

        print(f"HQ: {hq}")

        print(f"Industry: {industry}")

        print(f"Description: {description}")

        print("")

        # Append to CSV data

        csv_data.append([company_name, linkedin_url, hq, industry, description])

    # Export to CSV

    with open("sanfran_startups.csv", "w", newline="", encoding="utf-8") as file:

        writer = csv.writer(file)

        writer.writerows(csv_data)

else:

    print("No companies found matching the criteria.")

It’ll find any relative companies that match the search criteria, enrich them, and export them to a .CSV in the same folder as your Python script named “sanfran_startups.csv” – of course, you just need to slightly modify this to fit your needs.

When you use our Search API, you need no identifiers. It comes in hand for a lot of use cases.

By the way, we wrote a whole article talking about proven methods you can use as a VC firm for deal sourcing. It might be of value to you.

A quick recap

Whew, that was a lot. Here’s what we’ve covered so far:

Employment due diligence

Due diligence on people

Due diligence on companies

The examples above are only a few possibilities out of hundreds of different possible due diligence use cases.

So, what do you think?

Can you think of any workflows where our different endpoints could be of value?

Maybe in your HR department? Sales? Account management?

Between our different endpoints, you can flawlessly implement the data we can provide into your existing systems – for whatever purpose you need it for.

More specifically, with the data we provide and our enrichment capabilities, you can automate much of the mundane process of performing due diligence for several tasks.

Providing you and your entire team with clear-cut data to operate their business functions.

The next step:

If you’re seeing the value here, which I think you are…

The next step is to click right here and create your Proxycurl account.

It’s free, and you start with 10 credits, which allows you to test out our free endpoints as well as perform a few paid actions (but not much).

After that, $10 can get you 100 credits, and 100 credits can do quite a few different due diligence actions for you, and then you can just top up from there (or sign up for a subscription; it's up to you).

Our full Proxycurl API pricing policy is available for view here.

Simply stated: Due diligence is important – you need to know who you’re talking to, but it doesn’t have to be difficult.

Integrate Proxycurl with your due diligence processes today, and you’ll thank yourself later.

P.S. Have any questions about implementing Proxycurl with your business? No problem, we’ll be glad to help; reach out to “[email protected]”.

Subscribe to our newsletter

Get the latest news from Proxycurl

Colton Randolph
Share:

Featured Articles

Here’s what we’ve been up to recently.