The Blueprint to Building a Successful Sales Prospecting Application
Take a look around the B2B sales data and general prospecting software as a service space...
What kind of companies do you see?
- Apollo
- Lusha
- Clearbit
- RocketReach
- ZoomInfo
- UpLead
- Coresignal
- UserGems
- Reply.io
All of them are successful in their own right and do more or less the same thing a little bit differently; providing B2B data while charging a profitable recurring premium for it.
Is selling B2B data profitable?
Yep. Many companies require rich B2B data to function, particularly internal sales teams that use it for sales intelligence.
Using the popular SaaS database tool, Latka, if we take a look at the B2B data market we can see:
- Apollo raised just over $140M over five rounds (including seed and pre-seed), and the revenue run rate has hit roughly $23M in revenue as of 2023.
- Clearbit raised $37M over three rounds (including pre-seed), and the revenue run rate has hit roughly $41M in revenue as of 2023.
- Lusha raised $245M over two rounds, and the revenue run rate has hit roughly $29M in revenue as of 2023.
Clearly, there are plenty of investors and clients to go around the B2B data space. In my opinion, it's just as great of an industry to enter as something like AI is, especially once you incorporate the power of AI with rich B2B data.
Once you find your product/market fit, there's a lot of growth potential within the B2B data space.
Only some B2B data companies scrape and acquire their own data
Quite a few of the SaaS products in the general B2B data and sales prospecting tool space acquire their datasets from other data providers -- and there's a reason for that; it's hard and time-consuming to scrape massive amounts of data.
This is exactly why we'll be avoiding web scraping in this article.
We'll be showing you a way that you can build a sales prospecting application fed with rich B2B data -- all without having to scrape a single page.
But first, let me give you a background on how web scraping works, so you understand the full picture.
Everything starts somewhere
At Proxycurl, we're extremely experienced with scraping and collecting data on a massive scale.
When it comes to collecting B2B data one of the biggest sources of data we scrape from is LinkedIn. We've crawled an extensive amount of the public information on LinkedIn to scrape information from it.
Everything first starts with a seed. From the seed, you can branch out, collecting more and more data. The crawler's job is to branch out from the original seed.
Before you know it, you've clicked thousands of new links and have collected thousands of datasets.
Google's web crawlers work exactly the same way
They start from a seed and work outwards from the seed site, clicking all of the links and ultimately browsing the entire web as it's continuously expanding.
The problem is when it comes to scraping and running crawlers it's extremely difficult at scale.
In Google's case, everyone wants their crawlers on their site because it brings in more business, but in everyone else's case, businesses like LinkedIn are actively trying to prevent that to protect their source of monetization (data).
LinkedIn has strict limits on the amount of data you can pull from it, even on paid plans like Sales Navigator (2,500 records per search to be exact).
(See: How To Bypass LinkedIn Search Limits in 2023)
It's understandable because LinkedIn is in fact one of the best sources of B2B data. They just don't like to give it out without a very hefty premium, hardly even providing API access.
Should you scrape your own B2B data?
You're not LinkedIn. You don't have a giant B2B social media site to feed you data.
You're also not Google. People don't want you to crawl their site and content, collecting information because it's mutually beneficial.
So, you'll need to think about the logistics of hardware, accounts, IP addresses, and beyond to scrape data at scale. That's the scary part.
That's why often many B2B data and sales prospecting tools simply opt to skip doing their own scraping and instead rely on background data providers.
There are also a few other added benefits of doing this, the first being the time to market.
Not having to worry about building an infrastructure for data collection is major, but what's even bigger than that is you won't have to waste any time actually scraping and building a dataset. You can skip that entire process.
But, you will still need to acquire a rather large dataset to use as your data foundation for your application.
Lucky for you, I know just the right company for the job... us.
We power a lot of the B2B data and sales prospecting tools you see on the market
For example, Reply.io, mentioned above, is actually one of our customers.
They add their own twist on things by putting an emphasis on automating and integrating AI with sales outreach:
They approached us when they wanted to roll out their "Data" section:
All that data section really is, is one giant dataset that provides the ability to filter prospects, and build prospecting lists.
Other sales prospecting applications work similarly
For example, Lusha provides their own twist to things, but the central premise remains around filtering a large dataset to be used for prospecting.
You can also enrich your data:
Next, let's look at Apollo since it's a fairly large prospecting tool:
Same thing again, it's basically one giant dataset with its own twist on things.
Like Lusha, they also offer enrichment:
Across all of these tools, the value lies in the data, not necessarily the product.
Building the actual tool isn't that hard. Acquiring the data that's then distributed through the application is the hard part.
Good thing we've done it for you.
Introducing LinkDB: your new data foundation
Reply.io is powered by LinkDB, which is our dataset of over 472 million public LinkedIn profiles.
(Is that enough profiles for you?)
We've scraped a massive amount of data for you to build your application on, and we provide it for a very, very, competitive price.
How do we compare to our competitors?
It varies, but here's a comparison between us and Coresignal, which focuses largely on selling datasets rather than enrichment:
But still, don't get it confused. While its price tag tends to be less than those of our competitors, LinkDB isn't inherently cheap. It's designed for real businesses with real capital. There's a worthy investment to obtain LinkDB.
You can click here to see all of LinkDB's pricing. We're very transparent, and there's no hidden pricing or gotchas.
We sell people profiles segmented by the following countries:
- United States (264+M)
- India (24+M)
- United Kingdom (20+M)
- Brazil (17+M)
- Canada (13+M)
- France (11+M)
- Germany (9.3+M)
- Australia (8.3+M)
- Mexico (6+M)
- Italy (5.3+M)
- Spain (5.2+M)
- Indonesia (4.4+M)
- Netherlands (4.2+M)
- China (3.8+M)
- South Africa (3.3+M)
Our US People Profile segment is our most frequently purchased dataset, and it's also our largest. We also have our global company dataset available specifically for company data.
Specifically, LinkDB is delivered via parquet files. You can view sample data.
If you're interested in making a purchase or learning more, please send us an email at "hello@nubela.co".
Can LinkDB be used as a source for my B2B data application?
Yes, you can use LinkDB data to power your application but you should not disclose more than 33% of LinkDB data you have purchased from us to any given customer.
You just can't directly resell LinkDB.
Now that we've covered:
- If the B2B data industry is profitable or not
- The hard way to obtain B2B data
- The easy way to obtain B2B data
Let's get into the technicals...
Feeding your application B2B data
With LinkDB, you can segment by nearly anything you want. For example, here's a query to segment by school:
SELECT profile.id,
first_name,
last_name,
field_of_study
FROM profile
LEFT JOIN profile_education
ON profile.id = profile_education.profile_id
WHERE profile_education.school = 'Caltech';
We also provide an API that provides data on people and companies and is partially powered by LinkDB, but they work differently.
Our API equivalent is our Person Search Endpoint query:
import json, requests
api_key = 'Your_API_Key_Here'
headers = {'Authorization': 'Bearer ' + api_key}
person_endpoint = 'https://nubela.co/proxycurl/api/search/person'
params = {
'country': 'US',
'education_school_name': '(?i)^caltech$',
}
response = requests.get(person_endpoint, params=params, headers=headers)
result = response.json()
print(json.dumps(result, indent=2))
But the LinkDB version is more powerful. Why is it more powerful?
- We don't have to limit ourselves to a country
- We can select whichever fields we want
- With a database in our backend, there's no need to wait for expensive network calls to complete
You can check out sample data for LinkDB including 10,000 random US profiles in the Parquet format to see how LinkDB can be embedded into your application.
Building on your data foundation
With LinkDB you now have your data foundation to power your application.
Using our API mentioned above, you can get fancy and do other things like you saw with Lusha and Apollo, where they enriched data (and more).
Which, by the way, is basically just taking different data points and searching our dataset to see if we can find who you're searching for and return a richer, fresher data profile.
Pulling fresher datasets with our API
Our API returns profiles with varying levels of freshness, depending on how recently we scraped it.
If you use the use_cache=if-recent
parameter with our API, you're requesting a profile freshness of 29 days or less. About 88% of these profiles are then fetched right then.
The other 12% are popular profiles, which are cached. This parameter guarantees fresh data, but it takes longer for requests to complete.
If you use the use_cache=if-present
parameter with our API, it'll fetch cached profiles if possible. If not, profiles are fetched from LinkDB if it's possible. If that's not possible, the request is then fetched live. This request returns a response nearly immediately.
Usually a better idea for B2B data applications with a user interface that needs quick response times.
What can you do with our API?
With our API you can:
- Look up people
- Look up companies
- Enrich people profiles
- Enrich company profiles
- Look up the contact information of people and companies
- Check if an email address is disposable
Let me show you an example.
How to use our API to enrich data
Using our Reverse Email Lookup Endpoint, we can take only an email address and enrich it.
Here's a quick Python example:
import json, requests
api_key = 'Your_API_Key_Here'
headers = {'Authorization': 'Bearer ' + api_key}
api_endpoint = 'https://nubela.co/proxycurl/api/linkedin/profile/resolve/email'
params = {
'lookup_depth': 'deep',
'email': 'danial@nubela.co',
'enrich_profile': 'enrich',
}
response = requests.get(api_endpoint, params=params, headers=headers)
result = response.json()
print(json.dumps(result, indent=2))
The deep
parameter extends past our database and is particularly useful for work emails. The enrich_profile
parameter returns information such as:
- Full name
- LinkedIn headline
- LinkedIn summary
- Country
- City
- State
- Education
- Experiences
- Industries
- Personal phone number
- All of the other responses listed here
Moreover, there are several different endpoints and several different ways you can use our API to pull fresh, accurate, and rich data to complement LinkDB.
Our API is entirely self-serve and you could start using it today.
You can create your Proxycurl account for free here.
Or, you can view our pricing here.
Do I need API access or just LinkDB?
LinkDB is a massive database of hundreds of millions of different people and companies scraped from LinkedIn. As I mentioned earlier, it's a huge data foundation.
That being said, LinkDB and Proxycurl API are intended to work in tandem. LinkDB to surface profiles of interest, and Proxycurl API to enrich and refresh profile data.
Closing thoughts
If you can add your own twist on things and have some technical competency, or someone you know who does (a software engineer), there's some money to be made if you can carve your own slice out of the B2B data and sales prospecting application market.
Ideally, you don't want to have to hire a data science or web scraping team, either...
That's where we come in -- and that's who we replace, for way less than they would cost.
We've scraped all of the B2B data you could possibly need for you already, and now we'd like to give it to you.
LinkDB can act as a perfect data foundation for your application just like it does for Reply.io.
And if you need profiles enriched, or your application demands the absolute freshest data, you can use our API to accomplish that and complement LinkDB.
If you have any questions at all, please feel free to reach out to us at "hello@nubela.co".
P.S. If you need it, our software engineers can help you implement LinkDB with your application. Reach out here to learn more details.