Scrape 1M webpages / day without troubles from Recaptcha and bot detection

Proxycurl is a Web Crawling and Scraping API to scrape webpages in real-time with one-line of code

Scrape 1M Linkedin profiles / day with Proxycurl's Linkedin API

Get structured data of Linkedin profiles

Screenshot of Bill Gate's Linkedin Profile
    
   {
   'public_identifier': 'williamhgates',
   'profile_pic_url': 'https://media-exp1.licdn.com/dms/image/C5603AQHv9IK9Ts0dFA/profile-displayphoto-shrink_800_800/0?e=1604534400&v=beta&t=Lvh0ACqZ78o1BnS3RLKTfB0DWAYXTMXTwegz-9O_EMY',
   'first_name': 'Bill',
   'last_name': 'Gates',
   'occupation': 'Co-chair, Bill & Melinda Gates Foundation',
   'headline': 'Co-chair, Bill & Melinda Gates Foundation',
   'summary': 'Co-chair of the Bill & Melinda Gates Foundation. Microsoft Co-founder. Voracious reader. Avid traveler. Active blogger.',
   'country': 'us',
   'birth_date': {
       'month': 10,
       'day': 28
   },
   'address': 'None',
   'wechat_contact_info': 'None',
   'primary_twitter_handle': 'None',
   'twitter_handles': [],
   'phone_numbers': [],
   'email_address': 'None',
   'websites': [],
   'experiences': [
       {
       'company': 'Bill & Melinda Gates Foundation',
       'url': 'https://www.linkedin.com/company/bill-&-melinda-gates-foundation/',
       'title': 'Co-chair',
       'starts_at': {
           'month': 'None',
           'year': 2000
       },
       'ends_at': 'None'
       },
       {
       'company': 'Microsoft',
       'url': 'https://www.linkedin.com/company/microsoft/',
       'title': 'Co-founder',
       'starts_at': {
           'month': 'None',
           'year': 1975
       },
       'ends_at': 'None'
       }
   ],
   'education': [
       {
       'school': 'None',
       'degree_name': 'None',
       'field_of_study': 'None'
       },
       {
       'school': 'Harvard University',
       'degree_name': 'None',
       'field_of_study': 'None'
       }
   ],
   'languages': [],
   'organisations': []
   }
    

Proxycurl Web Crawling API

For teams that care about being cost and time efficient.

Bypass Recaptchas and Bot Detection

Crawl popular websites such as Google or Amazon without Recaptcha.

Real-time crawls

Crawls dispatched from Proxycurl are made in real-time.

Crawl 1M pages in 1 day

Proxycurl scales trivially. Make more API requests concurrently to scrape more pages.

Made for developers

Proxycurl is a distributed crawling service that helps to circumvent most (if not all) rate-limiting techniques employed by complex websites.

                        
import requests
api_endpoint = 'https://nubela.co/proxycurl/api'
api_key = 'YOUR_API_KEY'
header_dic = {'Authorization': 'Bearer ' + api_key}
payload = {'url': 'https://api.ipify.org?format=json'}
response = requests.get(api_endpoint,
                        json=payload,
                        headers=header_dic)
                        
                    

Our customers

We partner with organizations large and small to transform their product with big data

diffbot
canddi
alore

Try Proxycurl now with a trial API Key