Scrape 1M webpages without troubles from Recaptcha and bot detection

Proxycurl is a Web Crawling and Scraping API to scrape webpages in real-time with one-line of code

Scrape 1M Linkedin profiles with Proxycurl's Linkedin API

Get structured data of Linkedin profiles

Screenshot of Bill Gate's Linkedin Profile
    
        {
        'public_identifier': 'williamhgates',
        'profile_pic_url': 'https://media-exp1.licdn.com/dms/image/C5603AQHv9IK9Ts0dFA/profile-displayphoto-shrink_800_800/0?e=1604534400&v=beta&t=Lvh0ACqZ78o1BnS3RLKTfB0DWAYXTMXTwegz-9O_EMY',
        'first_name': 'Bill',
        'last_name': 'Gates',
        'occupation': 'Co-chair, Bill & Melinda Gates Foundation',
        'headline': 'Co-chair, Bill & Melinda Gates Foundation',
        'summary': 'Co-chair of the Bill & Melinda Gates Foundation. Microsoft Co-founder. Voracious reader. Avid traveler. Active blogger.',
        'country': 'us',
        'birth_date': {
            'month': 10,
            'day': 28
        },
        'address': 'None',
        'wechat_contact_info': 'None',
        'primary_twitter_handle': 'None',
        'twitter_handles': [],
        'phone_numbers': [],
        'email_address': 'None',
        'websites': [],
        'experiences': [
            {
            'company': 'Bill & Melinda Gates Foundation',
            'url': 'https://www.linkedin.com/company/bill-&-melinda-gates-foundation/',
            'title': 'Co-chair',
            'starts_at': {
                'month': 'None',
                'year': 2000
            },
            'ends_at': 'None'
            },
            {
            'company': 'Microsoft',
            'url': 'https://www.linkedin.com/company/microsoft/',
            'title': 'Co-founder',
            'starts_at': {
                'month': 'None',
                'year': 1975
            },
            'ends_at': 'None'
            }
        ],
        'education': [
            {
            'school': 'None',
            'degree_name': 'None',
            'field_of_study': 'None'
            },
            {
            'school': 'Harvard University',
            'degree_name': 'None',
            'field_of_study': 'None'
            }
        ],
        'languages': [],
        'organisations': []
        }
    

Proxycurl Web Crawling API

For teams that care about being cost and time efficient.

Bypass Recaptchas and Bot Detection

Crawl popular websites such as Google or Amazon without Recaptcha.

Real-time crawls

Crawls dispatched from Proxycurl are made in real-time.

Crawl 1M pages in 1 day

Proxycurl scales trivially. Make more API requests concurrently to scrape more pages.

Made for developers

Proxycurl is a distributed crawling service that helps to circumvent most (if not all) rate-limiting techniques employed by complex websites.

                        
                            import requests
                            api_endpoint = 'https://nubela.co/proxycurl/api'
                            api_key = 'YOUR_API_KEY'
                            header_dic = {'Authorization': 'Bearer ' + api_key}
                            payload = {'url': 'https://api.ipify.org?format=json'}
                            response = requests.get(api_endpoint,json=payload,headers=header_dic)
                        
                    

Our customers

We partner with organizations large and small to transform their product with big data

diffbot
hiretual
canddi
alore

Try Proxycurl now with a trial API Key