Using cURL In 5 Actual Scenarios (+ Extensive Codes Samples)
Every application you have ever interacted with goes through the client-server or request-response connection cycle. This connection is made possible by protocols HTTP, SMTP, Telnet, LDAP, and many others and their methods. HTTP protocols provide popular methods like GET, POST, PUT/PATCH, and DELETE to send requests to a server and get responses.
cURL is a powerful tool that provides a generic and language-agnostic way of transferring data via the earlier-mentioned protocols. For example, it takes a search or download request from you and sends it to a server holding the data you need. It retrieves this data by creating a connection with the server using a protocol method based on your request type and then returning your requested data as a response.
In this article, we'll explore curl in detail, what it's about, why it's so popular, and when and how to use it. Let's go!
Overview
- What Is cURL & Its Protocols?
- Why Is cURL So Popular?
- How and When To Use cURL
- cURL - A Powerful Tool
What Is cURL?
cURL, which stands for "Client URL," is a command line tool used for transferring data to (request) and from (response) a server via various protocols. At its most simplistic form, cURL allows you to communicate to a server using a URL or API endpoint using a request data alongside that URL and get back a set of data as responses.
cURL runs on libcurl, an open-source, client-side URL transfer library built and released by Daniel Stenberg in March 1998. It is versatile and ideal for communication across different devices due to its support for many protocols, including HTTP and HTTPS.
cURL Protocols
Before we dive in, let's briefly explain protocols.
Protocols is a set of standards that define how data transfer happens over the internet. They come with methods that communicate to a server how your request should be treated.
We already established that cURL supports many protocols but defaults to HTTP if you do not specify a protocol.
Here are the protocols cURL currently supports:
HTTPS, DICT, FILE, FTP, HTTP, IMAP, IMAPS, FTPS, SCP, POP3, POP3S, RTMP, SFTP, TELNET, TFTP, SMTP, SMTPS, SMBS, LDAP, LDAPS, SMB, GOPHER, GOPHERS, RTSP, MQTT
Why Is cURL So Popular?
Besides, its support for many protocols, cURL's portability is one of many attributes responsible for its popularity. cURL is compatible with almost every operating system and connected device.
Not just that, cURL is useful for user authentication, and SSL connections, has proxy support, and web scraping data.
When it comes to requests on HTTP/HTTPS, it's an easy and quick way to test API endpoints. Debugging is stress-free with cURL due to its verbose log system, you can quickly know what the error is without a debugger.
How and When to Use cURL
We have talked so much about what cURL is, you're probably wondering how to get started with this powerful tool.
How to Use cURL
cURL is a command line tool, which means, it can only be used in a terminal or command prompt.
Run curl --man
to check if you have cURL already installed, this command also pulls up the cURL manual pages.
On OSX, if your system is less than ten years old, you most likely have cURL installed.
On Windows, cURL may already be installed depending on the version you're using or what you've downloaded (such as Git for Windows and other dev tools).
On Linux, don't worry, cURL is definitely installed.
If you still don't have cURL installed, follow this guide for OSX, Windows, and Linux.
When to Use cURL
Now that you have cURL installed, let's use it! Let's try out some popular cURL use cases:
1. Web Scraping With cURL
Before trying this out, ensure you abide by a website’s rules, and in general do not try to access password-protected content, it is illegal.
Alright. Let's scrape. I'll be using Python for this. We will be scraping HTML data from httpbin posts web page.
It's very straightforward, just import the Python os module, pass your cURL command here and print the results.
import os
curl = os.system(f'curl "https://httpbin.org/forms/post"')
print(curl)
and we get something like this.
<!DOCTYPE html>
<html>
<head>
</head>
<body>
<!-- Example form from HTML5 spec http://www.w3.org/TR/html5/forms.html#writing-a-form's-user-interface -->
<form method="post" action="/post">
<p><label>Customer name: <input name="custname"></label></p>
<p><label>Telephone: <input type=tel name="custtel"></label></p>
<p><label>E-mail address: <input type=email name="custemail"></label></p>
<fieldset>
<legend> Pizza Size </legend>
<p><label> <input type=radio name=size value="small"> Small </label></p>
<p><label> <input type=radio name=size value="medium"> Medium </label></p>
<p><label> <input type=radio name=size value="large"> Large </label></p>
</fieldset>
<fieldset>
<legend> Pizza Toppings </legend>
<p><label> <input type=checkbox name="topping" value="bacon"> Bacon </label></p>
<p><label> <input type=checkbox name="topping" value="cheese"> Extra Cheese </label></p>
<p><label> <input type=checkbox name="topping" value="onion"> Onion </label></p>
<p><label> <input type=checkbox name="topping" value="mushroom"> Mushroom </label></p>
</fieldset>
<p><label>Preferred delivery time: <input type=time min="11:00" max="21:00" step="900" name="delivery"></label></p>
<p><label>Delivery instructions: <textarea name="comments"></textarea></label></p>
<p><button>Submit order</button></p>
</form>
</body>
</html>
2. Send API Requests
cURL can be used to send API requests via the HTTP protocol, we will be simulating the request-response connection using the popular HTTP methods. I'll be demonstrating with Proxycurl's API.
We will be using the GET
method for retrieving resources using an endpoint or URL. Using Proxycurl's job endpoint, we will be retrieving the number of jobs posted by companies on Linkedin.
This endpoint requires an authorization key to access the resource, cURL supports HTTP headers where you can pass authorization keys or tokens. Cool stuff!
curl \
-X GET \
-H "Authorization: Bearer ${YOUR_API_KEY}" \
'https://nubela.co/proxycurl/api/v2/linkedin/company/job/count?when=past-month&flexibility=remote&geo_id=92000000&keyword=software+engineer&search_id=1035'
3. Download A File Using cURL
Downloading files is cooler when you can do it from a command line. Yes, cURL gives you that superpower and it's very easy. This can be done in multiple ways but these two can suffice:
First, we can download a file using the HTTP protocol. Just get the URL of the website where the file is located and run:
curl https://your-domain/file.pdf
or, use the FTP or SFTP protocol. Just run:
curl ftp://ftp-your-domain-name/file.tar.gz
4. User Authentication
Right from your command prompt, you can get authenticated on a web app if you have a user account.
This request works with the HTTP protocol POST
request, which means, you'll have to be sending some extra data alongside the URL, in this scenario, the server you're making the request to would have to check if your credentials exist on the database and give you access or not.
curl -X POST http://www.nubela.co/proxycurl/login/ -d 'username=yourusername&password=yourpassword'
5. Retrieve the HTTP Headers of a URL
The cURL command has a -I option that allows you to fetch the HTTP headers of a particular page. This is important in cases where you might need some sort of metadata to access a resource.
curl -I https://www.nubela.co/proxycurl
cURL - A Powerful Tool
We've explained in detail what cURL is about, when and how to use it, and its many advantages. While cURL is a powerful web scraping tool, in terms of data collection and data cleaning, it requires lots of development time.
Proxycurl is a fully-managed layer that takes out the hassles of scraping and processing data at scale while you focus on building your applications. They have a whooping 401M profiles of peoples and companies to scale up your business. Most importantly, their APIs are fully compliant with major regulatory frameworks, so you can scale data without the worry of landing in legal troubles.