We’re treated to yet another legal drama in the world of web scraping. This time the verdict is out on Meta vs. Bright Data.
Bright Data won, and Meta lost. Yay to the web scraping community.
This win came off the back of a similarly high-profile, years-long legal battle between hiQ and LinkedIn, which hiQ won and LinkedIn lost.
That case ended in 2022 and was a huge win for the web scraping community because it further cemented that scraping a website for data is in fact legal, provided you meet certain requirements.
We highly recommend that you read the hiQ v. LinkedIn case alongside this Meta v. Bright Data case, because both address important facts about web scraping.
Now, time to start unpacking the details of this case:
(By the way, if you're already pretty familiar with the web scraping legal world, you can jump straight to the ruling and the key points here.)
Who are the players?
For the benefit of you who want to know in detail the parties involved in this case, here's the background:
Meta / Facebook
You should know who Meta is unless you’re a late Gen Z or Gen Alpha or a pure TikToker.
Either way, Meta is a gigantic technology company that owns multiple popular social networking products such as Facebook, Instagram, WhatsApp, and most recently Threads.
With these popular social-related products naturally come large user bases, making Meta a popular scraping target because of that vast and valuable user data.
Bright Data is a web data company headquartered in Israel. The company started its history as a proxies & IPs provider and in recent years since 2022, it dabbled into web scraping and selling the scraped data in the form of datasets.
Here’s a full walkthrough of Bright Data as a scraping service.
Senior U.S. District Judge Edward Chen
Legal commentaries usually only look at the companies/parties involved. But in this case, the involvement of the presiding judge - Judge Chen was a very intriguing one. Coincidentally (or not), he was too the presiding judge of the hiQ v. LinkedIn case, in which he ruled in favor of the scraping company and against the LinkedIn platform.
Although his involvement might not and should not be a cause of the ruling for this Meta v. Bright Data case, but interesting nevertheless.
Timeline of events in Meta v. Bright Data
Now that we’re clear on who was involved. Let’s dive in on the drama.
There were four main events:
- Nov’22: Cease-and-desist initiated by Meta
- Jan’23: Meta sued, then Bright Data countersued
- May’23: First hearing
- Jan’24: Ruling issued: Bright Data won
29 November 2022 - Cease-and-desist initiated by Meta
Meta initiated contact with Bright Data via 1) a video call and 2) an email back in November 2022 for a cease-and-desist. In these communications, Meta demanded that Bright Data stop any scraping activity on Meta’s platforms, in this case, Facebook and Instagram.
In our (non-legal) opinion, these actions by Meta were more to go through the motion to prove that contact was made, so that they can proceed to file a court action later. They for sure didn't expect Bright Data to comply with their demands.
And naturally, Bright Data didn’t. This prompted Meta’s next action:
6 January 2023 - Meta sued Bright Data
Meta sent a reminder letter to Bright Data that day, informing them of their "illegal" scraping activities, and demanded Bright Data to stop these activities immediately.
On the same day, Meta proceeded to sue Bright Data. They filed a complaint with the US District Court for California, claiming:
a breach of contract and tortious interference.
This is a very important fact because to Meta, Bright Data was a user and thus its scraping activities violated the Terms of Service (ToS) as a user. This was also a key consideration to Judge Chen which will explain the rationale of his ruling - more later.
20 January 2023 - Bright Data countersued Meta
Bright Data in return filed their complaint with the US District Court for Delaware.
Happenings in the subsequent months
The usual legal administrative and procedural steps happened from hereon. We saved you the troubles of going through these months, here are some highlights and references:
- Meta’s motion was submitted on 6 Jan 2023
- Exact timeline of the case
- Case reassigned to legendary web-scraping judge - Judge Chen
- The first hearing on 9 May 2023
- Covid-19 has too transformed the legal world, the proceedings in the whole case were mostly through Zoom
D-Day: the ruling
Finally, on 23 Jan 2024, Judge Chen issued his ruling, in favor of Bright Data.
Now what did all this mean for web scraping? Is it finally legal? Fully legal, half? Has the case concluded for good?
That's what we're going to discuss next:
Key points and summaries - what it meant for web scraping
Nothing is set in stone, there are still many unanswered questions and grey areas, but from what we've seen so far we can say,
As long as your scraping:
- Involves only public data and not private data
- Isn’t done while logged in, i.e. not a user of the platform that owns the data
As mentioned earlier, keep in mind that Meta sued for breach of contract, i.e. they’re suing Bright Data as a user of their products. This was a key point to Judge Chen's ruling.
That said, here are three highlights followed by the judge’s opinions/ruling:
Highlight #1: Bright Data scraped Meta while being a user
How was Bright Data a user? Well, they basically just had some Facebook and Instagram accounts for their corporate and branding purposes as any company would.
Being a user of any platform, you’ll generally need to comply with their Terms of Service (ToS). In this case, it means that Bright Data being a user and scraping the platforms as a user, violated Meta’s ToS - according to Meta.
In Bright Data’s view, however, those Facebook and Instagram accounts were meant as company profiles for branding and marketing purposes. They had nothing to do with Bright Data’s scraping activities.
The judge duly agreed.
Note: due to the case, Bright Data has since deleted all their Facebook & Instagram accounts. Try searching for them and you'll find user-created "fake" accounts under the name of Bright Data.
Highlight #2: Bright Data scraped Meta while not being a user
Now, because Bright Data has deleted all their Facebook and Instagram profiles, they were no longer a user at that point.
Still, Meta argued that Bright Data was bound by their ToS as an ex- or non-user, and thus their scraping activities on Meta's platforms were still illegal.
Interestingly, Judge Chen again ruled in favor of Bright Data on this point.
Why interestingly? The practice has always been that no matter whether a user or not, any scraping activity will constitute a violation of the ToS simply because of the nature of ToS being readily available and accessible thus becoming a necessary knowledge on the scrapers' part. Rightfully so, ToS is intended as a protection for platforms.
So, in this case, having ruled in favor of the web scraping company Bright Data, it is a huge win for the scraping community.
Highlight #3: Bright Data scraped public data without being logged in
This remained a very important aspect of web-scraping-related cases, as with hiQ v. LinkedIn. Because precedence has already been set that web scraping of public data is perfectly legal, social platforms can’t limit access to public data because it belongs to, as the name suggests, the public.
To quote Bright Data’s motion:
"This case is all about public data: whether the public has the right to search public information, or whether Meta can use the courts as a tool to eviscerate that right, even where Meta does not own the data at issue and has no property rights in it...
If Bright Data loses this case, the losers are not just Bright Data but the public, whose rights are being taken away."
The last line is quite epic.
Parallels between Meta v. Bright Data, hiQ v. LinkedIn, and another 2008 case: Facebook v. Power Ventures
Looking at the three high-profile web scraping cases of hiQ v. Linkedin, Facebook v. Power Ventures (started as far back as 2008), and this Meta v. Bright Data case, it is very clear that scraping of public data while being logged off is legal in the eyes of the courts.
In Facebook v. Power Ventures, the scraping itself wasn’t illegal for sure, rather it was Power Venture’s copying of Facebook’s copyrighted page designs and interface wholesale that caused Power Ventures to lose the case.
In hiQ v. LinkedIn, similarly, the scraping was deemed legal. But it was hiQ’s creation of fake accounts that got them into trouble.
And now in Bright Data’s case, they emerged the full winner because they steered clear of creating fake accounts for scraping purposes and avoided any copyright infringement like Power Ventures did. In other words, they scraped for public data as a non-user, non-logged-in.
Also, did you know that Meta actually employed Bright Data’s services before to scrape other websites? That's a turn of events.
By the way, the worst of it was done by Mantheos, another web scraping company, which got sued and forced to close down because they fraudulently created fake LinkedIn accounts and fabricated fake debit cards to get access to LinkedIn Sales Navigator and then scraped millions of those private profiles.
Ultimately, we’re still treading along a fine line when it comes to web scraping
And that is why platforms continue to sue scraping companies year after year after year.
Even with this clear win by Bright Data, and in extension the scraping community, we shouldn’t expect Meta to accept the defeat just like that.
As with any legal case, the losing party will usually submit an appeal, which will lead to another few months or even years of legal battle before it all settles for good.
That happened in hiQ v. LinkedIn too where Linkedin appealed the Court’s decision, which led to the case changing courts a few times, the ruling being backtracked, and it went on for years before both parties settled privately in December 2022.
Web scraping aside, there are better ways of getting data at scale from platforms like LinkedIn and Facebook without all the headaches - one such way is via APIs like ours at Proxycurl, or buying datasets directly.
Vendors like us take care of the complicated, dangerous, and ever-changing landscape of web scraping, including circumventing creative scraping blockers employed by these platforms like CAPTCHAs & IP blocking, while you focus on getting quality data to build your applications.
We hope this non-legal legal commentary helped shed light on this pivotal case, and what it meant for you if you're into web scraping.
The next step?
Use a data vendor like Proxycurl or Bright Data instead. Create your Proxycurl account, grab your API key, and test out the APIs for yourself:
P.S. Have any questions? No problem, we have answers. Just reach out to us at "[email protected]" and we'll get you taken care of ASAP!