While Meta Platforms, the parent company of Facebook and Instagram, is in the news for the company’s handling of user data often, it has only gotten worse for the firm over the last couple of years. Spawning yet another major controversy, Meta Platforms admitted that users in Australia don’t have an option of opting out from its mass scraping of their public posts and images on Instagram and Facebook. Here we’ll discuss data policies in some of the major regions and analyze the ways to keep your data secure online.
The revelations about Meta Platforms scraping data of Australians without their explicit consent came out during the grilling of the company’s global privacy director Melinda Claybaugh. Responding to the question on whether Meta was scraping Australian posts as late as 2007 for its artificial intelligence (AI) products, Claybaugh responded “We have not done that”.
Meta Platforms Was Scraping Through Public Posts of Australians
Greens senator David Shoebridge probed Claybaugh further and asked “The truth of the matter is that unless you have consciously set those posts to private since 2007, Meta has just decided that you will scrape all of the photos and all of the texts from every public post on Instagram or Facebook since 2007, unless there was a conscious decision to set them on private. That’s the reality, isn’t it?” Claybaugh responded in the affirmative to that question.
The photos, videos and posts of millions of Australians are being used to train Meta's Artificial Intelligence, and today a parliamentary committee has been told that unlike Europeans we can't opt out. pic.twitter.com/CQrfiz3pv2
— 10 News First Sydney (@10NewsFirstSyd) September 11, 2024
Claybaugh did not commit to whether Australians would be given an option to “opt out” like their counterparts in the European Union, Britain, the European Economic Area, and Switzerland. She said that the provisions in the region were “in response to a very specific legal frame” – alluding to General Data Protection Regulation (GDPR) privacy rules that barred the Facebook parent from training its large language models (LLMs) using data from users based in Europe.
Notably, US tech companies like Meta Platforms, Apple, and Elon Musk’s X (f0rmerly Twitter) have faced intense scrutiny in the EU to varying degrees. For instance, Apple is not offering its flagship Apple Intelligence technology in the region over “regulatory uncertainties.”
Europe Has Stringent Data Privacy Laws
Apple argues, “Specifically, we are concerned that the interoperability requirements of the DMA (Digital Markets Act) could force us to compromise the integrity of our products in ways that risk user privacy and data security. We are committed to collaborating with the European Commission in an attempt to find a solution that would enable us to deliver these features to our EU customers without compromising their safety.” The company also agreed to allow third-party app stores to be downloaded on iOS devices in the EU and, earlier this year, approved the Altstore PAL.
Meta, on the other hand, had to offer an ad-free paid version of Facebook and Instagram in the EU amid concerns over user privacy.
Can US Users Opt Out from Meta’s Scarping of Their Data?
To be sure, Meta isn’t only scraping Australian data without an opt-out option. The company offers either limited or no options in most regions. There are some exceptions, like Brazil, which recently barred Meta from training its AI models using the personal data of Brazilians citing “risks of serious damage and difficulty to users.” Meta subsequently appealed the decision and provided the Brazilian data protection authority (ANPD) with a new compliance plan and its suspension was revoked. Earlier this month, Meta said that it would inform Brazilian users about how it uses their data for AI training.
Meta to Increase Transparency on Data Usage in Brazil Following ANPD Ban Lift
Meta has agreed to be more transparent about its data usage policies in Brazil, particularly regarding how it uses user data for AI training. #TechNews #DataProtection
⬇️https://t.co/PzNfc2ZtUI— The Tech Report (@thetechreport) September 5, 2024
Other Latin American countries still haven’t refined their privacy policies to restrain tech companies from scraping the data of their citizens. India, which otherwise has strict data laws, doesn’t have any rules barring Meta Platforms from scraping public posts of its citizens either even though it’s the biggest market for Facebook and WhatsApp.
Meta Platforms does not provide an opt-out option to users based in the US and has been scraping their data from an unspecified date. However, users have an option to set their account to private. In its statement earlier this year, Meta said, “While we don’t currently have an opt-out feature, we’ve built in-platform tools that allow people to delete their personal information from chats with Meta A.I. across our apps.”
Separately, Facebook provides a form named “Data Subject Rights for Third Party Information Used for AI at Meta” through which users can control how Meta uses their information from third parties. One of the options in that form is to ask Facebook to delete any personal information it might have sourced from third parties and not use it to train its AI models.
Meta Says It Only Scrapes Public Data
It is worth noting that Meta is not scraping through private chats and images and only using public posts. It also isn’t scraping accounts of minors but it still uses their images if found in public posts of adults.
Meta Platforms doesn’t seem to be breaking the law, especially outside of the EU. However, many users aren’t enthused by the fact that their data is being used for profit without the option to opt-out.
Leading AI models need an exorbitant amount of data so most AI companies have simply been scraping through online public content, often without obtaining permission of any kind. This is likely a copyright violation as uploading content to the internet doesn’t suddenly make it free for anyone to use for profit.
Publishers are on the frontlines of this battle, arguing that AI companies are stealing their hard work for profit. Last year, The New York Times filed a lawsuit accusing AI giants OpenAI and Microsoft of brazenly copying millions of articles from their websites without due permission to train their AI systems.
In a more recent but similar case, journalists Andrea Bartz, Kirk Wallace Johnson, and Charles Graeber have filed a class action suit against AI giant Anthropic in a California court accusing it of using their work without permission to train the company’s Claude chatbot.
Last year, Google also updated its privacy policy to add that it can use scraped web data to train its AI models. Incidentally, in June the company defeated a class action lawsuit that accused it of misusing online social media content and information shared on Google platforms for training its AI systems. Previously, in May, OpenAI and Microsoft also defeated a similar lawsuit.
Data Scraping Regulations Are Sorely Needed
The reason tech companies have been able to scrape so much public data without retribution is because of the lax privacy laws. As Senator Shoebridge aptly said, “There’s a reason that people’s privacy is protected in Europe and not in Australia; it’s because European lawmakers made tough privacy laws. Meta made it clear today that if Australia had similar laws, Australians’ data would also have been protected.”
While some steps have been taken – such as OpenAI providing websites an option to stop its crawlers from scraping their content – much more needs to be done. In the absence of regulations, public data might continue to be the cannon fodder for training AI models.