PogoWasRight.org

Menu
  • About
  • Privacy
Menu

Article: The Great Scrape: The Clash Between Scraping and Privacy

Posted on December 2, 2024 by Dissent

Law professors Daniel J. Solove and Woodrow Hartzog have written an article on an important topic in privacy.

Abstract

Artificial intelligence (AI) systems depend on massive quantities of data, often gathered by “scraping” – the automated extraction of large amounts of data from the internet. A great deal of scraped data is about people. This personal data provides the grist for AI tools such as facial recognition, deep fakes, and generative AI. Although scraping enables web searching, archival, and meaningful scientific research, scraping for AI can also be objectionable or even harmful to individuals and society.

Organizations are scraping at an escalating pace and scale, even though many privacy laws are seemingly incongruous with the practice. In this Article, we contend that scraping must undergo a serious reckoning with privacy law.  Scraping violates nearly all of the key principles in privacy laws, including fairness; individual rights and control; transparency; consent; purpose specification and secondary use restrictions; data minimization; onward transfer; and data security. With scraping, data protection laws built around these requirements are ignored.

Scraping has evaded a reckoning with privacy law largely because scrapers act as if all publicly available data were free for the taking. But the public availability of scraped data shouldn’t give scrapers a free pass. Privacy law regularly protects publicly available data, and privacy principles are implicated even when personal data is accessible to others.

This Article explores the fundamental tension between scraping and privacy law. With the zealous pursuit and astronomical growth of AI, we are in the midst of what we call the “great scrape.” There must now be a great reconciliation.

You can download the full article for free from SSRN:

Solove, Daniel J. and Hartzog, Woodrow, The Great Scrape: The Clash Between Scraping and Privacy (July 03, 2024). 113 California Law Review (forthcoming 2025), Available at SSRN: https://ssrn.com/abstract=4884485 or http://dx.doi.org/10.2139/ssrn.4884485

 

 

No related posts.

Category: BreachesMiscOnlineU.S.

Post navigation

← Atrium Health Substitute Notice Concerning Tracking Pixels Breach
Ninth Circuit rejects challenge to Idaho’s abortion trafficking ban — with some caveats →

Now more than ever

Search

Contact Me

Email: [email protected]

Mastodon: Infosec.Exchange/@PogoWasRight

Signal: +1 516-776-7756

Categories

Recent Posts

  • Department of Justice Subpoenas Doctors and Clinics Involved in Performing Transgender Medical Procedures on Children
  • Google Settles Privacy Class Action Over Period Tracking App
  • ICE Is Searching a Massive Insurance and Medical Bill Database to Find Deportation Targets
  • Franklin, Tennessee Resident Sentenced to 30 Months in Federal Prison on Multiple Cyber Stalking Charges
  • On July 7, Gemini AI will access your WhatsApp and more. Learn how to disable it on Android.
  • German court awards Facebook user €5,000 for data protection violations
  • Record-Breaking $1.55M CCPA Settlement Against Health Information Website Publisher

RSS Recent Posts on DataBreaches.net

  • Avantic Medical Lab hacked; patient data leaked by Everest Group
  • Integrated Oncology Network victim of phishing attack; multiple locations affected (2)
  • HHS’ Office for Civil Rights Settles HIPAA Privacy and Security Rule Investigation with Deer Oaks Behavioral Health for $225k and a Corrective Action Plan
  • HB1127 Explained: North Dakota’s New InfoSec Requirements for Financial Corporations
  • Credit reports among personal data of 190,000 breached, put for sale on Dark Web; IT vendor fined
©2025 PogoWasRight.org. All rights reserved.