PogoWasRight.org

Menu
  • About
  • Privacy
Menu

Article: The Great Scrape: The Clash Between Scraping and Privacy

Posted on October 10, 2025 by Dissent

Abstract

Artificial intelligence (AI) systems depend on massive quantities of data, often gathered by “scraping”—the automated extraction of large amounts of data from the internet. A great deal of scraped data contains people’s personal information. This personal data provides the grist for AI tools such as facial recognition, deep fakes, and generative AI. Although scraping enables web searching, archiving of records, and meaningful scientific research, scraping for AI can also be objectionable and even harmful to individuals and society.
Organizations are scraping at an escalating pace and scale, even though many privacy laws are seemingly incongruous with the practice. In this Article, we contend that scraping must undergo a serious reckoning with privacy law. Scraping violates nearly all of the key principles of privacy laws, including fairness, individual rights and control, transparency, consent, purpose specification and secondary use restrictions, data minimization, onward transfer, and data security. Scraping ignores the data protection laws built around these requirements.
Scraping has evaded a reckoning with privacy law largely because scrapers act as if all publicly available data were free for the taking. But the public availability of scraped data shouldn’t give scrapers a free pass. Privacy law regularly protects publicly available data, and privacy principles are implicated even when personal data is accessible to others.
This Article explores the fundamental tension between scraping and privacy law. With the zealous pursuit and astronomical growth of AI, we are in the midst of what we call the “great scrape.” There must now be a great reconciliation.
Citation and Free Download of Article at:
Solove, Daniel J. and Hartzog, Woodrow, The Great Scrape: The Clash Between Scraping and Privacy (July 03, 2024). 113 California Law Review 1521 (2025), Available at SSRN: https://ssrn.com/abstract=4884485 or http://dx.doi.org/10.2139/ssrn.4884485

Related posts:

  • Article: The Great Scrape: The Clash Between Scraping and Privacy
Category: Misc

Post navigation

← UK Upper Tribunal hands down judgment on Clearview AI Inc
Microsoft ‘illegally’ tracked students via 365 Education, says data watchdog →

Search

Contact Me

Email: info[at]pogowasright.org
Security Issue: security[at]pogowasright.org
Mastodon: Infosec.Exchange/@PogoWasRight
Signal: +1 516-776-7756
DMCA Concern: dmca[at]pogowasright.org

Research Report of Note

A report by EPIC.org:

State Attorneys General & Privacy: Enforcement Trends, 2020-2024

Categories

Recent Posts

  • Changes in the Rules for Disclosure for Substance Use Disorder Treatment Records: 42 CFR Part 2: What Changed, Why It Matters, and How It Aligns with HIPAAs
  • Always watching: How ICE’s plan to monitor social media 24/7 threatens privacy and civic participation
  • Who’s watching the watchers? This Mozilla fellow, and her Surveillance Watch map
  • EPIC Publishes New Whitepaper Detailing Privacy Risks of Government Data Mining Programs
  • Modern cars are spying on you. Here’s what you can do about it.
  • Attorney General James and Multistate Coalition Secure $5.1 Million from Education Software Company for Failing to Protect Students’ Data       
  • EU Parliament committee votes to advance controversial Europol data sharing proposal

RSS Recent Posts at DataBreaches.net

  • Short-term renewal of cyber information sharing law appears in bill to end shutdown
  • Yanluowang ransomware IAB pleads guilty
  • Lawsuit Alleges Ex-Intel Employee Hid 18,000 Sensitive Documents Prior to Leaving the Company
  • HIPAA, but for non-Covered Entities?
  • Manassas City Public Schools close on Monday due to cyberattack
©2025 PogoWasRight.org. All rights reserved.