Jordan Pearson reports:
Yahoo Labs, the research wing of Yahoo, just released what the company is calling the “largest ever” machine learning dataset for artificial intelligence researchers to use in their work, for free. For example, to create a Facebook-like recommendation algorithm.
In doing so, Yahoo also released information that could potentially be used by researchers who download the database—and anyone they share it with—to identify Yahoo customers.
The behemoth dataset consists of 13.5 terabytes of user interactions with news items from some 20 million users, which the company says have been “anonymized.” While there are no names attached to the data, seven million users in the database also had information about their age, gender, the city they were in when they accessed the page, whether they used a mobile device or a desktop, and a timestamp of when they accessed the news item, included in the dataset.
Read more on Motherboard.