PogoWasRight.org

Menu
  • About
  • Privacy
Menu

Article: Linking Users Across Domains with Location Data: Theory and Validation

Posted on April 15, 2016June 26, 2025 by Dissent

Chris Riederer, Yunsung Kim, and Augustin Chaintreau of Columbia University, and Nitish Korula, and Silvio Lattanzi of Google Research have published an article. Here’s the abstract:

Linking accounts of the same user across datasets – even when personally identifying information is removed or unavailable – is an important open problem studied in many contexts. Beyond many practical applications, (such as cross domain analysis, recommendation, and link prediction), understanding this problem more generally informs us on the privacy implications of data disclosure. Previous work has typically addressed this question using either different portions of the same dataset or observing the same behavior across thematically similar domains. In contrast, the general cross-domain case where users have different profiles independently generated from a common but unknown pattern raises new challenges, including difficulties in validation, and remains under-explored.

In this paper, we address the reconciliation problem for location-based datasets and introduce a robust method for this general setting. Location datasets are a particularly fruitful domain to study: such records are frequently produced by users in an increasing number of applications and are highly sensitive, especially when linked to other datasets. Our main contribution is a generic and self-tunable algorithm that leverages any pair of sporadic location-based datasets to determine the most likely matching between the users it contains. While making very general assumptions on the patterns of mobile users, we show that the maximum weight matching we compute is provably correct. Although true cross-domain datasets are a rarity, our experimental evaluation uses two entirely new data collections, including one we crawled, on an unprecedented scale. The method we design outperforms naive rules and prior heuristics. As it combines both sparse and dense properties of location-based data and accounts for probabilistic dynamics of observation, it can be shown to be robust even when data gets sparse.

You can access the full article here.

h/t, Joe Cadillic

No related posts.

Category: Misc

Post navigation

← SJWs Build New Database to Shame People For “Insulting” Online Comments
Oral argument in garbage inspection case →

Now more than ever

Search

Contact Me

Email: info@pogowasright.org

Mastodon: Infosec.Exchange/@PogoWasRight

Signal: +1 516-776-7756

Categories

Recent Posts

  • Microsoft’s controversial Recall feature is now blocked by Brave and AdGuard
  • Trump Administration Issues AI Action Plan and Series of AI Executive Orders
  • Indonesia asked to reassess data privacy terms in new U.S. trade deal
  • Meta Denies Tracking Menstrual Data in Flo Health Privacy Trial
  • Wikipedia seeks to shield contributors from UK law targeting online anonymity
  • British government reportedlu set to back down on secret iCloud backdoor after US pressure
  • Idaho agrees not to prosecute doctors for out-of-state abortion referrals

RSS Recent Posts on DataBreaches.net

  • BreachForums — the one that went offline in April — reappears with a new founder/owner
  • Fans React After NASCAR Confirms Ransomware Breach
  • Allianz Life says ‘majority’ of customers’ personal data stolen in cyberattack
  • Infinite Services notifying employees and patients of limited ransomware attack
  • The safe place for women to talk wasn’t so safe: hackers leak 13,000 user photos and IDs from the Tea app
©2025 PogoWasRight.org. All rights reserved.
Menu
  • About
  • Privacy