In A Nutshell
We’ve been told that our transaction records are “anonymized” by removing our names and other personal details before our credit card companies share the information with outside organizations. But researchers from the Massachusetts Institute of Technology have shown that the locations and dates of just four purchases are sufficient to correctly identify you with over 90 percent accuracy in a database of 1.1 million people with three months of information (even if the data is anonymized). The researchers only need three purchases to identify you if they have price information. In other words, you have little privacy regardless of what you’ve been told.
The Whole Bushel
Our credit card companies may not understand the extent to which they’re violating our privacy while telling us they aren’t. That’s true even when our transaction records are “anonymized” by removing our names and other personal details before our credit card companies share the information with outside organizations.
The problem stems from metadata, a technical term that means data that describes other data. For example, if you call someone from your mobile phone, the metadata would include the time, date and location of the call. With computers now powerful enough to quickly perform statistical analysis on large amounts of data, it’s not necessary to have the names or numbers of the people involved in the calls. With a surprisingly small amount of metadata, computers can use the information in giant databases to determine who you are and whom you called with amazing accuracy.
In fact, about two years ago, researchers from the Massachusetts Institute of Technology (MIT) and the Universite Catholique de Louvain in Belgium published the results of a study showing that only four data points were needed to identify a mobile phone user with 95 percent accuracy. They used a European database with 1.5 million mobile phone users and 15 months of anonymized data. They could also identify about 50 percent of the mobile phone users from just two data points.
In this case, data points were generated by pings from the users’ mobile phones to nearby cell towers as these people traveled or when they sent or received text messages and calls.
Fast-forward to today and there’s an even more frightening reality. Using a database of about 1.1 million people with three months of information, MIT researchers have found they need the locations and dates of just four of your credit card purchases to correctly identify you with over 90 percent accuracy (even if the data is anonymized). The researchers only need three purchases to identify you with about 94 percent accuracy if they have price information.
But it doesn’t stop there. If they have just one of your credit card receipts, your tweet about your new phone, and an Instagram picture of you socializing with your buddies—all data points with location information—they can identify you with about 94 percent accuracy in that million-person database. They can also extract the records of your other credit card transactions, even if there isn’t one individual in that database who’s identified by personal information such as name, address, or credit card number.
The researchers gave this example in their study: “Let’s say that we are searching for Scott in a simply anonymized credit card data set. We know two points about Scott: he went to the bakery on 23 September and to the restaurant on 24 September. Searching through the data set reveals that there is one and only one person in the entire data set who went to these two places on these two days . . . Scott is re-identified, and we now know all of his other transactions, such as the fact that he went shopping for shoes and groceries on 23 September, and how much he spent.”
It’s even easier to identify women and high-income individuals, possibly because they have distinctive shopping habits.
The lack of privacy with anonymized data—combined with its widespread availability in government and commercial databases—also raises a larger concern of how this information can be used against us without our knowledge or consent with things like loan applications, insurance dealings, and divorce actions.