Enlarge / The 23andMe logo displayed on a smartphone screen.
Genetic profiling service 23andMe has confirmed that private user data is circulating for sale online after being scraped off its website.
Friday’s confirmation comes five days after an unknown entity took to an online crime forum to advertise the sale of private information for millions of 23andMe users. The forum posts claimed that the stolen data included origin estimation, phenotype, health information, photos, and identification data. The posts claimed that 23andMe’s CEO was aware the company had been “hacked” two months earlier and never revealed the incident.
23andMe officials on Friday confirmed that private data for some of its users is, in fact, up for sale. The cause of the leak, the officials said, is data scraping, a technique that essentially reassembles large amounts of data by systematically extracting smaller amounts of information available to individual users of a service. Attackers gained unauthorized access to the individual 23andMe accounts, all of which had been configured by the user to opt in to a DNA relative feature that allows them to find potential relatives.
In a statement, the officials wrote:
We do not have any indication at this time that there has been a data security incident within our systems. Rather, the preliminary results of this investigation suggest that the login credentials used in these access attempts may have been gathered by a threat actor from data leaked during incidents involving other online platforms where users have recycled login credentials.
We believe that the threat actor may have then, in violation of our terms of service, accessed 23andme.com accounts without authorization and obtained information from those accounts. We are taking this issue seriously and will continue our investigation to confirm these preliminary results.
The DNA relative feature allows users who opt in to view basic profile information of others who also allow their profiles to be visible to DNA Relative participants, a spokesperson said. If the DNA of one opting-in user matches another, each gets to access the other’s ancestry information.
Advertisement
The crime forum post claimed the attackers obtained “13M pieces of data.” 23andMe officials have provided no details about the leaked information available online, the number of users it belongs to, or where it’s being made available. On Friday, The Record and Bleeping Computer reported that one leaked database contained information for 1 million users of Ashkenazi heritage, all of whom had opted in to the DNA relative service. The Record said a second database included 300,000 users of Chinese heritage who also had opted in.
The data included profile and account ID numbers, names, gender, birth year, maternal and paternal genetic markers, ancestral heritage results, and data on whether or not each user has opted into 23andme’s health data.
The Record also reported that a researcher recently discovered a flaw on the 23andMe website that allows people who know the profile ID of a user to view that user’s profile photo, name, birth year, and location.
By now, it has become clear that storing genetic information online carries risks. In 2018, MyHeritage revealed that email addresses and hashed passwords for more than 92 million users had been stolen through a breach of its network that occurred seven months earlier.
That same year, law enforcement officials in California said they used a different genealogy site to track down a long-sought suspect in a string of grisly murders that occurred 40 years earlier. Investigators matched DNA left at a crime scene with the suspect’s DNA. The suspect had never submitted a sample to the service, which is known as GEDMatch. Instead, the match was made with a GEDMatch user related to the suspect.
While there are benefits to storing genetic information online so people can trace their heritage and track down relatives, there are clear privacy threats. Even if a user chooses a strong password and uses two-factor authentication as 23andMe has long urged, their data can still be swept up in scraping incidents like the one recently confirmed. The only sure way to protect it from online theft is to not store it there in the first place.