Getty Images
An unknown hacker gained administrative control of Sourcegraph, an AI-driven service used by developers at Uber, Reddit, Dropbox, and other companies, and used it to provide free access to resources that normally would have required payment.
In the process, the hacker(s) may have accessed personal information belonging to Sourcegraph users, Diego Comas, Sourcegraph’s head of security, said in a post on Wednesday. For paid users, the information exposed included license keys and the names and email addresses of license key holders. For non-paying users, it was limited to email addresses associated with their accounts. Private code, emails, passwords, usernames, or other personal information were inaccessible.
Free-for-all
The hacker gained administrative access by obtaining an authentication key a Sourcegraph developer accidentally included in a code published to a public Sourcegraph instance hosted on Sourcegraph.com. After creating a normal user Sourcegraph account, the hacker used the token to elevate the account privileges to those of an administrator. The access token appeared in a pull request posted on July 14, the user account was created on August 28, and the elevation to admin occurred on August 30.
“The malicious user, or someone connected to them, created a proxy app allowing users to directly call Sourcegraph’s APIs and leverage the underlying LLM [large language model],” Comas wrote. “Users were instructed to create free Sourcegraph.com accounts, generate access tokens, and then request the malicious user to greatly increase their rate limit. On August 30 (2023-08-30 13:25:54 UTC), the Sourcegraph security team identified the malicious site-admin user, revoked their access, and kicked off an internal investigation for both mitigation and next steps.”
Advertisement
The resource free-for-all generated a spike in calls to Sourcegraph programming interfaces, which are normally rate-limited for free accounts.
Enlarge / A graph showing API usage from July 31 to August 29 with a major spike at the end.
Sourcegraph
“The promise of free access to Sourcegraph API prompted many to create accounts and start using the proxy app,” Comas wrote. “The app and instructions on how to use it quickly made its way across the web, generating close to 2 million views. As more users discovered the proxy app, they created free Sourcegraph.com accounts, adding their access tokens, and accessing Sourcegraph APIs illegitimately.”
Sourcegraph personnel eventually identified the surge in activity as “isolated and inorganic” and began investigating the cause. Comas said the company’s automated code analysis and other internal control systems “failed to catch the access token being committed to the repository.” Comas didn’t elaborate.
The token gave users the ability to view, modify, or copy the exposed data, but Comas said the investigation didn’t conclude if that actually happened. While most data was available for all paid and community users, the number of license keys exposed was limited to 20.
The inadvertent posting by developers of private credentials in publicly available code has been a problem plaguing online companies for more than a decade. These credentials can include private encryption keys, passwords, and authentication tokens. In the age of publicly accessible code repositories like GitHub, credentials should never be included in commits. Instead, they should be stored only on restricted servers.