Tor is an imperfect privacy platform. Ars meets the researchers trying to replace it.
J.M. Porup – 31/8/2016, 07:25
Since Edward Snowden stepped into the limelight from a hotel room in Hong Kong three years ago, use of the Tor anonymity network has grown massively. Journalists and activists have embraced the anonymity the network provides as a way to evade the mass surveillance under which we all now live, while citizens in countries with restrictive Internet censorship, like Turkey or Saudi Arabia, have turned to Tor in order to circumvent national firewalls. Law enforcement has been less enthusiastic, worrying that online anonymity also enables criminal activity.
Tor’s growth in users has not gone unnoticed, and today the network first dubbed “The Onion Router” is under constant strain from those wishing to identify anonymous Web users. The NSA and GCHQ have been studying Tor for a decade, looking for ways to penetrate online anonymity, at least according to these Snowden docs. In 2014, the US government paid Carnegie Mellon University to run a series of poisoned Tor relays to de-anonymise Tor users. A 2015 research paper outlined an attack effective, under certain circumstances, at decloaking Tor hidden services (now rebranded as “onion services”). Most recently, 110 poisoned Tor hidden service directories were discovered probing .onion sites for vulnerabilities, most likely in an attempt to de-anonymise both the servers and their visitors.
Cracks are beginning to show; a 2013 analysis by researchers at the US Naval Research Laboratory (NRL), who helped develop Tor in the first place, concluded that “80 percent of all types of users may be de-anonymised by a relatively moderate Tor-relay adversary within six months.”
Despite this conclusion, the lead author of that research, Aaron Johnson of the NRL, tells Ars he would not describe Tor as broken—the issue is rather that it was never designed to be secure against the world’s most powerful adversaries in the first place.
“It may be that people’s threat models have changed, and it’s no longer appropriate for what they might have used it for years ago,” he explains. “Tor hasn’t changed, it’s the world that’s changed.”
Tor’s weakness to traffic analysis attacks is well-known. The original design documents highlight the system’s vulnerability to a “global passive adversary” that can see all the traffic both entering and leaving the Tor network. Such an adversary could correlate that traffic and de-anonymise every user.But as the Tor project’s cofounder Nick Mathewson explains, the problem of “Tor-relay adversaries” running poisoned nodes means that a theoretical adversary of this kind is not the network’s greatest threat.
“No adversary is truly global, but no adversary needs to be truly global,” he says. “Eavesdropping on the entire Internet is a several-billion-dollar problem. Running a few computers to eavesdrop on a lot of traffic, a selective denial of service attack to drive traffic to your computers, that’s like a tens-of-thousands-of-dollars problem.”
At the most basic level, an attacker who runs two poisoned Tor nodes—one entry, one exit—is able to analyse traffic and thereby identify the tiny, unlucky percentage of users whose circuit happened to cross both of those nodes. At present the Tor network offers, out of a total of around 7,000 relays, around 2,000 guard (entry) nodes and around 1,000 exit nodes. So the odds of such an event happening are one in two million (1/2000 x 1/1000), give or take.
But, as Bryan Ford, professor at the Swiss Federal Institute of Technology in Lausanne (EPFL), who leads the Decentralised/Distributed Systems (DeDiS) Lab, explains: “If the attacker can add enough entry and exit relays to represent, say, 10 percent of Tor’s total entry-relay and exit-relay bandwidth respectively, then suddenly the attacker is able to de-anonymise about one percent of all Tor circuits via this kind of traffic analysis (10 percent x 10 percent).””Given that normal Web-browsing activity tends to open many Tor circuits concurrently (to different remote websites and HTTP servers) and over time (as you browse many different sites),” he adds, “this means that if you do any significant amount of Web browsing activity over Tor, and eventually open hundreds of different circuits over time, you can be virtually certain that such a poisoned-relay attacker will trivially be able to de-anonymise at least one of your Tor circuits.”
For a dissident or journalist worried about a visit from the secret police, de-anonymisation could mean arrest, torture, or death.
As a result, these known weaknesses have prompted academic research into how Tor could be strengthened or even replaced by some new anonymity system. The priority for most researchers has been to find better ways to prevent traffic analysis. While a new anonymity system might be equally vulnerable to adversaries running poisoned nodes, better defences against traffic analysis would make those compromised relays much less useful and significantly raise the cost of de-anonymising users.
The biggest hurdle? Despite the caveats mentioned here, Tor remains one of the better solutions for online anonymity, supported and maintained by a strong community of developers and volunteers. Deploying and scaling something better than Tor in a real-world, non-academic environment is no small feat.
What Tor does really well
Tor was designed as a general-purpose anonymity network optimised for low-latency, TCP-only traffic. Web browsing was, and remains, the most important use case, as evidenced by the popularity of the Tor Browser Bundle. This popularity has created a large anonymity set in which to hide—the more people who use Tor, the more difficult it is to passively identify any particular user.
But that design comes at a cost. Web browsing requires low enough latency to be usable. The longer it takes for a webpage to load, the fewer the users who will tolerate the delay. In order to ensure that Web browsing is fast enough, Tor sacrifices some anonymity for usability and to cover traffic. Better to offer strong anonymity that many people will use than perfect anonymity that’s too slow for most people’s purposes, Tor’s designers reasoned.
“There are plenty of places where if you’re willing to trade off for more anonymity with higher latency and bandwidth you’d wind up with different designs,” Mathewson says. “Something in that space is pretty promising. The biggest open question in that space is, ‘what is the sweet spot?’
“Is chat still acceptable when we get into 20 seconds of delay?” he asks. “Is e-mail acceptable with a five-minute delay? How many users are willing to use that kind of a system?”
Mathewson says he’s excited by some of the anonymity systems emerging today but cautions that they are all still at the academic research phase and not yet ready for end users to download and use.
Ford agrees: “The problem is taking the next big step beyond Tor. We’ve gotten to the point where we know significantly more secure is possible, but there’s still a lot of development work to make it really usable.”
Can Tor be replaced?
After interviewing numerous leading anonymity researchers for this article, one thing becomes clear: Tor is not going away any time soon. The most probable future we face is a world in which Tor continues to offer a good-but-not-perfect, general-purpose anonymity system, while new anonymity networks arrive offering stronger anonymity optimised for particular use-cases, like anonymous messaging, anonymous filesharing, anonymous microblogging, and anonymous voice-over-IP.
Nor is the Tor Project standing still. Tor today is very different from the first public release more than a decade ago, Mathewson is quick to point out. That evolution will continue.
“It’s been my sense for ages that the Tor we use in five years will look very different from the Tor we use today,” he says. “Whether that’s still called Tor or not is largely a question of who builds and deploys it first. We are not stepping back from innovation. I want better solutions than we have today that are easier to use and protect people’s privacy.”
The following five projects are breaking new ground in developing stronger anonymity systems. Here’s a rundown of the big ideas in this space, the current status of each project, and speculation from the researchers about when we might expect to see real-world deployment.
Herd: Signal without the metadata
The twin Aqua/Herd projects look closest to real-world deployment, so let’s start there. Aqua, short for “Anonymous Quanta,” is an anonymous file-sharing network design. Herd, based on Aqua, and with similar anonymity properties, is an anonymous voice-over-IP network design—”Signal without the metadata,” as its project leader Stevens Le Blond, a research scientist at the Max Planck Institute for Software Systems (MPI-SWS) in Germany, explains to Ars.
Le Blond reports that his team has implemented a working Herd prototype at MPI-SWS, and together with their colleagues from Northeastern University in the United States, have just raised half-a-million dollars from the US National Science Foundation to deploy Herd, Aqua, and other anonymity systems over the next three years. With funding in hand, Le Blond hopes to see the first Herd nodes online and ready for users in 2017.
Both Herd and Aqua work by padding traffic with “chaff”—random noise that makes a user’s traffic indistinguishable from any other user on the network. Unlike Tor, which can, with difficulty, be used for VoIP in a CB-radio kind of way, Herd promises usable, secure, anonymous VoIP calls.
“Aqua and Herd attempt to reconcile efficiency and anonymity by designing, implementing, and deploying anonymity networks which provide low latency and/or high bandwidth without sacrificing anonymity,” Le Blond says.
According to Ford, the Herd/Aqua projects offer the most incremental advances in anonymity technology. “It’s not inconceivable that something like Aqua or Herd could replace Tor,” he says.
Vuvuzela/Alpenhorn: Metadata-free chat
Vuvuzela, named after the buzzy horn common at soccer matches in Africa and Latin America, and its second iteration, Alpenhorn, aim to offer anonymous, metadata-free chat. The best available metadata-free chat application today is Ricochet, while the abandoned Pond project also looked promising for a while. Alpenhorn, however, will offer stronger privacy guarantees, according to project leader David Lazar.
“Pond and Ricochet rely on Tor, which is vulnerable to traffic-analysis attacks,” Lazar says. “Vuvuzela is a new design that protects against traffic analysis and has formalised privacy guarantees.”
“Our experiments show that Vuvuzela and Alpenhorn can scale to millions of users,” he adds, “and we’re currently working on deploying a public beta.”
Developed by a team of researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), Vuvuzela’s approach to anonymous chat is to encrypt the metadata that it can, add noise to the metadata that it can’t, and use differential privacy to analyse how much anonymity this noise provides.
Alpenhorn checks in at just over 3,000 lines of Go. At scale, the network promises latency of ~37 seconds per message, assuming around a million simultaneous users, with a throughput of 60,000 messages per second.
The Alpenhorn team will present their research at the 2016 Usenix Symposium on Operating Systems Design and Implementation (OSDI) in November.
“We are currently working on the final version of the Alpenhorn paper and on making the Vuvuzela and Alpenhorn code ready for production,” Lazar says. “In the meantime, users can sign up to our e-mail list for updates on our progress.”
Dissent: The strongest-available anonymity
With strong anonymity guarantees come trade-offs in latency and bandwidth. Bryan Ford’s Dissent Project made a bit of a splash a few years ago by pushing the slider to 11 on the anonymity scale. Dissent’s proof-of-concept offers cryptographically provable anonymity—but at a high cost in terms of scalability and usability.
Unlike Tor’s onion-routing model, Dissent is based on a dining cryptographers algorithm, or “DC-net.” Combined with a verifiable-shuffle algorithm, Dissent offers the most anonymous design currently under investigation by researchers.
Dissent’s high latency and low bandwidth make it most suitable for, well, dissent. The project’s optimal use-case is one-to-many broadcasting that does not require real-time interaction, such as blogging, microblogging (anonymous Twitter, anyone?), or even IRC.
DC-nets work because when one client wishes to broadcast a message, all other clients on the network must broadcast a message of the same size. This creates a large bandwidth-overhead, and at present Dissent only scales to a few thousand users—although Ford says his team is working on improving the efficiency of the algorithm.
Dissent also has the potential to offer what Ford calls “PriFi”—integrated into a corporate or campus WiFi network, the platform could offer provably anonymous Web browsing within that set of users. A passive observer would know someone on campus was viewing a certain website, for example, but they would not be able to identify which user it was. PriFi traffic connecting to the Tor network would offer even stronger anonymity.
The Dissent team is in the process of redesigning Dissent and rewriting it in Go, and some of the new components, Ford says, are available on Github, but not yet ready as part of a complete system.
“Unfortunately none of this code is really ready for users who want a full ‘anonymity system’ to play with,” Ford says, “but strong hacker-types who might want to help us further develop the pieces and/or help us put them together into usable applications are of course welcome to try them and get in touch.”
Dissent has become something of a touchstone among anonymity researchers. The following two projects were both inspired, in part, by Dissent and a desire to make a more efficient anonymity system while still retaining much of Dissent’s strength.
Riffle: Anonymous filesharing
Like Aqua, Riffle‘s main use-case is anonymous filesharing. Contrary to some reports that its new anonymity system could replace Tor, Riffle would, if successfully deployed, not only complement Tor, but perhaps even make it faster by giving users sharing large files a more secure alternative.
“[Riffle is] not a replacement for Tor but complementary to Tor,” says Albert Kwon, a graduate student at MIT and the project’s lead researcher. “We have a very different goal. Our goal is to provide the strongest level of practical anonymity we could think of.”
Kwon’s interest in developing anonymous filesharing has nothing to do with enabling copyright infringement, he explains, but rather a desire to help journalists anonymously share large files and to make it easier for whistleblowers to submit large document sets to publishers.
Over Tor, “sending a very large file in a very short period of time is drastically different from Web browsing,” he says, “and much easier to fingerprint in that sense. I’d like to set up a filesharing group that wants to be anonymous; a lot of journalists are willing to have something like this running.”
Riffle was inspired by the Dissent project and, like Dissent, uses a verifiable shuffle algorithm (hence the name “Riffle”), but forgoes the DC-net crypto primitive in order to make the network more efficient. It could also be used for anonymous microblogging, Kwon says, though as an academic prototype, it is not usable by a “regular person.” He plans to spend the next semester building a public alpha release.
Riposte: An anonymous Twitter
Like Riffle, Riposte was inspired by Dissent, but the design has been fine-tuned for one specific use-case: microblogging.
“This is an example of where, if you’re willing to tailor your system design to an application, you can get much better performance,” says Henry Corrigan-Gibbs, a graduate student at Stanford’s Applied Crypto Group and Riposte’s lead researcher. “You can’t solve all the problems at the same time.”
Riposte maintains the strong anonymity properties of a DC-net, including resistance to both traffic analysis and to disruption attacks by malicious clients, but scales to millions of users. The tradeoff, again, is much higher latency—but, Corrigan-Gibbs says, that may be acceptable for a Twitter-like service.
“Low-latency anonymity is inherently problematic when you’re looking at an adversary that is able to see large parts of—or the interesting parts of—the network,” he explains.
Riposte currently exists as an academic prototype. Corrigan-Gibbs says the team is working on improving the system’s anonymity and security properties. “My hope,” he tells Ars, “is to get some of the ideas from Riposte (if not the code itself) integrated into existing communication platforms for privacy-sensitive users.”
The Riposte team has no plans to deploy the network on its own, at least for now. “I come up with design ideas and prototype the system to show that it works,” he explains. “And it takes a whole ‘nother set of important skills to build something. The Tor Project is very impressive to keep this massive distributed system running … and with relatively little funding.”
From research to production
The gap between academia and real-world deployment poses a challenge for researchers wanting to scale their next-gen anonymity prototype in production. Academics in search of tenure face an incentive structure that rewards publishing new ideas and proofs-of-concept—not building software, attracting users, and scaling adoption.
And, as the researchers themselves acknowledge, the skillset required to deploy software at scale is unrelated to their core research background. “Most of the next-generation anonymity work is coming from the research community, which is not normally very good at producing widely usable products,” Ford says. “In my group at EPFL I’m trying to change that, at least locally.”
Mathewson is sympathetic. Tor began as a research paper that he says he expected to deploy for a few years before handing over to someone else. More than 10 years later, the Tor Project has developed deep experience in maintaining a network that, for many dissidents, is critical infrastructure when it’s the only barrier between a seditious tweet or blog post and a visit by the secret police.His counsel to researchers is this: eat your own dog food.
“I’ve said this publicly before; for me the biggest achievement, the thing I’m waiting to hear from every one of these research groups is, ‘not only did we design it, and used it for testing, but we’re actually using it for our own communications in the lab,'” Mathewson says. “The two best choices we made when we started out were that we aimed to deploy and share it with the world as soon as we could.”
“What you learn about software from running it is like what you learn from food by tasting it,” he explains. “You can’t be a cook who makes recipes and never tastes them. You can’t actually know whether you’ve made a working solution for humans unless you give it to humans, including yourself.”
Without anonymity, democracy crumbles
Today, three years post-Snowden, strong encryption has grown increasingly ubiquitous, channelling more Web traffic than ever before and enabling end-to-end secure communication for a billion WhatsApp and Signal users.
“But the unfortunate thing is that encryption can only help you so much when metadata leaks who you’re talking to, when you’re talking to them, and even suggests what you’re talking to them about,” Chris Soghoian, the principal technologist at the Speech, Privacy, and Technology Project at the American Civil Liberties Union (ACLU), tells Ars.”We desperately need metadata protection because the kinds of users who need privacy the most—whether it’s journalists, or activists, or LGBT teens in the closet—merely revealing who you’re talking to can be enough to sink you,” he says. “And if people don’t feel free to communicate, feel free to read and to organise and to speak, then democracy crumbles.”
JM Porup is a freelance cybersecurity reporter who lives in Toronto. When he dies his epitaph will simply read “assume breach.” You can find him on Twitter at @toholdaquill.