13 November 2009
I am currently listening to the presentation of:
- Dominik Herrmann, Rolf Wendolsky, Hannes Federrath, “Website Fingerprinting: Attacking Popular Privacy Enhancing Technologies with the Multinomial Naive-Bayes Classifier” ACM CCSW 2009. Chicago, US.
This work looks at the classic problem of identifying browsed pages through SSL encryption or low-latency anonymization networks like Tor. The authors cast the problem of identifying web pages as a supervised learning and classification problem. Instead of applying a jacard coefficient or naive bayes classifier, the authors use a multinomial naive Bayes classifier to learn and identify the pages from their sizes. Furthermore they use filtering on the raw page sizes inspired from information retrieval — such as using the logarithms of the sizes to avoid large bias, or cosine normalisation to make the overall size of the website irrelevant.
That is yet another fine example of casting a traffic analysis problem in terms of machine learning and Bayesian inference.
A very interesting comparison is presented on how different single-hop and multi-hop anonymity systems compare. Single Hop systems are broken with probability over 95%. Multi-hop systems are pretty robust with JonDoNym traffic only being identified 20% of the time, and Tor about 3% of the time. Why is Tor doing so well? It turns out that the cell quantization does add noise, and mess up the attack. It is not clear if this is a fundamental finding, or whether a better / more specific attack could do better.
Datasets should be available soon. Sadly I cannot find the paper on-line. Put your work on the web people!
11 November 2009
I am currently sitting in the anonymity techniques session of ACM CCS 2009 in Chicago, and thought I might as well provide a roadmap to what is being presented here regarding traffic analysis and anonymity. Aside the main event, WPES was also hosted on Monday and contained quite a few relevent papers.
- On the risks of serving whenever you surf: Vulnerabilities in Tor’s blocking resistance design
Jon McLachlan and Nicholas J. Hopper (WPES)
It turns out that altruism does not pay: if you become a relay when also using an anonymity network, your attack surface grows and novel traffic analysis attacks are possible to guess what you are surfing. This paper presents a long term intersection attack, where the time you are on-line can be used to correlate observed browsing events (like tweets), as well as an extension of what has now been named clogging attacks, where traffic can be modulated on the Tor client, and observed on a remote server. Very cool! This paper seems to be part of Nick Hopper’s group new strategy, of presenting an attack paper at WPES, and a solution at CCS (this is understandable as he has been penalised, by not being able to publish at PETS 2010, as he is aPC chair). Part of the solution against this attack is presented in “Membership-concealing overlay networks” later in the conference.
- XPay: Practical anonymous payments for Tor routing and other networked services (not on-line!)
Yao Chen, Radu Sion and Bogdan Carbunar (WPES)
A controversial payment mechanism for paying routers that relay traffic in Tor. More details will follow when I manage to get a copy of the paper, as the proceedings of WPES are not available yet (*cough*cock-up*cough*).
- Hashing it out in public: Common failure modes of DHT-based anonymity schemes
Andrew Quoc Tran, Nicholas J. Hopper and Yongdae Kim (WPES)
Many have tried to build anonymity systems over DHTs, providing some of us an endless supply of systems to perform traffic analysis. This work looks at pretty much most of the DHT based anonymity systems, and points out that queries in DHTs are easy to observe, manipulate and DoS given only a small number of corrupt nodes. This is the attack paper that motivated the Torsk system, that will be presented later in the CCS conference.
- NISAN: Network Information Service for Anonymization Networks (not on-line?)
Andriy Panchenko, Arne Rache and Stefan Richter
Directory services are a bit of a drag on anonymity systems, and NISSAN is a proposal to use a DHT to distribute the directory functionality. Special care is taken to ensure the DHT routing resolution mechanism is not susceptible to an active adversary that controls a fraction of nodes in the DHT, to get truly random nodes. The authors also controversially argue that the observability of queries is not important, and suggest that bridging and fingerprinting is not a big deal so far. DHTs are scary, and NISSAN is likely to come under close scrutiny in the next few years (as long as the authors put the paper on-line!)
- Certificateless Onion Routing
Dario Catalano, Dario Fiore and Rosario Gennaro
The paper proposes the use of certificateless encryption to build circuits in onion routing (or mix) systems. It is unclear how useful this is, as the list of nodes already has to be signed to avoid directory and sybil attacks, but having different options to do crypto in anonymity networks is always interesting.
- ShadowWalker: Peer-to-peer Anonymous Communication using Redundant Structured Topologies
Prateek Mittal and Nikita Borisov
After many attack papers, Nikita and Prateek decide to jump in the deep end, and propose their own DHT based path construction mechanism for anonymity networks. ShadowWalker is a system that allows secure sampling in a DHT. It relies on commitments on routing tables shared amongst some “shadow” nodes to ensure the adversary cannot adaptively lie about their routing tables, and a tit-for-tat strategy to avoid DoS attacks. Very cool ideas, and a very cool name.
- The Bayesian Traffic Analysis of Mix Networks
Carmela Troncoso and George Danezis
Given a trace of a mix network, what is the probability of Alice talking to Bob? This seemingly simple question turns out to be remarkably hard to answer given the constraints of path selection and a real-world trace of traffic. More details on how to do this are also available in our report: The Application of Bayesian Inference to Traffic Analysis. I will write a post about how I hope this will become the standard way in which to assess the passive information leakage of anonymity networks.
- AS-awareness in Tor Path Selection
Matthew Edman and Paul Syverson
The story so far was that a bigger Tor network is a more secure Tor network. This paper points out it is not so simple: despite having more nodes, Tor still routes most traffic over a small number of AS (Autonomous Systems), that can see both ends of the connection, and should be able to do timing attacks. A method to chose paths to mitigate this problem is proposed — more would be welcome.
- Membership-concealing overlay networks
Eugene Vasserman, Rob Jansen, James Tyra, Nicholas Hopper and Yongdae Kim
Finding out remotely who is using an anonymity network might lead to trouble for some users. This work proposes a method to anonymize traffic, in a peer-to-peer manner, but without ever connecting to many strangers. Its very nice to see the idea of social network being used for infiltration resistance taking off.
Papers to come tomorrow:
- A New Cell Counter Based Attack Against Tor
Zhen Ling, Junzhou Luo, Wei Yu, Xinwen Fu, Dong Xuan and Weijia Jia
- Scalable Onion Routing with Torsk
Jon McLachlan, Andrew Tran, Nicholas Hopper and Yongdae Kim