I am at PETS 2016 writing this summary in real-time as Tao Wang presents his work with Ian Goldberg on “On Realistically Attacking Tor with Website Fingerprinting“. This is the latest paper addressing challenges identified by the Tor project to make website fingerprinting (WF) research more informative about real-world threats. We have a stake in this since Jamie Hayes from UCL will be presenting a joint paper on k-fingerprinting: a Robust Scalable Website Fingerprinting Technique.
The key problem is that a number of reported website fingerprinting results are evaluated in laboratory conditions. Website fingerprinting considers an adversary looking at an encrypted and anonymized stream, and the task is to recognize which website is being accessed through the protections. However, the evaluation of the attacks consider a restricted and stylized setting, where a single client accesses a single (usually landing) page from a server. This is restrictive: it does not consider accessing multiple pages, it assumed the start and end of pages is known, and that the training data is up to date.
So this paper consider how to deal with such setting. First, how to deal with noisy data? For example, what is another tab creates some continuous traffic, aka noise that may confuse the attacker? In this case the attacker will see a combined stream of the signal from a page load superimposed on the noise. Tao illustrates that as the volume of noise increases the accuracy of attacks radically drops. Trivial solutions such as trying to subtract noise are not likely to succeed.So the actual solution is to train on the noisy version of the stream, and that indeed increases the accuracy of the attack back to dangerous levels.
The second problem is splitting: how can an attacker detect the start and end of loading a specific bundle of resources corresponding to a web page. There can either be a lot of quiet time between loads, no time, or even a negative time when the loads overlap. It turns out when there is a large gap it can be overcome with little effect on the True positive rate — as the split is obvious. However, the problem has a generic solution for other cases: you simply use a classifier to learn where to split, and then apply the classifier. I do wonder if using both incoming and outgoing channels could improve this.
Finally, they address the issue of maintaining a training set, since it is expensive to collect such data continuously. So either you have a small fresh training set, or a larger but out of date training set. There is a trade-off, but Tao demonstrates that even old-ish (10 days) still provides good accuracy. In conclusion, it seems that those roadblocks can be tackled to make WF practical, and there is an increased call to implement defenses.
My meta-conclusion is twofold: first, it is clear that the roadblocks presented first by the tor project blog post are indeed issues, but are not a game stopper for website fingerprinting. It took the community about 2 years to mature into using more modern machine learning (ML) techniques, and this work illustrates this. Yet, my second conclusion is that the maturity of the ML usage for fingerprinting is still low: it is pretty self-evident that to be robust to noise one should train on noisy data — I am glad this is now explicit. It is also self-evident that we can take small data sets of pages and network conditions and synthesize a very large set of training examples to make learning even more robust to noise and better at detecting splits. No one has done this yet — maybe that would be a nice paper for PETs next year.
3 February 2016
(This is an extract from my contribution to Harper, Richard. “Introduction and Overview”, Trust, Computing, and Society. Ed. Richard H. R. Harper. 1st ed. New York: Cambridge University Press, 2014. pp. 3-14. Cambridge Books Online. Web. 03 February 2016. http://dx.doi.org/10.1017/CBO9781139828567.003)
Cryptography has been used for centuries to secure military, diplomatic, and commercial communications that may fall into the hands of enemies and competitors (Kahn 1996). Traditional cryptography concerns itself with a simple problem: Alice wants to send a message to Bob over some communication channel that may be observed by Eve, but without Eve being able to read the content of the message. To do this, Alice and Bob share a short key, say a passphrase or a poem. Alice then uses this key to scramble (or encrypt) the message, using a cipher, and sends the message to Bob. Bob is able to use the shared key to invert the scrambling (or “decrypt”) and recover the message. The hope is that Eve, without the knowledge of the key, will not be able to unscramble the message, thus preserving its confidentiality.
It is important to note that in this traditional setting we have not removed the need for a secure channel. The shared key needs to be exchanged securely, because its compromise would allow Eve to read messages. Yet, the hope is that the key is much shorter than the messages subsequently exchanged, and thus easier to transport securely once (by memorizing it or by better physical security). What about the cipher? Should the method by which the key and the message are combined not be kept secret? In “La Cryptographie Militaire” in 1883, Auguste Kerckhoffs stated a number of principles, including that only the key should be considered secret, not the cipher method itself (Kerckhoffs 1883). Both the reliance on a small key and the fact that other aspects of the system are public is an application of the minimization principle we have already seen in secure system engineering. It is by minimizing what has to be trusted for the security policy to hold that one can build and verify secure systems – in the context of traditional cryptography, in principle, this is just a short key.
Kerckhoffs argues that only the key, not the secrecy of the cipher is in the trusted computing base. But a key property of the cipher is relied on: Eve must not be able to use an encrypted message and knowledge of the cipher to recover the message without access to the secret key. This is very different from previous security assumptions or components of the TCB. It is not about the physical restrictions on Eve, and it is not about the logical operations of the computer software and hardware that could be verified by careful inspection. It comes down to an assumption that Eve cannot solve a somehow difficult mathematical problem. Thus, how can you trust a cipher? How can you trust that the adversary cannot solve a mathematical problem?
To speak the truth, this was not a major concern until relatively recently, compared with the long history of cryptography. Before computers, encoding and decoding had to be performed by hand or using electromechanical machines. Concerns such as usability, speed, cost of the equipment, and lack of decoding errors were the main concerns in choosing a cipher. When it comes to security, it was assumed that if a “clever person” proposes a cipher, then it would take someone much cleverer than them to decode it. It was even sometimes assumed that ciphers were of such complexity that there was “no way” to decode messages without the key. The assumption that other nations may not have a supply of “clever” people may have to do with a colonial ideology of nineteenth and early twentieth centuries. Events leading to the 1950s clearly contradict this: ciphers used by major military powers were often broken by their opponents.
In 1949, Claude Shannon set out to define what a perfect cipher would be. He wanted it to be “impossible” to solve the mathematical problem underlying the cipher (Shannon 1949). The results of this seminal work are mixed. On the positive side, there is a perfect cipher that, no matter how clever an adversary is, cannot be solved – the one-time pad. On the down side, the key of the cipher is as long as the message, must be absolutely random, and can only be used once. Therefore the advantage of short keys, in terms of minimizing their exposure, is lost and the cost of generating keys is high (avoiding bias in generating random keys is harder than expected). Furthermore, Shannon proves that any cipher with smaller keys cannot be perfectly secure. Because the one-time pad is not practical in many cases, how can one trust a cipher with short keys, knowing that its security depends on the complexity of finding a solution? For about thirty years, the United States and the UK followed a very pragmatic approach to this: they kept the cryptological advances of World War II under wraps; they limited the export of cryptographic equipment and know-how through export regulations; and their signal intelligence agencies – the NSA and GCHQ, respectively – became the largest worldwide employers of mathematicians and the largest customers of supercomputers. Additionally, in their roles in eavesdropping on their enemies’ communications, they evaluated the security of the systems used to protect government communications. The assurance in cryptography came at the cost of being the largest organizations that know about cryptography in the world.
The problem with this arrangement is that it relies on a monopoly of knowledge around cryptology. Yet, as we have seen with the advent of commercial telecommunications, cryptography becomes important for nongovernment uses. Even the simplest secure remote authentication mechanism requires some cryptography if it is to be used over insecure channels. Therefore, keeping cryptography under wraps is not an option: in 1977, the NSA approved the IBM design for a public cipher, the Data Encryption Standard (DES), for public use. It was standardized in 1979 by the US National Institute for Standards and Technology (NIST).
The publication of DES launched a wide interest in cryptography in the public academic community. Many people wanted to understand how it works and why it is secure. Yet, the fact that the NSA tweaked its design, for undisclosed reasons, created widespread suspicion in the cipher. The fear was that a subtle flaw was introduced to make decryption easy for intelligence agencies. It is fair to say that many academic cryptographers did not trust DES!
Another important innovation in 1976 was presented by Whitfield Diffie and Martin Hellman in their work “New Directions in Cryptography” (Diffie & Hellman 1976). They show that it is possible to preserve the confidentiality of a conversation over a public channel, without sharing a secret key! This is today known as “Public Key Cryptography,” because it relies on Alice knowing a public key for Bob, shared with anyone in the world, and using it to encrypt a message. Bob has the corresponding private part of the key, and is the only one that can decode messages used with the public key. In 1977, Ron Rivest, Adi Shamir, and Leonard Adleman proposed a further system, the RSA, that also allowed for the equivalent of “digital signatures” (Rivest et al. 1978).
What is different in terms of trusting public key cryptography versus traditional ciphers? Both the Diffie-Hellman system and the RSA system base their security on number theoretic problems. For example, RSA relies on the difficulty of factoring integers with two very large factors (hundreds of digits). Unlike traditional ciphers – such as DES – that rely on many layers of complex problems, public key algorithms base their security on a handful of elegant number theoretic problems.
Number theory, a discipline that G.H. Hardy argued at the beginning of the twentieth century was very pure in terms of its lack of any practical application (Hardy & Snow 1967), quickly became the deciding factor on whether one can trust the most significant innovation in the history of cryptology! As a result, a lot of interest and funding directed academic mathematicians to study whether the mathematical problems underpinning public key cryptography were in fact difficult and how difficult the problems were.
Interestingly, public key cryptography does not eliminate the need to totally trust the keys. Unlike traditional cryptography, there is no need for Bob to share a secret key with Alice to receive confidential communications. Instead, Bob needs to keep the private key secret and not share it with anyone else. Maintaining the confidentiality of private keys is simpler than sharing secret keys safely, but it is far from trivial given their long-term nature. What needs to be shared is Bob’s public key. Furthermore, Alice need to be sure she is using the public key associated with the Bob’s private key; if Eve convinces Alice to use an arbitrary public key to encrypt a message to Bob, then Eve could decrypt all messages.
The need to securely associate public keys with entities has been recognized early on. Diffie and Hellman proposed to publish a book, a bit like the phone register, associating public keys with people. In practice, a public key infrastructure is used to do this: trusted authorities, like Verisign, issue digital certificates to attest that a particular key corresponds to a particular Internet address. These authorities are in charge of ensuring that the identity, the keys, and their association are correct. The digital certificates are “signed” using the signature key of the authorities that anyone can verify.
The use of certificate authorities is not a natural architecture in many cases. If Alice and Bob know each other, they can presumably use another way to ensure Alice knows the correct public key for Bob. Similarly, if a software vendor wants to sign updates for their own software, they can presumably embed the correct public key into it, instead of relying on public key authorities to link their own key with their own identity.
The use of public key infrastructures (PKI) is necessary in case Alice wants to communicate with Bob without them having any previous relationship. In that case Alice, given only a valid name for Bob, can establish a private channel to Bob (as long as it trusts the PKI). This is often confused: the PKI ensures that Alice talks to Bob, but not that Bob is “trustworthy” in any other way. For example, a Web browser can establish a secure channel to a Web service that is compromised or simply belong to the mafia. The secrecy provided by the channel does not, in that case, provide any guarantees as to the operation of the Web service. Recently, PKI services and browsers have tried to augment their services by only issuing certificates to entities that are verified as somehow legitimate.
Deferring the link between identities and public keys to trusted third parties places this third party in a system’s TCB. Can certification authorities be trusted to support your security policy? In some ways, no. As implemented in current browsers, any certification authority (CA) can sign a digital certificate for any site on the Internet (Ellison & Schneier 2000). This means that a rogue national CA (say, from Turkey) can sign certificates for the U.S. State Department, that browsers will believe. In 2011, the Dutch certificate authority Diginotar was hacked, and their secret signature key was stolen (Fox-IT 2012). As a result, fake certificates were issued for a number of sensitive sites. Do CAs have incentives to protect their key? Do they have enough incentives to check the identity of the people or entities behind the certificates they sign?
Cryptographic primitives like ciphers and digital signatures have been combined in a variety of protocols. One of the most famous is the Secure Socket Layer SSL or TLS, which provides encryption to access encrypted Web sites on the Internet (all sites following the https:// protocol). Interestingly, once secure primitives are combined into larger protocols, their composition is not guaranteed to be secure. For example a number of problems have been identified against SSL and TLS that are not related to the weaknesses of the basic ciphers used (Vaudenay 2002).
The observation that cryptographic schemes are brittle and could be insecure even if they rely on secure primitives (as did many deployed protocols) led to a crisis within cryptologic research circles. The school of “provable security” proposes that rigorous proofs of security should accompany any cryptographic protocol to ensure it is secure. In fact “provable security” is a bit of a misnomer: the basic building blocks of cryptography, namely public key schemes and ciphers cannot be proved secure, as Shannon argued. So a security proof is merely a reduction proof: it shows that any weakness in the complex cryptographic scheme can be reduced to a weakness in one of the primitives, or a well-recognized cryptographic hardness assumption. It effectively proves that a complex cryptographic scheme reduces to the security of a small set of cryptographic components, not unlike arguments about a small Trusted Computing Base. Yet, even those proofs of security often work at a certain level of abstraction and often do not include all details of the protocol. Furthermore, not all properties can be described in the logic used to perform the proofs. As a result, even provably secure protocols have been found to have weaknesses (Pfitzmann & Waidner 1992).
So, the question of “How much can you trust cryptography?” has in part itself been reduced to “How much can you trust the correctness of a mathematical proof on a model of the world?” and “How much can one trust that a correct proof in a model applies to the real world?” These are deep epistemological questions, and it is somehow ironic that national, corporate, and personal security depends on them. In addition to these, one may have to trust certificate authorities and assumptions on the hardness of deep mathematical problems. Therefore, it is fair to say that trust in cryptographic mechanisms is an extremely complex social process.
7 November 2015
The recently unveiled UK Draft IP Bill imposes all sorts of obligations on telecommunications operators, including obligations to collaborate with warrants to facilitate surveillance, hack, notices to retain data, handing it out in bulk, and even obligations to implement bag doors, as well as gagging orders. Despite their centrality, it is surprisingly difficult to clearly understand who exactly is a “telecommunication operator”, and therefore on whom these obligations apply.
The scope of the legislation would be vastly different if it only applies to traditional telecommunication companies that control physical infrastructure, such as BT or cable companies, versus more widely to any internet service that allows messaging in any form, such as google chat, facebook, whatsapp and tinder (or any other dating app). What if it also applied to general purpose software and hardware companies, or free software projects? As ever, it is unwise to rely on the explanatory notes, or the announcements of politicians to elucidate this question — they have no legal validity. So I turn to the legislation itself, to try to get some insights.
S.193 provides definitions, and specifically S.193(8) to S.193(14) defines telecommunication operators, public and private, telecommunication services and finally telecommunication systems. We will take them in turn. I am always surprised how obscure, subtle, and wide-ranging, such definitions are.
S.193(10) Defines a telecommunications operator as being one of two things: they either offer a telecommunications “service” to persons in the UK; or they control or provide a telecommunication “system” which is at least in part in the UK, or controlled from the UK. Note the choice of subtle difference between a “service” and a “system“, as well as “offer“, “provide” versus “control“.
S.193(11) defined what a telecommunications service is: it is anything that provides, accesses, or facilitates the use of a telecommunication system. Helpfully, it points out that a service may be using a system provided by someone else: presumably this is intended to label as operators those providing services over infrastructure, logical or physical, provided by others; or software and hardware provided by others.
There is a further clarification in S.193(12): something is a telecommunications service if it is involved in the facilitation of the creation, management or storage of communications transmitted by a telecommunication system. Particularly troubling is the mention of “creation”: it might be used to argue that client side applications do facilitate the creation of communications (and their storage), and therefore are a telecommunication service. Their provision thus makes potential creators of software and apps, and for sure those providing web-mail and instant messaging services, telecommunication operators.
Finally, S.193(13) defines as a telecommunications system a system that in any way transmits communications using electric or electromagnetic energy including the communication apparatus (machinery) that is used to do this. The definition is very wide ranging, and includes all communications, except postal (which are dealt separately), and all telecommunication equipment in use.
I am not a lawyer (but neither are most MPs — only about 15% are legally trained).
My reading of the telecommunications operator definition is that it encompasses everyone that is somehow related to communications: their creation, management, storage, transmissions, processing, routing, etc. In my view this covers internet services and phone apps that allow private messaging at least: social network, instant messaging applications, dating websites, on-line games, etc. Of course it also covers trivially traditional telephony, mobile or fixed, Internet Service Providers and cable providers.
It is less clear whether only messaging and internet services, or also suppliers or hardware and software, are covered by this definition. For example, one could argue that a software vendor “provides a telecommunications system (S.193(10)(b))”, if by system we mean the software used to facilitate transmissions. In fact the definition of “system” includes the “apparatus comprised in it” (S.193(13)), namely software and hardware. Following that argument, software and hardware vendors of general computing equipment may be considered telecommunications operators — when their kit is used in the context of telecommunications. If I consider this argument reasonable, probably judges in secret courts, secretaries of state, and judicial commissioners may be convinced.
This ambiguity has far reaching consequences: if an enacted Investigatory Powers Bill, is interpreted to cover suppliers of communications software and hardware, then they may be coerced by notice to provide “interception capabilities” — government backdoors — into their software and hardware and further facilitate “interference warrants” — hacking — against the customers of their products. Operating system manufacturers, and even processor manufacturers may not be safe from this legislation which will discredit any assertion they make about the security of their products in an international market.
20 August 2015
This posts presents a quick opinion on a moral debate, that seems to have taken large proportions at this year’s SIGCOMM, the premier computer networking conference, related to the following paper:
Encore: Lightweight Measurement of Web Censorship with Cross-Origin Requests
by Sam Burnett (Georgia Tech) and Nick Feamster (Princeton).
The paper was accepted to be presented, along with a public review by John W. Byers (Boston) that summarizes very well the paper, and then presents an account of the program committee discussions, primarily focused on research ethics.
In a nutshell the paper proposes using unsuspecting users browsing a popular website as measuring relays to detect censorship. The website would send a page to the users’ browser — that may be in a censored jurisdiction — that actively probes potentially blocked content to establish whether it is blocked. Neat tricks to side-step and use cross domain restrictions and permissions may have other applications.
Most of the public review reflected an intense discussion on the program committee (according to insiders) about the ethical implications of fielding such a system (2/3 of the 1 side is devoted to this topic). The substantive worry is that, if such a system were to be deployed the probes may be intercepted and interpreted as willful attempt to bypass censorship, and lead to harm (in “a regime where due process for those seen as requesting censored content may not exist”). Apparently this worry nearly led to the paper being rejected. The review goes on to disavow this use case — on behalf of the reviewers — and even call such measurements unethical.
I find this rather lengthy, unprecedented and quite forceful statement a bit ironic, not to say somewhat short-sighted or even hypocritical. Here is why.
1 July 2015
One of my key annoyances while doing work in privacy technologies is the poor support of key cryptography libraries in my favorite Python programming language. Today, I would like to share my work on building petlib, a more or less pythonic, wrapper around the OpenSSL low level crypto and math libraries, as well as numerous example privacy technologies (PETs) that I have implemented as examples.
The needs of someone doing research in the PETs field are quite different from other developers: on one hand we need access to low level primitives (such as block cipher and hash function operations), as well as low level mathematical functions on big integers and elliptic curves on finite fields. A number of available libraries try to hide those primitives from developers behind abstractions such as “signed envelopes” or “secure sockets” — which so not serve those who try to build different abstractions. On the other hand, issues such a tight memory management and absolute control over other low-level aspects of the library are not essential; in fact a clean programming interface that leads to beautifully clear reference code for proposed protocols is preferable.
The petlib library is available for everyone to use, and after installing the OpenSSL prerequisites can be acquired through the python repositories through:
pip install petlib
The petlib library was used as the basis for teaching the labs of the Privacy Enhancing Technologies course at UCL, and thus has extensive documentations, and is properly version controlled, packaged and tested:
- petlib github repository
- petlib installation and programming documentation (read the docs)
- petlib listing on pypi
The best way to get a feel for how the library can be used to build PETs prototypes is to browse the examples in the source tree:
- A toy RSA example
- A simple Schnorr Zero-Knowledge Proof
- An additivelly homomorphic public key encryption scheme
- A generic engine for building zero-knowledge proofs using sigma protocols
- The Groth-Kohlweiss ring signature and zero-cash scheme
- The algebraic MAC scheme by Chase, Meiklejohn and Zaverucha
- An anonymous credential scheme based on the aMAC scheme above
- The Baldimtsi and Lysyanskaya anonymous credentials light scheme
In terms of more real-work research project, we have already used petlib for implementing prototypes for a few projects and labs:
- The centrally banked cryptocurrency framework (with Sarah Meiklejohn)
- A private stats collection system for Tor (with Melis and De Cristofaro)
- Exercises in Privacy Enhancing Technologies (for UCL COMPGA17)
One key missing component from the underlying OpenSSL crypto library is support for computations on pairings of elliptic curves. This limits the types of protocols that can currently be implemented with petlib, until such functionality becomes available in the underlying libraries (please contribute!) Bug reports and pull requests with fixes to the code and documentation are very welcome.
30 June 2015
The course covers principally, and in some detail, engineering aspects of PETs and caters for an audience of CS / engineering students that already understands the basics of information security and cryptography (although these are not hard prerequisites). Students were also provided with a working understanding of legal and compliance aspects of data protection regimes, by guest lecturer Prof. Eleni Kosta (Tilburg); as well as a world class introduction to human aspects of computing and privacy, by Prof. Angela Sasse (UCL). This security & cryptographic engineering focus sets this course apart from related courses.
The taught part of the course runs for 20 hours over 10 weeks, split in 10 topics:
- Introduction and privacy in communication. (01-GA17-IntroComms)
- Anonymous communications & Traffic analysis (02-GA17-Anonymous-Comms)
- Private Computations with homomorphic encryption and secret sharing (03-GA17-Private-Computations)
- Privately checking inputs using Zero-Knowledge Proofs (04-GA17-ZeroKnowlegde)
- Private authorization using selective disclosure credentials (05-GA17-Selective-Disclosure)
- Data anonymization & de-anonymziation attacks (08-GA17-Data-Anonymization)
- Private Storage, queries and lookups (09-GA17-Storage-Retrieval)
- Privacy by design case-studies (10-GA17-Privacy-by-design-case-studies – Copy)
- Guest lectures: Human aspects (Angela Sasse)
- Guest lectures: Data Protection (Eleni Kosta)
Most importantly the course includes 10 hours of labs (20 next year!), split into 5 exercises, that give students (and their teachers!) hands on experience implementing extremely advanced privacy enhancing technologies. More generally the course provides an introduction to solid cryptographic engineering, test-driven development, testing & QA tools and code audits. The programming language used was Python on a Linux environment, with the petlib library that was specially developed for this course.
For each lab exercise students in pairs were provided with a partial code file, and a set of unit tests, and were asked to fill in the remaining code to fulfill the task, and at least make the unit tests pass. The topics of the exercises track the first 5 lecture topics:
- Private communications and basic programming with petlib
- Building a simple mix server and client
- Building a private polling system with homomorphic encryption
- Basics of zero-knowledge proofs of knowledge, equality and linear statement
- A basic selective disclosure authorization credential system
Finally, part of the grading was based on students performing a code review of other groups, looking for code defects leading to security or other bugs.
Overall, I am very proud of the progress everyone made. The course was attended by 16 MSc student and 2 MEng students. Everyone eventually was able to complete all lab assignment — not a given considering the advanced nature of the tasks at hand. It was evident while discussing with student the final exercise, on building a selective disclosure credential, that many had developed an intuitive understanding of how to build solutions based on zero-knowledge protocols, and all had definitely overcome their initial fear of these more advanced concepts in PETs.
I was also very impressed with many students that were able to tackle the hardest questions in the exam. One of those questions, basically asked students to re-invent a variant of the privacy preserving genomic testing protocol we presented at WPES 2014 — and many did successfully. Similarly, they were asked to de-anonymize a mechanism very similar to the 15:15 rule in place in California to “protect” smart meter reading, and again many did so successfully under time constrain and the high pressure environment of exams. As ever, the great engagement from students was the most rewarding part of teaching the course.
All material is available online (see links to slides, and git repositories), and I would be delighted to share / receive any additional exercises by others finding this material relevant to their courses.
18 December 2014
Last week I had the opportunity to attend a joint US National Academy of Sciences and UK Royal Society event on cyber-security in Washington DC. One of the speakers, a true expert that I respect very much, described how they envision building (more) secure systems, and others in the audience provided their opinion (Chatham House Rule prevents me from disclosing names). The debate was of high quality, however it did strike me that it remained at the level of expert opinion. After 40 years of research in cyber-security, should we not be doing better than expert opinion when it comes to understanding how to engineer secure systems?
First, let me say that I have a great appreciation for craftsmanship and the deep insights that come from years of practice. Therefore when someone with experience tells me to follow a certain course of action to engineer a systems, in the absence of any other evidence, I do listen carefully. However, expert opinion is only one, and in some respects the weakest form of evidence in what researchers in other disciplines have defined as a hierarchy of evidence. Stronger forms of evidence include case studies, case-control and cohort studies, double-blind studies with good sample sizes and significant results, and systematic meta-analyses and reviews.
In security engineering we have quite a few case reports, particularly relating to specific failures, in the form of design flaws and implementation bugs. We also have a set of methodologies as well as techniques and tools that are meant to help with security engineering. Which work, and at what cost? How do they compare with each other? What are the non-security risks (cost, complexity, training, planning) associated with them? There is remarkably little evidence, besides at best expert opinion, at worse flaming, to decide. This is particularly surprising, since a number of very skilled people have spent considerable time advocating for their favorite engineering paradigms in the name of security: static analysis, penetration testing, code reviews, strong typing, security testing, secure design and implementation methodologies, verification, pair-coding, use of specific frameworks, etc. However, besides opinion it is hard to find much evidence of how well these work in reducing security problems.
I performed a quick literature survey, which I add here for my own future benefit:
28 November 2014
It takes quite a bit of institutional commitment and vision to build a strong computer security group. For this reason I am delighted to share here that UCL computer science has in 2014 hired three amazing new faculty members into the Information Security group, bringing the total to nine. Here is the line-up of the UCL Information Security group and teaching the MSc in Information Security:
- Prof. M. Angela Sasse is the head of the Information Security Group and a world expert on usable security and privacy. Her research touches upon the intersection of security mechanisms or security policies and humans — mental models they have, the mistakes they make, and their accurate or false perceptions that lead to security systems working or failing.
- Dr Jens Groth is a cryptographer renowned for his work on novel zero-knowledge proof systems (affectionately known as Groth-Sahai), robust mix systems for anonymous communications and electronic voting and succinct proofs of knowledge. These are crucial building blocks of modern privacy-friendly authentication and private computation protocols.
- Dr Nicolas Courtois is a symmetric key cryptographer, known for pioneering work on algebraic cryptanalysis, extraordinary hacker of real-world cryptographic embedded systems, who has recently developed a keen interest in digital distributed currencies such as bitcoin.
- Prof. David Pym is both an expert on logic and verification, and also applies methods from economics to understand complex security systems and the decision making in organizations that deploy them. He uses stochastic processes, modelling and utility theory to understand the macro-economics of information security.
- Dr Emiliano de Cristofaro researchers privacy and applied cryptography. He has worked on very fast secure set intersection protocols, that are key ingredients of privacy technologies, and is one of the leading experts on protocols for privacy friendly genomics.
- Dr George Danezis (me) researches privacy technologies, anonymous communications, traffic analysis, peer-to-peer security and smart metering security. I have lately developed an interest in applying machine learning techniques to problems in security such as anomaly detection and malware analysis.
- Dr Steven Murdoch (new!) is an world expert on anonymous communications, through his association with the Tor project, banking security and designers of fielded banking authentication mechanisms. He is a media darling when it comes to explaining the problems of real-world deployed cryptographic systems in banking.
- Dr Gianluca Stringhini (new!) is rising star in network security, with a focus on the technical aspects of cyber-crime and cyber-criminal operations. He studies honest and malicious uses of major online services, such as social networks, email services and blogs, and develops techniques to detect and suppress malicious behavior.
- Dr Sarah Meiklejohn (new!) has a amazing dual expertise in theoretical cryptography on the one hand, and digital currencies and security measurements on the other. She has developed techniques to trace stolen bitcoins, built cryptographic compilers, and contributed to fundamental advances in cryptography such as malleable proof systems.
One key difficulty when building a security group is balancing cohesion, to achieve critical mass, with diversity to cover a broad range of areas and ensuring wide expertise to benefit our students and research. I updated an interactive graph illustrating the structure of collaborations amongst the members of the Information Security Group, as well as their joint collaborators and publication venues. It is clear that all nine faculty members both share enough interest, and are complementary enough, to support each other.
Besides the nine fullt-time faculty members with a core focus on security, a number of other excellent colleagues at UCL have a track record of contributions in security, supporting teaching and research. Here is just a handful:
- Prof. Brad Karp is an expert in networking and systems and has made seminal contributions to automatic worm detections and containment.
- Dr David Clark specializes in software engineering with a core interest in information flow techniques for confidentiality, software security and lately malware.
- Dr Earl Barr researches software engineering, and has researched security bugs, and malware as well as ideas for simple key management.
- Prof. Ingemar Cox (part-time at UCL) is a world expert in multimedia security, watermarking and information hiding.
- Prof. Yvo Desmedt (part-time at UCL) is a renowned cryptographer with key contributions in group key exchange, zero-knowledge and all fields of symmetric and asymmetric cryptography.
The full list of other colleagues working in security, including visiting researchers, post-doctoral researchers and research students list many more people — making UCL one of the largest research group in Information Security in Europe.
19 December 2013
I had the opportunity to speak as part of a panel at the London Crypto Festival on November 30th 2013. My main point was that we have not one, but many ways to protect privacy in on-line services. Therefore consumers and citizens should demand from their service and software providers strong protections for their privacy, and come to expect them. The examples I used are from what I know best, namely smart metering privacy for which we have proposed in the past very credible protocols for privacy friendly billing and statistics.
My Crypto Party Presentation can be found here.
31 October 2013
I am joining a fantastic team of researchers: Angela Sasse heads the group and is doing pioneering work on human aspects of security; Jens Groth is an expert on cryptography, and zero knowledge; Nicolas Courtois is a leading cryptanalyst, and has hit the news many times in the past by demonstrating vulnerabilities in deployed systems. Alongside myself, Emiliano De Cristofaro, who works on applied cryptography and privacy, and David Pym, who has a dual interest in formal methods and economics of security, are also joining the group.
One of my first non-research tasks at UCL is to teach the Computer Security 1 course, which is a broad introduction to the basics of computer security. As a matter of principle, namely that the highest levels of quality of protection are achieved when computer security is discussed in public, I consider that the class to be a public event and open to anyone who would like to attend (subject to space restrictions). So if you are based in London, and would like access, just let me know.