Some may have noticed I have gone part-time at University College London, and plan to spend the next two years engineering and launching distributed systems as part of the Chainspace project, Vega Protocol and Kryptik. A number of people have asked me “Why would an expert on Privacy Enhancing Technologies develop such an interest in distributed ledgers and blockchains?”. In this post, I will lay down why I think this space is interesting technologically, and also how it interacts and influences the engineering, adoption, and deployment of Privacy Technologies; and  how a number of  societal issues in Privacy may in fact be deeply linked with developments in Blockchains in unexpected ways.

Why are Privacy Enhancing Technologies so interesting in the first place?

Privacy is a multi-faceted field, and I have spent about 20 years working on, evaluating, and advising on systems that broadly protect privacy better than current engineering practices. Those who follow my work may notice a pattern: I research what I call in class “hard privacy” systems, namely those that assume there is no central party that may be trusted to manage honestly user data. Instead, I have used cryptography, and distributed trust to ensure that users can do useful work in a peer-to-peer manner — without disclosing any data to third parties — or, when there is a need for infrastructure, it is distributed and data is encrypted in such a way that multiple entities need to conspire to violate user privacy properties.

“Soft privacy” systems, such as those that help organizations manage privacy of data they hold better, guide cookie and retention policies, prevent discriminatory uses, and implement internal controls are very important and useful — but they are just not my main thing. Why is that? I am fascinated about how privacy interacts with power, be it corporate or government power, and see privacy as a means to ensure both liberty and the possibility of political dissent as well as economic equity through undermining what is now broadly called surveillance capitalism. Collection and access of vast personal datasets already creates such a huge imbalance of power and potential for abuse, that I would rather prevent it through “hard privacy” technologies, rather than manage it through alternatives.

It should already be obvious that the engineering tools we have to use to build cryptographic and distributed systems for privacy, are related to those used to build distributed ledgers: they involve top-notch networking and distributed systems security, including prevention of byzantine faults; the most cutting edge of zero-knowledge and other modern crypto that was initially developed for privacy applications; and a good dose of security economics to ensure all participants in the system have incentives to do the right thing. As such the technical tools between  my type of Privacy Engineering and Distributed Ledgers are one and the same.

So it is obvious why, purely on the basis of technical affinity and curiosity, I may be attracted to distributed ledgers. But it turns out there are also deeper reasons.

So what are distributed ledgers good for?

This is the question that many ask today, as the hype around this family of systems is somewhat tempered by the falling prices of tokens. And the reason the answer is harder than one might expect for a technology that has been in the public eye for over a year, is that many have promised that blockchains are a panacea for all social and technical evils that no technology can possibly be.

So let’s start with what current distributed ledgers are not particularly good for — research challenges that are best summarized in an article by Sarah Meiklejohn. Blockchains do not scale particularly well, and in particular open Nakamoto consensus based on PoW or even PoS does not seem to scale. A number of proposals are on the table, but right now they are all experimental. Blockchains are not particularly usable, either in terms of software but also due to their semantics: probabilistic finality, and a resulting high latency to get assurance. Blockchains also struggle to provide data privacy guarantees for transactional data in the chain — zcash is a gold standard in that space, and recent research shows most transactions are not very private despite best-of-breed techniques being employed. Particularly under the light of the last issue, what could possibly make blockchains interesting to a Privacy Engineer, besides technical curiosity?

There are two interesting aspects of blockchains in that respect: (1) they decentralize transaction systems; and (2) they provide an alternative monetization strategy for providing on-line services. Those, it turns out, could — and the path of technology and its impact on social life is always contingent — be extremely influential when it comes to privacy. I will explain those in turn.

Decentralized Transactions are hard.

As many in the redecentralize movement have pointed out the web and other internet systems, such as email, were initially decentralized. Then something happened in the years 2000-2010, and somehow large quasi-monopolistic service providers appeared to dominate most interactions (lets not be coy we are talking about Facebook, Google, Amazon, and smaller friends). What happened is no conspiracy: we have very good tools to decentralize the distribution of static, read-only, content. In 2000 research on peer-to-peer systems was all the rage, and led to bittorrent on the edgy side, to the DHTs that underpin today interesting projects such as IPFS. Cloudflare could be distributed with the right technical model using those techniques, and we could serve most static content in a decentralized manner.

What we do not know how to decentralize — in general — are “transaction systems”. Those are system, in which users from different security domains (not just one natural owner) may “write” to objects, and mutate their state. The core business of Amazon, Uber, AirBnB as well as more traditional industries such as banking, travel, HR, etc are based on keeping such records and mutating them safely. Traditional databases from oracle to mysql have been the traditional workhorse behind such systems. Cloud platforms allow databases to scale, using complex techniques such as crash-fail resilient consensus and replication — but they require all the infrastructure to be under a single authority.

This technical reality, namely that it is hard to build scalable decentralized transaction systems is one of the reasons why internet monopolies came to dominate most of our interactions.  However, this cannot explain everything (nothing can): techniques for achieving decentralized byzantine fault tolerance have been known since the 1980s-1990s, with PBFT and all that. So how comes they did not catch up? And when I say they did not catch up, I mean pretty much not at all: a couple of years ago one would struggle to find a single PBFT usable library in most languages. I would argue this is related to economics.

Lack of Economic model.

When the googles and facebooks of this earth established themselves it was not clear how they would make any money. Therefore they innovated, and established the ad-based model by which the user is not the customer, but rather their attention is a product to be targeted and sold for adverts. This led to the establishment of surveillance capitalism that now seems to run deeper, and also interacts in questionable ways with the advancements in machine learning — another fascinating field of computer science.

So for the years 2000-2010 the main monetization vision of online services launched was the collection of personal information, and the resulting generation of ad revenue. However, it turns out that personal information is more akin to tar sands than oil: an organization needs a lot of it to refine it into something useful — and that also leads to a monopolistic situation in which platforms that already have captured the ad revenue and a lot of personal data are difficult to dislodged. I think this has now sank in the psyche of engineers and entrepreneurs and I hardly hear any new businesses aiming to dislodge Google in terms of ad revenue — for which I am thankful. That market is captured and closed.

For a while an alternative model, based on selling mobile apps first, and then selling in-app purchases gave a glimpse of hope that on line services may be able to monetize, but that also did not last. The most monetized app were akin to “digital crack”, seeking to develop users with strange addictions; eventually others such as Instagram and Whatsapp got bough by large incumbents. Today most independent app developers make little money are are subject to the whims of every app store that may delist them at will — also controlled by large incumbents that take up to a 30% cut on revenues.

How could one have built decentralized alternatives to those? What is WhatsApp or Instagram were to operate their systems on the basis of decentralization — they would each require not one, by over four authorities to operate. What would be the incentives of such authorities to do a good job? And how could they cover their costs? This was a problem. The Tor network, for example, operates on the basis of volunteers running relays, research, foundation and state department funding — a model that cannot scale to run significant infrastructures.

Blockchains provide both technical and economic answers to enable decentralization.

This is where blockchains start to become interesting. Distributed ledgers provide platforms to technically make building decentralized apps minimally humane. Now, writing larger correct decentralized apps in solidity for Ethereum, for example, cannot be described as the most pleasant development experience. But take my word for it, it is much better than having to start from scratch writing your own byzantine consensus protocols, and ensuring they are correct. Thus they offer a technical alternative for writing transaction systems that are decentralized.

Secondly, blockchain systems integrate a system of incentives and monetization. Nodes operating the infrastructure are remunerated in many ways, from mining rewards to fees, using a micropayment system that is integrated into the platform. This solves the question of “who are the decentralized authorities and why would they work for me?” that blocked the deployment of such decentralized systems before.

At the same time, for all its ills, the ICO model — by which projects would issue their own tokens for use in a system — also provided incentives for founders and developers to initiate a project, fund development and often maintenance. Thus services on blockchains do not have to rely on ad revenue, but can instead rely on fees to survive and be sustainable, at least in theory. The open access model of those platforms also ensure that both infrastructure nodes and developers can invest in building apps without fear they may be arbitrarily excluded.

Those two features of blockchain platforms have profound implication for privacy, even though the platforms themselves today do not protect data privacy very well. First, they undermine the monopolistic position of large service providers — that by virtue of accumulating masses of user data, and using it as part of an advertising based economic model, fundamentally cannot be privacy friendly. Undermining those large silos of data both frees people from the whims of those platforms, the wide use of data to manipulate them, but also the secondary threats of governments (domestic and foreign) then dipping into those databases for their own purposes. Allowing users to pay for services they access also ensures that those services can survive and be sustainable, in ways other than selling out their users’ data.

So in brief, blockchains align incentives correctly: control over the service is decentralized, and usually subject to code (smart contracts) to ensure users are not subject to the arbitrary and opaque decision making of large online service monopolies; and secondly that payments are made to those that maintain infrastructure and services, to ensure they do not need to be tempted to pry as a business model. If this model is a success — subject to a number of contingencies — it may provide a good foundation for better, more open and humane, transaction systems that could actually redecentralize the internet.

Challenges ahead.

While the incentives are aligned this does not mean that current blockchains actually achieve all those great goals, and in particular that they provide strong privacy guarantees. For this reason I have spent some of my research time in the past looking at how we can use efficient zero-knowlege to protect privacy in transaction systems, as well as how to scale up cryptographic monetary systems and smart contract platforms with Chainspace. Scaling up those platforms to make them truly competitive with large on-line service providers and other ‘sharing economy’ silos is indeed the subject of my new start-up chainspace.io.

With their fall the adoption of Privacy Enhancing Technologies can finally be unblocked. And already we see some of the most advanced cryptographic techniques, including zero-knowledge and selective disclosure credentials, being fielded in the context of blockchains, where they have seen little traction elsewhere in the past 20 years. While large internet services make most of their money from mining user data for ads or optimization those technologies stand no chance to see the light of day at a large scale.

I would welcome anyone working in the fields of threshold crypto-systems, multi-party computation, and censorship circumvention — techniques that intrinsically require multiple authorities to work together, to consider how their distributed infrastructures could be both engineered and incentivized using and adapting ideas from the distributed ledger research community.

Are you off for good?

The final question people ask: am I going away from academia and UCL for good? Rest assured I am coming back. My roles in all ventures involve research; I am committed to my PhD students, joint projects and colleagues at UCL; and I feel deeply passionate about teaching new generation both security engineering and privacy technologies. UCL is my natural home in the UK, being a truly open university to the world, traditionally progressive in its outlook, and in the heart of London. My adventures in industry, however successful, will only make me a stronger scholar.

Advertisements

I am at the Privacy Enhancing Technologies Symposium in Minnesota. As it is customary the conversations with other privacy researchers and developers are extremely engaging. One topic I discussed over beer with Mike Perry and Natalie Eskinazi is whether secure group messaging protocols could be simplified by relaxing some of their security and systems assumptions.

In the past I questioned whether group key agreements need to be contributory of symmetric. In this post I argue that plausible deniability properties can be relaxed, leaderlessness assumptions may be too strong, full asynchrony unnecessary, and high-scalability unlikely to make a difference. In brief, protocol designers should stop looking for the perfect protocol, with all features, and instead design a protocol that is good enough. This is controversial, so I will discuss those one by one.

Plausible deniability within a group?

Off-the-record messaging protocols, from OTR to the protocols underlying signal / whatsapp, provide cryptographic plausible deniability. In a nutshell, this means that when Alice talks to Bob, after a certain point in time Bob will not be able to prove cryptographically to any other party that Alice sent a particular message. This is technically achieved by either publishing the signature keys used after a while, or using symmetric message authentication codes to ensure Bob could have forged any authenticatior.

Before moving to the group setting it is worth thinking what plausible deniability buys you in the 2-party case: It protects Alice from an adversary, that needs cryptographic proof that she said a particular thing to Bob. However, the adversary — at the time of the conversation cannot be Bob. Bob is convinced that it is Alice that sent a message — and thus there is no need for other evidence. Similarly, if Bob is fully trusted by the adversary — for example to keep a tamper proof log of all messages sent and received and all secrets — then the defense is ineffective.

Plausible deniability protects Alice in a very narrow setting: When Alice sent a message to Bob, who at the time was honestly following the protocol, and that in the future decides to work with an adversary (that does not fully trust Bob) to convince the adversary that Alice said something. Even in the 2-party case, this is quite eclectic.

However, this property becomes even more elusive in the multi-party case. Imagine that Alice sends a message to a group of 7 participants, though a channel providing plausible deniability. At the time the message is received by 6 parties, that are all convinced that it came from Alice. For plausible deniability to be meaningful, all those parties at that time should not be adversaries. Then, down the line one or more (incl. Alice) transcripts are leaked, and the those transcripts do not contain any cryptographic authenticators for Alice.

However, for adversaries not performing legal attacks (when a court needs to be convinced of a fact) the lack of authenticators has never been a barrier to believe in a transcript of a conversation. Similarly, courts in the UK, do not rely on cryptographic authenticators to ensure authenticity: a record of a conversation, backed by an expert witness that states that it is unlikely to have been modified (through forensics) is sufficient. Furthermore, in the context of a group chat there will be multiple transcripts, all matching — unless users engaged in a very wide conspiracy to make the discussion look like something else. No software facilitates this, and without this facility, plausible deniability would provide a meaningless protection.

So my conclusion is as follow: a weak form of plausible deniability may be provided by ensuring honest parties delete authenticators (incl. signatures) once they have been checked. This ensures that honest captured devices do not contain cryptographic evidence. However, as we discussed above, dishonest parties can anyway bypass plausible deniability — and thus providing it through crypto tricks increases complexity without providing much protection.

So I advocate for the simple options of using signatures and deleting them upon reception. If your chat partners are bad, and do not delete it at this point, you are in trouble anyway.

Groups will always be small.

People want to have private conversations within groups. However, how large can these groups be until at least one of the participants or their device is compromized? The size of the group is often used in secure group chat protocols as a parameters N, and the cost of the  protocol in terms of number of messages or their size can be O(N) or O(1) or O(logN). However, this is meaningless since N will never become too big — for systems or operational security reasons.

Thus secure group chat protocols that work well for small number of users are likely to be good enough — no need to optimize in terms of N. Optimizations that reduce the constant factors are likely to be as important.

Submitting to a leader & asynchrony

Significant complexities arise in secure group chat protocols from the implicit assumption that all members of the group are equal, they talk through point-to-point messages, and that no infrastructure exists that can be trusted to anything relating to the chat. However, this is not how popular chat programs work: irc, xmpp, signal, whatsapp all rely on servers to mediate communications, and provide some services!

The most important service that can be provided by a central party is ordering: a server or a special participant can assume a “leader” role to order messages, in a total order that others will accept. They may be prevented from doing this arbitrarily: clients can include enough information to ensure that the total order imposed is compatible with causal ordering of messages (as Ximin Luo points out). However, since there is no canonical total ordering, why not rely on the ordering imposed by one of the members of the group (the leader) or a services mediating the communications.

Of course, a key challenge when relying on a leader is what happens when the leader dies or goes off line. When there is a graceful exit the leader can handover to the next leader. When there is a crash-failure things become more complex: group member need to detect the failure and replace the leader. This is complex in systems assuming high degrees of asynchrony, where there is no cut-off time out after which one can be sure another node it dead. In that case paxos or pBFT protocols need to be used to do leader election and consensus. However, if one ditches this assumption, and instead assumes that there exists a perfect failure detector — then nodes can detect failures and replace the leader.

Now, in practice users do not expect a chat session to survive the demise of the service they use (be it skype, whatsapp, or facebook). So it seems that providing such a property for secure group chat is unnecessarily complex.

Conclusion.

I would now advocate group secure messaging systems on the following lines:

  • Ditch full plausible deniability; use signatures, and rely on honest parties to delete them upon checking.
  • Design systems for optimal operation with group sizes of 3-50. There is no point in having much larger groups, or killing designs on the basis they do not perform when N goes to infinity.
  • Rely on an infrastructure server or a leader within the group to offer consistent total ordering to all. Assume that others can detect its liveness through a timeout (synchronous), and either elect or rotate another one in. Or simply, stop the chat session once they become unavailable.

You know you live in 2017 when the top headline on national newspapers relates to a ransomware attack on the National Heath Service, the UK Prime minister comments on the matter, and the the security researchers dealing with the outbreak are presented as heroic figures. As ever, The Register, has the most detailed and sophisticated technical article on the matter. But also strangely the most informative in terms of public policy. As if somehow, in our days, technical sophistication is a prerequisite also for sophisticated political comment on those matters. Other news outlets present a caricature, of the bad malware authors, the good security researcher and vendors working around the clock, the valiant government defenders, and a united humanity trying to beat the virus. I want to break that narrative open in this article, and discuss the actual political and social lessons we should be learning. In part to avoid similar disasters in the future.

First off, I am always surprised when such massive systemic outbreaks of malware, are blamed squarely on the author(s) of the malware itself, and the blame game ends there. It is without doubt that the malware author has a great share of responsibility. I personally think it is immoral to deploy ransomware in the wild, deny people access to their data, and seek to benefit from this. It is also a crime in the UK and elsewhere.

However, it is strange that a single author, or a small group of authors, without any major resources can have such a deep and widespread effect on major technological infrastructures. The absurdity becomes clear if we transpose the situation into the world of traditional engineering. Imagine all skyscrapers in major cities had to be evacuated, because a couple of teenagers with rocks were trying to blackmail business owners to pay up, to protect their precious glass windows. The fragility of software and IT systems seems to have no parallel in any other large scale engineering infrastructure — and this is not inherent, but the result of very specific micro-political, geo-political and economic decisions.

Lets take the WannaCrypt outbreak and look at the political and other social decisions that lead to the disaster — besides the agency of the malware authors:

  • The disaster was possible in part, and foremost, because IT systems within the UK critical NHS infrastructure are outdated — and for example rely on Windows XP that is not any more being maintained by Microsoft. Well, actually this is not strictly true: Microsoft does make security updates for Windows XP, but does not provide them for free — and instead Microsoft expects organizations that are locked in the OS to pay up to get patches and stay safe. So two key questions need to be asked …
  • Why is the NHS not upgrading to a new versions of Windows, or any other modern operating system? The answer is simple: line of business applications (LOB: from heath record management, specialist analysis and imaging software, to payroll) may not be compatible with new operating systems. On top of that a number of modern medical devices, such as large X-ray scanners or heart monitors, come with embedded computers running Windows XP — and only Windows XP. There is no way of upgrading them. The MEDJACK cyber-attacks were leveraging this to rampage through hospitals in 2015.
  • Is having LOB software tying you to an outdated OS, or medical devices costing millions that are not upgradeable, a fact of nature? No. It is down to a combination of terrible and naive procurement processes in health organizations, that do not take into account the need and costs if IT and security maintenance — and do not entrench it into the requirements and contracts for services, software and devices. It is also the result of the health software and devices industries being immature and unsophisticated as to the needs to secure IT. They reap the benefits of IT to make money, but without expending much of it to provide quality and security. The tragic state of security of medical devices has built the illustrious career of my friend Prof. Kevin Fu, who has found systemic attacks against implanted heart devices that could kill you, noob security bugs in medical device software, and has written extensively on the poor strategy to tackle these problem. So today’s attacks were a disaster waiting to happen — and expect more unless we learn the right lessons.
  • So given the terrible state of IT that prevents upgrading the OS, why is the NHS not paying up Microsoft to get security patches? That is because the government, and Jeremy Hunt in particular, back in 2014 decided to not pay up the money necessary to keep receiving security updates for Windows XP, despite being aware of the absolute reliance of the NHS on the outdated software. So in effect, a deliberate political decision was taken, at the highest level of the government to leave the NHS open to cyber attack. This is unlikely to be the last Windows XP security bug, so more are presumably to come.
  • Then there is the question of how malware authors, managed to get access to security bugs for windows XP? How did they get the tools necessary to attack such a mature, and rather common system, about 15 years after Windows XP was released, and only after it went out of maintenance? It turn out that the vulnerabilities they used, were in fact hoarded by the NSA as a cyber weapon — which was lost or stolen by hackers or leakers, and released into the wild! (The tool was codenamed EternalBlue). For may years, the computer security research community has been warning that stockpiling vulnerabilities in very common software for cyber-offense purposes, is dangerous. When those cyber weapons are lost, leaked, or even just used, there is proliferation of the technology necessary to attack, which criminals and foreign states can turn against critical infrastructure. This blog commented on the matter as recently as March 8, 2017 in a post entitled “What the CIA hack and leak teaches us about the bankruptcy of current “Cyber” doctrines”. This now feels like an unfortunately fulfilled prophesy, but the NHS attack was just the expected outcome of the US/UK and now common place doctrine around cyber — that contributes to and leverages insecurity rather than security. Alternative public policy options exist of course.

So to summarize, besides the author of the malware, a number of other social and systemic factors contribute to making such cyber attacks possible: from poor security standards in heath informatics industries; poor procurement processes in heath organizations; lack of liability on any of the software vendors (incl. Microsoft) for providing insecure software or devices; cost-cutting from the government on NHS cyber security with no constructive alternatives to mitigate risks; and finally the UK/US cyber-offense doctrine that inevitably leads to proliferation of cyber-weapons and their use on civilian critical infrastructures.

It it those systemic factors that need to change to avoid future failures. Bad people wishing to make money from ransomware, or other badness, will always exist. There is a discipline devoted to preventing this, and it is called security engineering. It is time industry and goverment start taking its advice seriously.