The London Times are reporting that UK government officials are considering a centralized database to hold all traffic data gathered under data retention legislation. While the plan has been criticised as a privacy disaster (which it is), the centralized approach comes as no surprise and is in line with the economics of data retention.

Already in December 2001, when the traffic retention plans were still young, I commented on trends that are likely to follow a mandatory traffic regime. There are three of them:

  1. On-line traffic databases: The idea that traffic data will be stored in tapes archived in a locked room is simply a fantasy. In order to minimize costs associated with complying with Law Enforcement (LE) requests, service providers have incentives to keep records on live, spinning, storage. Similarly an easy to use (and abuse) interface is likely to be provided for technical personnel to process the queries and provide results. This is already the case in telephony: the Ericsson interception GUI manualis publicly available.
  2. Ahead-of time indexing: Trawling though a huge volume of call data takes time. In order to speed up the process (thus minimizing cost), as well as support legitimate business needs, service providers have incentives to index the call records to efficient retrieval. This means creating tables pointing to records by network-identifier of the caller / callee or time of call. This is the initial phase of an analysis process, and effectively creates an efficient decentralized surveillance infrastructure.
  3. Out-sourcing and centralisation: Communication and service providers are not in the business of running complex and cheap data retention regimes and answering requests. As for other business activities that are not at the core of their business model (like cleaning or catering) they are likely to outsource the task to a specialized company. Since the main activity of such a company is processing information, there are enormous economies of scale, most likely leading to (at best) an oligopoly, of a few providers. Those providers can at any point be in a “special relationship” with LE, or other government agencies, providing effectively a full feed in real time (as the current proposal suggests.)

It is interesting to note that LE as well as agencies have incentives to push as much as possible for those trends to materialize, since it increases the efficiency of the request process, as well as pushing towards centralization – meaning that it is easier to get to the data in bulk [1]. This can be done by imposing on operators `quality standards’, maybe even to guarantee the security and privacy of data, that will in effect push up the cost of in-house retention management.

[1] I realize that some people are not so cynical about the agencies being interested in the data in bulk, even if it is acquired illegally. Those people should definitely read the account of the blanket telegraph interception.