On the matter of the British Library cyber incident

The most important lesson to figure out is why it is taking so long to restore services. That will tell us how to prevent such a calamity in other vital national institutions.

Jan 20, 2024

Introduction, apology, caveat, and then another apology

The introduction: For nearly three months, the British Library has been close to unusable because of what has invariably been called "a cyber incident”. Lots of people have asked me in recent months: “what on earth is going on with the BL and why isn’t it getting more attention?”

At the start of this week, the BL announced the partial restoration of its capabilities. So it seems a good time to take stock of one of the most impactful cyber incidents in British history.

The apology: This is the first post on this Substack for, well, a very long time. Apologies to those who supported me when it started. Many of you will know there were various circumstances which meant 2023 was a hard year for me to sustain it. I won’t make any promises about how often I will post, but I will try to reactivate it, based on an article at least once a month. Consider this January’s offering. And feedback on content and ideas for more is always welcome.

The caveat: This post is based open-source information, my own judgments, and nothing else. I used to run the UK’s National Cyber Security Centre, but I stopped doing that in 2020 and left public service. I have not asked former colleagues about the case. Nor have I spoken to the British Library. No one should assume anything I say here reflects the position of the Government or any part of it.

The other apology. Pursuant to that caveat, I am acutely conscious that this is an article about an extremely hard-pressed organisation trying its best to serve its users under the most extraordinary pressure. Being at the centre of a cyber crisis is absolutely horrible. It normally also means something has gone wrong, somewhere. In commenting on some of those potential causes of the problems, I do not mean to criticise those working flat out to fix things. I apologise if any of this post inadvertently comes across that way. Indeed I’d want to thank BL staff for what appears to have been an extraordinarily effort in a long slog to get to this important recovery point this week. I would encourage anyone else commenting on this or other cyber incidents to remember the human beings at the centre of the crisis. A paper from the Royal United Services Institute this week rightly identified psychological damage to staff as a consequence of these types of attacks. We should always remember this.

What happened at the British Library?

To the issue at hand, and first, some facts. In early January, Alex Scroxton at the indispensable Computer Weekly wrote a superb overview of the British Library cyber incident. In the interests of brevity, the following points are the most important:

on the last weekend of October, the British Library fell victim to what it called, inevitably, a ‘cyber incident’, acknowledging disruption to its services;
serious disruption to services continued throughout November, with all the hallmarks of a ransomware attack (for the uninitiated, this is when a hacker locks you out of your network and demands payment to let you back in, normally via cryptocurrency);
towards the end of November the fact that this was ransomware was confirmed. A new(ish) criminal group calling itself Rhysidia claimed the attack on their (so-called dark) webpage. In doing so they confirmed this event was also what is known as ‘double extortion ransomware’; that is when the demand for payment to decrypt the network is accompanied by a threat to release stolen data from the network, or sell it to other criminals, if the ransom isn’t paid;
Rhysidia listed the ransom and the price of the stolen data set at 20 bitcoin. At the time, this was worth about £600,000. With presumably no ransom paid, and presumably no buyer (the data is worth far less than £600K to a criminal; the awful bluff that is data extortion is a subject for another day) Rhysidia then dumped 573GB of British Library corporate data, including staff details, onto the dark web;
as 2023 gave way to 2024, the costs of the crisis both to the BL and its users became more and more apparent, as the disruption continued. The Financial Times reported that the BL would have to burn through nearly half its reserves to cover the costs, which, at an estimated £6m-£7m, were some ten times the demanded ransom. Meanwhile, The Guardian reported on the plight of authors who missed out on valuable royalties payable when their books were borrowed. Media coverage of what was a disaster for academic research went global;
the BL this week announced a partial restoration of the main catalogue, but in read-only, and therefore much less useful, form. So the crisis continues, but this is a significant mitigation for users.

Two key points flow from these events. The first is that it can safely be inferred that neither the BL nor anyone else paid the ransom (though no one has, to my knowledge, commented officially on this). If the ransom had been paid but the criminals had failed, for whatever reason, to restore access to the BL’s network we would know about that by now, one way or the other.

The second, and most important part of the whole story, is that for more than two and a half months this vital national resource has been essentially unusable. At the start of the crisis it seems that nothing at all worked: the basic staff computers, the phones, and even the public Wi-Fi for a bit. But the longer term damage was caused by the total inaccessibility of the main BL catalogue, described by the BL’s boss itself as “one of the most important datasets for researchers around the world” with its record of some 170 million items dating back centuries.

A particular problem is understood to be that most of the collection is stored in a giant facility belonging to the BL in West Yorkshire. Users are supposed to order from the catalogue and the item will be transported south in a few days. Without the catalogue, this became impossible. Whilst some workarounds could be done in the BL’s magnificent London headquarters, if the text you wanted was in Yorkshire, and it probably was, no one had any way of knowing where it was, and how to get it.

Although plenty of people have asked why this episode hasn’t received more national attention, it is clearly one of the worst cyber incidents in British history. So what are the lessons of it?

For me, there are three. None are new, but not all of them receive enough attention. And the last one needs to resonate thunderously throughout all organisations.

Lesson 1: The perpetrators are in Russia. They will likely never appear in a British court. We have to work within this reality

The British Library cyber crisis has nothing and everything to do with geopolitics. Nothing, in that the only motivation for it is money. Everything, in that the only reason it can happen with impunity is because the Rhysidia group, like nearly all the major ransomware groups, are based in Russia.

It is well documented that the Russian state has no interest in shutting these groups down and putting the leaders in prison, providing they don’t harm Russian interests and cooperate with the state when required. It is against current Russia law for the state to extradite its own citizens (this must be the first time I’ve linked to Tass). So these people are almost certain never to appear in a British court, and very unlikely to face even a Russian one anytime soon.

But, like all comparable democracies, the British state is configured to treat this type of incident as an arrestable and prosecutable crime. “Cyber crime is just the same as other crime” is something I heard a lot from law enforcement colleagues in Government. But there is one crucial difference. For the first time in human history, it is possible to inflict sustained, large-scale criminal damage on another country without the perpetrator or a single accomplice setting foot in it.

We have consistently underestimated just how much cyber crime breaks our model of policing. In rule of law democracies, the contract between citizen and police is based in part on an assumption that when someone is a victim of crime, the police will pursue the perpetrator. And with cybercrime, there are some in the UK we can go after. And with the Russians, every so often some idiotic cyber criminal goes on holiday to a Western country, or contracts for a criminal service with someone in East London, and the police can do what police are supposed to do. But these are the exceptions.

What police forces are doing increasingly well - normally via multinational operations led by the FBI - is orchestrating takedowns of digital infrastructure used by the criminals. But these interventions, while welcome, are invariably whack-a-mole operations and the criminals reappear in another guise with new infrastructure.

Can anything be done? Things got so bad in 2021, with the attack on Colonial Pipeline in the US, alongside serious healthcare disruption in the US and Europe, that President Biden used his Geneva summit with Vladimir Putin in June of that year to demand Russia clamp down on the rampant ransomware crime emanating from its territory. For a brief period, this seemed to have some effect, with the somewhat theatrically broadcast arrest of the REvil gang, one of the most notorious groups.

But then came the invasion of Ukraine. A dictatorship willing to defy the White House over the invasion of a neighbour is unlikely to be swayed by American demands about criminals on its own territory. And a West that would support Ukraine but not take direct military action on its behalf is not going to take direct action against individuals protected within Russia’s vast borders. Both the Russian state and the criminals know that.

Therefore, the brief period when some of Russia’s ransomware thugs flew a bit too close to the sun and became a nuisance to the Kremlin is now over. All the evidence of 2023 suggests that the criminal safe haven has been fully restored. There will come a time in the future when Washington, London, Brussels and others can talk to Moscow about dealing with this scourge. But that time is not now, or soon.

It is of no benefit to pretend otherwise. Australia’s otherwise hugely impressive response to the disastrous theft by cyber criminals of more than a third of the population’s medical records - what I’ve called elsewhere (£) a masterclass in devaluing a stolen dataset to the criminal - provides a case in point. In a press conference in November 2022, the head of the Australian Federal Police claimed the identities of the hackers were known to the AFP and pledged to bring the perpetrators to justice in Canberra via cooperation with Russian law enforcement. Any reasonable Australian watching could have concluded the police thought they had a good chance of locking up the villains. But, as was widely predicted at the time, this has not happened, and there appears next to no chance that it ever will.

It is always hard for Governments and public authorities to admit they can’t do something, especially when the ‘thing’ is being able to catch and convict criminals who’ve laid waste to something that’s very important to lots of citizens. But the lesson from Australia, the British Library, and countless other ransomware crises is that normal policing doesn’t work in most of these cases because the suspects are safely holed up in Russia.

So we should stop pretending that conventional policing can do much about this, and look instead at other things we might be able to do. This article is long enough already without prescribing in detail what the approach should be: that is for another day. However, here are three starting points:

in the short term at least, serious policy needs to eschew basing our strategy on fantasies of “striking back” or “imposing costs” on criminals who just want to make money and currently shelter in the world’s largest safe house. Impose costs when we can: there are things we can do to harass and harry cyber criminals. But this will not be a strategic solution for as long as the Russia safe haven exists;
the question of ransom policy and law cannot be avoided forever. The UK’s de facto position is that state bodies like the British Library will never pay, but private entities can, with no questions asked (even if the Government pretends to discourage them). This stands in marked contrast to Britain’s uncompromising approach to terrorist kidnappings, where ransoms are never paid, whatever the (sometimes terrible) consequences. But the Government has yet to publish any analysis or evidence as to why it takes a hardline approach for kidnaps and a soft one for cybercrime. Indeed, it doesn’t really have a cyber ransom policy at all (a detailed look at the ransom policy question is an issue for another article, but policymakers must take a hard look at it);
nor does the state really have a counter-ransomware strategy. A rich seam of possible policy measures to explore has been provided by Parliament’s Joint Committee on National Security Strategy report of December 2023. The Government could do worse than start there.

Many of the reforms in that report have merit and deserve consideration. But the Committee’s overarching point is that countering ransomware needs serious political leadership and attention.

Lesson 2: The BL case perfectly exemplifies the sort of area where the UK is most vulnerable to cyber disruption

That sort of strategic review of our approach to ransomware requires us to look hard at our own national vulnerabilities. Here the BL crisis provides some valuable lessons.

Harm happens in cyberspace because we have a three decades-long legacy of weak security in our software, hardware and wider digital infrastructure. Famously, the Internet was not built with security in mind - and we’re plagued with poor incentives for providers and users to do anything about it. That is slowly changing, but it is improving much more for newer technologies more than for our existing tech stack.

As all IT security professionals know, legacy systems in old organisations pose the hardest problems. There are no really transformative options until new systems come along. There are only mitigations. These mitigations require a lot of high quality technical and human resources. So they are expensive. They also require a lot of skilled people, as well as management attention and sponsorship. But it’s hard to explain the benefits of these measures to hard-pressed management facing many other pressures. And security reforms are often unpopular with staff and users because they add complexity to everyday work.

So it’s easy to see why some organisations are incentivised to take cyber security and resilience seriously, and some aren’t. Any service where public safety is at risk will invest heavily in security, safety and resilience, and test it all the time. The system and the organisation will probably be inspected. A regulatory license to operate might well depend on that evaluation. Put simply, no one should ever do something where their physical safety is dependent only on a computer staying connected, and most regulatory systems rightly don’t allow this.

So, for example, when part of the UK’s National Air Traffic Control system failed (accidentally) last August, there was no risk to safety to the planes already in the air because of the way much-tested backups work. Because the air traffic control system is such an obvious part of critical national infrastructure it is highly likely that Government agencies will pay attention to and assist with the cyber protection and resilience of the service. Similarly, in the private sector, banks invest heavily in cyber security capabilities and people because they know the risks of large scale financial loss are existential. Moreover, they can afford to. And the Government and regulator will want to help too, to avoid systemic risk within the financial system.

But consider the British Library in this context. It is a very important national institution, for sure. But if you’re tasked with identifying the most important national IT networks for protection against attack, the British Library will not get anywhere near the top of the list for attention. As we have seen, no one gets hurt or dies if the BL goes down. The health service will still function. So will the banks. The lights will still be on. People’s bills will still be accurate. The data of vulnerable populations will not have leaked. And so on.

As a cultural institution, the BL is important and famous. It also a public body. It is not, however, a political or budgetary priority. Constrained by public sector budgets and salaries, it will find it hard to source the people and capabilities it needs for cyber security (the British Treasury was widely mocked for advertising for a head of cyber security with an annual salary of between £51,000 and £57,000 when the industry standard is multiples of that figure). It is hard to imagine the BL being able to pay more, or finding it easy to recruit cyber security professionals.

This matters, because it is within hundreds of networks like the one the BL depended on that serious national risk lies.

The history of cyber security is pockmarked with warnings of mass casualty digital apocalypses threatening civilisation as we know it. It turns out that’s the wrong problem: as the brilliant work of Lennart Maschmeyer has shown, hacking into, say, a power grid and depriving civilians of supply even for a short time via cyber means is possible, but it is painfully slow and hugely resource intensive for the aggressor. Moreover, for cyber security and other security reasons these systems are better protected than ‘normal business’ networks, and have manual or other backups. That’s why cyber attacks don’t directly kill people.

It turns out, however, that our more immediate cyber security problem is that by crippling these so-called ‘normal business’ networks an aggressor can hugely harm a society without that much effort. We now know you can shut down a crucial oil pipeline in the United States not by attacking the pipeline, but by shutting down the ordinary software systems that support its administration. It turns out you can cripple the entire healthcare system of a rich EU nation not by touching hospital equipment or systems but by locking out the network of the body that allocates doctors appointments and schedules surgeries. And it turns out that you can bring part of the British academic sector to a crashing halt by taking a massive library catalogue offline.

So what else can an aggressor do to networks that don’t look to be of ‘strategic’ importance? That is that question we should be asking ourselves in the light of the BL fiasco. We should then be moving resources, expertise and monitoring accordingly as best we can. We also need to think about how we better incentivise the leaders of these organisations to improve basic security and resilience, because ransomware attacks are not, in general, sophisticated.

This is an election year in the UK, and after the votes are counted we can expect someone to try to form a stable administration with a five year horizon. Much is made of short-termism in politics, but we have to work with the world as it is, not as we’d like to be. In that spirit, here are two planning assumptions on national cyber risk for the next five-year Parliament:

a devastating, highly sophisticated, threat-to-life cyber attack against the UK in the next five years is unlikely, and if it happens, its impact will be mitigated so long as we continue to ensure that safety-critical systems are not wholly dependent on computer networks;
by way of contrast, serious economic and social disruption, including an incident that could threaten public order or safety arising from a cyber operation (the disruption of healthcare administration, the criminal justice system, or food or oil distribution being some examples) is very likely. Indeed, an incident of the severity of the BL attack is likely in each of the next five years.

This lesson of national vulnerability from the BL case, and these assumptions, would make a good starting point for the sort of serious discussion about ransomware that is urgently needed.

Lesson 3: Organisations, whether public or private, must be able to recover far more quickly than the BL did

And the one thing above all else that would make a difference to the problem is finding a way of forcing organisations to be able to recover more quickly than the British Library did.

To understand why, we have divert back briefly to ransoms. As noted earlier, the British state doesn’t pay ransoms, and most other Governments don’t. Throughout this crisis the Government did not come under any serious pressure to pay (unlike, for example, the Irish Government during the healthcare cyber crisis of 2021 because of the huge impact on health services).

The private sector is another matter. Because there is no reporting requirement in most jurisdictions, including the UK, to report when a ransom has been paid, there are no reliable figures for how many organisations pay (the cyber security company Coveware has made as decent a fist as any of tracking trends over time, and the latest figures show a significant decrease to fewer than half of organisations in 2022, down from seven out of every eight a few years earlier).

The blunt reason why the private sector often pays, but governments hardly ever do, is that Governments can throw far more resources and support at recovery. That was certainly the case in Ireland, where the military and a number of major cyber security companies were deployed with no expense spared. Private companies cannot afford to surge in capabilities like this, and they can’t call in the Army. And unlike the state, they can go bankrupt.

So for private organisations, paying can be more effective than not paying. In this case, the cost to the BL was far more than the ransom. This is not always the case: a BBC File on Four documentary in 2021 tracked the impressive response of the Harris Federation of London, a major schools provider. They held their nerve and the overall cost to them was less than the ransom demanded. And paying does not mean avoidance of harm: Colonial Pipeline paid the ransom, but the pipeline was still out for several days.

But Governments do not want to pay ransoms and, certainly in Britain, it is unlikely that taxpayers want them to. The crucial point is that not paying the ransom only works if the organisation can recover quickly.

The heroic efforts of Irish healthcare workers, and IT professionals from the civil service, the military and the private sector got the system back up and running to some sort of basically acceptable level in a similar amount of time as it takes a victim who paid to recover. Similarly, the Joint Committee on National Security Strategy heard from the leader of Redcar and Cleveland about how staff from the National Cyber Security Centre slept in the council’s offices during a ransomware crisis to ensure that the system dealing with the cases of at-risk children were recovered quickly.

Moreover, we need to ask ourselves: what if there is no ransom? What if a hostile hacker working for a nation state does exactly the same thing as a ransomware attacker, but the objective is to damage the UK by destroying the network, rather than to extort money by temporarily locking it?

In such a scenario, recovery is the only option. Ransomware highlights our digital vulnerabilities to others who have motives even worse and more strategically damaging than the criminals. And if there is no effective system backup that can easily be deployed, or no way of restoring the old system in some way, - in other words, if there’s no way of recovering quickly - then we’re stuffed. Recovery capability is paramount for national security.

Here, the obvious point to make about the British Library is that it has taken - and is taking - an inordinately long time for its catalogue, one of its most important services, to be restored. That’s even, presumably, with help from Government experts and others. Of all the high profile ransomware cases throughout the world, it is hard to think of many that have dragged on for this long with this degree of severity.

This slowness to recover is the most painful and most important lesson from the British Library cyber incident.

There are, no doubt, very good specific reasons for it. A 170 million item catalogue is bound to be very complicated. A replica backup would no doubt be very expensive and hard to maintain. (And, as stated at the start, this analysis implies no criticism of those working round the clock and over Christmas to try to get services back up and running; it is impossible to retrofit a solution that did not exist before the crisis).

But faced with the likelihood and potency of this threat to myriad public and private entities, we must no longer accept a situation where important national organisations, public or private, cannot withstand the lost of their enterprise computer network for such a long period of time. If we tolerate this, the likely consequences in terms of economic and social disruption will prove intolerable. Planning for the loss of a key network, and being able to recover quickly from it, needs to be a core part of good public and corporate governance that every organisation models and practices.

The way to get to this point is not to indulge in the classic British tradition of holding a what-went-wrong-and-who-can-we-hang-out-to-dry inquiry. This is not the Post Office IT scandal. There is not a single allegation of malice, bad faith or wilful negligence. Instead, an organisation with a reputation for being well-run and held in high public esteem found itself without the systems and plans in place to recover from being the victims of criminals. They deserve sympathy and support.

But we have to figure out why. What constraints were there, (and what incentives weren’t), that prevented this otherwise capable organisation from protecting itself and recovering quickly? Where else is this a risk? And what can be done about it?

The American answer to this conundrum has been to establish a Cyber Safety Review Board, based on the successful model in aviation safety. The aim is not to hunt for blame but to look at the rational explanations for why things went wrong and make constructive recommendations to address them. Such an approach could work in this case. The last thing we need are hours of theatrical hearings in a courtroom or committee room of Parliament, with exhausted witnesses defensive and humiliated. That makes for good TV and terrible public policy.

Summary

The UK has, by and large, suffered less major harm from ransomware than most comparable nations. But the British Library case is a warning. The critical lessons of it are:

ransomware is now a national security issue, likely to cause significant and possibly dangerous disruption in the near future;
this requires a strategic national response, which, due to the Russia safe haven problem, has to be predicated on policing not being able to do what it normally does to deter and punish crime;
there are a bunch of useful policy mitigations on the table, which need to be brought together in a coherent way. This has to include a thought-through, publicly articulated posture on ransoms;
but once an attack gets through, the one thing that matters above all else is the ability of the victim organisation to recover quickly. All organisations, whether public or private, need to test their ability to withstand the loss of a key network and show that they can recover at least partially within an acceptable amount of time.

This work is not easy. But it is vital, and urgent. And it is doable, with the right focus and leadership. Otherwise, in the well-chosen title of the Parliamentary report, national security is a hostage to fortune.

Alan Rew

Jan 22, 2024

You say

"A 170 million item catalogue is bound to be very complicated. A replica backup would no doubt be very expensive and hard to maintain"

The process of backing up & restoring large databases is a very old problem solved long ago. I don't know which database software the BL uses, but any database system worth its salt incorporates both full & incremental backups (& the ability to restore them) as a standard feature. I worked on developing such software in the early 1980s.

Backups are not "hard to maintain". The underlying complexity of the process should be hidden from the user by the database software. If backups are hard to maintain that suggests either poor choice of software or inadequate training of IT staff.

I hope that the BL now ensures that all its data is backed up in a way that enables it to be restored quickly in future.

I do sympathise with BL staff. It's not their fault if the systems provided to them are vulnerable or difficult to restore. Government needs to take BL IT infrastructure more seriously. It doesn't help that most government ministers appear to be IT illiterate.

Expand full comment

Stephen Greig

Feb 6, 2024

I am very inadequately qualified in computing to pontificate on the cause of this disaster but my immediate thought was why wasn't the catalogue backed up? 170 million records seems a very small number. I have been playing around with about 2 million records of plant names on my very cheap (£500 ish) computer. I would very much like to read a more technical discussion (but not too technical or I won't understand it!) rather than broad brush political observations. My next immediate thought is why isn't the catalogue in the public domain anyway?? Surely if it is a catalogue us plebs wouldn't benefit that much from being able to see it and we might be able to send the BL a copy next time they loose it.

5 more comments...

Ciaran's Crispy Cogitations

Discussion about this post