The OT Security Revolution

From IT security inspiration to OT security innovation, we are living in the OT security revolution.

In recent years the way people are thinking about OT security has changed forever and continues to change – in part because of exponential growth in the number of cyberattacks that cause facility shutdowns.

In the first decade of the ICS / OT security discipline, practitioners took inspiration from IT security, because that’s all we had. Back then, the question was “how much like IT security can we make our industrial / OT security program?”

Today, equipped with improved tools and knowledge, the OT security mindset is shifting to ask different questions to achieve the necessary level of connectivity and security demanded by industrial enterprises and indeed by society at large

4 Key Questions you should be asking if you are working in OT security in 2023

1. How to be protected from data?

2. What is the worst that can happen?

3. How to access OT data without exposing OT systems?

4. How to get engineering-grade security?

Digging deeper…

1. How to be Protected from Data?

In the beginning, the job was to protect the information – the confidentiality, integrity and availability of information (CIA). But – in the industrial space we recognized even in the beginning that this wasn’t the right goal. So we tweaked it a little – we re-prioritized. We said maybe the priorities should be AIC for reliability-critical systems, or even IAC for safety-critical systems?

How did we protect data? Encryption! Only – that didn’t work so well. Modern attacks arrive inside encrypted and authenticated connections. We pull malware, for example, from our email servers inside an encrypted connections, don’t we? Malware that gains a foothold on an OT network beacons out to a command-and-control center (C2) with an encrypted connection, and attack information comes back into the OT network inside that connection.

IT networks somewhere in the world are compromised every day by ransomware, most of which arrived from the Internet, inside of TLS-encrypted connections. In short, modern attacks exploit permissions more often than they exploit vulnerabilities. For details, check out our full video explaining the propagation of modern malware.

Today, the world of OT security no longer asks, “how can I protect the information?” but rather asks, “how can I protect safe, reliable and efficient physical operations from information?” Because after all, all cyber-sabotage attacks are information.

The only way a power plant or passenger rail switching system, or petrochemical pipeline can change from an uncompromised to a compromised state is if attack information enters the system. And in almost all physical operations, our goal is safe, reliable, and efficient operations, almost always in that order.

Steps to getting protection from information?

How do we get protection from information? We start with an inventory – a list of all of the ways that any information can enter our control systems. A comprehensive list of all possible incoming information flows is also a comprehensive inventory of all possible attack vectors.

The good news? The list is almost always small – almost always less than two dozen vectors. With that list in hand, now we can set about systematically controlling as many of those information flows & attack vectors as we can – preferably controlling those vectors with physical, engineering-grade mitigations, not “hopeful” cybersecurity mitigations – see “How to get engineering-grade security?“ below.

2. What is the worst that can happen?

In the beginning, we modelled risk as (consequence x likelihood), and we spent a lot of time trying to figure out how likely a successful attack on our OT systems might be.

Today, we recognize that worst case consequences of compromise are what determine network criticality, not likelihood. The degree of protection we must provide a network is determined by what are the worst-case consequences of compromise:

Power plant DCS networks and high-voltage substation SCADA networks are reliability-critical – worst case consequences of compromise are generally unacceptable because they put public safety at risk in the case of a prolonged lack of electric power.
Railway switching system and Safety Instrumented System (SIS) networks are safety-critical – worst case consequences of compromise are unacceptable, because such compromise risks worker casualties or even public-safety / mass-casualty incidents.
Business networks are business-critical – worst-case consequences of compromise tend to be acceptable – lawsuits for leaking personally-identifiable information (PII) and restore-from-backup clean-up costs.

What about in-between cases? What are worst-case consequences of compromise for a small shoe factory? Well, if the machines and robots on the production line all have manual interlocks to prevent injury when technicians crawl into the machines to repair them, then the worst-case consequences of cyber compromise might well be only business-critical.

How do we use this knowledge?

When consequences are business-critical all the way through, then we can use IT-style cybersecurity programs all the way through. When consequences differ, though, for example for safety-critical passenger rail switching systems, things get interesting.

Engineering Change Control (ECC) is the discipline by which safety-critical and reliability-critical networks are managed. Every change to any system or component poses a risk, and the task of ECC is to evaluate, address and manage those risks.

Security updates are changes and are therefore risks. By deploying security updates to reduce the risk of downtime or malfunction due to attacks exploiting known defects, we increase the risk of downtime or malfunction due to changed software. The cure can be worse than the disease. Even anti-virus signature updates are a risk – a risk that the update malfunctions and we quarantine enough of the control system to cause unacceptable consequences.

And once we understand consequences, consequence boundaries become apparent. The IT/OT interface is most often the most important such boundary – a connection between two networks with very different criticalities, very different worst-case consequences of compromise.

3. How to access OT data without exposing OT systems

Once we understand consequence boundaries (see above), we need to understand how to secure those boundaries. The OT security revolution recognizes that continued increases in efficiency are important.

To increase efficiencies, we deploy automation and computers – an ever-increasing number of targets for cyber attacks. And data in motion is the lifeblood of modern automation – both industrial automation and business automation. But – every information flow is also an attack opportunity, because all cyber-sabotage attacks are information.

To enable the ever-increasing number of targets to carry out modern automation, we need to connect those targets, increasing the number of opportunities to attack those targets. Neither of these trends is disappearing any time soon. The OT security imperative is only going to increase in the decades ahead.

How to provide business automation with necessary OT data without providing attackers access to OT systems?

Well, most people answer “Firewalls!”, assuming that firewalls provide access to industrial data, while still protecting industrial systems / targets. This is not true. Firewalls forward some network packets, while dropping others. If we send a polite query through a firewall into an industrial system, we get back the OT data that we need for our business automation. If we send an impolite query through, we are attacking the OT network right through the firewall.

To launch our attack packets through the firewall, we need only disguise our attacks enough so that the firewall does not recognize the packet as an attack – and forwards the attack packet. Firewalls provide access to systems, not to data.

Does this mean firewalls are useless?

Of course not – far from it. Firewalls play important roles inside our reliability-critical and safety-critical networks. Firewalls play important roles in our enterprise networks as well.

Where firewalls are problematic is at consequence boundaries – at the connection between networks that we must secure to very different standards, and that we must manage in very different ways.

At consequence boundaries, modern standards such as ANSSI’s critical infrastructure standards and Israel’s critical infrastructure standards demand something stronger than firewalls. Even NERC CIP encourages stronger-than-firewall protections at consequence boundaries, even though the standard does not require such protections.

The imperative to increase automation will never go away. At consequence boundaries, we need to provide access to the OT data that is vital to business automation, without providing access to OT systems, so that Internet-exposed business-critical networks can propagate attacks into those OT systems. Check out our guide to Firewalls vs Unidirectional Gateways.

4. How to get engineering-grade security

The Detect, Respond and Recover pillars of the NIST Cybersecurity Framework (NIST CSF) are seen by many practitioners as the pinnacle of enterprise security programs. Practitioners recognize that compromise is inevitable on Internet-exposed networks, and so enterprise security teams set up people, process, and technology (PPT) to systematically detect compromised equipment, scramble practiced incident response teams to deal with that compromise, eventually restoring compromised equipment from backups and recovering normal operations.

Reliability-critical and safety-critical sites generally deploy the Detect, Respond and Recover pillars of the NIST CSF as well, but those pillars mean something very different in networks where worst-case consequences of compromise are unacceptable.

In control-critical networks, the Detect pillar means we hope we can detect cyber attacks before we suffer unacceptable consequences. Then we hope we can scramble our practiced incident response teams in time, and that those teams can identify attacks in progress as real and not false alarms, in time to prevent unacceptable consequences. And when we’ve identified the compromised machines, we hope we can restore correct functionality, again before we suffer unacceptable consequences. Because – we cannot restore human lives, damaged equipment, or lost production “from backups.”

Hope?

Would you drive across a bridge every day if you knew that the design engineer for the bridge “hoped” that the bridge would stand up to the specified load, for the specified number of decades? Design engineers don’t “hope,” they design bridges according to engineering best practices – practices rooted in a deep mathematical and physical understandings of how materials and stresses work. Engineering solutions are deterministic – build the bridge the same way every time, and it will stand up to the same load every time.

The OT security revolution is starting to recognize that there are powerful engineering solutions for managing cyber risks to physical operations – solutions that do not exist in the enterprise space. For instance, if your life depended on a 4-story boiler not blowing up in your face because a cyber attack mis-operated the boiler, would you prefer a spring-loaded over-pressure valve on the boiler to address that risk? Or would you prefer a longer password on the PLC operating the furnace?

I’d prefer the valve. The valve operates deterministically, with mathematically predictable failure modes. And safety engineers have been designing these kinds of physical mitigations into physical processes for generations. But – where is the over-pressure valve in the NIST CSF? It’s not there – enterprise security techniques are blind to engineering-grade solutions for protecting physical operations from cyber risks.

A new discipline - security engineering

A new discipline is emerging – cybersecurity engineering. This is a body of knowledge including deterministic, engineering-grade mitigations drawn from process (safety) engineering, automation (protection) engineering and network engineering disciplines, among others. If you would like to dig deeper, we have a report with a lot more detail about OT security engineering.

The OT Security Revolution

The revolution continues. We need to protect safe and reliable physical operations from attack information that might enter our systems, more than we need to protect the CIA, AIC or IAC of the information itself. We classify our networks by worst-case consequences of cyber compromise, not by attack “likelihood.” We need our network designs to provide access for our business systems – access to OT data that comes from our safety-critical and reliability-critical networks, without providing access to OT systems in those same networks. This means that firewalls are not enough – not at criticality boundaries. And we need our network and automation designs to stand up to the threat “load” these systems will face in the years and decades ahead. We need deterministic, engineering-grade solutions to enable those systems to stand up to the load, not “hope.”

The OT security revolution continues. The next step in the revolution is on its way – watch this space and sign up to be the first to know about out latest innovation in OT security – The WF-600 family of products.

Sign up to be the first to know about our latest OT security advancements solutions

Author
Recent Posts

Andrew Ginter

Andrew Ginter is Waterfall’s Vice President of Industrial Security. He holds B.Sc. of Applied Mathematics and M.Sc. of Computer Science degrees from the University of Calgary, as well as ISP, ITCP, and CISSP accreditations.