Securing Azure datacenters with continuous IoT/OT monitoring

February 22, 2021 TH Author

Figure 1: Industrial cooling system for datacenters.

As more intelligent devices and machinery become connected to the internet, Operational Technology (OT) and the Internet of Things (IoT) have become part of your enterprise network infrastructure—and a growing security risk. With every new factory sensor, wind turbine monitoring device, or smart building, the attack surface grows. Analysts estimate that there will be 37 billion industrial IoT (IIoT) devices by 2025. Even more alarming for business leaders, Gartner predicts that 75 percent of CEOs will be personally liable for cyber-physical incidents by 2024.

We’ve spent 15 to 20 years adding layers of telemetry and monitoring for IT security. However, most chief information security officers (CISOs) and security operations center (SOC) teams have little or no visibility into their OT risk. It’s clear that a new approach is needed, one that includes IoT and OT-specific incident response and best practices for bringing the two teams together to defend against increasingly sophisticated cyber threats.

A changing threat landscape

In every area of our lives, cyber-physical systems (CPS) go mostly unseen as they quietly monitor building automation, industrial robots, gas pipelines, HVAC systems, turbines, automated warehousing and logistics systems, and other industrial systems. In the past, OT risk was minimized because of “air-gapping” meaning, a physical divide was maintained between OT and IT networks. But digital transformation has disrupted all that. Now devices in the warehouse, refinery, and factory floor are connected directly to corporate IT networks and often to the internet.

Microsoft offers end-to-end IoT security solutions for new, or “greenfield,” IoT deployments, but most of today’s IoT and OT devices are still considered “unmanaged” because they’re not provisioned, tracked in a configuration management database (CMDB), or consistently monitored. These devices typically don’t support agents and lack built-in security such as strong credentials and automated patching—making them soft targets for adversaries looking to pivot deeper into corporate networks.

For OT security, the key priorities are safety and availability. Production facilities need to be up and running to keep generating revenue. However, beyond revenue losses, there’s a risk for catastrophic damage and possible loss of life when OT systems are breached. And like IT attacks, an OT breach also poses a risk for theft of intellectual property (IP). According to the Verizon Data Breach Investigations Report (DBIR), manufacturers are eight times more likely to be breached for theft of IP. OT security translates directly into three main types of business risks:

Revenue impact: In 2017, WannaCry malware shut down major automotive manufacturers and affected more than 200,000 computers across 150 countries, with damages ranging into billions of dollars. The same year, NotPetya ransomware nearly shut down the mighty Maersk shipping company and several CPG companies. The attack crippled Merck’s production facilities resulting in losses of $1.3 billion. Last year, LockerGoga shut down the systems of Norwegian aluminum manufacturing company Norsk Hydro and several other plants. In 2020, Ekans (snake spelled backward) ransomware became the latest OT threat by specifically shutting down industrial control systems (ICS).
IP theft: IP includes proprietary manufacturing processes, formulas, designs, and more. In one instance, Microsoft Security Response Center (MSRC) discovered hackers were compromising vulnerable IoT devices using their default credentials. Once inside, the hackers scanned the network to see what other systems they could access to get sensitive IP. One in five North American-based corporations reports that they have had IPs stolen within the last year.
Safety risks: The Triton attack on a petrochemical facility targeted safety controllers with the intent to cause major structural damage and possible loss of life. The attackers gained a foothold in the IT network then used living-off-the-land (LOTL) tactics to gain remote access to the OT network, where they deployed their purpose-built malware. As this attack demonstrated, increased connectivity between IT and OT networks gives adversaries new avenues of attack for compromising unmanaged OT devices.

The U.S. Cybersecurity and Infrastructure Agency (CISA) reports that adversaries are still using many of the tactics seen in the Triton cyberattack to compromise embedded devices in OT systems. CISA has issued three basic recommendations for securing OT:

Create an up-to-date, detailed inventory and map of your OT network.
Use the asset inventory or map to prioritize risks, such as unpatched systems, unauthorized connections between subnets, or unauthorized connections to the internet.
Implement continuous monitoring with anomaly detection.

Azure datacenters—a strategic resource

Through our cloud, Microsoft serves more than a billion customers and more than 20 million businesses across 60 regions worldwide. Today we help secure more than 400,000 customers across 120 countries. These range from small businesses to large enterprises, with 90 of the Fortune 100 using four or more of our security, compliance, identity, and management solutions. Our SOCs process 8 trillion global signals daily. Datacenters are the building blocks of the Cloud, and Microsoft has been building datacenters for more than 30 years. Microsoft datacenters constitute a complex industrial-scale facility sitting at the intersection of operational technologies (OT) and information technologies (IT). This includes industrial control systems managing the climate, power and water, physical security systems, diverse MS and non-MS personnel managing the servers and equipment, various networks including LAN and WAN and WiFi, and diverse software tools. Exclusively leveraging IT security solutions is insufficient to secure datacenters because OT systems have a long lifespan, implement network segregation, rely on proprietary protocols, and patching can disrupt operations leading to safety risks.

Figure 2: Microsoft datacenters.

The biggest risks in securing complex heterogeneous datacenter environments and generations are lack of visibility into the full datacenter stack, and IR plans and playbooks across OT and IT. To address this, we have implemented an end-to-end security monitoring system using Azure Defender for IoT and Azure Sentinel while integrating with Microsoft’s central SOC.

To strengthen its data centers’ operational resiliency worldwide, Microsoft’s Azure data center security team selected CyberX’s purpose-built IoT and OT cybersecurity platform in mid-2019. Microsoft subsequently acquired CyberX in June 2020 and recently released Azure Defender for IoT, which is based on CyberX’s agentless security platform.

Incorporating IoT and OT-aware behavioral analytics and threat intelligence, Azure Defender for IoT delivers continuous IoT and OT asset discovery, vulnerability management, and threat detection. As a Network Detection and Response (NDR) platform that uses passive monitoring and Network Traffic Analysis (NTA), it has zero performance impact on the OT network.

Azure Defender for IoT is now deeply integrated with Azure Sentinel and is available for on-premises, Azure-connected, and hybrid environments. By using both Azure Defender for IoT and Azure Sentinel as a unified, end-to-end IT and OT security solution, the Azure datacenter security team has been able to reduce complexity and prevent gaps that can lead to vulnerabilities.

Figure 3: Microsoft datacenters: Ingestion, detection, and investigation.

How it works

Azure Sentinel processes alert both from IT and OT, including from Azure Defender for IoT for OT devices such as HMIs, PLCs, biometrics, and badge readers and IT devices such as physical hosts, firewalls, virtual machines, routers, and more. All information is integrated with our incident-response system and our central SOC (including OT and IT playbooks) where machine learning reduces false positives and makes our alerts richer—creating a feedback loop with Azure Sentinel, which further refines and improves our alerting capabilities.

Microsoft datacenter security monitoring and response:

Improves the quality of critical environment inventory for risk-based analysis.
Correlates significant security events across multiple sources.
Advances detections across industrial control system (ICS) networks for known malware, botnet, and command/control traffic.
Enables machine learning support for insider threat-detection via user and entity behavior analytics (UEBA).
Deploys OT and IT incident-response playbooks using Azure Logic Apps integrated with Microsoft SOC. For example, we implement OT and IT playbooks for scenarios like ransomware or malware, botnet, insider threat, and untracked data-bearing devices.
Detects anomalous activity while reducing noise.

In addition, the Microsoft cloud security stack—Microsoft Threat Intel Center (MSTIC) is being expanded with OT capabilities and threat intel.

OT and IT: Bridging the cultural divide

OT and IT have traditionally worked on separate sides of the air gap as laid out in the Purdue Model. But as I mentioned at the top, that physical divide has vanished into the cloud. Thinking in terms of an IT and OT persona that enables both teams to collaborate seamlessly is the security challenge for our time. Here are a few insights that can help bridge the gap:

Mature and boost IT security practices for OT: Patching an OT system isn’t the same as updating IT; there can be dangerous repercussions in the form of factory downtime or safety risks. Empathy is important; the liberties enjoyed in the IT world can’t be blindly applied on OT. However, don’t throw away IT security best practices—boost them with OT capabilities.
Embrace the security journey: Whether you’re in OT or IT, security improvements move like a dial, not a switch. Agree on your guiding principles and tenants, then constantly improving collaboration between OT and IT teams.
Understand the OT persona: IT teams should get to know what a day in the life of an OT person looks like. Our team shadowed OT activity by making site visits, which helped build understanding and establish working relationships.
Appreciate the other team’s priorities: When working with OT, this means understanding the importance of safety and availability. What might be a simple system patch in IT could cause downtime or a safety issue in OT. Establish a common vocabulary and metrics to work out issues together.
Acknowledge preconceptions: OT often feels like the IT security approach will cause disruptions and downtime, leading to audits, escalations, or worse. For that reason, our approach became: “Hey, we found a problem. Let’s solve it together.”
Be proactive versus reactive: Do security assessments together and keep the right people in the loop. Set up two-way trainings, such as joint tabletop or red team exercises, and plan for “worst day” scenarios. Create dedicated websites and SharePoint sites where people can reach out with confidence that their concerns will be addressed.

For more information on securing smart buildings and bridging the IT and OT gap, watch my SANS webinar presentation titled “Securing Building Automation & Data Centers with Continuous OT Security Monitoring.”

Learn more

To learn more about Microsoft Security solutions, visit our website. Bookmark the Security blog to keep up with our expert coverage on security matters. Also, follow us at @MSFTSecurity for the latest news and updates on cybersecurity.

A changing threat landscape

Azure datacenters—a strategic resource

How it works

OT and IT: Bridging the cultural divide

Learn more

You May Also Like

5 cybersecurity paradigm shifts that will lead to more inclusive digital experiences

Data science for cybersecurity: A probabilistic time series model for detecting RDP inbound brute force attacks

Stopping Active Directory attacks and other post-exploitation behavior with AMSI and machine learning