IoT security: how Microsoft protects Azure Datacenters

Azure Sphere first entered the IoT Security market in 2018 with a clear mission—to empower every organization on the planet to connect and create secure and trustworthy IoT devices. Security is the foundation for durable innovation and business resilience. Every industry investing in IoT must consider the vulnerabilities of the cyberthreat landscape. For our customers, Azure Sphere has helped unlock opportunities for new insights and to deliver magical new experiences simply by providing a secured foundation for IoT.

Our customers are leading innovations across industries, and they are our strongest resource when it comes to security needs. One of the most significant blockers for customers is the risk assumed by connecting business-critical devices and equipment to the internet. Datacenters are a notable example. When you look at the datacenter’s essential infrastructure, the most critical functions of maintaining the environment have been intentionally kept offline to protect and preserve them. While the servers and network of a datacenter function as this powerful hub of innovation that drives global computing, the mechanical, and electrical systems that they depend on are, out of necessity, air gapped.

Mike Czamara, a General Manager at Microsoft, leads a team dedicated to the critical environment and availability of Azure Datacenters worldwide. “We approach datacenters with a necessarily conservative methodology. There’s the shell and there’s the critical space,” he says. Mike describes the shell as the building, the walls, the roof, the electrical system, the mechanical systems; everything that functions around the critical spaces or in service of them. The core is the servers and all the networking. The shell’s multiple systems operate simultaneously, but not always symbiotically since they are not digitally connected. Connecting critical equipment is a substantial risk for a datacenter focused on reducing, if not eliminating downtime.

However, disruptions happen. Outages happen. Mike’s team was finding that there were sometimes problems across building automation systems or power monitoring systems running code written by a third party. These issues sometimes lead to breakdowns. But, because the code at the heart of the issue was controlled by a third party, as Mike puts it, “Part of our destiny, and that of our customers, was out of our control.” Having greater control over the datacenter environment promised better outcomes for customers. The need for more control over the datacenter environment was nested in a larger challenge: the datacenter ecosystem itself.

Taking the first step

We’re at the very beginning. We’re just walking up to the starting line. IoT was the first step,” says Mike. Really, the first step was an email. Adolfo Ferreira, a Senior Principal Technical Program Manager on Mike’s team, learned about Azure Sphere from the public announcement in April 2018. Adolfo immediately emailed Galen Hunt, the Managing Director of Azure Sphere. “I wrote him, begging him to give me a development kit. I told him what I wanted to do with it, and he took a kit away from one of his developers to give to me.” As Mike puts it, “From that point, it was game on.

Azure Sphere really triggered this big opportunity for us,” says Adolfo. At the time he discovered Azure Sphere, Adolfo and his team were looking to develop secured data acquisition from the mechanical and electrical systems, which have always been “read-only” systems. Azure Sphere gave them a way to securely connect these systems. The end-to-end solution includes secured hardware, the custom-built Azure Sphere OS, the cloud-based Azure Sphere Security Service, and ongoing servicing by Microsoft security experts for more than ten years. “I understood what Azure Sphere was trying to do, I knew the security was the highest level in the industry. I knew nothing could come close to the level of security Azure Sphere could offer,” says Adolfo.

For every Azure datacenter, security is the greatest priority, and the security requirements are spectacularly stringent. “Our data centers are not just running Microsoft’s businesses, but other tens of thousands of other company’s businesses within them. The Azure Sphere guardian module has layers and layers of security. The guardian module had no problem meeting our bar,” says Mike.

With Azure Sphere, the team started connecting mechanical and electrical systems—air handling units, power distribution units—to collect telemetry from the devices. In parallel, they started collecting data from servers and network devices. By using guardian modules powered by Azure Sphere, the team was able to confidently connect their most critical equipment when before the risk had been too great.

The team is exploring multiple scenarios that Azure Sphere has made possible. Maintenance, for example, is probably the most substantial commitment required of a datacenter. The standard approach is to have a regular, planned maintenance schedule to prevent problems. Sometimes it’s necessary, but often it’s just scheduled and so it just happens even when there’s no apparent need. Mike estimates that by staying on top of this sort of “blind maintenance” routine, only about 15 percent of maintenance will be reactive, meaning in response to an immediate need.

Informed by telemetry from connected systems, maintenance can become incisive, truly predictive, and can reduce reactive maintenance to as little as five percent. This can make a dramatic difference for organizations that forecast a budget one to five years out. Says Mike, “We are not spending money in hopes of preventing an outage. Our spend can become more targeted.”

Unlocking insights

Mike envisions a future of diagnostics in the datacenter. He sees a cache of information in every piece of equipment, “When we unlock that, it’s data that can create a wealth of knowledge. When I can see that a specific component in a certain generator is acting funny, and I can see how it affects performance health, I can make a more informed choice of what to do.” But he is thinking bigger than just generators or even just one datacenter. The knowledge gained from a single issue or incident in one datacenter can inform and improve performance for all the other datacenters located around the world.

But Mike is still thinking bigger than that—bigger than Microsoft. Having access to diverse sets of data, from partners and, maybe one day, from other organizations running equipment securely connected with Azure Sphere, can drive more informed decisions, and improve safety.

Smarter and safer

Mike’s team has been pioneering new safety measures enabled by Azure Sphere. Anytime a person must go into a datacenter to work on a piece of equipment, it is a point of risk. “There’s a problem of human error when a person goes into the wrong panel. They might turn off the wrong panel, which disrupts our customers.” In addition to the risk of uptime, there is also a serious risk to personal safety. Datacenters use a ton of power. A single datacenter uses between thirty-two and forty megawatts of power, roughly equivalent to six thousand homes. Panels have power sensors that will trip a warning siren when necessary, but a person’s instinctive reaction is to immediately shut the panel to turn off the alarm, potentially leaving problems unresolved. The team had to think about the problem, safety risk, and human behavior.

The team paired a klaxon siren with an Andon light and using a board built with Azure Sphere connected to the power sensor and datacenter control system. This setup made it possible to send the step-by-step of a work order, called a digital method of procedure (DMOP), directly to the panel requiring work. When a DMOP is released, the Andon light for the specific panel will change color to identify it as the panel requiring work. As the person goes through the DMOP for the work order, step by step, the light will reflect their progress. If the person misses a step, the light will signal the mistake and the klaxon will sound. Says Mike, “It’s exactly like bowling with bumpers.

The team went a step further and integrated their electrical power monitoring system and their incident monitoring system. If a person working in the datacenter opens the wrong panel, a security alert is automatically sent, and a ticket is cut to a manager. “We immediately know when something has gone off-script if someone has put themselves or the datacenter in jeopardy. We can stop all work and figure out what’s going on,” says Mike.

Azure Sphere made it possible to securely coordinate multiple systems to create a new safety process. The connected panels do more than just help ensure correct and safe execution of processes, they also capture data when things go wrong so that the team can learn from incidents and resolve problems. “We’re creating systems that will keep us within the lines of safety and security and that help us adjust and refine those lines,” says Mike.

Impressive too is that Adolfo’s team developed the first of these safer electrical panels in only two months. “The Azure Sphere SDK made it possible for us to move fast and develop a complete solution from scratch, that was fully integrated with Azure Cloud Services,” he says. “With Azure Sphere, we can quickly turn any idea into a proof of concept.”

Strategic advantages

Adolfo’s team is focused on developing systems to increase reliability, security, and safety, and to optimize the building and systems that make up the “shell” of the datacenters. The total Azure Sphere offering, particularly the ongoing servicing by Microsoft security experts for more than ten years, has amplified the team’s ability to deliver business value. The cloud-based Azure Sphere Security Service automatically delivers OS and security updates to every device, so Adolfo and his team never have to worry about patching. “That’s all taken care of by Azure Sphere,” he says. And when the team needs to push new firmware to devices, Adolfo says it’s incredibly straightforward to do that at scale. Plus, Azure Sphere attestation guarantees the right firmware version is running on all their devices. “The services and support that Azure Sphere just provides have taken away the burden on my team,” he says.

Handling all that work at scale, especially security, would have required building out a dedicated team. “Having a whole team just for upkeep doesn’t actually add business value. Instead, we can spend our time on how to implement technology to improve availability, to reduce costs, to increase visibility into operations—that’s really how we add value. It’s a huge advantage. We have the opportunity to set the new standard in the datacenter industry, using Azure Sphere,” says Adolfo.

The business case for creativity

Mike sees the true value of Azure Sphere in how it enables innovation on a much larger scale of influence: “This tiny little thing is enabling us to evolve—not iterate anymore—evolve our space, our industry. It’s going to make our datacenters much more predictable, more usable, so that our customers reap the benefits and rewards of everything we’re doing.”

Mike started out by giving one engineer, Adolfo, total freedom to innovate with that first Azure Sphere development kit. Now Adolfo leads a team of ten whose only job is to create, to invent, to explore. “Because we were seeing such gains with one, two, then three people driving innovation, I was able to make a legitimate business case to bring on more people,” says Mike.

One of the reasons why Mike can confidently turn his team loose, without rails (“you can’t really have rails if you want to innovate,” he says), is because Azure Sphere offers a secured platform. The team’s grounding principles are safety, security, uptime, and cost. It must be safe. It must be secure. It cannot impact the customer. And it has to be affordable. Says Mike, “Azure Sphere delivers it all. It gives us this great foundation to work through wild ideas and opportunities.

Get started with Azure Sphere today to build and test innovative, secured solutions for your organization, even while you’re working remotely.

READ MORE HERE