The Register

Self-replicating botnet attacks Ray clusters

Malefactors are actively attacking internet-facing Ray clusters and abusing the open source AI framework to spread a self-replicating botnet that mines for cryptocurrency, steals data, and launches distributed denial of service (DDoS) attacks.

Oligo Security bug hunters say the ongoing campaign, which they’ve named ShadowRay 2.0, has been active since at least September 2024. The attacks exploit CVE-2023-48022, a critical – and unpatched – vulnerability in Ray, an open source distributed computing framework for AI workloads that’s used by major tech companies, including Amazon, Apple, and OpenAI.

This is the same flaw Oligo previously reported as being under exploitation in late 2023. At the time, the application security firm dubbed the vulnerability ShadowRay.

The security hole, which received a 9.8 CVSS rating, allows remote attackers to execute arbitrary code via an exposed Ray dashboard API. It remains unpatched because Anyscale, the vendor that developed the framework, maintains that Ray is not intended for use outside a “strictly controlled network environment,” and, as such, the bug report is irrelevant.

In October, Anyscale handed off Ray to the Linux Foundation’s PyTorch Foundation, which is now responsible for maintaining the open source project.

The Register reached out to both Anyscale and the Linux Foundation for comment, and to see if there’s any plans to fix the flaw. We will update this story if and when we hear back from either organization.

A major evolution

“The latest campaign represents a major evolution from our initial ShadowRay discovery,” Oligo researcher Avi Lumelsky and Gal Elbaz wrote in a Tuesday blog post. 

While it shares some similarities with the initial attacks, it’s being carried out by a different attacker or group using different techniques.

“The attackers, operating under the name IronErn440, have turned Ray’s legitimate orchestration features into tools for a self-propagating, globally cryptojacking operation, spreading autonomously across exposed Ray clusters,” the duo added.

They have targeted organizations across the globe, focusing especially on large clusters and expensive GPU environments

Considering that there are more than 230,000 internet-facing Ray clusters worldwide, this puts organizations at major risk.

“The attackers have reached every Ray server we manually inspected, and their activity has been ongoing for weeks,” Lumelsky told The Register. “They have targeted organizations across the globe, focusing especially on large clusters and expensive GPU environments. We identified several exposed clusters worth millions of dollars in annual compute capacity, and, in each case, the attackers were already present. The impacted organizations span multiple industries and sizes, from the US to China and beyond.” 

In addition to abusing the flaw for illicit cryptomining operations, the attackers have used the compromised clusters to move laterally within organizations, pivot to non-internet-facing nodes, and infect additional machines on internal networks, Lumelsky added. 

“These compromised machines were then used to launch DDoS attacks against external websites,” he said. “In several instances, the attackers also accessed proprietary company assets, including AI models, datasets, and application source code, cloud credentials, database credentials, and access to retained user data from production environments.”

IronErn440 originally used GitLab as a platform for updating and delivering region-aware malware in this campaign. Region-aware malware detects the victims’ countries, and uses that info to adapt delivery methods such as using regional proxies and optimizing for network conditions. 

Oligo reported the malicious activity, and GitLab removed the attacker’s repository and account on November 5.

Shortly after, however, the attacker moved to GitHub and created multiple accounts and new repos. The campaign remains active, we’re told.

How the attacks work

The attackers first used open source vulnerability detection tool interact.sh to identify which Ray dashboard IPs were exploitable, and then waited for callbacks to track which servers executed their commands. This allowed them to discover vulnerable targets at scale. 

“Instead of manual scanning, they let victims identify themselves by calling back,” the AI security researchers wrote. “This approach also helps evade traditional scanning detection.”

Then, they got to work exploiting unauthenticated Ray job submission APIs on exposed dashboards, submitting everything from simple reconnaissance to complex multi-stage payloads.

This gets to the heart of the security flaw: Because Ray’s dashboard is designed for trusted internal networks, it doesn’t have authentication built in. When clusters are exposed to the internet – as they frequently are – this poses a huge security hole, as attackers don’t need to authenticate to gain access.

As the researchers note: “The attackers didn’t need to exploit a vulnerability, they just used Ray’s features as designed. This is a configuration vulnerability at scale.”

One of the payloads abused Ray’s scheduling API, NodeAffinitySchedulingStrategy, to execute malware on every alive node in the cluster, thus enabling lateral movement. “It’s essentially using the victim’s infrastructure as intended, using python code – like the applications that are already running,  just for malicious purposes,” Lumelsky and Elbaz said.

Then the attackers deployed a multi-stage Python payload that discovers CPUs and GPUs in the clusters, limits usage to 60 percent to evade detection, and submits a takeover job with those exact resource requirements.

According to the researchers, the GitLab payloads are most likely AI-generated. They base this assessment on the malware’s “structure, comments, and error handling patterns.”

The criminals also deployed multiple interactive reverse shells to AWS-hosted command-and-control servers, which provides redundancy, but also may indicate multiple attack operators.

After the crims moved the operation to GitHub on November 10, the security researchers spotted compromised clusters with thousands of machines, reportedly worth $4 million per year with 100 percent CPU utilization on 60 percent of the Ray cluster. 

“One of the servers had a network NFS mount, which included 240GB of Source Code, AI Models and Datasets. Everything the company is doing for the past few years, exposed to the internet,” the Oligo duo wrote.

The attackers also used a new, more efficient cryptominer that checks for GPU machines through the “nvidia-smi” utility.

Although GitHub blocked the attackers’ account on November 17, the criminals opened a new account on the same day and, within two hours, the campaign was back up and running.

“We believe this campaign is automated due to the pace of recovery and stealthy operation across providers and worldwide,” the AI threat hunters wrote.

READ MORE HERE