Sharing the first SimuLand dataset to expedite research and learn about adversary tradecraft

August 5, 2021 TH Author

Last month, we introduced the SimuLand project to help security researchers around the world deploy lab environments to reproduce well-known attack scenarios, actively test detections, and learn more about the underlying behavior and implementation of adversary techniques. Since the release of the project, we have worked on a second phase to improve the current documentation and collect the telemetry generated after running the simulation plans in the lab guides.

Today, we are excited to release a dataset generated from the first simulation scenario to provide security researchers with an option to access data mapped to attack behavior without deploying the full environment.

Sharing SimuLand data to expedite research

In our previous blog post, we showed a basic threat research methodology and where the SimuLand project fits. One of the next steps after a simulation is the collection and analysis of the data generated. We believe we can help expedite the research process by sharing the security events generated during testing.

Figure 1: Map of a threat research methodology emphasizing SimuLand and Security Datasets.

What security events?

The data that you could collect from a SimuLand scenario depends on the adversary tradecraft simulated and the security controls in place. Based on the first simulation scenario, these are some of the security events you can collect and map to adversary behavior:

Figure 2: Adversarial techniques mapped to sources of data.

How are security events collected?

Security events generated during a simulation can be collected using the following APIs:

Where can I download the dataset from?

Rather than creating a new GitHub repository and storing all the security events generated, we are contributing every single dataset to the GitHub repository of the Security Datasets project. This is a community-driven effort developed to share pre-recorded datasets with the Information Security (InfoSec) community to expedite data analysis and threat research. This is another open-source initiative created and maintained by the Open Threat Research community.

You can find our first dataset here.

What can you do with the dataset?

Besides empowering security researchers to understand the underlying behavior of attack techniques, sharing a dataset also helps to:

Expedite the development and validation of detection rules.
Identify and validate a chain of events to model adversary behavior.
Facilitate labeled and unlabeled data for initial research and feature development.
Automate simulation exercises by injecting pre-recorded events into data pipelines.
Complement training material and expedite the creation of data analysis use cases.
Expedite the creation of internal or community events, such as capture-the-flag or hackathons, where the data is used to create challenges and encourage collaboration.

What’s next

With this first dataset, we commit ourselves to release the security events we generate in our lab environment along with new lab guides. We also hope you can help us identify new sources of data to improve the project and data collection strategy.

Learn more

To learn more about this open-source initiative, visit the SimuLand GitHub repository.

To learn more about Microsoft Security solutions, visit our website. Bookmark the Security blog to keep up with our expert coverage on security matters. Also, follow us at @MSFTSecurity for the latest news and updates on cybersecurity.

Sharing SimuLand data to expedite research

What security events?

How are security events collected?

Where can I download the dataset from?

What can you do with the dataset?

What’s next

Learn more

You May Also Like

Automatic on-premises Exchange Server mitigation now in Microsoft Defender Antivirus

Addressing cybersecurity at the board level with Difenda and Microsoft

Stay compliant and protect sensitive data with Zero Trust security