{"id":60348,"date":"2026-03-20T16:19:00","date_gmt":"2026-03-20T16:19:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/security\/blog\/?p=146042"},"modified":"2026-03-20T16:19:00","modified_gmt":"2026-03-20T16:19:00","slug":"cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents","status":"publish","type":"post","link":"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/","title":{"rendered":"CTI-REALM: A new benchmark for end-to-end detection rule generation with AI agents"},"content":{"rendered":"<p class=\"wp-block-paragraph\"><strong>Excerpt:<\/strong> CTI-REALM is Microsoft\u2019s open-source benchmark for evaluating AI agents on real-world detection engineering\u2014turning cyber threat intelligence (CTI) into validated detections. Instead of measuring \u201cCTI trivia,\u201d CTI-REALM tests end-to-end workflows: reading threat reports, exploring telemetry, iterating on KQL queries, and producing Sigma rules and KQL-based detection logic that can be scored against ground truth across Linux, AKS, and Azure cloud environments.<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\">\n<p class=\"wp-block-paragraph\"><em>Security is Microsoft\u2019s top priority. Every day, we process more than 100 trillion security signals across endpoints, cloud infrastructure, identity, and global threat intelligence. That\u2019s the scale modern cyber defense demands, and AI is a core part of how we protect Microsoft and our customers worldwide. At the same time, security is, and always will be, a team sport. <\/em><\/p>\n<p class=\"wp-block-paragraph\"><em>That\u2019s why Microsoft is committed to AI model diversity and to helping defenders apply the latest AI responsibly. We created CTI\u2011REALM and open\u2011sourced it so the broader industry can test models, write better code, and build more secure systems together.<\/em><\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\">\n<p class=\"wp-block-paragraph\">CTI-REALM (Cyber Threat Real World Evaluation and LLM Benchmarking) is Microsoft\u2019s open-source benchmark that evaluates AI agents on end-to-end detection engineering. Building on work like <a href=\"https:\/\/www.microsoft.com\/en-us\/security\/blog\/2025\/10\/14\/microsoft-raises-the-bar-a-smarter-way-to-measure-ai-for-cybersecurity\/\">ExCyTIn-Bench<\/a>, which evaluates agents on threat investigation, CTI-REALM extends the scope to the next stage of the security workflow: detection rule generation. Rather than testing whether a model can answer CTI trivia or classify techniques in isolation, CTI-REALM places agents in a realistic, tool-rich environment and asks them to do what security analysts do every day: read a threat intelligence report, explore telemetry, write and refine KQL queries, and produce validated detection rules.<\/p>\n<p class=\"wp-block-paragraph\">We curated 37 CTI reports from public sources (Microsoft Security, Datadog Security Labs, Palo Alto Networks, and Splunk), selecting those that could be faithfully simulated in a sandboxed environment and that produced telemetry suitable for detection rule development. The benchmark spans three platforms: Linux endpoints, Azure Kubernetes Service (AKS), and Azure cloud infrastructure with ground-truth scoring at every stage of the analytical workflow.<\/p>\n<h3 class=\"wp-block-heading\" id=\"why-cti-realm-exists\">Why CTI-REALM exists<\/h3>\n<p class=\"wp-block-paragraph\">Existing cybersecurity benchmarks primarily test parametric knowledge: can a model name the MITRE technique behind a log entry, or classify a TTP from a report? These are useful signals. However, they miss the harder question: can an agent operationalize that knowledge into detection logic that finds attacks in production telemetry?<\/p>\n<p class=\"wp-block-paragraph\">No current benchmark evaluates this complete workflow. CTI-REALM fills that gap by measuring:<\/p>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\"><strong>Operationalization, not recall:<\/strong> Agents must translate narrative threat intelligence into working Sigma rules and KQL queries, validated against real attack telemetry.<\/li>\n<li class=\"wp-block-list-item\"><strong>The full workflow: <\/strong>Scoring captures intermediate decision quality\u2014CTI report selection, MITRE technique mapping, data source identification, iterative query refinement. Scoring is not just limited to the final output.<\/li>\n<li class=\"wp-block-list-item\"><strong>Realistic tooling:<\/strong> Agents use the same types of tools security analysts rely on: CTI repositories, schema explorers, a Kusto query engine, MITRE ATT&amp;CK and Sigma rule databases.<\/li>\n<\/ul>\n<h3 class=\"wp-block-heading\" id=\"business-impact\">Business Impact<\/h3>\n<p class=\"wp-block-paragraph\">CTI-REALM gives security engineering leaders a repeatable, objective way to prove whether an AI model improves detection coverage and analyst output.<\/p>\n<p class=\"wp-block-paragraph\">Traditional benchmarks tend to provide a single aggregate score where a model either passes or fails but doesn\u2019t always tell the team why. CTI-REALM\u2019s checkpoint-based scoring answers this directly. It reveals whether a model struggles with CTI comprehension, query construction, or detection specificity. This helps teams make informed decisions about where human review and guardrails are needed.<\/p>\n<h3 class=\"wp-block-heading\" id=\"why-cti-realm-matters-for-business\">Why CTI-REALM matters for business<\/h3>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\"><strong>Measures operationalization, not trivia:<\/strong> Focuses on translating narrative threat intel into detection logic that can be validated against ground truth.<\/li>\n<li class=\"wp-block-list-item\"><strong>Captures the workflow:<\/strong> Evaluates intermediate steps (e.g., technique extraction, telemetry identification, iterative refinement) in addition to the final rule quality.<\/li>\n<li class=\"wp-block-list-item\"><strong>Supports safer adoption:<\/strong> Helps teams benchmark models before considering any downstream use and reinforces the need for human review before operational deployment.<\/li>\n<\/ul>\n<h3 class=\"wp-block-heading\" id=\"latest-results\">Latest results<\/h3>\n<p class=\"wp-block-paragraph\">We evaluated 16 frontier model configurations on CTI-REALM-50 (50 tasks spanning all three platforms).<\/p>\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"903\" height=\"446\" src=\"https:\/\/www.microsoft.com\/en-us\/security\/blog\/wp-content\/uploads\/2026\/03\/image-1.gif\" alt=\"Animated Gif Image\" class=\"wp-image-146045\"><figcaption class=\"wp-element-caption\">Model performance on CTI-REALM-50, sorted by normalized reward.<\/figcaption><\/figure>\n<h3 class=\"wp-block-heading\" id=\"what-the-numbers-tell-us\">What the numbers tell us<\/h3>\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Anthropic models lead across the board. Claude occupies the top three positions (0.587\u20130.637), driven by significantly stronger tool-use and iterative query behavior compared to OpenAI models.<\/li>\n<li class=\"wp-block-list-item\">More reasoning isn\u2019t always better. Within the GPT-5 family, medium reasoning consistently beats high across all three generations, suggesting overthinking hurts in agentic settings.<\/li>\n<li class=\"wp-block-list-item\">Cloud detection is the hardest problem. Performance drops sharply from Linux (0.585) to AKS (0.517) to Cloud (0.282), reflecting the difficulty of correlating across multiple data sources in APT-style scenarios.<\/li>\n<li class=\"wp-block-list-item\">CTI tools matter. Removing CTI-specific tools degraded every model\u2019s output by up to 0.150 points, with the biggest impact on final detection rule quality rather than intermediate steps.<\/li>\n<li class=\"wp-block-list-item\">Structured guidance closes the gap. Providing a smaller model with human-authored workflow tips closed about a third of the performance gap to a much larger model, primarily by improving threat technique identification.<\/li>\n<\/ul>\n<p class=\"wp-block-paragraph\">For complete details around techniques and results, please refer to the paper here: <a href=\"https:\/\/arxiv.org\/abs\/2603.13517\">[2603.13517] CTI-REALM: Benchmark to Evaluate Agent Performance on Security Detection Rule Generation Capabilities<\/a>.<\/p>\n<h3 class=\"wp-block-heading\" id=\"get-involved\">Get involved<\/h3>\n<p class=\"wp-block-paragraph\">CTI-REALM is open-source and free to access. CTI-REALM will be available on the Inspect AI repo soon. You can access it here: <a href=\"https:\/\/github.com\/UKGovernmentBEIS\/inspect_evals\" target=\"_blank\" rel=\"noreferrer noopener\">UKGovernmentBEIS\/inspect_evals: Collection of evals for Inspect AI<\/a>.<\/p>\n<p class=\"wp-block-paragraph\">Model developers and security teams are invited to contribute, benchmark, and share results via the official GitHub repository. For questions or partnership opportunities, reach out to the team at msecaimrbenchmarking@microsoft[.]com.<\/p>\n<p class=\"wp-block-paragraph\"><strong>CTI-REALM helps teams evaluate whether an agent can reliably turn threat intelligence into detections before relying on it in security operations.<\/strong><\/p>\n<h3 class=\"wp-block-heading\" id=\"references\">References<\/h3>\n<ol class=\"wp-block-list\">\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/www.microsoft.com\/en-us\/security\/blog\/2025\/10\/14\/microsoft-raises-the-bar-a-smarter-way-to-measure-ai-for-cybersecurity\/?msockid=1e918fa42a0668b10b8499822b646944\">Microsoft raises the bar: A smarter way to measure AI for cybersecurity | Microsoft Security Blog<\/a><\/li>\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/arxiv.org\/abs\/2603.13517\">[2603.13517] CTI-REALM: Benchmark to Evaluate Agent Performance on Security Detection Rule Generation Capabilities<\/a><\/li>\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/github.com\/UKGovernmentBEIS\/inspect_evals\/pull\/1270\">CTI-REALM: Cyber Threat Intelligence Detection Rule Development Benchmark by arjun180-new \u00b7 Pull Request #1270 \u00b7 UKGovernmentBEIS\/inspect_evals<\/a><\/li>\n<\/ol>\n<p>READ MORE <a href=\"https:\/\/www.microsoft.com\/en-us\/security\/blog\/2026\/03\/20\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/\">HERE<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Excerpt: CTI-REALM is Microsoft\u2019s open-source benchmark for evaluating AI agents on real-world detection engineering\u2014turning cyber threat intelligence (CTI) into validated detections.<br \/>\nThe post CTI-REALM: A new benchmark for end-to-end detection rule generation with AI agents appeared first on Microsoft Security Blog. READ MORE HERE&#8230;<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"colormag_page_layout":"default_layout","footnotes":""},"categories":[276],"tags":[11130],"class_list":["post-60348","post","type-post","status-publish","format-standard","hentry","category-microsoft-secure","tag-microsoft-agent-365"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.8 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>CTI-REALM: A new benchmark for end-to-end detection rule generation with AI agents 2026 | ThreatsHub Cybersecurity News<\/title>\n<meta name=\"description\" content=\"ThreatsHub Cybersecurity News | ThreatsHub.org | Cloud Security &amp; Cyber Threats Analysis Hub. 100% Free OSINT Threat Intelligent and Cybersecurity News.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"CTI-REALM: A new benchmark for end-to-end detection rule generation with AI agents 2026 | ThreatsHub Cybersecurity News\" \/>\n<meta property=\"og:description\" content=\"ThreatsHub Cybersecurity News | ThreatsHub.org | Cloud Security &amp; Cyber Threats Analysis Hub. 100% Free OSINT Threat Intelligent and Cybersecurity News.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/\" \/>\n<meta property=\"og:site_name\" content=\"ThreatsHub Cybersecurity News\" \/>\n<meta property=\"article:published_time\" content=\"2026-03-20T16:19:00+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.threatshub.org\/blog\/coredata\/uploads\/2025\/05\/Threatshub_Favicon1.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"432\" \/>\n\t<meta property=\"og:image:height\" content=\"435\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"TH Author\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@threatshub\" \/>\n<meta name=\"twitter:site\" content=\"@threatshub\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"TH Author\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/\"},\"author\":{\"name\":\"TH Author\",\"@id\":\"https:\/\/www.threatshub.org\/blog\/#\/schema\/person\/12e0a8671ff89a863584f193e7062476\"},\"headline\":\"CTI-REALM: A new benchmark for end-to-end detection rule generation with AI agents\",\"datePublished\":\"2026-03-20T16:19:00+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/\"},\"wordCount\":956,\"publisher\":{\"@id\":\"https:\/\/www.threatshub.org\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.microsoft.com\/en-us\/security\/blog\/wp-content\/uploads\/2026\/03\/image-1.gif\",\"keywords\":[\"Microsoft Agent 365\"],\"articleSection\":[\"Microsoft Secure\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/\",\"url\":\"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/\",\"name\":\"CTI-REALM: A new benchmark for end-to-end detection rule generation with AI agents 2026 | ThreatsHub Cybersecurity News\",\"isPartOf\":{\"@id\":\"https:\/\/www.threatshub.org\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.microsoft.com\/en-us\/security\/blog\/wp-content\/uploads\/2026\/03\/image-1.gif\",\"datePublished\":\"2026-03-20T16:19:00+00:00\",\"description\":\"ThreatsHub Cybersecurity News | ThreatsHub.org | Cloud Security & Cyber Threats Analysis Hub. 100% Free OSINT Threat Intelligent and Cybersecurity News.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/#primaryimage\",\"url\":\"https:\/\/www.microsoft.com\/en-us\/security\/blog\/wp-content\/uploads\/2026\/03\/image-1.gif\",\"contentUrl\":\"https:\/\/www.microsoft.com\/en-us\/security\/blog\/wp-content\/uploads\/2026\/03\/image-1.gif\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.threatshub.org\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Microsoft Agent 365\",\"item\":\"https:\/\/www.threatshub.org\/blog\/tag\/microsoft-agent-365\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"CTI-REALM: A new benchmark for end-to-end detection rule generation with AI agents\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.threatshub.org\/blog\/#website\",\"url\":\"https:\/\/www.threatshub.org\/blog\/\",\"name\":\"ThreatsHub Cybersecurity News\",\"description\":\"%%focuskw%% Threat Intel \u2013 Threat Intel Services \u2013 CyberIntelligence \u2013 Cyber Threat Intelligence - Threat Intelligence Feeds - Threat Intelligence Reports - CyberSecurity Report \u2013 Cyber Security PDF \u2013 Cybersecurity Trends - Cloud Sandbox \u2013- Threat IntelligencePortal \u2013 Incident Response \u2013 Threat Hunting \u2013 IOC - Yara - Security Operations Center \u2013 SecurityOperation Center \u2013 Security SOC \u2013 SOC Services - Advanced Threat - Threat Detection - TargetedAttack \u2013 APT \u2013 Anti-APT \u2013 Advanced Protection \u2013 Cyber Security Services \u2013 Cybersecurity Services -Threat Intelligence Platform\",\"publisher\":{\"@id\":\"https:\/\/www.threatshub.org\/blog\/#organization\"},\"alternateName\":\"Threatshub.org\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.threatshub.org\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.threatshub.org\/blog\/#organization\",\"name\":\"ThreatsHub.org\",\"alternateName\":\"Threatshub.org\",\"url\":\"https:\/\/www.threatshub.org\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.threatshub.org\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.threatshub.org\/blog\/coredata\/uploads\/2025\/05\/Threatshub_Favicon1.jpg\",\"contentUrl\":\"https:\/\/www.threatshub.org\/blog\/coredata\/uploads\/2025\/05\/Threatshub_Favicon1.jpg\",\"width\":432,\"height\":435,\"caption\":\"ThreatsHub.org\"},\"image\":{\"@id\":\"https:\/\/www.threatshub.org\/blog\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/threatshub\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.threatshub.org\/blog\/#\/schema\/person\/12e0a8671ff89a863584f193e7062476\",\"name\":\"TH Author\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.threatshub.org\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/066276f086d5155df79c850206a779ad368418a844da0182ce43f9cd5b506c3d?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/066276f086d5155df79c850206a779ad368418a844da0182ce43f9cd5b506c3d?s=96&d=mm&r=g\",\"caption\":\"TH Author\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"CTI-REALM: A new benchmark for end-to-end detection rule generation with AI agents 2026 | ThreatsHub Cybersecurity News","description":"ThreatsHub Cybersecurity News | ThreatsHub.org | Cloud Security & Cyber Threats Analysis Hub. 100% Free OSINT Threat Intelligent and Cybersecurity News.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/","og_locale":"en_US","og_type":"article","og_title":"CTI-REALM: A new benchmark for end-to-end detection rule generation with AI agents 2026 | ThreatsHub Cybersecurity News","og_description":"ThreatsHub Cybersecurity News | ThreatsHub.org | Cloud Security & Cyber Threats Analysis Hub. 100% Free OSINT Threat Intelligent and Cybersecurity News.","og_url":"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/","og_site_name":"ThreatsHub Cybersecurity News","article_published_time":"2026-03-20T16:19:00+00:00","og_image":[{"width":432,"height":435,"url":"https:\/\/www.threatshub.org\/blog\/coredata\/uploads\/2025\/05\/Threatshub_Favicon1.jpg","type":"image\/jpeg"}],"author":"TH Author","twitter_card":"summary_large_image","twitter_creator":"@threatshub","twitter_site":"@threatshub","twitter_misc":{"Written by":"TH Author","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/#article","isPartOf":{"@id":"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/"},"author":{"name":"TH Author","@id":"https:\/\/www.threatshub.org\/blog\/#\/schema\/person\/12e0a8671ff89a863584f193e7062476"},"headline":"CTI-REALM: A new benchmark for end-to-end detection rule generation with AI agents","datePublished":"2026-03-20T16:19:00+00:00","mainEntityOfPage":{"@id":"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/"},"wordCount":956,"publisher":{"@id":"https:\/\/www.threatshub.org\/blog\/#organization"},"image":{"@id":"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/#primaryimage"},"thumbnailUrl":"https:\/\/www.microsoft.com\/en-us\/security\/blog\/wp-content\/uploads\/2026\/03\/image-1.gif","keywords":["Microsoft Agent 365"],"articleSection":["Microsoft Secure"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/","url":"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/","name":"CTI-REALM: A new benchmark for end-to-end detection rule generation with AI agents 2026 | ThreatsHub Cybersecurity News","isPartOf":{"@id":"https:\/\/www.threatshub.org\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/#primaryimage"},"image":{"@id":"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/#primaryimage"},"thumbnailUrl":"https:\/\/www.microsoft.com\/en-us\/security\/blog\/wp-content\/uploads\/2026\/03\/image-1.gif","datePublished":"2026-03-20T16:19:00+00:00","description":"ThreatsHub Cybersecurity News | ThreatsHub.org | Cloud Security & Cyber Threats Analysis Hub. 100% Free OSINT Threat Intelligent and Cybersecurity News.","breadcrumb":{"@id":"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/#primaryimage","url":"https:\/\/www.microsoft.com\/en-us\/security\/blog\/wp-content\/uploads\/2026\/03\/image-1.gif","contentUrl":"https:\/\/www.microsoft.com\/en-us\/security\/blog\/wp-content\/uploads\/2026\/03\/image-1.gif"},{"@type":"BreadcrumbList","@id":"https:\/\/www.threatshub.org\/blog\/cti-realm-a-new-benchmark-for-end-to-end-detection-rule-generation-with-ai-agents\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.threatshub.org\/blog\/"},{"@type":"ListItem","position":2,"name":"Microsoft Agent 365","item":"https:\/\/www.threatshub.org\/blog\/tag\/microsoft-agent-365\/"},{"@type":"ListItem","position":3,"name":"CTI-REALM: A new benchmark for end-to-end detection rule generation with AI agents"}]},{"@type":"WebSite","@id":"https:\/\/www.threatshub.org\/blog\/#website","url":"https:\/\/www.threatshub.org\/blog\/","name":"ThreatsHub Cybersecurity News","description":"%%focuskw%% Threat Intel \u2013 Threat Intel Services \u2013 CyberIntelligence \u2013 Cyber Threat Intelligence - Threat Intelligence Feeds - Threat Intelligence Reports - CyberSecurity Report \u2013 Cyber Security PDF \u2013 Cybersecurity Trends - Cloud Sandbox \u2013- Threat IntelligencePortal \u2013 Incident Response \u2013 Threat Hunting \u2013 IOC - Yara - Security Operations Center \u2013 SecurityOperation Center \u2013 Security SOC \u2013 SOC Services - Advanced Threat - Threat Detection - TargetedAttack \u2013 APT \u2013 Anti-APT \u2013 Advanced Protection \u2013 Cyber Security Services \u2013 Cybersecurity Services -Threat Intelligence Platform","publisher":{"@id":"https:\/\/www.threatshub.org\/blog\/#organization"},"alternateName":"Threatshub.org","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.threatshub.org\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.threatshub.org\/blog\/#organization","name":"ThreatsHub.org","alternateName":"Threatshub.org","url":"https:\/\/www.threatshub.org\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.threatshub.org\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.threatshub.org\/blog\/coredata\/uploads\/2025\/05\/Threatshub_Favicon1.jpg","contentUrl":"https:\/\/www.threatshub.org\/blog\/coredata\/uploads\/2025\/05\/Threatshub_Favicon1.jpg","width":432,"height":435,"caption":"ThreatsHub.org"},"image":{"@id":"https:\/\/www.threatshub.org\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/threatshub"]},{"@type":"Person","@id":"https:\/\/www.threatshub.org\/blog\/#\/schema\/person\/12e0a8671ff89a863584f193e7062476","name":"TH Author","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.threatshub.org\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/066276f086d5155df79c850206a779ad368418a844da0182ce43f9cd5b506c3d?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/066276f086d5155df79c850206a779ad368418a844da0182ce43f9cd5b506c3d?s=96&d=mm&r=g","caption":"TH Author"}}]}},"_links":{"self":[{"href":"https:\/\/www.threatshub.org\/blog\/wp-json\/wp\/v2\/posts\/60348","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.threatshub.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.threatshub.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.threatshub.org\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.threatshub.org\/blog\/wp-json\/wp\/v2\/comments?post=60348"}],"version-history":[{"count":0,"href":"https:\/\/www.threatshub.org\/blog\/wp-json\/wp\/v2\/posts\/60348\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.threatshub.org\/blog\/wp-json\/wp\/v2\/media?parent=60348"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.threatshub.org\/blog\/wp-json\/wp\/v2\/categories?post=60348"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.threatshub.org\/blog\/wp-json\/wp\/v2\/tags?post=60348"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}