NIST Report on AI Monitoring Challenges
Summary
NIST has released a new report, NIST AI 800-4, detailing challenges in monitoring deployed artificial intelligence systems. The report identifies six common categories of monitoring and highlights gaps and barriers to effective AI system oversight, based on practitioner workshops and literature reviews.
What changed
NIST has published a new report, NIST AI 800-4, titled "Challenges to the Monitoring of Deployed AI Systems." The report, released on March 9, 2026, stems from practitioner workshops and a literature review conducted in 2025 by the Center for AI Standards and Innovation (CAISI). It identifies six common categories for monitoring AI systems post-deployment, including Functionality, Operational, Human Factors, Security, Compliance, and Large-Scale Impacts monitoring. The report details the challenges, gaps, and open questions associated with each category, aiming to inform future research and facilitate wider AI adoption.
This report serves as a foundational document for understanding the complexities of AI monitoring. While it is non-binding, it provides critical insights for organizations developing, deploying, or overseeing AI systems. Compliance officers should review the identified challenges, particularly in the "Compliance Monitoring" category, to ensure their AI systems adhere to relevant regulations and standards. The report's findings may influence future regulatory guidance or industry best practices, necessitating a proactive approach to AI system monitoring and risk management.
What to do next
- Review NIST AI 800-4 for challenges in AI system monitoring.
- Assess current AI monitoring practices against the report's categories and identified challenges.
- Incorporate findings into AI risk management frameworks and compliance strategies.
Source document (simplified)
New Report: Challenges to the Monitoring of Deployed AI Systems
March 9, 2026
Share
Facebook Linkedin X.com Email As artificial intelligence (AI) systems are increasingly integrated into commercial and government applications, there is a growing demand to monitor these systems in real-world settings. While the concept of monitoring digital systems for quality assurance is not new, particularly in the cases of cybersecurity and software continuous monitoring, it is a vast and fragmented space in the AI sector. Given that AI systems have novel properties that introduce variability and manifest in unpredictable ways, post-deployment monitoring – from incident monitoring to field studies – is a crucial practice for confident, wide-spread AI adoption.
To address this pressing need, in 2025 the Center for AI Standards and Innovation (CAISI) held three practitioner workshops and conducted an in-depth literature review to map the landscape, focusing on current challenges to robust and effective post-deployment monitoring of AI systems.
Our findings are outlined in the new report, NIST AI 800-4: Challenges to the Monitoring of Deployed AI Systems, in which we identify monitoring categories and detail challenges (gaps, barriers, and open questions) to inform and spur future research in the field. The primary contribution of this report is the identification, organization, and documentation of monitoring challenges, and reporting of views expressed by experts in the field.
Six common categories of monitoring, developed via thematic coding, are listed in the table below. See Appendix B of the report for the full methodology, and Appendix C for the associated codebook.
| Monitoring Category | Definition |
| Functionality Monitoring
Does the system continue to work as intended? | Measuring system functions, capabilities, and features to ensure the system works as intended |
| Operational Monitoring
Does the system maintain consistent service across its infrastructure? | Measuring system infrastructure components, for example to ensure the system maintains consistent levels of service |
| Human Factors Monitoring
Is the system transparent to humans and high quality? | Measuring human-system interactions, for example to ensure the system produces high-quality outputs and is transparent |
| Security Monitoring
Is the system secure against attacks and misuse? | Measuring where the system is potentially vulnerable to adversarial attacks and misuse |
| Compliance Monitoring
Does the system adhere to relevant regulations and directives? | Measuring system components for adherence to relevant laws, regulations, standards, controls, and guidelines |
| Large-Scale Impacts Monitoring
Does the system promote human flourishing? | Measuring system properties that have wide downstream impacts, for example to ensure the system promotes human flourishing |
To manageably synthesize the many challenges reported by practitioners and subject matter experts, we organized the database of workshop quotes and literature excerpts in two ways: (1) by monitoring category, as, for example, some monitoring challenges are more applicable to human factors than security (e.g., overhead of collecting and gauging user feedback), and (2) those challenges that are shared across categories (e.g., poor incident sharing mechanisms). Finally, we sorted open questions on AI system monitoring into “who”, “what”, “when”, “why”, and “how” to monitor.
The table below highlights a sampling of post-deployment monitoring challenges. See the report for the full list.
| | Highlighted Gaps, Barriers, and Open Questions |
| Category-Specific Challenges | Gaps:
- Insufficient research on human-AI feedback loops
- Underexplored methods to detect deceptive behavior
Defining metrics for beneficial impacts to humans
Barriers:Detecting performance degradation and drift
Fragmented logging across distributed infrastructure
Navigating the complexity of the policy landscape |
| Cross-Cutting Challenges | Gaps:Lack of trusted guidelines or standards for methods and tools
Immature information sharing ecosystem
Barriers:Scaling human-driven monitoring alongside rapid rollouts
Balancing competitive pressures with necessary oversight
Hiring and training qualified AI experts |
| Open Questions | - How to reduce monitoring burden on the end user or customer?Should monitoring be based on risk-level? Tailored to the use case?
What is the right cadence for monitoring?
What is the relationship between monitoring and auditing?
How to balance and integrate automated monitoring and human-validated monitoring? |
The identified gaps, barriers, and open questions highlight impactful opportunities for further investigation and innovation. The monitoring categories can offer a common language for describing sub-fields within AI system monitoring, and the challenges identified highlight areas where additional solutions are needed.
We welcome your engagement as we evaluate how best to support stakeholders in post-deployment monitoring of AI systems. You can share comments via email to NISTAI800-4 [at] nist.gov (NISTAI800-4[at]nist[dot]gov).
Artificial intelligence and Standards
NIST in your inbox
Stay up to date with the latest news from NIST. Enter Email Address
Released March 9, 2026, Updated March 18, 2026
Named provisions
Related changes
Source
Classification
Who this affects
Taxonomy
Browse Categories
Get Telecom & Technology alerts
Weekly digest. AI-summarized, no noise.
Free. Unsubscribe anytime.
Get alerts for this source
We'll email you when NIST AI News & Updates publishes new changes.