NIST Report on AI Monitoring Challenges

NIST AI News & Updates

Published March 9th, 2026

Detected March 23rd, 2026

Summary

NIST has released a new report, NIST AI 800-4, detailing challenges in monitoring deployed artificial intelligence systems. The report identifies six common categories of monitoring and highlights gaps and barriers to effective AI system oversight, based on practitioner workshops and literature reviews.

View original document View source feed page

What changed

NIST has published a new report, NIST AI 800-4, titled "Challenges to the Monitoring of Deployed AI Systems." The report, released on March 9, 2026, stems from practitioner workshops and a literature review conducted in 2025 by the Center for AI Standards and Innovation (CAISI). It identifies six common categories for monitoring AI systems post-deployment, including Functionality, Operational, Human Factors, Security, Compliance, and Large-Scale Impacts monitoring. The report details the challenges, gaps, and open questions associated with each category, aiming to inform future research and facilitate wider AI adoption.

This report serves as a foundational document for understanding the complexities of AI monitoring. While it is non-binding, it provides critical insights for organizations developing, deploying, or overseeing AI systems. Compliance officers should review the identified challenges, particularly in the "Compliance Monitoring" category, to ensure their AI systems adhere to relevant regulations and standards. The report's findings may influence future regulatory guidance or industry best practices, necessitating a proactive approach to AI system monitoring and risk management.

What to do next

Review NIST AI 800-4 for challenges in AI system monitoring.
Assess current AI monitoring practices against the report's categories and identified challenges.
Incorporate findings into AI risk management frameworks and compliance strategies.

Source document (simplified)

UPDATES

New Report: Challenges to the Monitoring of Deployed AI Systems

March 9, 2026

Facebook Linkedin X.com Email As artificial intelligence (AI) systems are increasingly integrated into commercial and government applications, there is a growing demand to monitor these systems in real-world settings. While the concept of monitoring digital systems for quality assurance is not new, particularly in the cases of cybersecurity and software continuous monitoring, it is a vast and fragmented space in the AI sector. Given that AI systems have novel properties that introduce variability and manifest in unpredictable ways, post-deployment monitoring – from incident monitoring to field studies – is a crucial practice for confident, wide-spread AI adoption.

To address this pressing need, in 2025 the Center for AI Standards and Innovation (CAISI) held three practitioner workshops and conducted an in-depth literature review to map the landscape, focusing on current challenges to robust and effective post-deployment monitoring of AI systems.

Our findings are outlined in the new report, NIST AI 800-4: Challenges to the Monitoring of Deployed AI Systems, in which we identify monitoring categories and detail challenges (gaps, barriers, and open questions) to inform and spur future research in the field. The primary contribution of this report is the identification, organization, and documentation of monitoring challenges, and reporting of views expressed by experts in the field.

Six common categories of monitoring, developed via thematic coding, are listed in the table below. See Appendix B of the report for the full methodology, and Appendix C for the associated codebook.

| Monitoring Category | Definition |
| Functionality Monitoring
Does the system continue to work as intended? | Measuring system functions, capabilities, and features to ensure the system works as intended |
| Operational Monitoring
Does the system maintain consistent service across its infrastructure? | Measuring system infrastructure components, for example to ensure the system maintains consistent levels of service |
| Human Factors Monitoring
Is the system transparent to humans and high quality? | Measuring human-system interactions, for example to ensure the system produces high-quality outputs and is transparent |
| Security Monitoring
Is the system secure against attacks and misuse? | Measuring where the system is potentially vulnerable to adversarial attacks and misuse |
| Compliance Monitoring
Does the system adhere to relevant regulations and directives? | Measuring system components for adherence to relevant laws, regulations, standards, controls, and guidelines |
| Large-Scale Impacts Monitoring
Does the system promote human flourishing? | Measuring system properties that have wide downstream impacts, for example to ensure the system promotes human flourishing |
To manageably synthesize the many challenges reported by practitioners and subject matter experts, we organized the database of workshop quotes and literature excerpts in two ways: (1) by monitoring category, as, for example, some monitoring challenges are more applicable to human factors than security (e.g., overhead of collecting and gauging user feedback), and (2) those challenges that are shared across categories (e.g., poor incident sharing mechanisms). Finally, we sorted open questions on AI system monitoring into “who”, “what”, “when”, “why”, and “how” to monitor.

The table below highlights a sampling of post-deployment monitoring challenges. See the report for the full list.

Insufficient research on human-AI feedback loops
Underexplored methods to detect deceptive behavior
Defining metrics for beneficial impacts to humans
Barriers:
Detecting performance degradation and drift
Fragmented logging across distributed infrastructure
Navigating the complexity of the policy landscape |
| Cross-Cutting Challenges | Gaps:
Lack of trusted guidelines or standards for methods and tools
Immature information sharing ecosystem
Barriers:
Scaling human-driven monitoring alongside rapid rollouts
Balancing competitive pressures with necessary oversight
Hiring and training qualified AI experts |
| Open Questions | - How to reduce monitoring burden on the end user or customer?
Should monitoring be based on risk-level? Tailored to the use case?
What is the right cadence for monitoring?
What is the relationship between monitoring and auditing?
How to balance and integrate automated monitoring and human-validated monitoring? |
The identified gaps, barriers, and open questions highlight impactful opportunities for further investigation and innovation. The monitoring categories can offer a common language for describing sub-fields within AI system monitoring, and the challenges identified highlight areas where additional solutions are needed.

We welcome your engagement as we evaluate how best to support stakeholders in post-deployment monitoring of AI systems. You can share comments via email to NISTAI800-4 [at] nist.gov (NISTAI800-4[at]nist[dot]gov).

Artificial intelligence and Standards