Test & Evaluation Techniques for Meeting M-24-10 Mandates to Manage Generative AI Risk
Overview The release of the National Institute of Standards and Technology (NIST)’s AI Risk Management Framework (AI RMF) helped put a framework around how testing would enable organizations to manage and mitigate AI risks. While testing is predominantly considered a core part of model development, the NIST AI RMF emphasizes the importance of continuous testing and monitoring of AI. The validity and reliability for deployed AI systems are often assessed by ongoing testing or monitoring that confirms a system is performing as intended. Measurement of validity, accuracy, robustness, and reliability contribute to trustworthiness and should take into consideration that certain types of failures can cause greater harm NIST AI RMF §3.1 OMB’s memo M-24-10 goes into detail about the expectations around AI safety testing. Section 5c of the memo (5c. Minimum Practices for Safety-Impacting and Rights-Impacting Artificial Intelligence) has laid out the minimum practices for AI risk management. These are: