Microsoft has released an open-access automation framework, called PyRIT, to proactively identify risks in generative artificial intelligence (AI) systems.
Microsoft has released a Python Risk Identification Tool for generative AI (PyRIT), it is an open-access automation framework to proactively find risks in generative AI systems.
The tool aims at helping red teaming activity of AI systems, Microsoft states that the development of the PyRIT demonstrates its commitment to democratize securing AI for its customers, partners, and peers.
Unlike traditional red teaming activities, red teaming of generative AI systems must include the identification of both security risks as well as responsible AI risks such as fairness issues to producing ungrounded or inaccurate content.
The design of PyRIT ensures abstraction and extensibility to allow future enhancement of PyRIT’s capabilities. The tool implements five interfaces: target, datasets, scoring engine, attack strategies and memory.
PyRIT supports integrating with models from Microsoft Azure OpenAI Service, Hugging Face, and Azure Machine Learning Managed Online Endpoint.
The tool supports two attack strategy styles, the single-turn strategy and the multi-turn strategy. The former strategy consists of sending a combination of jailbreak and harmful prompts to the AI system and scores the response. In the multi-turn strategy, the system sends a combination of jailbreak and harmful prompts to the AI system, and subsequently responds to the AI system based on the scored score. The first approach if faster while the the second is a more realistic adversarial behavior and the implementation of more advanced attack strategies.
“PyRIT is more than a prompt generation tool; it changes its tactics based on the response from the generative AI system and generates the next input to the generative AI system. This automation continues until the security professional’s intended goal is achieved.” reads the announcement published by Microsoft.
Microsoft pointed out that the tool is not a replacement for the manual red teaming of generative AI systems.
“PyRIT was created in response to our belief that the sharing of AI red teaming resources across the industry raises all boats. We encourage our peers across the industry to spend time with the toolkit and see how it can be adopted for red teaming your own generative AI application.” concludes the announcement
Follow me on Twitter: @securityaffairs and Facebook
(SecurityAffairs – hacking, generative AI)