As software systems become more complex, cybersecurity teams are managing growing volumes of vulnerabilities that require evaluation, prioritization and remediation.
For regulated medical-device environments, that process carries additional importance because teams must carefully balance cybersecurity, compliance and patient safety considerations.
Within our Advanced Surgery business, one growing challenge involved vulnerabilities published without official severity scores. In those situations, security engineers must manually assess technical documentation and determine how serious each vulnerability may be.
A backlog of 769 vulnerabilities represented roughly 32 working days of manual effort.
To help address the challenge, our Advanced Surgery business’ R&D Product Security team developed an AI-powered Vulnerability Assessment Copilot Agent that reduced the same workload to approximately 3.2 calendar days while maintaining human oversight and governance controls.
AI Supporting Engineers, Not Replacing Them
The system was developed using Microsoft Copilot Studio and the Claude Opus 4.6 foundation model.
The AI assistant reviews vulnerability descriptions, searches trusted technical references, compares similar vulnerabilities and generates an initial assessment package for cybersecurity engineers.
For each vulnerability, the system provides:
- Recommended severity scoring
- Detailed rationale
- Supporting references
- Exploitability analysis
- Comparable vulnerabilities
The goal is not fully automated cybersecurity decision-making.
Instead, the platform helps reduce repetitive analysis work so experienced engineers can focus more time on higher-value activities such as risk assessment, remediation planning and customer assurance.
Building AI Into Operational Workflows
Beyond the AI model itself, the project focused heavily on workflow integration and governance.
The system supports automated batch processing, restart recovery capabilities, duplicate prevention logic and ongoing validation testing whenever prompts or models are updated.
The AI agent was also validated against vulnerabilities with official industry severity scores, achieving:
- 88% agreement on severity category
- 80% exact score agreement
Human review remains mandatory for Critical or High severity outputs and lower-confidence assessments.
Practical AI With Measurable Impact
As AI adoption expands across engineering and technology industries, successful implementation increasingly depends on practical application rather than experimentation alone.
This project demonstrates how AI can improve operational speed and scalability while maintaining engineering rigor, traceability and expert oversight.
It also highlights an important reality about AI adoption in highly technical industries: the strongest outcomes happen when advanced tools are paired with deep domain expertise and thoughtfully engineered workflows.