Sam Altman-run OpenAI has expanded its internal safety processes to tackle the threat of harmful AI amid increased government scrutiny.
The company said it will establish a dedicated team to oversee technical work and an operational structure for safety decision-making.
“We are creating a cross-functional Safety Advisory Group to review all reports and send them concurrently to Leadership and the Board of Directors. While Leadership is the decision-maker, the Board of Directors holds the right to reverse decisions,” it said late on Monday.
OpenAI discussed its updated “Preparedness Framework”, saying it will invest in the design and execution of rigorous capability evaluations and forecasting to better detect emerging risks.
“We will run evaluations and continually update ‘scorecards’ for our models. We will evaluate all our frontier models, including at every 2x effective compute increase during training runs. We will push models to their limits,” said the ChatGPT maker.
The goal is to probe the specific edges of what’s unsafe to effectively mitigate the revealed risks. To track the safety levels of our models, OpenAI will produce risk “scorecards” and detailed reports.
“We will also implement additional security measures tailored to models with high or critical (pre-mitigation) levels of risk,” said the company.
OpenAI said it will develop protocols for added safety and outside accountability.
“The Preparedness Team will conduct regular safety drills to stress-test against the pressures of our business and our own culture,” it added.
The company will also collaborate closely with external parties as well as internal teams like Safety Systems to track real-world misuse.