A key challenge in designing machine learning applications is describing in software code precisely what the developer wants a system to do.
In a paper released this week by Georgetown University’s Center for Security and Emerging Technology, the authors address the specification challenge in machine learning as a critical step toward greater AI safety.
The authors define AI safety specification as conveying in code a detailed description of the task a machine learning system should perform, or what they call a “human-specified objective.”
An example is spotting the traffic lights in a CAPTCHA tile. Not included are specifications for steps such as training models.
In order for a machine learning system to discern patterns, developers specify key objectives. That objective function represents a core component of a learning algorithm, specifying how a model learns as it gathers more training data.
Specification is among a list of AI safety issues addressed by the researchers as developers seek to advance machine learning from narrow to more challenging real-world applications.”
“There’s also been a lot more attention in the basic research side to safety and reliability and to understanding performance,” co-author Helen Toner noted in an interview.
“The interpretability of these systems is another really useful input here that lets you understand, depending on the application and depending on what kinds of assurances you need, having better ways to understand conceptually what’s going on inside these systems is also extremely valuable.
“So that’s the basic science piece at sort of at the bottom of the stack in my mind.”
The need for safety specifications will grow as machine learning advances from narrow applications like product recommendations to mission-critical tasks such autonomous driving.
“As machine learning systems become more advanced, they will likely be deployed in increasingly complex environments to carry out increasingly complex tasks,” the authors note. “This is where specification problems may begin to bite.”
That places a greater onus on developers to come up with new ways of conveying their intentions in code, enabling machine learning to obey “the letter, not the spirit, of the rules [designers] give them,” the authors note.
Algorithms that incorporate human supervision or steps toward anticipating worst-case performance are recommended as a way of overcoming “misspecification.” Also needed is more research into how developers can convey in code nuanced, complex objectives with the assurance that the systems they design will work toward those objectives.
Until then, the authors warn, machine learning will remain restricted to “narrow, tightly prescribed settings.”
The AI safety study is here.