A tale of two research communities

The AAIP’s Body of Knowledge is another resource aimed at providing practical guidance on assurance and regulation to developers of autonomous technologies.

Different stakeholders may access this knowledge base with queries related to their needs (e.g. the use of Systems-Theoretic Processes Analysis (STPA) for hazard analysis for a cobot system, or using simulation as an assurance method for autonomous vehicles).

Overall the assuring autonomy field presents practical methods for, and expertise in, building safe autonomous technologies.

AI Safety

In the past decade, a serious research community has emerged, focused on the safety of smarter-than-human AI, including prominent members of the AI community such as Stuart Russel and researchers at DeepMind and OpenAI. Classic work in this field attacks difficult foundational decision and game theoretic problems relating to the goals of powerful AI systems.

A key issue is intent or goal alignment — how do we get machines to want to do what we want them to do?

Current paradigms in AI and ML depend upon the framework of expected utility maximising agents (e.g. in reinforcement learning the agent wishes to maximise the expected reward). However, systematically writing down everything we care about into an objective function is likely impossible and by default, agents have unsafe incentives such as not being switched off.

For example, consider a robot with the goal of getting coffee. As Russel says, “You can’t fetch the coffee if you’re dead” — such an agent will incapacitate anyone who tries to prevent it from achieving its goal of getting you a Starbucks. Importantly, this is the standard way in which we currently build AI!

It is really non-trivial to make this paradigm safe (or change the paradigm under which we currently build AI). More recent work aims to research current AI techniques in order to gain insight into future systems (Concrete Problems in AI Safety is a seminal overview) and more nuanced arguments and subfields aimed at solving a variety of problems relating to the safety of AGI have emerged (prominent research communities exist at DeepMind, OpenAI, Future of Humanity Institute, Center for Human-Compatible Artificial Intelligence, Machine Intelligence Research Institute).

Integrating the fields

Below is a conceptual breakdown of problems in technical AI safety from DeepMind. They highlight challenges in:

Of course, these categories are not cleanly disjoint and ensuring real-world systems are safe will necessitate solving problems across each category. However, this conceptual breakdown takes a step in the direction of applying ideas from safety engineering to increasingly powerful AI systems.

