Welcome to the AI Safety Concept Map! This is my first update to the site, so thought I’d include a bit of background as to why this exists. Here’s the diagram at the time of writing:

AI Safety Map

This website aims to provide a visual map of the conceptual shape of the field of existential AI safety. The site also doubles as my project for the March 2024 cohort of the AI Safety Fundamentals Alignment Course (AISF) run by BlueDot Impact.

The concept map primarily came out of my difficulty wrapping my head around all the moving parts of the AI existential safety problem. Going through AISF I was exposed to a lot of curated introductory material about AI alignment, and one thing that was obvious was how interrelated the different approaches and considerations were. However, when reading through and discussing the materials it was difficult to keep track of how everything fit together into the big picture. Add to that the various online resources outlining a seemingly overwhelming number of ways things could go wrong (for example AGI Ruin: A List of Lethalities) and I felt the need to sit down, categorize, and synthesize what I was learning.

The map is inspired by aisafety.world and is similar to visualizations such as Mapping the Conceptual Territory in AI Existential Safety and Alignment while aiming to be a bit more concrete. It draws principles from failure tree and bowtie diagrams to try and organize the concepts to show causal factors. The map is also focused on existential risk so has less content about other important AI risks like surveillance, bias, and unemployment.

There is a lot of future work to improve this map:

  • Expand the number of concepts and themes represented. The map is heavily based on the content provided in the AI Safety Fundamentals Alignment Course run by BlueDot Impact. Next steps here are to add topics from the relevant Wikipedia pages and run the diagram past subject matter experts to get feedback on gaps or inaccuracies.
  • Move towards a more dynamic visualization. The current diagram is static without links which limits its ease of use. A first step would be to add links in a similar way to aisafety.world. To that end there is an AI Safety Concept Map Sheet that has been created to facilitate this if you want to leave some suggestions! Other next steps here are to build a dashboard that allows the relationships between the different items to be displayed dynamically.

I hope you find the map clarifying and helpful!