Definition
AI safety is the research field focused on identifying, understanding, and mitigating harms from advanced AI systems — from misuse and misalignment to loss of control. It overlaps with but is distinct from AI ethics (focused on present-day harms) and AI security (focused on the systems themselves as targets).
Episodes covering this
Worth reading next
Papers we haven't done a deep dive on yet, but would recommend on this topic.