- Cold Takes on AI

Cold Takes on AI

Here’s a quick guide to the AI content on Cold Takes.

The Most Important Century

This is the core of my AI content - the case that AI could make this the most important century of all time for humanity.

It’s a long series, but this page has a short written summary with all the key charts, plus a link to audio for the whole series (4 hours) or shorter discussions on podcasts.

The most central point of this series is that the long-run future could come much faster than people tend to imagine, because the right kind of AI could cause explosive acceleration in the rate of progress. This idea is explained in the summary.

Because of this point:

  • I think the wilder, wackier AI risks are not necessarily merely “long-run” risks that we can deal with later, after “short-run” risks. I think it is important to be preparing for them today.
  • I don’t think we can count on society “adapting as we go” to challenges of AI.
  • I’m generally in favor of technological progress, because I think it has usually been good over the course of history (more here). But the rate of progress we could see with AI is unprecedented, so I think we need to be a lot more careful here with AI than with other technological progress.

The risk of misaligned AI

In some ways, developing advanced enough AI would be like introducing a new species to the world - one capable of developing new weaponry, persuading and manipulating humans, and (crucially) quickly copying itself.

If we’re careful or lucky, such AIs might reliably behave as intended and serve as tools or helpers for humans. If we’re not, the world could end up run by misaligned AI for ends of its own (as, today, it’s run by humans rather than other animals).

I think this is a very important risk to pay attention to, because it means we can’t just view AI through a lens like “Make sure my country builds it before another one does.” We need to worry that moving too quickly and incautiously could mean catastrophe for everyone.

Key pieces on this topic:

  • AI Could Defeat All of us Combined argues that advanced enough AI could bring down human civilization entirely, if (for whatever reason) it were pointed toward that goal. AI wouldn’t need superhuman intelligence (though this would help), just the ability to copy itself to the point where humans are outnumbered.
  • Why Would AI “Aim” to Defeat Humanity? walks through how the way AI is developed today - trial-and-error-based training - could result in AIs that deceive and manipulate humans, and have “aims” that include disempowering humanity.
  • How We Could Stumble Into AI Catastrophe tells stylized stories of how the world could end up rushing unsafe AI systems into the world, despite the good intentions of AI developers and despite a reasonable amount of warning signs. It could be very hard to be as cautious as we need to be, with the huge commercial and other incentives to race forward with AI development.
  • AI Safety Seems Hard to Measure argues that it could be very hard to know whether our AI systems have dangerous “aims” or not. Commercial incentives could lead to AI systems that appear safe as far as we can tell - and that might not be good enough.
  • High-Level Hopes for AI Alignment outlines a few ways that we might overcome the challenges of measurement and end up with AI systems that reliably behave as intended.
  • What does Bing Chat tell us about AI risk? is a short piece relating Bing’s strange behavior in early 2023 to the above concerns.

Other high-stakes risks of AI

I discuss a number of other ways AI could quickly create radical new challenges, in Transformative AI issues (not just misalignment): an overview.

This piece touches on many issues that seem very important but have gotten almost no attention, driving home that we’re not ready for the challenges rapid AI development could bring.

So what?

I’ve written a number of pieces about what we can do today to try to reduce key AI risks, and help the most important century go well:

  • Ideal Governance and Nonprofit Boards are Weird are relevant for organizations trying to build powerful AI systems, and wondering whether traditional approaches to governance are good enough.
  • The Track Record of Futurists Seems ... Fine. Some people think it’s futile to try to reason about future developments like transformative AI, because humans are too bad at predicting the future. Here I look at the track record of a few famous sci-fi writers, and conclude that their predictions aren’t great, but aren’t so bad that we should give up on futurism entirely. (There have been some good criticisms of this piece, which I hope to eventually write about.)
  • I have a series on what utopia could look like (and why it’s hard to talk about without sounding scary).
  • Rowing, Steering, Anchoring, Equity, Mutiny - I think different people have radically different pictures of what it means to "work toward a better world." I think this explains a number of the biggest chasms between people who think of themselves as well-meaning but don't see the other side that way. This piece explores five different orientations toward helping the world: prioritizing “progress” (rowing), trying to anticipate key future developments (steering), avoiding change altogether (anchoring), focusing on present-day needs (equity), and working to upend the global order (mutiny).
  • “Technological unemployment” AI vs. “most important century” AI: how far apart? This piece argues that we might not see sustained AI-driven unemployment before we see wackier things like a world run by misaligned AI.