You are on page 1of 141

Astral Codex Ten Subscribe

Biological Anchors: A Trick That Might Or Might Not


Work
...
Feb 23 113 384
Introduction
I've been trying to review and summarize Eliezer Yudkowksy's recent dialogues on AI
safety. Previously in sequence: Yudkowsky Contra Ngo On Agents. Now we’re up to
Yudkowsky contra Cotra on biological anchors, but before we get there we need to figure
out what Cotra's talking about and what's going on.

The Open Philanthropy Project ("Open Phil") is a big effective altruist foundation
interested in funding AI safety. It's got $20 billion, probably the majority of money in the
field, so its decisions matter a lot and it’s very invested in getting things right. In 2020, it
asked senior researcher Ajeya Cotra to produce a report on when human-level AI would
arrive. It says the resulting document is "informal" - but it’s 169 pages long and likely to
affect millions of dollars in funding, which some might describe as making it kind of formal.
The report finds a 10% chance of “transformative AI” by 2031, a 50% chance by 2052, and an
almost 80% chance by 2100.

Eliezer rejects their methodology and expects AI earlier (he doesn’t offer many numbers,
but here he gives Bryan Caplan 50-50 odds on 2030, albeit not totally seriously). He made
the case in his own very long essay, Biology-Inspired AGI Timelines: The Trick That
Never Works, sparking a bunch of arguments and counterarguments and even more long
essays.

There's a small cottage industry of summarizing the report already, eg OpenPhil CEO
Holden Karnofsky's article and Alignment Newsletter editor Rohin Shah's comment. I've
drawn from both for my much-inferior attempt.

Part I: The Cotra Report


Ajeya Cotra is a senior research analyst at OpenPhil. She's assisted by her fiancee Paul
Christiano (compsci PhD, OpenAI veteran, runs an AI alignment nonprofit) and to a lesser
degree by other leading lights. Although not everyone involved has formal ML training, if
you care a lot about whether efforts are “establishment” or “contrarian”, this one is
probably more establishment.

The report tries to when will we first get "transformative AI" (ie AI which produces a
transition as impressive as the Industrial Revolution; probably this will require it to be
about as smart as humans). Its methodology is:

1. Figure out how much inferential computation the human brain does.

2. Try to figure out how much training computation it would take, right now, to get a neural
net that does the same amount of inferential computation. Get some mind-bogglingly large
number.

3. Adjust for "algorithmic progress", ie maybe in the future neural nets will be better at
using computational resources efficiently. Get some number which, realistically, is still
mind-bogglingly large.

4. Probably if you wanted that mind-bogglingly large amount of computation, it would take
some mind-bogglingly large amount of money. But computation is getting cheaper every
year. Also, the economy is growing every year. Also, the share of the economy that goes to
investments in AI companies is growing every year. So at some point, some AI company
will actually be able to afford that mind-boggingly-large amount of money, deploy the
mind-bogglingly large amount of computation, and train the AI that has the same
inferential computation as the human brain.

5. Figure out what year that is.

Does this encode too many questionable assumptions? For example, might AGI come from
an ecosystem of interacting projects (eg how the Industrial Revolution came from an
ecosystem of interacting technologies) such that nobody has to train an entire brain-sized
AI in one run? Maybe - in fact, Ajeya thinks the Industrial Revolution scenario might be
more likely than the single-run scenario. But she finds the single-run scenario as a useful
upper bound (later she mentions other reasons to try it as a lower bound, and compromises
by treating it as a central estimate) and still thinks it’s worth figuring out how long it will
take.

So let’s go through the steps one by one:

How Much Computation Does The Human Brain Do?


Step one - figuring out how much computation the human brain does - is a daunting task.
A successful solution would look like a number of FLOP/S (floating point operations per
second), a basic unit of computation in digital computers. Luckily for Ajeya and for us,
another OpenPhil analyst, Joe Carlsmith, finished a report on this a few months prior. It
concluded the brain probably uses 10^13 - 10^17 FLOP/S. Why? Partly because this was the
number given by most experts. But also, there are about 10^15 synapses in the brain, each
one spikes about once per second, and a synaptic spike probably does about one FLOP of
computation.

(I'm not sure if he's taking into account the recent research suggesting that computation
sometimes happens within dendrites - see section 2.1.1.2.2 of his report for complications
and why he feels okay ignoring them - but realistically there are lots of order-of-magnitude-
sized gray areas here, and he gives a sufficiently broad range that as long as the unknown
unknowns aren't all in the same direction it should be fine.)

So a human-level AI would also need to do 10^15 floating point operations per second?
Unclear. Computers can run on more or less efficient algorithms; neural nets might use
their computation more or less effectively than the brain. You might think it would be more
efficient, since human designers can do better than the blind chance of evolution. Or you
might think it would be less efficient, since many biological processes are still far beyond
human technology. Or you might do what OpenPhil did and just look at a bunch of
examples of evolved vs. designed systems and see which are generally better:

Source: This document by Paul Christiano.

Ajeya combines this with another metric where they see how existing AI compares to
animals with apparently similar computational capacity; for example, she says that
DeepMind’s Starcraft engine has about as much inferential compute as a honeybee and
seems about equally subjectively impressive. I have no idea what this means. Impressive at
what? Winning multiplayer online games? Stinging people? In any case, they decide to
penalize AI by one order of magnitude compared to Nature, so a human-level AI would
need to do 10^16 floating point operations per second.

How Much Compute Would It Take To Train A Model That Does 10^16
Floating Point Operations Per Second?
So an AI could potentially equal the human brain with 10^16 FLOP/S.

Good news! There’s a supercomputer in Japan that can do 10^17 FLOP/S!

It looks like this (source)

So why don’t we have AI yet? Why don’t we have ten AIs?

In the modern paradigm of machine learning, it takes very big computers to train relatively
small end-product AIs. If you tried to train GPT-3 on the same kind of medium-sized
computers you run it on, it would take between tens and hundreds of years. Instead, you
train GPT-3 on giant supercomputers like the ones above, get results in a few months, then
run it on medium-sized computers, maybe ~10x better than the average desktop.

But our hypothetical future human-level AI is 10^16 FLOP/S in inference mode. It needs to
run on a giant supercomputer like the one in the picture. Nothing we have now could even
begin to train it.

There’s no direct and obvious way to convert inference requirements to training


requirements. Ajeya tries assuming that each parameter will contribute about 10 FLOPs,
which would mean the model would have about 10^15 parameters (GPT-3 has about 10^11
parameters). Finally, she uses some empirical scaling laws derived from looking at past
machine learning projects to estimate that training 10^15 parameters would require
H*10^30 FLOPs, where H represents the model’s “horizon”.
If I understand this correctly, “horizon” is a reinforcement learning concept: how long does
it take to learn how much reward you got for something? If you’re playing a slot machine,
the answer is one second. If you’re starting a company, the answer might be ten years. So
what horizon do you need for human level AI? Who knows? It probably depends on what
human-level task you want the AI to do, plus how well an AI can learn to do that task from
things less complex than the entire task. If writing a good book is mostly about learning to
write good sentence and then stringing them together, a book-writing AI can get away with
a short horizon. If nothing short of writing an entire book and then evaluating it to see
whether it is good or bad can possibly teach you book-writing, the AI will need a long time
horizon. Ajeya doesn’t claim to have a great answer for this, and considers three models:
horizons of a few minutes, a few hours, and a few years. Each step up adds another three
orders of magnitude, so she ends up with three estimates of 10^30, 10^33, and 10^36 FLOPs.

(for reference, the lowest training estimate - 10^30 - would take the supercomputer pictured
above 300,000 years to complete; the highest, 300 billion.)

Or What If We Ignore All Of That And Do Something Else?


This is piling a lot of assumptions atop each other, so Ajeya tries three other methods of
figuring out how hard this training task is.

Humans seem to be human-level AIs. How much training do we need? You can analogize
our childhood to an AI’s training period. We receive a stream of sense-data. We start out
flailing kind of randomly. Some of what we do gets rewarded. Some of what we do gets
punished. Eventually our behavior becomes more sophisticated. We subject our new
behavior to reward or punishment, fine-tune it further.

Rent asks us: how do you measure the life of a woman or man? It answers: “in daylights, in
sunsets, in midnights, in cups of coffee; in inches, in miles, in laughter, in strife.” But you
can also measure in floating point operations, in which case the answer is about 10^24. This
is actually trivial: multiply the 10^15 FLOP/S of the human brain by the ~10^9 seconds of
childhood and adolescence. This new estimate of 10^24 is much lower than our neural net
estimate of 10^30 - 10^36 above. In fact, it’s only a hair above the amount it took to train
GPT-3! If human-level AI was this easy, we should have hit it by accident sometime in the
process of making a GPT-4 prototype. Since OpenAI hasn’t mentioned this, probably it’s
harder than this and we’re missing something.

Probably we’re missing that humans aren’t blank slates. We don’t start at zero and then only
use our childhood to train us further. The very structure of our brain encodes certain
assumptions about what kinds of data we should be looking out for and how we should use
it. Our training data isn’t just what we observed during childhood, it’s everything that any
of our ancestors observed during evolution. How many floating-point operations is the
evolutionary process?

Ajeya estimates 10^41. I can’t believe I’m writing this. I can’t believe someone actually
estimated the number of floating point operations involved in jellyfish rising out of the
primordial ooze and eventually becoming fish and lizards and mammals and so on all the
way to the Ascent of Man. Still, the idea is simple. You estimate how long animals with
neurons have been around for (10^16 seconds), total number of animals at any given second
(10^20) times average number of FLOPS per animal (10^5) and you can read more here but it
comes out to 10^41 FLOs. I would not call this an exact estimate - for one thing, it assumes
that all animals are nematodes, on the grounds that non-nematode animals are basically a
rounding error in the grand scheme of things. But it does justify this bizarre assumption,
and I don’t feel inclined to split hairs here - surely the total amount of computation
performed by evolution is irrelevant except as an extreme upper bound? Surely the part
where Australia got all those weird marsupials wasn’t strictly necessary for the human
brain to have human-level intelligence?

One more weird human training data estimate attempt: what about the genome? If in some
sense a bit of information in the genome is a “parameter”, how many parameters does that
suggest humans have, and how does it affect training time? Ajeya calculates that the
genome has about 7.5x10^8 parameters (compared to 10^15 parameters in our neural net
calculation, and 10^11 for GPT-3). So we can…

Okay, I’ve got to admit, this doesn’t have quite the same “huh?!” factor as trying to calculate
the number of FLOs in evolution, but it is in a lot of ways even crazier. The Japanese
canopy plant has a genome fifty times larger than ours, which suggests that genome size
doesn’t correspond very well to organism awesomeness. Also, most of the genome is coding
for weird proteins that stabilize the shape of your kidney tubule or something, why should
this matter for intelligence?
The Japanese canopy plant. I think it is very pretty, but probably low
prettiness per megabyte of DNA.

I think Ajeya would answer that she’s debating orders of magnitude here, and each of these
weird things costs only a few OOMs and probably they all even out. That still leaves the
question of why she thinks this approach is interesting at all, to which she answers that:

The motivating intuition is that evolution performed a search over a space of small,
compact genomes which coded for large brains rather than directly searching over the
much larger space of all possible large brains, and human researchers may be able to
compete with evolution on this axis.

So maybe instead of having to figure out how to generate a brain per se, you figure out how
to generate some short(er) program that can output a brain? But this would be very
different from how ML works now. Also, you need to give each short program the chance to
unfold into a brain before you can evaluate it, which evolution has time for but we probably
don’t.

Ajeya sort of mentions these problems and counters with an argument that maybe you
could think of the genome as a reinforcement learner with a long horizon. I don’t quite
follow this but it sounds like the sort of thing that almost might make sense. Anyway, when
you apply the scaling laws to a 7.5*10^8 parameter genome and penalize it for a long
horizon, you get about 10^33 FLOPs, which is weirdly similar to some of the other
estimates.
So now we have six different training cost estimates. First, neural nets with short, medium,
and long horizons, which are 10^30, 10^33, and 10^36 FLOPs, respectively. Next, the
amount of training data in a human lifetime - 10^24 FLOs - and in all of evolutionary
history - 10^41 FLOPs. And finally, this weird genome thing, which is 10^33 FLOPs.

An optimist might say “Well, our lowest estimate is 10^24 FLOPs, our highest is 10^41
FLOPs, those sound like kind of similar numbers, at least there’s no “5 FLOPs” or “10^9999
FLOPs” in there.

A pessimist might say “The difference between 10^24 and 10^41 is seventeen orders of
magnitude, ie a factor of 100,000,000,000,000,000 times. This barely constrains our
expectations at all!”

Before we decide who to trust, let’s remember that we’re still only at Step 2 of our eight step
Methodology, and continue.

How Do We Adjust For Algorithmic Progress?


So today, in 2022 (or in 2020 when this was written, or whenever), assume it would take
about 10^33 FLOs to train a human-level AI.

But technology constantly advances. Maybe we’ll discover ways to train AIs faster, or run
AIs more efficiently, or something like that. How does that factor into our estimate?

Ajeya draws on Hernandez & Brown’s Measuring The Algorithmic Efficiency Of Neural
Networks. They look at how many FLOPs it took to train various image recognition AIs to
an equivalent level of performance between 2012 and 2019, and find that over those seven
years it decreased by a factor of 44x, ie training efficiency doubles every sixteen months!
Ajeya assumes a doubling time slightly longer than that, because it’s easier to make
progress in simple well-understood fields like image recognition than in the novel task of
human-level AI. She chooses a doubling time of “merely” 2 - 3 years.

If training efficiency doubles every 2-3 years, it would dectuple in about 10 years. So
although it might take 10^33 FLOPs to train a human level AI today, in ten years or so it
may take only 10^32, in twenty years 10^31, and so on.

When Will Anyone Have Enough Computational Resources To Train A


Human-Level AI?
In 2020, AI researchers could buy computational resources at about $1 for 10^17 FLOPs.
That means the 10^33 FLOPs you’d need to train a human-level AI would cost $10^16, ie
ten quadrillion dollars. This is about twenty times more money than exists in the entire
world.
But compute costs fall quickly. Some formulations of Moore’s Law suggest it halves every
eighteen months. These no longer seem to hold exactly, but it does seem to be halving
maybe once every 2.5 years. The exact number is kind of controversial: Ajeya admits it’s
been more like once every 3-4 years lately, but she heard good things about some upcoming
chips and predicted it might revert back to the longer-term faster trend (it’s been two years
now, some new chips have come out, and this prediction is looking pretty good).

So as time goes on, algorithmic progress will cut the cost of training (in FLOPs), and
hardware progress will also cut the cost of FLOPs (in dollars). So training will become
gradually more affordable as time goes on. Once it reaches a cost somebody is willing to
pay, they’ll buy human-level AI, and then that will be the year human-level AI happens.

What is the cost that somebody (company? government? billionaire?) is willing to pay for
human-level AI?

The most expensive AI training in history was AlphaStar, a DeepMind project that spent
over $1 million to train an AI to play StarCraft (in their defense, it won). But people have
been pouring more and more money into AI lately:

Source here. This is about compute rather than cost, but most of the
increase seen here has been companies willing to pay for more compute
over time, rather than algorithmic or hardware progress.

The StarCraft AI was kind of a vanity project, or science for science’s sake, or whatever you
want to call it. But AI is starting to become profitable, and human-level AI would be very
profitable. Who knows how much companies will be willing to pay in the future?

Ajeya extrapolates the line on the graph forward to 2025 and gets $1 billion. This is starting
to sound kind of absurd - the entire company OpenAI was founded with $1 billion in
venture capital, it seems like a lot to expect them to spend more than $1 billion on a single
training run. So Ajeya backs off from this after 2025 and predicts a “two year doubling
time”. This is not much of a concession. It still means that in 2040 someone might be
spending $100 billion to train one AI.

Is this at all plausible? At the height of the Manhattan Project, the US was investing about
0.5% of its GDP into the effort; a similar investment today would be worth $100 billion. And
we’re about twice as rich as 2000, so 2040 might be twice as rich as we are. At that point,
$100 billion for training an AI is within reach of Google and maybe a few individual
billionaires (though it would still require most or all of their fortune).

Ajeya creates a complicated function to assess how much money people will be willing to
pay on giant AI projects per year. This looks like an upward-sloping curve. The line
representing the likely cost of training a human-level AI looks like a downward sloping
curve. At some point, those two curves meet, representing when human-level AI will first
be trained.

So When Will We Get Human-Level AI?


The report gives a long distribution of dates based on weights assigned to the six different
models, each of which has really wide confidence intervals and options for adjusting the
mean and variance based on your assumptions. But the median of all of that is 10% chance
by 2031, 50% chance by 2052, and almost 80% chance by 2100.

Ajeya takes her six models and decides to weigh them like so, based on how plausible she
thinks each one is:

20% neural net, short horizon


30% neural net, medium horizon
15% neural net, long horizon
5% human lifetime as training data
10% evolutionary history as training data
10% genome as parameter number

She ends up with this:


How Sensitive Is This To Changes In Assumptions?
She very helpfully gives us a Colab notebook and Google spreadsheet to play around with.
The notebook lets you change some of the more detailed parameters of the individual
models, and the spreadsheet lets you change the big picture. I leave the notebook to people
more dedicated to forecasting than I am, and will talk about the spreadsheet here.

If you’re following along at home, the default spreadsheet won’t reflect Ajeya’s findings
until you fill in the table in the bottom left like so:
Great. Now that we’ve got that, let’s try changing some stuff. I like the human childhood
training data argument (Lifetime Anchor) more than Ajeya does, and I like the size-of-the-
genome argument less. I’m going to change the weights to 20-20-0-20-20-20. Also, Ajeya
thinks that someone might be willing to spend 1% of national GDP on training AIs, but
that sounds really high to me, so I’m going to down to 0.1%. Also, Ajeya’s estimate of 3%
GDP growth sounds high for the sort of industrialized nations who might do AI research,
I’m going to lower it to 2%.

Since I’m feeling mistrustful today, let’s use the Hernandez&Brown estimate for compute
halving (1.5 years) in place of Ajeya’s ad hoc adjustments. And let’s use the current compute
halving time (3.5 years) instead of Ajeya’s overly rosy version (2.5 years).

All these changes…


…don’t really do much. The median goes from 2052 to about 2065. Four of the models give
results between 2030 and 2070. The last two, Neural Net With Long Horizon and Evolution,
suggest probably no AI this century (although Neural Net With Long Horizon does think
there’s a 40% chance by 2100). Ajeya doesn’t really like either of these models and they’re
not heavily weighted in her main result.

Does The Truth Point To Itself?


Back up a second. Here’s something that makes me kind of nervous.

Most of Ajeya’s numbers are kind of made up, with several order-of-magnitude error bars
and simplifying assumptions like “all animals are nematodes”. For a single parameter, we
get estimates spanning seventeen different orders of magnitude: the upper bound is one
hundred quadrillion times the upper bound.

And yet four of the six models, including two genuinely exotic ones, manage to get dates
within twenty years of 2050.

And 2050 is also the date everyone else focuses on. Here’s the prediction-market-like site
Metaculus:
Their distribution looks a lot like Ajeya’s, and even has the same median, 2052 (though
forecasters could have read Ajeya’s report).

Katja Grace et al surveyed 352 AI experts, and they gave a median estimate of 2062 for an
AI that could “outperform humans at all tasks” (though with many caveats and high
sensitivity to question framing). This was before Ajeya’s report, so they definitely didn’t
read it.

So lots of Ajeya’s different methods and lots of other people presumably using different
methodologies or no methodology at all, all converge on this same idea of 2050 give or take
a decade or two.

An optimist might say “The truth points to itself! There are 371 known proofs of the
Pythagorean Theorem, and they all end up in the same place. That’s because no matter
what methodology you use, if you use it well enough you get to the correct answer.”

A pessimist might be more suspicious; we’ll return to this part later.

FLOPS Alone Turn The Wheel Of History


One more question: what if this is all bullshit? What if it’s an utterly useless total garbage
steaming pile of grade A crap?

Imagine a scientist in Victorian Britain, speculating on when humankind might invent


ships that travel through space. He finds a natural anchor: the moon travels through space!
He can observe things about the moon: for example, it is 220 miles in diameter (give or take
an order of magnitude). So when humankind invents ships that are 220 miles in diameter,
they can travel through space!
Ships have certainly grown in size tremendously, from primitive kayaks to Roman triremes
to Spanish galleons to the great ocean liners of the (Victorian) present.

The AI forecasting organization AI Impacts actually has a whole report on


historical ship size trends to prove an unrelated point about technological
progress, so I didn’t even have to make this graph up.

Suppose our Victorian scientist lived in 1858, right when the Great Eastern was launched.
The trend line for ship size crossed 100m around 1843, and 200m in 1858, so doubling time
is 15 years - but perhaps they notice this is going to be an outlier, so let’s round up a bit and
say 18 years. The (one order of magnitude off estimate for the size of the) Moon is 350,000m,
so you’d need ships to scale up by 350,000/200 = 1,750x before they’re as big as the Moon.
That’s about 10.8 doublings, and a doubling time is 18 years, so we’ll get spaceships in . . .
2052 exactly.

(fudging numbers to land where you want is actually fun and easy)
SS Great Eastern, the extreme outlier large steamship from 1858. This has
become sort of a mascot for quantitative technological progress forecasters.

What is this scientist’s error? The big one is thinking that spaceship progress depends on
some easily-measured quantity (size) instead of on fundamental advances (eg figuring out
how rockets work). You can make the same accusation against Ajeya et al: you can have all
the FLOPs in the world, but if you don’t understand how to make a machine think, your AI
will be, well, a flop.

Ajeya discusses this a bit on page 143 of her report. There is some sense in which FLOPs
and knowing-what-you’re-doing trade of against each other. If you have literally no idea
what you’re doing, you can sort of kind of re-run evolution until it comes up with
something that looks good. If things are somehow even worse than that, you could always
run AIXI, a hypothetical AI design guaranteed to get excellent results as long as you have
infinite computation. You could run a Go engine by searching the entire branching tree
structure of Go - you shouldn’t, and it would take a zillion times more compute than exists
in the entire world, but you could. So in some sense what you’re doing, when you’re figuring
out what you’re doing, is coming up with ways to do already-possible things more
efficiently. But that’s just algorithmic progress, which Ajeya has already baked into her
model.

(our Victorian scientist: “As a reductio ad absurdum, you could always stand the ship on its
end, and then climb up it to reach space. We’re just trying to make ships that are more
efficient than that.”)

Part II: Biology-Inspired AI Timelines: The Trick That


Never Works
Eliezer Yudkowsky presents a more subtle version of these kinds of objection in an essay
called Biology-Inspired AI Timelines: The Trick That Never Works, published December
2021.

Ajeya’s report is a 169-page collection of equations, graphs, and modeling assumptions.


Yudkowsky’s rebuttal is a fictional dialogue between himself, younger versions of himself,
famous AI scientists, and other bit players. At one point, a character called “Humbali”
shows up begging Yudkowsky to be more humble, and Yudkowsky defeats him with
devastating counterarguments. Still, he did found the field, so I guess everyone has to listen
to him.

He starts: in 1988, famous AI scientist Hans Moravec predicted human-level AI by 2010. He


was using the same methodology as Ajeya: extrapolate how quickly processing power
would grow (in FLOP/S), and see when it would match some estimate of the human brain.
Moravec got the processing power almost exactly right (it hit his 2010 projection in 2008)
and his human brain estimate pretty close (he says 10^13 FLOP/S, Ajeya says 10^15, this 2
OOM difference only delays things a few years), yet there was not human-level AI in 2010.
What happened?

Ajeya's answer could be: Moravec didn't realize that, in the modern ML paradigm, any
given size of program requires a much bigger program to train. Ajeya, who has a 35-year
advantage on Moravec, estimates approximately the same power for the finished program
(10^16 vs. 10^13 FLOP/S) but says that training the 10^16 FLOP/S program will require
10^33ish FLOPs.

Eliezer agrees as far as it goes, but says this points to a much deeper failure mode, which
was that Moravec had no idea what he was doing. He was assuming processing power of
human brain = processing power of computer necessary for AGI. Why?

The human brain consumes around 20 watts of power. Can we thereby conclude that an AGI
should consume around 20 watts of power, and that, when technology advances to the point of
being able to supply around 20 watts of power to computers, we'll get AGI? […]

You say that AIs consume energy in a very different way from brains? Well, they'll also consume
computations in a very different way from brains! The only difference between these two cases is
that you know something about how humans eat food and break it down in their stomachs and
convert it into ATP that gets consumed by neurons to pump ions back out of dendrites and
axons, while computer chips consume electricity whose flow gets interrupted by transistors to
transmit information. Since you know anything whatsoever about how AGIs and humans
consume energy, you can see that the consumption is so vastly different as to obviate all
comparisons entirely.

You are ignorant of how the brain consumes computation, you are ignorant of how the first AGIs
built would consume computation, but "an unknown key does not open an unknown lock" and
these two ignorant distributions should not assert much internal correlation between them.

Cars don’t move by contracting their leg muscles and planes don’t fly by flapping their
wings like birds. Telescopes do form images the same way as the lenses in our eyes, but
differ by so many orders of magnitude in every important way that they defy comparison.
Why should AI be different? You have to use some specific algorithm when you’re creating
AI; why should we expect it to be anywhere near the same efficiency as the ones Nature
uses in our brains?

The same is true for arguments from evolution, eg Ajeya’s Evolutionary Anchor, ie “it took
evolution 10^43 FLOPs of computation to evolve the human brain so maybe that will be the
training cost”. AI scientists sitting in labs trying to figure things out, and nematodes
getting eaten by other nematodes, are such different methods for designing things that it’s
crazy to use one as an estimate for the other.

Algorithmic Progress vs. Algorithmic Paradigm Shifts


This post is a dialogue, so (Eliezer’s hypothetical model of) OpenPhil gets a chance to
respond. They object: this is why we put a term for algorithmic progress in our model. The
model isn’t very sensitive to changes in that term. If you want you can set it to some kind of
crazy high value and see what happens, but you can’t say we didn’t consider it.

OpenPhil: We did already consider that and try to take it into account: our model
already includes a parameter for how algorithmic progress reduces hardware
requirements. It's not easy to graph as exactly as Moore's Law, as you say, but our best-
guess estimate is that compute costs halve every 2-3 years […]

Eliezer: The makers of AGI aren't going to be doing 10,000,000,000,000 rounds of


gradient descent, on entire brain-sized 300,000,000,000,000-parameter models,
algorithmically faster than today. They're going to get to AGI via some route that you don't
know how to take, at least if it happens in 2040. If it happens in 2025, it may be via a route
that some modern researchers do know how to take, but in this case, of course, your
model was also wrong.

They're not going to be taking your default-imagined approach algorithmically faster,


they're going to be taking an algorithmically different approach that eats computing power
in a different way than you imagine it being consumed.

OpenPhil: Shouldn't that just be folded into our estimate of how the computation
required to accomplish a fixed task decreases by half every 2-3 years due to better
algorithms?

Eliezer: Backtesting this viewpoint on the previous history of computer science, it


seems to me to assert that it should be possible to:

Train a pre-Transformer RNN/CNN-based model, not using any other techniques


invented after 2017, to GPT-2 levels of performance, using only around 2x as much
compute as GPT-2;

Play pro-level Go using 8-16 times as much computing power as AlphaGo, but only
2006 levels of technology.

For reference, recall that in 2006, Hinton and Salakhutdinov were just starting to publish
that, by training multiple layers of Restricted Boltzmann machines and then unrolling
them into a "deep" neural network, you could get an initialization for the network
weights that would avoid the problem of vanishing and exploding gradients and
activations. At least so long as you didn't try to stack too many layers, like a dozen layers
or something ridiculous like that. This being the point that kicked off the entire deep-
learning revolution.

Your model apparently suggests that we have gotten around 50 times more efficient at
turning computation into intelligence since that time; so, we should be able to replicate
any modern feat of deep learning performed in 2021, using techniques from before deep
learning and around fifty times as much computing power.

OpenPhil: No, that's totally not what our viewpoint says when you backfit it to past
reality. Our model does a great job of retrodicting past reality.

Eliezer: How so?

OpenPhil: <Eliezer cannot predict what they will say here.>

I think the argument here is that OpenPhil is accounting for normal scientific progress in
algorithms, but not for paradigm shifts.

Directional Error
These are the two arguments Eliezer makes against OpenPhil that I find most persuasive.
First, that you shouldn’t be using biological anchors at all. Second, that unpredictable
paradigm shifts are more realistic than gradual algorithmic progress.

These mostly add uncertainty to OpenPhil’s model, but Eliezer ends his essay making a
stronger argument: he thinks OpenPhil is directionally wrong, and AI will come earlier
than they think.

Mostly this is the paradigm argument again. Five years from now, there could be a
paradigm shift that makes AI much easier to build. It’s happened before; from GOFAI’s
pre-programmed logical rules to Deep Blue’s tree searches to the sorts of Big Data methods
that won the Netflix Prize to modern deep learning. Instead of just extrapolating deep
learning scaling thirty years out, OpenPhil should be worried about the next big idea.

Hypothetical OpenPhil retorts that this is a double-edged sword. Maybe the deep learning
paradigm can’t produce AGI, and we’ll have to wait decades or centuries for someone to
have the right insight. Or maybe the new paradigm you need for AGI will take more
compute than deep learning, in the same way deep learning takes more compute than
whatever Moravec was imagining.

This is a pretty strong response, since it would have been true for every previous forecaster:
remember, Moravec erred in thinking AI would come too soon, not too late. So although
Eliezer is taking the cheap shot of saying OpenPhil’s estimate will be wrong just as
everyone else’s was wrong before, he’s also giving himself the much harder case of arguing
it might be wrong in the opposite direction as all its predecessors.

Eliezer takes this objection seriously, but feels like on balance probably new paradigms will
speed up AI rather than slow it down. Here he grudgingly and with suitable embarrassment
does try to make an object-level semi-biological-anchors-related argument: Moravec was
wrong because he ignored the training phase. And the proper anchor for the training phase
is somewhere between evolution and a human childhood, where evolution represents
“blind chance eventually finding good things” and human childhood represents “an
intelligent cognitive engine trying to squeeze as much data out of experience as possible”.
And part of what he expects paradigm shifts to do is to move from more evolutionary
processes to more childhood-like processes, and that’s a net gain in efficiency. So he still
thinks OpenPhil’s methods are more likely to overestimate the amount of time until AGI
rather than underestimate it.

What Moore’s Law Giveth, Platt’s Law Taketh Away


Eliezer’s other argument is kind of a low blow: he refers to Platt’s Law Of AI Forecasting:
“any AI forecast will put strong AI thirty years out from when the forecast is made.”
This isn’t exact. Hans Moravec, writing in 1988, said 2010 - so 22 years. Ray Kurzweil,
writing in 2001, said 2023 - another 22 years. Vernor Vinge, in a 1993 speech, said 2023, and
that was exactly 30 years, but Vinge knew about Platt’s Law and might have been joking.

The point is: OpenPhil wrote a report in 2020 that predicted strong AI in 2052, isn’t that
kind of suspicious?

I’d previously mentioned it as a plus that Ajeya got around the same year everyone else got.
The forecasters on Metaculus. The experts surveyed in Grace et al. Lots of other smart
experts with clever models. But what if all of these experts and models and analyses are just
fudging the numbers for the same Platt’s-Law-related reasons?

Hypothetical OpenPhil is BTFO:

OpenPhil: That part about Charles Platt's generalization is interesting, but just because
we unwittingly chose literally exactly the median that Platt predicted people would
always choose in consistent error, that doesn't justify dismissing our work, right? We
could have used a completely valid method of estimation which would have pointed to
2050 no matter which year it was tried in, and, by sheer coincidence, have first written
that up in 2020. In fact, we try to show in the report that the same methodology,
evaluated in earlier years, would also have pointed to around 2050 -

Eliezer: Look, people keep trying this. It's never worked. It's never going to work. 2
years before the end of the world, there'll be another published biologically inspired
estimate showing that AGI is 30 years away and it will be exactly as informative then as
it is now. I'd love to know the timelines too, but you're not going to get the answer you
want until right before the end of the world, and maybe not even then unless you're
paying very close attention. Timing this stuff is just plain hard.

Part III: Responses And Commentary


Response 1: Less Wrong Comments

Less Wrong is a site founded by Eliezer Yudkowsky for Eliezer Yudkowsky fans who
wanted to discuss Eliezer Yudkowsky’s ideas. So, for whatever it’s worth - the comments on
his essay were pretty negative.

Carl Shulman, an independent researcher with links to both OpenPhil and MIRI (Eliezer’s
org), writes the top-voted comment. He works from a model where there is hardware
progress, software progress downstream of hardware progress, and independent (ie
unrelated to algorithms) software progress, and where the first two make up most progress
on the margin. Researchers generally develop new paradigms once they have enough
compute available to tinker with them.

Progress in AI has largely been a function of increasing compute, human software


research efforts, and serial time/steps. Throwing more compute at researchers has
improved performance both directly and indirectly (e.g. by enabling more experiments,
refining evaluation functions in chess, training neural networks, or making algorithms
that work best with large compute more attractive).

Historically compute has grown by many orders of magnitude, while human labor
applied to AI and supporting software by only a few. And on plausible decompositions
of progress (allowing for adjustment of software to current hardware and vice versa),
hardware growth accounts for more of the progress over time than human labor input
growth.

So if you're going to use an AI production function for tech forecasting based on inputs
(which do relatively OK by the standards tech forecasting), it's best to use all of compute,
labor, and time, but it makes sense for compute to have pride of place and take in more
modeling effort and attention, since it's the biggest source of change (particularly when
including software gains downstream of hardware technology and expenditures). […]

A perfectly correlated time series of compute and labor would not let us say which had
the larger marginal contribution, but we have resources to get at that, which I was
referring to with 'plausible decompositions.' This includes experiments with old and
new software and hardware, like the chess ones Paul recently commissioned, and studies
by AI Impacts, OpenAI, and Neil Thompson. There are AI scaling experiments, and
observations of the results of shocks like the end of Dennard scaling, the availability of
GPGPU computing, and Besiroglu's data on the relative predictive power of computer
and labor in individual papers and subfields.

In different ways those tend to put hardware as driving more log improvement than
software (with both contributing), particularly if we consider software innovations
downstream of hardware changes.

Vanessa Kosoy makes the obvious objection, which echoes a comment of Eliezer’s in the
dialogue above:

I'm confused how can this pass some obvious tests. For example, do you claim that
alpha-beta pruning can match AlphaGo given some not-crazy advantage in compute? Do
you claim that SVMs can do SOTA image classification with not-crazy advantage in
compute (or with any amount of compute with the same training data)? Can Eliza-style
chatbots compete with GPT3 however we scale them up?

Mark Xu answers:

My model is something like:

For any given algorithm, e.g. SVMs, AlphaGo, alpha-beta pruning, convnets, etc.,
there is an "effective compute regime" where dumping more compute makes them
better. If you go above this regime, you get steep diminishing marginal returns.

In the (relatively small) regimes of old algorithms, new algorithms and old
algorithms perform similarly. E.g. with small amounts of compute, using AlphaGo
instead of alpha-beta pruning doesn't get you that much better performance than
like an OOM of compute (I have no idea if this is true, example is more because it
conveys the general gist).

One of the main way that modern algorithms are better is that they have much large
effective compute regimes. The other main way is enabling more effective
conversion of compute to performance.

Therefore, one of primary impact of new algorithms is to enable performance to


continue scaling with compute the same way it did when you had smaller amounts.

In this model, it makes sense to think of the "contribution" of new algorithms as the
factor they enable more efficient conversion of compute to performance and count the
increased performance because the new algorithms can absorb more compute as
primarily hardware progress. I think the studies that Carl cites above are decent
evidence that the multiplicative factor of compute -> performance conversion you get
from new algorithms is smaller than the historical growth in compute, so it further
makes sense to claim that most progress came from compute, even though the
algorithms were what "unlocked" the compute.

For an example of something I consider supports this model, see the LSTM versus
transformer graphs in https://arxiv.org/pdf/2001.08361.pdf

I also found Vanessa’s summary of this reply helpful:

Hmm... Interesting. So, this model says that algorithmic innovation is so fast that it is
not much of a bottleneck: we always manage to find the best algorithm for given
compute relatively quickly after this compute becomes available. Moreover, there is
some smooth relation between compute and performance assuming the best algorithm
for this level of compute. [EDIT: The latter part seems really suspicious though, why
would this relation persist across very different algorithms?] Or at least this is true is
"best algorithm" is interpreted to mean "best algorithm out of some wide class of
algorithms s.t. we never or almost never managed to discover any algorithm outside of
this class".

This can justify biological anchors as upper bounds[1]: if biology is operating using the
best algorithm then we will match its performance when we reach the same level of
compute, whereas if biology is operating using a suboptimal algorithm then we will
match its performance earlier.

Charlie Steiner objects:

Which examples are you thinking of? Modern Stockfish outperformed historical chess
engines even when using the same resources, until far enough in the past that computers
didn't have enough RAM to load it.

I definitely agree with your original-comment points about the general informativeness
of hardware, and absolutely software is adapting to fit our current hardware. But this can
all be true even if advances in software can make more than 20 orders of magnitude
difference in what hardware is needed for AGI, and are much less predictable than
advances in hardware rather than being adaptations in lockstep with it.

And Paul Christiano responds:

Here are the graphs from Hippke (he or I should publish summary at some point, sorry).
I wanted to compare Fritz (which won WCCC in 1995) to a modern engine to understand
the effects of hardware and software performance. I think the time controls for that
tournament are similar to SF STC I think. I wanted to compare to SF8 rather than one of
the NNUE engines to isolate out the effect of compute at development time and just
look at test-time compute.

So having modern algorithms would have let you win WCCC while spending about 50x
less on compute than the winner. Having modern computer hardware would have let you
win WCCC spending way more than 1000x less on compute than the winner. Measured
this way software progress seems to be several times less important than hardware
progress despite much faster scale-up of investment in software.

But instead of asking "how well does hardware/software progress help you get to 1995
performance?" you could ask "how well does hardware/software progress get you to 2015
performance?" and on that metric it looks like software progress is way more important
because you basically just can't scale old algorithms up to modern performance.

The relevant measure varies depending on what you are asking. But from the perspective
of takeoff speeds, it seems to me like one very salient takeaway is: if one chess project
had literally come back in time with 20 years of chess progress, it would have allowed
them to spend 50x less on compute than the leader.
Response 2: AI Impacts + Matthew Barnett
AI Impacts gathered and analyzed a dataset of who predicted AI when; Matthew Barnett
helpfully drew in the line corresponding to Platt’s Law (everyone always predicts AI in
thirty years).

Just eyeballing it, Platt’s Law looks pretty good. But Holden Karnofsky (see below) objects
that our eyeballs are covertly removing outliers. Barnett agrees this is worth checking for
and runs a formal OLS regression.

Platt’s Law in blue, regression line in orange.


He writes:

I agree this trendline doesn't look great for Platt's law, and backs up your observation by
predicting that Bio Anchors should be more than 30 years out.

However, OLS is notoriously sensitive to outliers. If instead of using some more robust
regression algorithm, we instead super arbitrarily eliminated all predictions after 2100,
then we get this, which doesn't look absolutely horrible for the law. Note that the
median forecast is 25 years out.

I’m split on what to think here. If we consider a weaker version of Platt’s Law, “the average
date at which people forecast AGI moves forward at about one year per year”, this seems
truish in the big picture where we compare 1960 to today, but not obviously true after 1980.
If we consider a different weaker version, “on average estimates tend to be 30 years away”,
that’s true-ish under Barnett’s revised model, but not inherently damning since Barnett’s
assuming there will be some such number, it turns out to be 25, and Ajeya gave the
somewhat different number of 32. Is that a big enough difference to exonerate her of
“using” Platt’s Law? Is that even the right way to be thinking about this question?

Response 3: Real OpenPhil


The hypothetical OpenPhil in Eliezer’s mind having been utterly vanquished, the real-
world OpenPhil is forced to step in. OpenPhil CEO Holden Karnofsky responds to Eliezer
here.

There’s a lot of back and forth about whether the report includes enough caveats (answer: it
sure does include a lot of caveats!) but I was most interested in the attacks on Eliezer’s two
main points.
First, the point that biological anchors are fatally flawed from the start and measuring
FLOP/S is no better than measuring power consumption in watts. Holden:

If the world were such that:

We had some reasonable framework for "power usage" that didn't include
gratuitously wasted power, and measured the "power used meaningfully to do
computations" in some important sense;

AI performance seemed to systematically improve as this sort of power usage


increased;

Power usage was just now coming within a few orders of magnitude of the human
brain;
We were just now starting to see AIs have success with tasks like vision and speech
recognition (tasks that seem likely to have been evolutionarily important, and that
we haven't found ways to precisely describe GOFAI-style);

It also looked like AI was starting to have insect-like capabilities somewhere around
the time it was consuming insect-level amounts of power;

And we didn't have some clear candidate for a better metric with similar properties
(as I think we do in the case of computations, since the main thing I'd expect
increased power usage to be useful for is increased computation);

...Then I would be interested in a Bio Anchors-style analysis of projected power usage.


As noted above, I would be interested in this as a tool for analysis rather than as "the
way to get my probability distribution." That's also how I'm interested in Bio Anchors
(and how it presents itself).

Second, the argument that paradigm shifts might speed up AI:

I think it's a distinct possibility that we're going to see dramatically new approaches to
AI development by the time transformative AI is developed.

On the other, I think quotes like this overstate the likelihood in the short-to-medium
term.

Deep learning has been the dominant source of AI breakthroughs for nearly the last
decade, and the broader "neural networks" paradigm - while it has come in and out
of fashion - has broadly been one of the most-attended-to "contenders" throughout
the history of AI research.
AI research prior to 2012 may have had more frequent "paradigm shifts," but this is
probably related to the fact that it was seeing less progress.

With these two points in mind, it seems off to me to confidently expect a new
paradigm to be dominant by 2040 (even conditional on AGI being developed), as the
second quote above implies. As for the first quote, I think the implication there is
less clear, but I read it as expecting AGI to involve software well over 100x as
efficient as the human brain, and I wouldn't bet on that either (in real life, if AGI is
developed in the coming decades - not based on what's possible in principle.)

Reponse 4: Me
Oh God, I have to write some kind of conclusion to this post, in some way that suggests I
have an opinion, or that I’m at all qualified to assess this kind of research. Oh God oh God.

I find myself most influenced by two things. First, Paul’s table of how effectively Nature
tends to outperform humans, which I’ll paste here again:

I find it hard to say how this influenced me. It would be great if Paul had found some sort of
beautiful Moore’s-Law-esque rule for figuring out the Nature vs. humans advantage. But
actually his estimates span five orders of magnitude. And they don’t even make sense as
stable estimates - human solar power a few decades ago was several orders of magnitude
worse than Nature’s, and a few decades from now it may be better.

Still, I think this table helps the whole thing feel less mystical. Usually Nature outperforms
humans by some finite amount, usually a few orders of magnitude, on the dimension we
care about. We can add it to the error bars on our model and move on.
The second thing that influences me a lot is Carl Shulman’s model of “once the compute is
ready, the paradigm will appear”. Some other commenters visualize this as each paradigm
having a certain amount of compute you can “feed” it before it stops scaling with compute
effectively. This is a heck of a graph:

Given these two assumptions - that natural artifacts usually have efficiencies within a few
OOM of artificial ones, and that compute drives progress pretty reliably - I am proud to be
able to give Ajeya’s report the coveted honor of “I do not make an update of literally zero
upon reading it”.

That still leaves the question of “how much of an update do I make?” Also “what are we
even doing here?”

That is - suppose before we read Ajeya’s report, we started with some distribution over
when we’d get AGI. For me, not being an expert in this area, this would be some
combination of the Metaculus forecast and the Grace et al expert survey, slightly pushed
various directions by the views of individual smart people I trust. Now Ajeya says maybe
it’s more like some other distribution. I should end up with a distribution somewhere in
between my prior and this new evidence. But where?
I . . . don’t actually care? I think Metaculus says 2040-something, Grace says 2060-
something, and Ajeya says 2050-something, so this is basically just the average thing I
already believed. Probably each of those distributions has some kind of complicated shape,
but who actually manages to keep the shape of their probability distribution in their head
while reasoning? Not me.

This report was insufficiently different from what I already believed for me to need to
worry about updating from one to the other. The more interesting question, then, is
whether I should update towards Eliezer’s slightly different distribution, which places more
probability mass on earlier decades.

But Eliezer doesn’t say what his exact probability distribution is, and he does say he’s
making a deliberate choice not to do this:

I consider naming particular years to be a cognitively harmful sort of activity; I have


refrained from trying to translate my brain's native intuitions about this into
probabilities, for fear that my verbalized probabilities will be stupider than my intuitions
if I try to put weight on them. What feelings I do have, I worry may be unwise to voice;
AGI timelines, in my own experience, are not great for one's mental health, and I worry
that other people seem to have weaker immune systems than even my own. But I
suppose I cannot but acknowledge that my outward behavior seems to reveal a
distribution whose median seems to fall well before 2050.

So, should I update from my current distribution towards a black box with “EARLY”
scrawled on it?

What would change if I did? I’d get scared? I’m already scared. I’d get even more scared?
Seems bad.

Maybe I’d have different opinions on whether we should pursue long-term AI alignment
research programs that will pay off after 30 years, vs. short-term AI alignment research
programs that will pay off in 5? If you have either of those things, please email anyone whose
name has been mentioned in this blog post, and they’ll arrange to have a 6-to-7-digit sum of money
thrown at you immediately. It’s not like there’s some vast set of promising 30-year research
programs and some other set of promising 5-year research programs that have to be triaged
against each other. Maybe there’s some ability to redirect a little bit of talent and interest at
the margin, in a way that makes it worth OpenPhil’s time to care. But should I care? Should
you?

One of my favorite jokes continues to be:


An astronomy professor says that the sun will explode in five billion years, and sees a
student visibly freaking out. She asks the student what’s so scary about the sun exploding
in five billion years. The student sighs with relief: “Oh, thank God! I thought you’d said
five million years!”

And once again, you can imagine the opposite joke: A professor says the sun will explode in
five minutes, sees a student visibly freaking out, and repeats her claim. The student, visibly
relieved: “Oh, thank God! I thought you’d said five seconds.”

Here Ajeya is the professor saying the sun will explode in five minutes instead of five
seconds. Compared to the alternative, it’s good news. But if it makes you feel complacent,
then the joke’s on you.

Subscribe to Astral Codex Ten


By Scott Alexander · Thousands of paid subscribers
P(A|B) = [P(A)*P(B|A)]/P(B), all the rest is commentary.

futuremattersnewsletter@gmail.com Subscribe

113 384 Share

Discussion
Write a comment…
Chronological

Gunflint Feb 23 · edited Feb 23


> Oh, thank God! I thought you’d said five million years!”
That one has always tickled me too.
I thought of it when an debate raged here about saving humanity by colonization other star
systems. I’d mentioned the ‘No reversing entropy’ thing and the response was: “We’re just
talking about the next billion years!
Reply Give gift
Matthew Barnett Writes Matthew Barnett’s Blog · Feb 23
> Bartlett agrees this is worth checking for and runs a formal OLS regression.
Minor error, but I'm Barnett.
Reply Give gift
Gwern Branwen Writes Gwern.net Newsletter · Feb 23
Another minor error: I believe Carl Shulman is not 'independent' but still employed by
FHI (albeit living in the Bay Area and collaborating heavily with OP etc).
Reply
Habryka Feb 25
Also pretty sure Carl is no longer living in the Bay Area, but in Reno instead (to the
Bay Area's great loss)
Reply
Algon33 Feb 23
Another minor error: Transformative AI is "AI that precipitates a transition comparable to
(or more significant than) the agricultural or industrial revolution".
This is easy to fix Scott, and about as long as your original description + its witty reply.
Reply Give gift
Scott Alexander Feb 23 Author
Sorry, fixed.
Reply
Melvin Feb 23
Sure, but what does Bartlett think?
Reply Give gift
Purplish Feb 23
Another minor error: quoting on Mark Xu's list
Reply Give gift
DavesNotHere Feb 23
That last graph may be a heck of a graph, but I have no idea what it depicts. Could we have a
link to the source or an explanation, please?
Reply
Dan L Feb 23
Without explicitly confirming at the source, it appears to be a graph of chess program
performance per computational power, for multiple models over time.
The Y-axis is chess performance measured using the Elo system, which is a way of
ranking performers by a relative standard. Beginner humans are <1000, a serious
enthusiast might be >1500, grandmaster is ~2500, and Magnus Carlsen peaked at 2882.
The X-axis is how much compute ("thinking time") each model was allowed per move.
This has to be normalized to a specific model for comparisons to be meaningful (SF13-
NNUE here) and I'm just going to trust it was done properly, but it looks ok.
The multiple lines are each model's performance at a given level of compute. There are
three key takeaways here: 1) chess engines are getting more effective over time even
allowed the same level of compute, 2) each model's performance tends to "level out" at
some level of allocated resources, and 3) a lot of the improved performance of new
models comes from being able to usefully utilize additional resources.
That's a big deal, because if compute keep getting cheaper but the algorithms can't
really leverage it, you haven't done much. But if ML folks look at the resources thrown at
GPT-3 and say "the curve isn't bending!" it could be a sign that we can still get
meaningful performance increases from moar power.
Reply
DavesNotHere Feb 23
many thanks!
Reply
tgb Feb 24
Scott seems to take from this graph that it supports the "algorithms have a range of
compute where they're useful" thesis. But I see it as opposing that.
First, the most modern algorithms are doing much better than the older ones *at
low compute regimes* so the idea that we nearly immediately discover the best
algorithms for a given compute regime once we're there appears to be false - at
least we didn't manage to do that back in 1995.
Second, regimes where increased computation gives a benefit to these algorithms
seems pretty stable. It's just that newer algorithms are across-the-board better. I
guess it's hard to compare a 100 ELO increase at 2000 ELO to a 100 ELO increase
at 3000 ELO, but I don't really see any evidence in the plot that newer algorithms
*scale* better with more compute. If anything, it's that they scale better at low
compute regimes, which more lend itself to a Yudkowskian conclusion.
Am I misinterpreting this?
Reply
Bolton Feb 25
I agree with you. If it were really the case that "once the compute is ready, the
paradigm will appear", I would expect to see all of the curves on this graph
intersect each other, with each engine having a small window for which it
dominates the ELO roughly corresponding to the power of computers at the
time it was made.
Reply
Manaria Writes Manaria’s Newsletter · Feb 27
I'd expect that the curves for, say, image recognition tasks, *would*
intersect, particularly if the training compute is factored in.
But the important part this graph shows is: the difference between
algorithms isn't as large as the difference between compute (although the
relative nature of ELO makes this less obvious).
Reply Give gift
Thor Odinson Mar 1
I think those algorithms have training baked in, so a modern trained net does
really well even with low compute (factor of 1000 from hardware X software),
but the limit on how good an algo you could train was a lot lower in the past
(factor of 50 from software alone)
Reply Give gift
Dustin Feb 24
> But if ML folks look at the resources thrown at GPT-3 and say "the curve isn't
bending!" it could be a sign that we can still get meaningful performance increases
from moar power.
I don't follow the space closely, but I think this is exactly what ML folks are saying
about GPT-3.
Reply
Dan L Feb 24
Basically a Gwern quote IIRC, but I wouldn't hold him responsible for my half-
rememberings!
Reply
JDK Feb 23
It seems easier to just have children.
Reply Give gift
CounterBlunder Feb 23
This made me laugh
Reply
JDK Feb 24 · edited Feb 24
If you think about it long enough it should.
When we say we want AIs what we are really saying is we want an AI that is better
than humans not just an AI. But there are geniuses being born every day.
But what we really want is to understand consciousness and to solve particular
problems faster than than we can at the moment.
We wanted to fly like the birds but we really did not invent an artificial bird. We
wanted to work as hard as horse, but did not invent an artificial horse.
The question of consciousness is a legitimate and important question.
Reply Give gift
Carl Pham Feb 24
I think this is an important point. Doing basic research in AI as a way to
understand NI makes enormous sense: we understand almost nothing about
how our mind works, and if we understood much more we could (one hopes)
make enormous strides in education, sociology, functional political institutions,
the treatment of mental illness, and the improvement of life for people with
mental disabilities (through trauma, birth, or age). We could also optimize the
experience and contributions of people who are unusually intelligent, and
maybe figure out how to boost our own intelligence, via training or genetic
manipulation. Exceedingly valuable stuff.
But as a technological end goal, an actual deployed mass-manufactured tool,
it seems highly dubious. There are only three cases to consider:
(1) We can build a general AI that is like us, but much dumber. Why bother?
(There's of course many roles for special-purpose AIs that can do certain
tasks way better than we can, but don't have our general-purpose thinking
abilities.)
(2) We can build a general AI that is like us, and about as smart. Also seems
mostly pointless, unless we can do it far cheaper than we can make new
people, and unless it is so psychologically different it doesn't mind being a
slave.
(3) We can build a general AI that is much smarter than us. This seems a priori
unlikely, in the sense that if we understood intelligence sufficiently well to do
this, why not just increase our own intelligence first? Got to be easier, since all
you need to do is tweak the DNA appropriately. And even if we could build one,
why would we want to either enslave a hyperintelligent being or become its
slaves, or pets? Even a bad guy wouldn't do that, since a decent working
definition of "bad guy" is "antisocial who doesn't want to recognize any
authority" and building a superintelligent machine to whom to submit is rather
the opposite of being a pirate/gangster boss/Evil Overlord.
I realize plenty of people believe there is case (2b) we can build an AI that is
about as smart as us, and then *it* can rebuild itself (or build another AI) that
is way smarter than us, but I don't believe in this boostrapping theory at all, for
the same reason I find (3) dubious a priori. The idea that you can build a very
complex machine without any good idea of how it works seems silly.
Reply Give gift
Ghillie Dhu Feb 24
>The idea that you can build a very complex machine without any good
idea of how it works seems silly.
But that's essentially what ML does. If there was a good idea of how a
solution to a given problem works, it would be implemented via traditional
software development instead.
Reply
Carl Pham Feb 24 · edited Feb 24
I disagree. I understand very well what a ML program does. I may not
have all the details at my fingertips, but that is just as meaningless as
the fact that I don't know where each molecule goes when gasoline
combusts with oxygen. Sure, there's a lot of weird ricochets and
nanometer-scale fluctuations that go on about which I might not
know, absent enormous time and wonderful microscopes -- but
saying I don't know the details is nowhere near saying I don't know
what's going on. I know in principle.
Same with ML. I may not know what this or that node weight is, and
to figure out why it is what it is, i.e. trace it back to some pattern in
the training data, would take enormous time and painstaking not to
say painful attention to itsy bitsy detail, but that is a long way from
saying I don't know what it's doing. I do in principle.
I'll add this dichotomy has existed in other areas of science and
technology for much longer, and it doesn't bother us. Why does a
particular chemical reaction happen in the pathway it does, exactly?
We can calculate that from first principles, with a big enough
computer to solve a staggeringly huge quantum chemistry problem.
But if you wanted to trace back this wiggle in the preferred trajectory
to some complex web of electromagnetic forces between electrons,
it would take enormous time and devotion to detail. So we don't
bother, because this detail isn't very important. We understand the
principles by which quantum mechanics determines the reaction
path, and we can build a machine that finds that path by doing
trillions of calculations which we do not care to follow, and maybe the
path is not what raw intuition suggests (which is why we do the
calculation at all, usually), but at no point here do we say we do not
*understand* why the Schroedinger Equations is causing this H atom
to move this way instead of that. I don't really see why we would
attribute some greater level of mystic magic to a neural network
pattern-recognition algorithm.
Reply Give gift
Ghillie Dhu Feb 25
>...but that is a long way from saying I don't know what it's
doing. I do in principle.
Knowing in principle seems like a much lower bar than having a
good idea how something works.
>I don't really see why we would attribute some greater level of
mystic magic to a neural network pattern-recognition algorithm.
Intelligence is an emergent phenomenon (cf., evolution
producing hominid minds), so what magic do you see being
attributed beyond knowledge of how to build increasingly
complex pattern-recognition algorithms?
Reply
Carl Pham Feb 25
Gosh, no it's not. What makes you say that? Knowing how
something works in principle is pretty much all there is to
knowing how something works. The details are far less
important. That's why we give high credit to Einstein for
inventing relativity, and we do *not* hand out Nobel Prizes
to engineers who work out how relativity affects the GPS
system. Einstein did not foresee GPS, but had he been told
about it, he could have easily worked out what relativity had
to say about it -- because he understands the principle,
which is the important stuff.
Nobody knows whether intelligence is an emergent
phenonenon. The evidence so far is actually that it is *not*,
because no other species seems to exhibit it, despite
having brains that aren't super duper less complex than
ours. That would suggest there is something extremely
unusual about the human brain, some weird accident of
evolution. If it were the case that increasing neuron count
*always* led to intelligence, we would see a far smoother
spectrum of intelligence among animals species, and not a
sudden discontinuous jump between humans and all other
animals.
The problem with your last suggestion is that you are not
distinguishing between imagining the parameters of the
pattern
Expand full--comment
which takes genuine original intelligence -- and
Reply Give gift
Ghillie Dhu Feb 25
I said "seems" to indicate that there's a semantic gap
between what you're saying and how I initially read it;
I'm not trying to argue that there is a difference
relevant to your position.
There *were* other intelligent species: homo sapiens
simply outcompeted our hominid cousins leaving a
large extant gap between us and, e.g., other primates,
cephalopods, cetaceans, corvids. et al.
Intelligence as an emergent phenomenon is a broader
claim than that it is a direct function of some measure
of brain complexity; human evolution is an object proof
that intelligence *can* emerge from a process that
does not already possess it.
>What you need to imagine is a program that could
*define* the pattern in the first place. That would be
intelligent. How would it be done?
Topology optimization. Hyperparameter tuning. There
are available commercial products to define patterns
whose parameters are set during a later training
phase. It's analogous to evolution selecting for the
basic structure of brains which are predisposed to
acquire language.
>The underlying problem is that a universal pattern
recognition model would necessarily have to have
infinite parameters...
Only if you require the model to have infinite precision.
Deep neural nets are already capable of
*approximating* arbitrary functions due to sufficient
overparameterization.
Reply
Robert McIntyre Feb 28 · edited Feb 28
"Knowing how something works in principle is pretty
much all there is to knowing how something works."
You can't be serious about this? By this logic we
already pretty much know all there is to know about
human intelligence, because we understand
biochemistry in principle, and human brains are made
out of biochemistry!
Reply
Matthew Carlin Feb 25
That's not what ML does. ELI5, ML is about as well understood as the
visual cortex, it's built like a visual cortex, and it solves visual cortex
style problems.
People act like just because each ML model is too large and messy to
explain, all of ML is a black box. It's not. Each model of most model
classes (deep learning, CNN, RNN, gbdt, whatever you want) is just a
layered or otherwise structured series of simple pattern recognizers,
each recognizer is allowed to float towards whatever "works" for the
problem at hand, and all the recognizers are allowed to influence
each other in a mathematically stable (ie convergent) format.
End result of which is you get something that works like a visual
cortex: it has no agency and precious little capacity for transfer
learning, but has climbed the hill to solve that one problem really
well.
This is a very well understood space. It's just poorly explained to the
general public.
Reply
Ghillie Dhu Feb 25
My initial objection to Carl was based on a difference of opinion
about what constitutes a "good idea of how it works". You
appear to share his less-restrictive understanding of the phrase.
N.B., I am a working data scientist who was hand coding CV
convolutions two decades ago.
Reply
Matthew Carlin Feb 26
Fair, apologies. I am one too.
I share his sense that there is an isolated demand for rigor
here. One pours one's chemicals together with a general
understanding of the principles at play and which knobs to
adjust, but without a detailed understanding of specific
molecular interactions, turbulent flows, non homogeneities,
etc, and this is considered fine.
I build my models with a general understanding of the
principles at play and which knobs to adjust, but without a
detailed understanding of specific sub components and
how they've influenced each other, and I also consider this
fine.
It may help that my professional interest for a few years
was model explainability and issue diagnosis. We can't see
everything, but we can peer in about as well as we can peer
into anything in the natural world.
Reply
Ghillie Dhu Feb 26
Thank you. FWIW, I'm not expecting rigor here; the
juxtaposition of "silly" with what I *thought* Carl
meant by "good idea" is what I was reacting to.
Reply
Egg Syntax Feb 24
> This seems a priori unlikely, in the sense that if we understood
intelligence sufficiently well to do this, why not just increase our own
intelligence first? Got to be easier, since all you need to do is tweak the
DNA appropriately.
I think this is mistaken. For reasons that Scott has talked about elsewhere,
the fact that we aren't *already* smarter suggests that we're near a local
optimum for our physiology / brain architecture / etc, or evolution would
have made it happen; eg it may be that a simple tweak to increase our
intelligence would result in too much mental illness. Finding ways to tweak
humans to be significantly smarter without unacceptable tradeoffs may
be extremely difficult for that reason.
On the other hand, I see no a priori reason that that local optimum is likely
to be globally optimal. So conditional on building GAI at all, I see no
particular reason to expect a specific barrier to increasing past human-
level intelligence.
Reply
Carl Pham Feb 24
Oh I wouldn't disagree that it's likely to be hard to increase human
intelligence. Whether what we mean by "intelligence" -- usually,
purposeful conscious reasoning and imagination -- has been
optimized by Nature is an interesting and unsolved question,
inasmuch as we don't know whether that kind of intelligence is
always a survival advantage. There are also some fairly trivial reasons
why Nature may not have done as much as can be done, e.g. the
necessity for having your head fit through a vagina during birth.
But yeah I'd take a guess that it would be very hard. I only said that
hard as it is, building a brand-spanking new type of intelligence, a
whole new paradigm, is likely to be much harder.
Anyway, if we take a step back, the idea that improving the
performance of an engine that now exists is a priori less likely than
inventing a whole new type of engine is logically incoherent.
Reply Give gift
Donald Feb 24
"if we understood intelligence sufficiently well to do this, why not just
increase our own intelligence first?"
Because the change is trivial in computer code, but hard in DNA.
For example, maybe a neural structure in 4d space works really well. We
can simulate that on a computer, but good luck with the GM.
Maybe people do both, but the human takes 15-20 years to grow up,
whereas the AI "just" takes billions of dollars and a few months.
Because we invented an algorithm that is nothing at all like a human mind,
and works well.
Reply Give gift
Carl Pham Feb 24
That would be convincing if anyone had ever written a computer
code that had even the tiniest bit of awareness or original thought,
no matter how slow, halting, or restricted in its field of competence. I
would say that the idea that a computer can be programmed *at all*
to have original thought (or awareness) is sheer speculation, based
on a loose analogy between what a computer does and what a brain
does, and fueled dangerously by a lot of metaphorical thinking and
animism (the same kind that causes humans to invent local
conscious-thinking gods to explain why it rains when it does, or
eclipses, or why my car keys are always missing when I'm in a hurry).
Reply Give gift
Donald Feb 25
Deep blue can produce chess moves that are good, and aren't
copies of moves humans made. GPT3 can come up with new
and semi-sensible text.
Can you give a clear procedure for measuring "Original
thought".
Before deep blue, people were arguing that computers couldn't
play chess because it required too much "creative decision
making" or whatever.
I think you are using "Original thought" as a label for anything
that computers can't do yet.
You have a long list of things humans can do. When you see a
simple dumb algorithm that can play chess, you realize chess
doesn't require original thought, just following a simpleish
program very fast. Then GPT3 writes kind of ok poetry and you
realize that writing ok poetry (given lots of examples) doesn't
require original thought.
I think there is a simplish program for everything humans do, we
just haven't found it yet. I think you think there is some magic
original thought stuff that only humans have, and also a long list
of tasks like chess, go, image recognition etc that we can do
with the right algorithm.
Reply Give gift
Mr. Doolittle Feb 25
As a workaround to creativity, we have found a way to teach
computers to emulate creative thought through massive
repetition. The reason that training an AI takes so much
more computational power than to run an AI is that
workaround to creativity. We show it an overwhelming
number of scenarios, so that the AI can create incredibly
thorough plots of all the potential solutions to problems
within it's area. So it can memorize every theoretical chess
move and correlate it with percentages of winning with
every other theoretical chess move and then always make
the best choice. That's not creativity, that's the chess
equivalent to entering every possible password to brute
force the combination.
It's a fuzzy area because we struggle to define what
"creative" means and similarly struggle to define just about
any of the terms we use to describe human consciousness
and intelligence. That said, we know what GPT3 is doing is
not the same as a human being creative. Humans can be
creative with far less training input and maybe no (domain
relevant) training input whatsoever. Current AI paintings are
unique in the sense that no human ever painted that exact
image, but it's obvious from both a computer technical
perspective as well as a review of the art itself that it's a
type of copy.
Reply Give gift
Carl Pham Feb 25 · edited Feb 25
(1) A coin-flipping machine could also produce chess
moves that are good, and aren't copies of moves humans
have made. All you need is a hell of lot of coin flips and a
metric for evaluating whether the result is good or not. This
isn't original thought, it's an enormous random number
generator plus a sieve. What makes the human ability to
play chess impressive is not that we can do it at all, but that
we can do it *without* doing the enormous number of
brute-force computations Deep Blue does.
(2) Sure. For a start, original thought is inferential, not
deductive. It infers *new* rules from data, like inventing
quantum mechanics to describe spectroscopic data, or
inventing the pentatonic scale, or for that matter music
itself.
Following rules someone else programs for you is ipso
facto not original. That's why a car automatic transmission
isn't exercising "original thought" when it shifts from 2nd to
3rd under specified conditions of engine vacuum, speed,
acceleration, et cetera -- *even if* no human driver would
have made that shift at that particular moment, and even if
it's a better point to shift than where a person would. This
isn't
Expandinvention, it's just variations on a theme the potential
full comment
Reply Give gift
Carl Pham Feb 25
Incidentally, if you observe "gee, given what you said,
*human beings* don't exhibit much original thought,
most of the time. Some probably never do it at all!" -- I
would entirely agree. What makes the human being
special is not that he exhibits original thought all the
time, or even most of the time -- most of the time we
*do* act a lot like pre-programmed simple robots --
but that we can do it at all. We are all, apparently, even
the simplest and stupidest of us, capable of original
thought at least once in a while. How and why that is
true is a big mystery -- one I certainly have no clue
about. If research in AI can shed some light on that,
that would be fabulous. But so far, it has not. Nothing
AI has ever produced to date has given the slightest
glimmer of understanding of actual originality, except
insofar as it has allowed us to more sharply separate
non-originality from originality. There's always hope
for the future, though. Maybe someday AI research
*will* demonstrate how originality occurs. That would
be great.
Reply Give gift
Robert McIntyre Feb 28
How would you know when an AI has successfully
produced an original thought? And does
explaining the architecture of the AI influence
whether you consider the thought was "original"?
Are there certain explanations of how people
think that would convince you that WE aren't
capable of original thoughts?
Reply
Matthew Carlin Feb 25
"Because the change is trivial in computer code, but hard in DNA."
In any large software shop which relies on ML to solve "rubber hits
the road" problems, not toy problems, it takes literally dozens of
highly paid full time staff to keep the given ML from falling over on its
head every *week* as the staff either build new models or coddle old
ones in an attempt to keep pace with ever changing reality.
And the work is voodoo, full of essentially bad software practices and
contentious statistical arguments and unstable code changes.
Large scale success with ML is about as far from "the change is
trivial in computer code" as it is possible to be in the field of
computer science.
Reply
Mr. Doolittle Feb 24
I thought about this specifically when reading that we could spend quadrillions of dollars
to create a supercomputer capable of making a single human level AI.
Reply Give gift
magic9mushroom Feb 25 · edited Feb 25
To be fair, once made that AI could be run on many different computers (which
would each be far less expensive), whereas we don't have a copy-paste function for
people.
Reply
David Piepgrass Feb 25 · edited Feb 25
But more importantly, that way of thinking is wrong and I predict humanity is
about to reduce per-model training budgets at the high end. Though wealthy
groups' budgets will jump temporarily whenever they suspect they might have
invented AGI, or something with commercialization potential.
Reply
magic9mushroom Feb 26
By "reduce per-model training budgets", do you mean "reduce how much
we're willing to spend" or "reduce how much we need to spend"?
Reply
David Piepgrass Feb 26 · edited Feb 26
I mean that a typical wealthy AI group will reduce the total amount it
actually spends on models costing over ~$500,000 each, unless
they suspect they might have invented AGI, or something with
commercialization potential, and even in those cases they probably
won't spend much more than before on a single model (but if they
do, I'm pretty sure they won't get a superintelligent AGI out of it).
(edit: raised threshold 100K=>500K. also, I guess the superjumbo
model fad might have a year or two left in it, but I bet it'll blow over
soon)
Reply
david roberts Writes Let Me Challenge Your Thinking · Feb 23
The math and science are very difficult for me. So, I'm glad you are there to interpret it from a
super layperson's perspective!
Could you point me to WHY AI scares you? I assume you've written about your fears.
Or should I remain blissfully ignorant?
Reply
Maxwell E Feb 23
He has written about this before on his previous blog, but even more helpfully
summarized the general concerns here
https://www.lesswrong.com/posts/LTtNXM9shNM9AC2mp/superintelligence-faq
Consider especially parts 3.1.2 thru 4.2
Reply
Scott Alexander Feb 23 Author
This is pretty out of date, but I guess it will do until/unless I write up something else.
Reply
david roberts Writes Let Me Challenge Your Thinking · Feb 23
Thanks!
Reply
magic9mushroom Feb 24 · edited Feb 24
I obviously cannot speak to why AI scares Scott, but there are some theoretical and
practical reasons to consider superhuman AI a highly-scary thing should it come into
existence.
Theoretical:
Many natural dangers that threaten humans do not threaten humanity, because
humanity is widely dispersed and highly adaptive. Yellowstone going off or another
Chicxulub impactor striking the Earth would be bad, but these are not serious X-risks
because humanity inhabits six continents (protecting us from local effects), has last-
resort bunkers in many places (enabling resilience against temporary effects) and can
adapt its plans (e.g. farming with crops bred for colder/warmer climates).
These measures don't work, however, against other intelligent creatures; there is no
foolproof plan to defeat an opponent with similar-or-greater intelligence and similar-or-
greater resources. For the last hundred thousand years or so, this category has been
empty save for other humans and as such humanity's survival has not been threatened
(the Nazis were an existential threat to Jews, but they were not an existential threat to
humanity because they themselves were human). AGI, however, is by definition an
intelligent agent that is not human, which makes human extinction plausible (other
"force majeure" X-risks include alien attack and divine intervention).
Additionally, many X-risks can be empirically determined to be incredibly unlikely by
examining history and prehistory. An impact of the scale of that which created Luna
would still be enough to kill off humanity, but we can observe that these don't happen
often and there is no particular reason for the chance to increase right now. This one
even
Expandapplies to alien attack and divine intervention, since presumably these entities
full comment
Replyld h h d h bili d i hi d h li bl h
david roberts Writes Let Me Challenge Your Thinking · Feb 24
Very helpful to my understanding why AI is a unique threat. Thanks for this. You
explain it very well. Although now when i see video clips of kids in robot
competitions, my admiration will be tinged with a touch of foreboding.
Reply
Gruffydd Writes Gruffydd’s Newsletter · Feb 27
Don't be tinged by that foreboding. If you read a bit about superintelligence it
becomes clear that it's not going to come from any vector that's typically
imagined (terminator or black mirror style robots).
There are plenty of ideas of more realistic ways an AGI escapes confinement
and gains access to the real world, a couple of interesting ones I read were it
solving the protein folding problem, paying or blackmailing someone over the
intenet to mix the necessary chemicals, and it creates nanomachines capable
of anything. Another was tricking a computer sciencist with a perfect woman
on a VR headset.
In fact it probably won't be any of these things, after all, it's a super
intelligence: whatever it creates to pursue its goals will be so beyond our
understanding that it's meaningless to predict what it will do other than as a bit
of fun or creative writing exercise.
Let me know if you want links to those stories/ideas, I should have them
somewhere. Superintelligence by Nick Bostrom is good read, although quite
heavy. I prefer Scott's stuff haha.
Reply Give gift
magic9mushroom Mar 1
The hypothetical "rogue superintelligent AGI with no resources is out to
kill everyone, what does it do" might not be likely to go that way, but
that's hardly the only possibility for "AI causes problems". Remote-control
killer robots are already a thing (and quite an effective thing), militaries
have large budgets, and plugging an AI into a swarm of killbots does seem
like an obvious way to improve their coordination. PERIMETR/Dead Hand
was also an actual thing for a while.
Reply
John Schilling Mar 2
The "killbots" can't load their own ordnance or even fill their own fuel
tanks, which is going to put a limit on their capabilities.
Reply
arbitrario Mar 17
> solving the protein folding problem, paying or blackmailing someone
over the intenet to mix the necessary chemicals, and it creates
nanomachines capable of anything
Arguably the assumption that "nanomachines capable of anything" can
even exist is a big one. After all, in the Smalley - Drexler debate Smalley
was fundamentally right and drexlerian nanotech is not really compatible
with known physics and chemistry
Reply Give gift
Matthew Carlin Feb 25 · edited Feb 25
Offering the opposite take: https://idlewords.com/talks/superintelligence.htm
(Note this essay is extremely unpopular around these parts, but also, fortunately,
rationalists are kind enough to let it be linked!)
Reply
magic9mushroom Feb 27
1) I mean, yes, people get annoyed when you explain in as many words that you are
strawmanning them in order to make people ignore them.
2) There are really two factions to the AI alarmists (NB: I don't intend negative
connotations there, I just mean "people who are alarmed and want others to be
alarmed") - the ones who want to "get there first and do it right" and the ones who
want to shut down the whole field by force. You have something of a case against
the former but haven't really devoted any time to the latter.
Reply
Hyperion Feb 23
Generally I think that the paradigm shifts argument is convincing, and so all this business of
trying to estimate when we will have a certain number of FLOPS available is a bit like trying to
estimate when fusion will become widely available by trying to estimate when we will have the
technology to manufacture the magnets at scale.
However, I disagree with Eliezer that this implies shorter timelines than you get from raw
FLOPS calculations - I think it implies longer ones, so would be happy to call the Cotra
report's estimate a lower bound.
Reply
Mike Feb 23
>she says that DeepMind’s Starcraft engine has about as much inferential compute as a
honeybee and seems about equally subjectively impressive. I have no idea what this means.
Impressive at what? Winning multiplayer online games? Stinging people?
Swarming
Reply Give gift
dyoshida Feb 24
Building hives
Reply
Scott Alexander Feb 24 Author
You people are all great.
Reply
Matthew Carlin Feb 25
It plays Zerg well and Terran for shit.
Protoss, you say? Everyone knows Protoss in SC2 just go air.
Reply
Daniel Kokotajlo Feb 23 · edited Feb 23
Yes, you should care. The difference between 50% by 2030 and 50% by 2050 matters to
most people, I think. In a lot of little ways. (And for some people in some big ways.)
For those trying to avert catastrophe, money isn't scarce, but researcher
time/attention/priorities is. Even in my own special niche there are way too many projects to
do and not enough time. I have to choose what to work on and credences about timelines
make a difference. (Partly directly, and partly indirectly by influencing credences about
takeoff speeds, what AI paradigm is likely to be the relevant one to try to align, etc.)
EDIT: Example of a "little" way: If my timelines went back up to 30 years, I'd have another
child. If they had been at 10 years three years ago, I would currently be childless.
Reply
Scott Alexander Feb 23 Author
Why does your child-having depend on your timelines? I'm considering a similar
question now and was figuring that if bringing a child into the world is good, it will be
half as good if the kid lives 5 years as if they live 10, but at no point does it become bad.
This would be different if I thought I had an important role in aligning AI that having a
child would distract me from; maybe that's our crux?
Reply
Ben Pace Writes LessWrong · Feb 23
I myself am pro bringing in another person to fight the good fight. If it were me
being brought in I would find it an honor, rather than damning. My crux is simply
that I am too busy to rear more humans myself.
Reply
Daniel Kokotajlo Feb 23
FWIW I totally agree
Reply
Some Guy Writes Extelligence · Feb 23
Psst… kids are awesome (for whatever points a random Internet guy adds to your
metrics)
Reply Give gift
Daniel Kokotajlo Feb 23
I'm not sure it is rational / was rational. I probably shouldn't have mentioned it.
Probably an objective, third-party analysis would either conclude that I should have
kids in both cases or in neither case.
However the crux you mention is roughly right. The way I thought of it at the time
was: If we have 30 years left then not only will they have a "full" life in some sense,
but they may even be able to contribute to helping the world, and the amount of my
time they'd take up would be relavitely less (and the benefits to my own fulfillment
and so forth in the long run might even compensate) and also the probability of the
world being OK is higher and there will be more total work making it be OK and so
my lost productivity will matter much less...
Reply
Matthew Carlin Feb 25
(Apologies if this is a painful topic. I'm a parent and genuinely curious about
your thinking)
Would you put a probability on their likelihood of survival in 2050? (ie, are you
truly operating from the standpoint that your children have a 40 or 50 percent
chance of dying from GAI around 2050?)
Reply
Daniel Kokotajlo Feb 25
Yes, something like that. If I had Ajeya's timelines I wouldn't say "around
2050" I would say "by 2050." Instead I say 2030-ish. There are a few
other quibbles I'd make as well but you get the gist.
Reply
Matthew Carlin Feb 26
Thanks for answering.
Reply
Daniel Kirmani Feb 23
> money isn't scarce, but researcher time/attention/priorities is.
I don't get the "MIRI isn't bottlenecked by money" perspective. Isn't there a well-
established way to turn money into smart-person-hours by paying smart people very
high salaries to do stuff?
Reply
Daniel Kokotajlo Feb 23 · edited Feb 23
My limited understanding is: It works in some domains but not others. If you have an
easy-to-measure metric, you can pay people to make the metric go up, and this
takes very little of your time. However, if what you care about is hard to measure /
takes lots of time for you to measure (you have to read their report and fact-check
it, for example, and listen to their arguments for why it matters) then it takes up a
substantial amount of your time, and that's if they are just contractors who you
don't owe anything more than the minimum to.
I think another part of it is that people just aren't that motivated by money,
amazingly. Consider: If the prospect of getting paid a six-figure salary to solve
technical alignment problems worked to motivate lots of smart people to solve
technical alignment problems... why hasn't that happened already? Why don't we
get lots of applicants from people being like 'Yeah I don't really care about this stuff
I think it's all sci-fi but check out this proof I just built, it extends MIRI's work on
logical inductors in a way they'll find useful, gimme a job pls." I haven't heard of
anything like that ever happening. (I mean, I guess the more realistic case of this is
someone who deep down doesn't really care but on the exterior says they do. This
does happen sometimes in my experience. But not very much, not yet, and also the
kind of work these kind of people produce tends to be pretty mediocre.)
Another part of it might be that the usefulness of research (and also manager/CEO
stuff?) is heavy-tailed. The best people are 100x more productive than the 95th
percentile people who are 10x more productive than the 90th percentile people who
are 10x more productive than the 85th percentile people who are 10x more
productive than the 80th percentile people who are infinitely more productive than
the 75th percentile people who are infinitely more productive than the 70th
percentile people who are worse than useless. Or something like that.
Anyhow it's a mystery to me too and I'd like to learn more about it. The
phenomenon is definitely real but I don't really understand the underlying causes.
Reply
Melvin Feb 23
> Consider: If the prospect of getting paid a six-figure salary to solve technical
alignment problems worked to motivate lots of smart people to solve technical
alignment problems... why hasn't that happened already?
I mean, does MIRI have loads of open, well-paid research positions? This is the
first I'm hearing of it. Why doesn't MIRI have an army of recruiters trolling
LinkedIn every day for AI/ML talent the way that Facebook and Amazon do?
Looking at MIRI's website it doesn't look like they're trying very hard to hire
people. It explicitly says "we're doing less hiring than in recent years". Clicking
through to one of the two available job ads (
https://intelligence.org/careers/research-fellow/ ) it has a section entitled "Our
recommended path to becoming a MIRI research fellow" which seems to imply
that the only way to get considered for a MIRI research fellow position is to
hang around doing a lot of MIRI-type stuff for free before even being
considered.
None of this sounds like the activities of an organisation that has a massive
pile of funding that it's desperate to turn into useful research.
Reply Give gift
Daniel Kokotajlo Feb 23 · edited Feb 23
I can assure you that MIRI has a massive pile of funding and is desperate
for more useful research. (Maybe you don't believe me? Maybe you think
they are just being irrational, and should totally do the obvious thing of
recruiting on LinkedIn? I'm told OpenPhil actually tried something like that
a few years ago and the experiment was a failure. I don't know but I'd
guess that MIRI has tried similar things. IIRC they paid high-caliber
academics in relevant fields to engage with them at one point.)
Again, it's a mystery to me why it is, but I'm pretty sure that it is.
Some more evidence that it's true:
--Tiny startups beating giant entrenched corporations should NEVER
happen if this phenomenon isn't real. Giant entrenched corporations have
way more money and are willing to throw it around to improve their tech.
Sure maybe any particular corporation might be incompetent/irrational,
but it's implausible that all the major corporations in the world would be
irrational/incompetent at the same time so that a tiny startup could beat
them all.
--Similar things can be said about e.g. failed attempts by various
governments to make various cities the "new silicon valley" etc.
Maybe part of the story is that research topics/questions are heavy-
tailed-distributed in importance. One good paper on a very important
question is more valuable than ten great papers on a moderately
important
Reply question
Melvin Feb 23
> I can assure you that MIRI has a massive pile of funding and is
desperate for more useful research. (Maybe you don't believe me?
Maybe you think they are just being irrational
Maybe they're not being irrational, they're just bad at recruiting.
That's fine, that's what professional recruiters are for. They should
hire some.
If MIRI wants more applicants for its research fellow positions it's
going to have to do better than
https://intelligence.org/careers/research-fellow/ because that seems
less like a genuine job ad and more like an attempt to get naive
young fanboys to work for free in the hopes of maybe one day
landing a job.
Why on Earth would an organisation that is serious about recruitment
tell people "Before applying for a fellowship, you’ll need to have
attended at least one research workshop"? You're competing for the
kind of people who can easily walk into a $500K+ job at any FAANG,
why are you making them jump through hoops?
Reply Give gift
Robert Mushkatblat Feb 24 · edited Feb 24
MIRI doesn't want people who can walk into a FAANG job, they
want people who can conduct pre-paradigmatic research.
"Math PhD student or postdoc" would be a more accurate
desired background than "FAANG software engineer" (or even
"FAANG ML engineer"), but still doesn't capture the fact that
most math PhDs don't quite fit the bill either.
If you think professional recruiters, who can't reliably distinguish
good from bad among the much more commoditized "FAANG
software engineer" profile, will be able to find promising
candidates for conducting novel AI research - well, I don't want
to say it's impossible. But the problem is doing that in a way that
isn't _enormously costly_ for people already in the field; there's
no point in hiring recruiters if you're going to spend more time
filtering out bad candidates than if you'd just gone looking
yourself (or not even bothered and let high-intent candidates
find you).
Reply
megaleaf Feb 24
> MIRI doesn't want people who can walk into a FAANG job
But if they really were rolling in money, wouldn't it be worth
trying various parallel initiatives, including hiring a team of
FAANG ML engineers and letting them try to get up to
speed? Just to see if any of them reach a point where they
can make a useful contribution?
If the objection was that overseeing all this would steal too
much mental bandwidth from the key existing researchers,
my response would be, don't oversee it then. Absent any
better options, I hereby volunteer to set up a non-profit
called the 'Institute for Parallel Initiatives In AI Safety'
(IPIAIS). Then, MIRI/FHI/CHAI et al can give me any spare
amounts of money they wish and I promise to make a good
faith effort to use it as wisely as possible in the pursuit of AI
safety.
Reply
Matthew Carlin Feb 26
"FAANG ML engineer" has really high overlap with "Math
PhD student or postdoc".
I agree about not fitting the bill. MIRI wants people to do
basic blue sky research in a fundamentally unrewarding
space. It's hard to publish, hard to get results you're proud
of, you're swimming upstream against the problem, and
you're swimming upstream against the culture. It's an odd
duck who wants to do this.
(NB I want to do this, just not for MIRI. The other problem is
people like this tend to have idiosyncratic goals and it's
random and unlikely whether they're MIRI-aligned goals.)
Reply
Eye Beams are cool Feb 24
Holy shit. That's not a job posting. That's instructions for joining
a cult. Or a MLM scam.
Reply Give gift
Matthew Carlin Feb 26
I read the posting and it didn't seem like that to me. But if it
does seem like that to you, that's some evidence of what
MIRI is. fwiw that's definitely how EY initiatives generally
sound to me, and I was genuinely surprised to open the
posting and *not* feel the same thing.
Reply
Juliette Culver Feb 24
I think there is an interesting question about how one moves fields
into this area. I imagine that having people who are intelligent but
with a slightly different outlook would be useful. Being mentored
while you get up to speed and write your first paper or two is
important I think. I'm really not sure how I would move into a paid
position for example without basically doing an unpaid and isolated
job in my spare time for a considerable amount of time first.
Reply Give gift
Froolow Feb 24
For what it is worth, I agree completely with Melvin on this point - the
job advert pattern matches to a scam job offer to me and certainly
does not pattern match to any sort of job I would seriously consider
taking. Apologies to be blunt, but you write "it's a mystery to me why
it is", so I'm trying to offer an outside perspective that might be
helpful.
It is not normal to have job candidates attend a workshop before
applying for a job in prestigious roles, but it is very normal to have
candidates attend a 'workshop' before pitching them an MLM or
timeshare. It is even more concerning that details about these
workshops are pretty thin on the ground. Do candidates pay to
attend? If so this pattern matches advanced fee scams. Even if they
don't pay to attend, do they pay flights and airfare? If so MIRI have
effectively managed to limit their hire pool to people who live within
commuting distance of their offices or people who are going to work
for them anyway and don't care about the cost.
Furthermore, there's absolutely no indication how I might go about
attending one of these workshops - I spent about ten minutes trying
to google details (which is ten minutes longer than I have to spend to
find a complete list of all ML engineering roles at Google / Facebook),
and the best I could find was a list of historic workshops (last one in
2018) and a button saying I should contact MIRI to get in touch if I
wanted to attend one. Obviously I can't hold the pandemic against
MIRI
Expandnotfullholding
commentin-person meetups (although does this mean they
Reply Give gift
Essex Feb 24 · edited Feb 24
I believe the reason they aren't selecting people is simply that MIRI is
run by deeply neurotic people who cannot actually accept any
answer as good enough, and thus are sitting on large piles of money
they insist they want to give people only to refuse them in all cases.
Once you have done your free demonstration work, you are simply
told that, sorry, you didn't turn out to be smarter than every other
human being to ever live by a minimum of two orders of magnitude
and thus aren't qualified for the position.
Perhaps they should get into eugenics and try breeding for the
Kwisatz Haderach.
Reply
Mr. Doolittle Feb 25
Although your take is deeply uncharitable, I think the basis of
your critique is true and stems from a different problem. Nobody
knows how to create a human level intelligence, so how could
you create safety measures based on how such an intelligence
would work? They don't know. So they need to hire people to
help them figure that out, which makes sense. But since they
don't know, even at an introductory level, they cannot actually
evaluate the qualifications of applicants. Hiring a search firm
would result in the search firm telling MIRI that MIRI doesn't
know what it needs. You'd have to hire a firm that knows what
MIRI needs, probably by understanding AI better than they do, in
order to help MIRI get what it needs. Because that defeats the
purpose of MIRI, they spin their wheels and struggle to hire
people.
Reply Give gift
Essex Feb 25 · edited Feb 25
I disagree that I'm being uncharitable here, given that
Yudkowsky himself has admitted to rejecting close to 100%
of all proposals RE: AI risk prima facie, admits he doesn't
even know what an answer that would satisfy him would
begin to look like, and says that even if he DID get this
miracle answer that'd totally satisfy him he'd still see it as
basically a total crapshoot. Given this, the fact that AFAICT
most computing scientists don't think AGI is even a
probable risk, much less just over the horizon and
Yudkowsky's well-documented history of highly risk-
aversive behavior (remember when he banned discussion
of Roko's Basilisk because he thought thinking and talking
about it too hard would essentially actualize it into
existence?), reaching the conclusion "Yudkowsky (and the
people who think strongly like him) are highly neurotic in a
form that makes them incapable of accepting even very
small amounts of risk, leading them to reject anything they
see as 'too risky' out of hand" doesn't seem to me like
some immensely cruel slander on their names.
Reply
Matthew Carlin Feb 26
I think it's under appreciated and *very important* to
realized Eliezer Yudkowsky's primary objective isn't
"stop malignant AI", it's "stop everyone I love from
dying".
https://www.yudkowsky.net/other/yehuda
See also the theme of HPMOR.
It is very sad. I feel for him. But I think it deeply
undermines any more level headed approach to his AI
risk objective.
Reply
Essex Feb 27 · edited Feb 27
As a Buddhist, I must say that the desire to end
suffering is highly admirable! Sadly, this is why I
feel I must be hard on EY: while he has realized
most of the Noble Truths, he has yet to realize the
most important one.
This world IS a charnel ground. The laws that
govern it make it so, and they cannot be broken
through any path but one which he, as an avowed
atheist and effective antitheist, sneers at. There is
no machine that will banish suffering and make all
people Bodhisattvas, and to believe that one can
be made is delusion more severe and complete as
he thinks those with faith possess. After all, the
faithful believe there are laws beyond the laws of
the charnel ground, while EY does not. And thus,
all I can do is make clear my position and pray
that one day he accepts the hand that Kannon is
extending to him.
Reply
1 reply
Matthew Carlin Feb 26
That assumes they're so risk-averse they want to get the
hiring problem right on the first try. Which I think is
basically the point of agreement between you and Essex.
Reply
Matthew Carlin Feb 26
They're going to have a problem with the KH-risk people.
Reply
Daniel Kirmani Feb 23 · edited Feb 23
> However, if what you care about is hard to measure / takes lots of time for
you to measure then it takes up a substantial amount of your time.
One solution here would be to ask people to generate a bunch of alignment
research, then randomly sample a small subset of that research and subject it
to costly review, then reward those people in proportion to the quality of the
spot-checked research.
But that might not even be necessary. Intuitively, I expect that gathering really
talented people and telling them to do stuff related to X isn't that bad of a
mechanism for getting X done. The Manhattan Project springs to mind. Bell
Labs spawned an enormous amount of technical progress by collecting the
best people and letting them do research. I think the hard part is gathering the
best people, not putting them to work.
> If the prospect of getting paid a six-figure salary to solve technical alignment
problems worked to motivate lots of smart people to solve technical alignment
problems... why hasn't that happened already?
Because the really smart and conscientious people are already making six
figures. In private correspondence with a big LessWrong user (>10k karma),
they told me that the programmers they knew that browsed LW were all very
good programmers, and that the _worst_ programmer that they knew that read
LW worked as a software engineer at Microsoft. If we equate "LW readers"
with "people who know about MIRI", then virtually all the programmers who
know about MIRI are already easily clearing six figures. You're right that the
usefulness of researchers is heavy-tailed. If you want that 99.99th percentile
guy, you need to offer him a salary competitive with those of FAANG
companies.
Reply
John Schilling Feb 24
If you equate "people who know about MIRI" with "LW readers", then
maybe put some money and effort into MIRI more widely known.
Hopefully in a positive way, of course.
Reply
N. N. Writes Good Optics · Feb 24
You probably know more about the details of what has or has not been tried
than I do, but if this is the situation we really should be offering like $10 million
cash prizes no questions asked for research that Eliezer or Paul or whoever
says moves the ball on alignment. I guess some recently announced prizes are
moving us in this direction, but the amount of money should be larger, I think.
We have tons of money, right?
Reply
Xpym Feb 24
They (MIRI in particular) also have a thing about secrecy. Supposedly
much of the potentially useful research not only shouldn't be public, even
hinting that this direction might be fruitful is dangerous if the wrong
people hear about it. It's obviously very easy to interpret this uncharitably
in multiple ways, but they sure seem serious about it, for better or worse
(or indifferent).
Reply Give gift
Melvin Feb 24
This whole thread has convinced me that MIRI is probably the
biggest detriment in the world for AI alignment research, by soaking
up so much of the available funding and using it so terribly.
The world desperately needs a MIRI equivalent that is competently
run. And which absolutely never ever lets Eleizer Yudkowsky
anywhere near it.
Reply Give gift
Greg Billock Feb 24
My take is increasingly that this institution has succeeded in
isolating itself for poorly motivated reasons (what if AI
researchers suspected our ideas about how to build AGI and did
them "too soon"?) and seems pretty explicitly dedicated to
developing thought-control tech compatible with some of the
worst imaginable futures for conscious subjects (think dual use
applications -- if you can control the thoughts of your subject
intelligence with this kind of precision, what else can you
control?).
Reply
Daniel Kokotajlo Feb 24
It hasn't "soaked up so much of the available funding." Other
institutions in this space have much more funding, and in
general are also soaking in cash.
(I disagree with your other claims too of course but don't have
the energy or time to argue.)
Reply
Dirichlet-to-Neumann Feb 24
Give Terrence Tao 500 000$ to work on AI alignement six months a year,
letting him free to research crazy Navier-Stokes/Halting problem links the rest
of his time... If money really isn't a problem, this kind of thing should be easy
to do.
Reply Give gift
Daniel Kokotajlo Feb 25
Literally that idea has been proposed multiple times before that I know of,
and probably many more times many years ago before I was around.
Reply
David Piepgrass Feb 25 · edited Feb 25
> a six-figure salary to solve technical alignment problems
Wait, what? If I knew that I might've signed the f**k up! I don't have experience
in AI, but still! Who's offering six figures?
Reply
Ninety-Three Feb 24
Every time I am confused about MIRI's apparent failures to be an effective research
institution I notice that the "MIRI is a social club for a particular kind of nerd" model
makes accurate predictions.
Reply Give gift
Matthew Carlin Feb 26
You could pay me to solve product search ranking problems, even though I find the
end result distasteful. In fact, if you bought stuff online, maybe you did pay me!
You couldn't pay me to work on alignment. I'm just not aligned. Many people aren't.
Reply
wewest Feb 23
Fighting over made up numbers seems so futile.
But I don't understand this anyway.
Why do the dangers posed by AI need a full/transformative AI to exist? My total layman's
understanding of these fears is that y'all are worried an AI will be capable of interfering with
life to an extent people cannot stop. It's irrelevant if the AI "chooses" to interfere or there's
some programming error, correct? So the question is not, "when will transformative AI exist?"
the question is only, "when will computer bugs be in a position to be catastrophic enough to
kill a bunch of people?" or, "when will programs that can program better than humans be left
in charge of things without proper oversight or with oversight that is incapable of stopping
these programming programs?"
Not that these questions are necessarily easier to predict.
Reply Give gift
Scott Alexander Feb 23 Author
A dumber-than-human level AI that (let's say) runs a power plant and has a bug can
cause the power plant to explode. After that we will fix the power plan, and either debug
the AI or stop using AIs to run power plants.
A smarter-than-human AI that "has a bug" in the sense of being unaligned with human
values can fight our attempts to turn it off and actively work to destroy us in ways we
might not be able to stop.
Reply
wewest Feb 23
But if we are not worried about the bugs in the e.g. global water quality managing
program, then an AI as smart as a human is not such a big deal either. There are
plenty of smart criminals out who are unaligned with human values and even the
worst haven't managed to wipe out humanity. We need to have an AI smarter than
the whole group of AI police before seriously worrying, so maybe we need to
multiply our made up number by 1,000?
But to illustrate the bug/AI question. Let's imagine Armybot, a strategy planning
simulation program in 2022. And lets say there's a bug and Armybot, which is
hooked up to the nuclear command system for proper simulations, runs a
simulation IRL and lets off all those nukes. That's an extinction level bug that could
happen right now if we were dumb enough.
Now lets imagine Armybot is the same program in 2050 and now it's an AI with the
processing power equivalent to the population of a small country. Now the fear is
Armybot's desire/bug to nuke the world kicks in (idk why it becomes capable of
making independent decisions or having wants just because of more processing
power so I'm more comfortable saying there's a bug). But now it can independently
connect itself to the nuclear command center with its amazing hacking skills (that it
taught itself? that we installed?). That's an extinction level bug too.
So the question is, which bug is more likely?
Reply Give gift
FeepingCreature Feb 24 · edited Feb 24
The general intuition, I believe, is that an AI as smart as a human can quickly
become way way smarter than a human, because humans are really hard to
improve (evolution has done its best to drill a hole through the gene-
performance landscape to where we are, but it's only gotten more stuck over
the aeons) and AI tends to be really easy to improve: just throw more cores at
it.
If you could stick 10 humans of equal intelligence in a room and get the
performance of one human that's 10 iq points smarter than that, then the
world would look pretty different. Also we can't sign up for more brain on AWS.
Reply Give gift
Melvin Feb 24
My intuition is that "Just throw more cores at it" is no more likely to
improve an AI's intelligence than opening up my skull and chucking in a
bunch more brain tissue.
I think you'd have to throw more cores at it _and then_ go through a
lengthy process of re-training, which would probably cost another
hundred billion dollars of compute time.
Reply Give gift
Bugmaster Feb 24
It's even worse (or better, I guess, depending on your viewpoint) than
that, because cores don't scale linearly; there's a reason why
Amazon has a separate data center in every region, and why your
CPU and GPU are separate units. Actually it's even worse than that,
because even with all those cores, no one knows what "a lengthy
process of re-training" to create an AGI would look like, or whether
it's even possible without some completely unprecedented advances
in computer science.
Reply
FeepingCreature Feb 24 · edited Feb 24
I think we can safely assume that it is going to be vastly easier
than making a smarter human, at least given our political
constraints. (Iterated embryo selection etc.) It doesn't matter
how objectively hard it is, just who has the advantage, and by
how much. Also I think saying we need fundamental advances in
CS to train a larger AI given a smaller AI, misses first the already
existing distillation research, and second assumes that the AGI
was a one in a hundred stroke of good luck that cannot be
reproduced. Which seems unlikely to me.
Reply Give gift
Bugmaster Feb 24
Well, yes, making a dramatically smarter AI would be easier
than making a dramatically smarter human -- but only
because AI is so dumb to begin with. A dramatically
smarter AI that was as smart as a horse would be a major,
monumental achievement, and lots of self-driving car
companies are working on it now... with varying degrees of
success.
And yes, I am indeed assuming that AGI was one in a 4.5
billion years stroke of luck. Do you see anyone else around
who's as smart as humans (who aren't even that smart,
BTW) ?
Reply
Ghillie Dhu Feb 24
>A dramatically smarter AI that was as smart as a
horse would be a major, monumental achievement,
and lots of self-driving car companies are working on
it now... with varying degrees of success.
FWIW, I am much less distraught by the Tesla
Autopilot's occasional unexpected braking or swerving
than most owners seem to be precisely because I
think of it as a horse mind; it's very good at the basics
but you never know when it might get spooked by
something you can't even see.
Reply
Veedrac Feb 24 · edited Feb 24
A hundred billion dollars of compute time for training is a fairly
enlightening number because it's simultaneously an absurd amount
of compute, barely comparable to even the most extravagant training
runs we have today, enough to buy multiple cutting edge fabs and
therefore all of their produced wafers, while also being an absolutely
trivial cost to be willing to pay if you already have AGI and are looking
to improve it to ASI. Heck, we've spent half that much just on our
current misguided moon mission that primarily exists for political
reasons that have nothing to do with trying to go to the moon.
That said, throwing more cores at an AI is by no means necessary,
nor even the most relevant way an AI could self-improve, nor actually
do we even need to first get AGI before self-improvement becomes a
threat. For example, we already have systems that can do pick-and-
place for hardware routing better than humans, we don't need AGI to
do reinforcement learning, and there are ways in which an AI system
could be trained to be more scalable when deployed than humans
have evolved to be.
A fairly intelligent AI system finely enough divided to search over the
whole of the machine learning literature and collaboratively try out
swathes of techniques on a large cluster would not have to be
smarter than a human in each individual piece to be more productive
at fast research than the rest of humanity. Similarly, it's fairly easy to
build AI systems that have an intrinsic ability to understand very high
Expand full
fidelity comment that is hard to convey to humans like AI systems
information
Reply Give gift
Donald Feb 24
There will be 0 or a few AI's given access to nukes. And hopefully only well
tested AI.
If the AI is smart, especially if its smarter than most humans, and it wants to
take over the world and destroy all humans, its likely to succeed. If you aren't
stupid, you won't wire a buggy AI to nukes with no safeguards. But if the AI is
smart, its actively trying to circumvent any safeguard. And whether nukes
already exist is less important. It can trick humans into making bioweapons.
"idk why it becomes capable of making independent decisions or having wants
just because of more processing power so I'm more comfortable saying there's
a bug". Current AI sometimes kind of have wants, like wanting to win at chess,
or at least reliably selecting good chess moves.
We already have robot arms programmed to "want" to pick things up. (Or at
least search for plans to pick things up.) The difference is that currently our
search isn't powerful enough to find plans involving breaking out, taking over
the world and making endless robot arms to pick up everything forever.
Defence against a smart adversary is much much harder than defence against
random bugs.
Reply Give gift
David Piepgrass Feb 25 · edited Feb 25
> an AI as smart as a human
Scott said "smarter-than-human" (perhaps he means "dramatically smarter"),
and I argue downthread that there will never be an AI "as smart as" a human.
Reply
Matthew Carlin Feb 25
I'm unconvinced by AI X-risk in general, but I think I can answer this one: bugs
are random. Intelligences are directed. A bad person is more dangerous than a
bug at similar levels of resources and control.
Reply
Bugmaster Feb 24
No, it can't, because merely being able to compute things faster than a human does
not automatically endow the AI with nigh-magical powers -- and most of the
magical powers attributed to putative superhuman AIs, from verbal mind control to
nanotechnology to, would appear to be physically impossible.
Don't get me wrong, a buggy AI could still mess up a lot of power plants; but that's
a quantitative increase in risk, not a qualitative one.
Reply
Sandro Feb 24
An AI doesn't need magical powers to be a huge, even existential threat. It just
needs to be really good at hacking and can use the usual human foibles as
leverage to get nearly anything it wants: money and blackmail.
Reply Give gift
Bugmaster Feb 24
Human hackers do that today all the time, with varying degrees of
success. They are dangerous, yes, but not an existential threat. If you are
proposing that an AI would be able to hack everything everywhere at the
same time, then we're back in the magical powers territory.
Reply
Sandro Feb 25
We're talking about superintelligent AI. Being better than human
hackers is a trivial corollary. Exactly what is magical about that?
Reply Give gift
Bugmaster Feb 25
How much better, exactly ? Is it good enough to hack my Casio
calculator watch ? If so, then it's got magical powers, because
that watch is literally unhackable -- there's nothing in it to hack.
Ok, so maybe not, but is it good enough to gain root access to
every computer on the Internet at the same time while avoiding
detection ? If so, then it has magical powers of infinite
bandwidth, superluminal communication, and whatever else it is
that lets it run its code at zero performance penalty. Ok, so
maybe it's not quite that good, but it's just faster than average
human hackers and better informed about security holes ? Well,
then it's about as good as Chinese or Russian state hackers
already are today.
In other words, you can't just throw the word "superintelligent"
into a sentence as though it was a magic incantation; you still
need to explain what the AI can do, and how it can do it (in
broad strokes).
Reply
Sandro Feb 25 · edited Feb 25
> Ok, so maybe it's not quite that good, but it's just faster
than average human hackers and better informed about
security holes ? Well, then it's about as good as Chinese or
Russian state hackers already are today.
I'm having a hard time understanding what's so difficult to
grasp about this. An AI system that is superintelligent will
have superhuman inference capabilities. That means it will
by necessity be *better than any human hacker*, period,
because it will make fewer mistakes (if any), and it will have
a wider base of knowledge upon which to draw. Even if it
starts out as a shitty hacker, it would become the best
much more quickly than the best human hacker, or it's not
superintelligent.
Furthermore, are you forgetting this AI is a digital program?
That means:
a) it can clone itself as many times as it needs into as many
systems as it needs to. What human hacker army can
nearly instantly scale itself several orders of magnitude?
b) it may even have *better* abilities to interact with digital
systems because that's its "native" environment; for
instance, it might have a much easier time detecting covert
channels on any systems upon which it's running.
c)Expand
it doesn't need to compromise every system instantly or
full comment
Reply Give gift
Bugmaster Feb 25
Ok, so let's imagine a toy scenario, where some
specific machine has exactly one potential security
vulnerability. Can the superhuman AI exploit it faster
than a human ? Well, maybe, but can it do this 1000x
faster ? Infinitely faster ? The answer is "no", because
computers aren't infinitely fast, no matter how
insecure they may be. Thus, the human with a 2-page
piece of rootkit code, and the superintelligent AI, are
both in the exact same boat. They are both down to
"compromising systems slowly via nearly undetectable
backdoors" -- which is exactly what human hackers
and botnets are already doing, today, so it's not a new
problem. And they face the same risk of discovery and
getting the plug pulled on them, should their botnet
prove to be more than an invisible annoyance.
As I said above, you can't just throw in the word
"superintelligent" as a synonym for "nigh-omnipotent".
You say that I am strawmanning this term, but you
yourself have not explained what it means, other than
"far beyound human ability at everything". Which is
fine, I can work with that, but you first need to explain
how far beyound, and why it is even possible. There's
no amount of superintelligence that will allow the AI to
e.g. move a rock from here to the Moon in less than a
second. There are similar physical limitations to
everything it would want to do. Yes, it could do these
things faster or more efficiently than a current average
human... but then, so could humans; technology is
getting better all the time.
Reply
Sandro Feb 26
> Thus, the human with a 2-page piece of rootkit
code, and the superintelligent AI, are both in the
exact same boat.
No they're not. Humans need to sleep, they need
to eat, they can be tracked, and physical confined
and thus neutralized in ways that are well
understood by now (not the case with AI).
Humans can't replicate themselves and
parallelize their agency across multiple
independent systems to which they have access
nearly instantly. And again, we're talking about
superintelligence, not human intelligence.
Honestly, it boggles my mind that you think that
none of these differences are relevant.
> As I said above, you can't just throw in the word
"superintelligent" as a synonym for "nigh-
omnipotent".
Superintelligence is not nigh-omnipotence, it's
just high competence to a level you *have not*
dealt with before, by definition. You keep claiming
that these are all the same problems with the
same properties and similar enough solutions, but
literally nothing about these scenarios is
reasonably comparable to what we've dealt with
before, and you're just hand-waving away these
differences as irrelevant.
To be clear, I'm not saying such AIs *will* have
these properties, but they *potentially could*
because there are no physical limitations that
would prevent it from having the properties I
described. To dismiss these as irrelevant dangers
is foolishness.
Reply Give gift
4 replies
B Civil Feb 25
Why would an AI want money?
Reply
Sandro Feb 26
Because money is how anyone or anything acquires resources in the
world as it currently exists.
Reply Give gift
Jonathan Paulson Feb 23
These timelines seem to depend crucially on compute getting much cheaper. Computer chip
factories are very expensive, and there are not very many of them. Has anyone considered
trying to make it illegal to make compute much cheaper?
Reply
Scott Alexander Feb 23 Author
Who? You're talking to the small group of researchers and activists who care about this,
with a few tens of billions of dollars. How do they make it illegal to make compute much
cheaper?
Reply
Jonathan Paulson Feb 23
Just offering a concrete policy goal to lobby for. As far as I know, actual policy ideas
here beyond “build influence” are in short supply.
I agree this would be very challenging and probably require convincing some part of
the US and Chinese governments (or maybe just the big chip manufacturers) that
AI risk is worth taking seriously.
Reply
Daniel Kokotajlo Feb 25
Ideas aren't in short supply; clearly good ideas are. You aren't the first person
to propose lobbying to stop compute getting cheaper. What's missing is a
thorough report that analyzes all the pros and cons of ideas like that and
convinces OpenPhil et al that if they do this idea they won't five years later
think "Oh shit actually we should have done the exact opposite, we just made
things even worse."
Reply
David Piepgrass Feb 25 · edited Feb 25
Even clearly good ideas aren't in short supply; *popular* people who can
tell which ideas are good are. So usually when I see (or invent) a good
idea, it is not popular.
Reply
Jonathan Paulson Feb 25
What are some clearly good policy ideas in this space?
Most that I have seen are bad because of the difficulty of
coordinating among all possible teams of people working on AI (on
the other hand, the number of potential chip fabs is much smaller)
Reply
David Piepgrass Feb 25 · edited Feb 25
Sorry, just realized I made a fairly useless comment. I was
making a general observation, not one about this field
specifically. So, don't know.
Reply
Jonathan Paulson Feb 25
OK, I’m glad to hear this idea is already out there. I wasn’t sure if it was. I
agree the appropriate action on it right now is “consider carefully”, not
“lobby hard for it”.
Reply
Algon33 Feb 23
I don't know if someone has discussed your idea in AI governance, but in alignment
there's the concept of a "pivotal act". You train an AI to do some specific task which
helps which drastically changes the expected outcome of AGI. For instance, an AI which
designs nanotech in order to melt all GPUs and destoy GPU plants, after which it shuts
down. Which is vaguely similair to what you suggested. So maybe search for pivotal acts
on the alignment forum to find the right literature.
Reply Give gift
Mr. Doolittle Feb 24
Is this intended to be a failsafe, such that the AGI has a program to destroy
computer creating machinery, but can only do so if it escapes its bounds enough to
gain the ability?
Reply Give gift
Algon33 Feb 24
It is intended to slow down technological progress in AI and make it impossible
for someone else (and you afterwards!) to make an AGI, or anything close to an
AGI. And nothing else. So no first order effect on other tech, politics,
economics, science etc.
This works out better as a failsafe than what you've proposed, since if you're
expecting the AI to escape and have enough power to conduct such an act,
you've lost anyway. Someone else is probably making an AGI as well in that
scenario, or the AI will be able to circumvent the program firing up or so on.
Note that getting the AI to actually just melt GPUs somehow and then shut
down is an unsolved problem. If we knew how to do that right now, the
alignment community would be way more optimistic about our chances.
Reply Give gift
mmirate Feb 23 · edited Feb 24
If you've tried to buy a high-amperage MOSFET, a stepper driver, a Raspberry Pi or a
GPU lately, you would know how easy it is to make compute expensive. Different chips -
or different computers whose CPUs/firmwares don't conform to a BIOS-like standard -
are not necessarily fungible with each other, and the whole chip fab process has a very
long cycle time despite the relatively normal amount of throughput achievable by,
essentially, a very deep pipeline.
(And yes, I too think the whole movement reeks of Luddism.)
Reply
Bugmaster Feb 24
See, this is *exactly* why I'm opposed to the AI-alignment community. Normally I
wouldn't care, people can believe whatever they want, from silicon gods to the old-
fashioned spiritual kind. But beliefs inform actions, and boom -- now you've got people
advocating for halting the technological progress of humanity based on a vague
philosophical ideal.
We've got real problems that we need to solve, right now: climate change, hunger,
poverty, social networks, the list goes on; and we could solve most (arguably, all) of
them by developing new technologies -- unless someone stops us by shoving a wooden
shoe in the gears every time a new machine is built.
Reply
Donald Feb 24
"halting the technological progress of humanity based on a vague philosophical
ideal. "
Does this apply to the biologists deciding not to build bioweapons? Some
technologies are dangerous and better off not built. It can create new problems as
well as solving them. You would need to show that the capability of AI to solve our
problems is bigger than the risk. Maybe AI is dangerous and we would be better off
solving climate change with carbon capture. Solving any food problems with GMO
crops. And just not doing the most dangerous AI work until we can figure out how to
make it safer.
Reply Give gift
Bugmaster Feb 24
You are not talking about the equivalent of deciding not to build bioweapons;
you are talking about the equivalent of stopping biological research in general.
I agree that computing is dangerous, just as biology is dangerous -- and don't
even get me started on internal combustion. But we need all of these
technologies if we are to thrive, and arguably survive, as a species. I'm not
talking about solving global warming with some specific application of AI; I'm
talking about transformative events such as our recent transition into the
Information Age.
Reply
Jonathan Paulson Feb 24
Progress is great! Stopping growth would be a disaster.
That said, it doesn’t seem to me that cheaper computing power is very useful in
solving climate change, poverty, etc. Computers are already really great; what we
need is more energy abundance and mastery over the real physical world.
Consumer CPUs haven’t been getting faster for many years, so it’s not even clear
most computer users are benefiting from Moore’s law these days.
Reply
Matthew Carlin Feb 25
If you don't think more computer power can somehow magically solve those
problems, this is a good first step towards understanding why some people are
unconvinced by AI X-risk.
Reply
David Piepgrass Feb 26 · edited Feb 26
> this is *exactly* why I'm opposed to the AI-alignment community
Jonathan Paulson's comment is surely not representative of the AI-alignment
community.
Reply
Metacelsus Writes De Novo · Feb 23 · edited Feb 23
>human solar power a few decades ago was several orders of magnitude worse than
Nature’s, and a few decades from now it may be several orders of magnitude better.
No, because typical solar panels already capture 15 – 20% of the energy in sunlight (the
record is 47%). There's not another order of magnitude left to improve.
Source, https://en.wikipedia.org/wiki/Solar_cell_efficiency
Nitpicking aside, I wonder how the potential improvement of human intelligence through
biotechnology will affect this timeline. The top AI researcher in 2052 may not have been born
yet.
Reply
Brendan Richardson Feb 23
The table also measures solar in terms of "payback period," which has much more room
for improvement.
Reply
Lambert Feb 24
It's also a lot more relevant than efficiency unless you are primarly constrained by
acreage
Reply Give gift
Carl Pham Feb 24
I don't think that's a reasonable metric for solar power. Plants use solar power to drive
chemical reactions -- to make molecules. They're not optimized for generating 24VDC
because DC current isn't useful to a plant. So the true apples-to-apples comparison is
to compare the amount of sunlight a plant needs to synthesize X grams of glucose from
CO2 and water, versus what you can do artificially, e.g. with a solar panel and attached
chemical reactor. By that much more reasonable metric the natural world still outshines
the artificial by many orders of magnitude.
One imagines that if plant life *did* run off a 5VDC bus, then evolution would have
developed some exceedingly efficient natural photovoltaic system. What it would look
like is an interesting question, but I very much doubt it would be made of bulk silicon.
Much more likely is that it would be an array of microscopic machines driven by photon
absorption, which is kind of the way photosynthetic reaction centers work already.
Reply Give gift
Thegnskald Writes Sundry Such and Other · Feb 24
That's not a reasonable metric, either, for exactly the same reason: Solar panels
aren't optimized for generating glucose.
(Also, your metric means efficiency improvements in generating glucose are
efficiency gains for solar power.)
Reply Give gift
Dustin Feb 24
I think this is right.
This leaves us with having to compare across different domains. How do you
quantify the difference between "generates DC power" and "makes
molecules"? I guess you'd have to start talking about the *purpose* of doing
those things. Something like "utility generated for humans" vs "utility
generated for plants"...and that seems really difficult to do.
Reply
Carl Pham Feb 24
No, you would need to measure a combined system, of solar panel plus
chemical plant, as I said. But a plant *is* a combined solar panel plus chemical
plant, and optimized globally, not in each of its parts, so if you wan to make a
meaningful comparison, that's what you have to do. Otherwise you're making
the sort of low-value comparison that people do when they say electric cars as
a technology generate zero CO2, forgetting all about the source of the
electricity. It's true but of very limited value in making decisions, or gaining
insight.
In this case, the insight that is missing is that Nature is still a heck of a lot
better at harvesting and using visible photons as an energy source. The fact
that PV panels can do much better at a certain specialized subtask, which is
*all by itself* pointless -- electricity is never an end in itself, it's always a
means to some other end -- isn't very useful.
Reply Give gift
Bogdan Butnaru Mar 21
But you’re still favoring the plant by trying to get technology to simulate
the plant. Yes, it’s an integrated solar panel plus chemical plant, but that’s
completely useless if what you want is a solar panel plus 24V DC output
plug. In that case, the plants lose by infinity orders of magnitude, because
no plant does 24V DC output. You get similar results if what you want is
computing simple arithmetic (a $5 calculator will beat any plant, complete
with solar panel), or moving people between continents. Yes, birds
contain sophisticated chemical reactors and have complex logic for
finding fuel, but they still cannot move people between continents.
I you insist on measuring based on one side’s capabilities, I am the world’s
best actor by orders of magnitude, since I have vast advantages at
convincing my mother I am the son she raised, relative to anyone else.
Reply
Pycea Feb 23
This is a minor point in all this, but it seems weird to estimate the amount of training evolution
has by the amount of FLOPs each animal has done. Thinking more doesn't seem like it would
increase the fitness of your offspring, at least not in a genetic sense. The only information
evolution gets is how many kids you have (and they have, etc).
Though maybe you could point to this as the reason why the evolution estimate is so much
higher than the others.
Reply
Carl Pham Feb 24
It works if you consider optimization, or solution finding in general, as a giant
undifferentiated sorting problem. I have X bits of raw data, and my solution (or optimum)
is some incredibly rare combination F(X), and what I need to do is sift all the
combinations f(X) until f(X) = F(x). That will give you an order of magnitude estimate for
how much work it is to find F given X, even if you don't know the function f.
But in practice that estimate often proves to be absurdly and uselessly large. It's sort of
like saying the square root of 10 has to be between 1 and 10^50. I mean...yeah, sure, it's
a true statement. But not very practically useful.
In the same sense, many problems Nature has solved appear to have been solved in
absurdly low amounts of time, if you take the "number of operations needed" estimate
as some kind of approximate bound. This is the argument often deployed by "Intelligent
Design" people to explain how evolution is impossible, because the search space for
mutations is so unimaginably huge, relative to the set of useful mutations, that evolution
would accomplish pretty much zip over even trillion-year spans. See also the Levinthal
Paradox in protein folding. Or for that matter the eerie fact that human beings who can
compete with computer chess programs at a given level are doing way, way, *way* fewer
computations. Somehow they can "not bother" with 99.9999+% of the computations
the computer does that turn out to be dead ends.
How Nature achieves this is one of the most profound and interesting questions in
evolutionary biology, in the understanding of natural intelligence, and in a number oif
areas of physical biochemistry, I would say.
Reply Give gift
Some Guy Writes Extelligence · Feb 23 · edited Feb 23
Gotta say I don’t generally feel this way (although I always find his stuff to be enlightening
and a learning experience) but I’m pretty well aligned with Eliezer here. I think people figure
out when they’ll start to feel old age and just put AI there then work backwards. I’m greatly
conflicted about AGI as I don’t know how we fix lots of problems without it and it seems like
there’s some clever stuff to do in the space other than brute forcing that I think doesn’t
happen as much… and this is where I’m conflicted, because kinda thankfully it makes people
feel shunned to do wild stuff which slows the whole thing down. Hopefully we arrive at the
place of unheard of social stability and AGI simultaneously. If we built it right now I think it
would be like strapping several jet engines on a Volkswagen bug. For whatever that’s worth,
Some Guy On The Internet feels a certain way.
Reply Give gift
FeepingCreature Feb 24
I personally think AGI in eight years. GPT-3 scares me. It's safe now, but I worry it's "one
weird trick" (probably involving some sort of online short-horizon self-learning) out from
self-awareness.
Reply Give gift
Some Guy Writes Extelligence · Feb 24
It feels weird to be rooting against progress but I hope you’re wrong until we have
some more time to get our act together. To me the control problem is also how we
control ourselves. Without some super flexible government structure to oversee us I
worry what we’ll try to do even if there are good decision makers telling us to stop.
Seems like most minds we could possibly build would be insane/unaligned (that’s
probably me anthropomorphizing) since humans need a lot of fine tuning and don’t
have to be that divergent before we are completely coo coo for Cocoa Puffs.
Hopefully the first minds are unproductively insane instead of diabolically insane.
Reply Give gift
Bugmaster Feb 24
I am personally pretty old already, but I do expect to live 8 more years, so I'd totally
take you up on that bet. From where I'm standing, it looks like easy money (unless
of course you end up using some weak definition of AGI, like "capable of beating a
human at Starcraft" or whatever).
Reply
Thegnskald Writes Sundry Such and Other · Feb 24
There's the general thing that the definition for AGI keeps changing; what
would have counted as intelligence thirty years ago no longer counts, because
we've already achieved it. So what looks like a strong definition for AGI today
becomes a weak definition tomorrow.
This is actually the source of my optimism: People worried about AGI can't
even define what it is they are worried about. (Personally I'll worry when some
specific key words get used together. But not too much, because I'm probably
just as wrong.)
Reply Give gift
Bugmaster Feb 24
I'm not worried about AGI at all -- that is to say, I'm super worried about it,
but only in the same way that I'm worried about nuclear weapons or mass
surveillance or other technologies in the hands of bad human actors.
However, I'd be *excited* about AGI when it could e.g. convincingly
impersonate a regular (non-spammer) poster on this forum. GPT-3 is
nowhere near this capability at present.
Reply
Matthew Carlin Feb 25
/basilisk !remindme 8 years
Reply
Eugene Norman Feb 28
It’s about as self aware as a rock.
Reply Give gift
Bogdan Butnaru Mar 21
The dinosaurs died because of a rock.
Reply
Eugene Norman Mar 21
The rock wasn’t self aware
Reply Give gift
Bogdan Butnaru Mar 23 · edited Mar 23
Ergo, being self aware is not a necessary condition to be scary
and/or cause a disaster. Or, more precisely, just saying “it’s not self
aware” is not an argument that you shouldn’t worry about it.
The thing that is scary about GPT-3 is not its *self-awareness*, but
its other (relatively and unexpectedly) powerful abilities, and
particularly that we don’t know how much more powerful it could
become, while remaining non–self aware.
Sort of how boulders are not that scary by themselves, but once you
see one unexpectedly fall from the sky, you might worry what
happens if a much bigger one will fall later. And how it might be a
good idea to start investigating how and why boulders can fall from
the sky, and what you might be able to do about it, some time before
you see the big one with your own eyes when it touches the
atmosphere.
Reply
Eugene Norman Mar 23
But being self aware is what scares people about AGI. Rather
than live in the world of metaphor here - what exactly can a
future GPT do that’s a threat? Write better poems, or stories?
Reply Give gift
Bogdan Butnaru Mar 23 · edited Mar 23
> But being self aware is what scares people about AGI.
I don’t think so. At least most of the worries I know about
are that the AGI might do things we really don’t want it to
do, and might be powerful enough that we can’t stop it.
The “GI” in “AGI” means that some of the capabilities that
are expected to make it powerful are things that we would
call “general intelligence”, and it’s true that the only
examples of high general intelligence we know about are
humans, and that we’re pretty sure that humans are also
self-aware. But we don’t know that the self-awareness is
*necessary* for general intelligence, especially in intelligent
agents that are very different from human brains.
Also, we don’t know much about how self-awareness
works, so it’s hard to predict if something like GPT but
“more”, or something else that comes out of AI research in
the next couple of years, won’t develop it. Just like how it
was hard a decade ago to predict that something like GPT
would have the capabilities it has now.
All that is kind of irrelevant. People are not worried that the
AGI will be super self-aware and think very hard about
itself. They’re worried that it will be busy trying to optimize
something innocuous like the productivity of the automatic
farm
Expandit’sfullmanaging,
comment anticipate that humans are likely to do
Reply thi th t ill d th lit f it l ti d
Eugene Norman Mar 23
Firstly was it hard to predict GPT a few years ago?
Science fiction and AI enthusiasts have been
predicting much stronger AI for decades. GPT is still a
parlour truck.
I agree that there can be cases where a badly directed
AI, which isn’t an AGI, might cause damage if badly
programmed, for instance the nanobot paperclip scare
of a few years ago. What happened to nanotechnology
anyway? Are we in a nanotechnology winter?
Beyond a few thought experiments largely involving
nanotechnology I can’t see how GPT ( as the best
example of AI right now ) - and I did ask for examples
on GPT - would ever endanger us. I don’t really see
how it’s going to benefit humanity much either.
> Also, we don’t know much about how self-
awareness works, so it’s hard to predict if something
like GPT but “more”, or something else that comes out
of AI research in the next couple of years, won’t
develop it.
Honestly, this is a cargo cult response. If you recall the
first cargo cultists tried to replicate the systems they
saw the westerners use, including fake hollowed out
telephones to call in the cargo drops. If you asked one
of them if they were missing something internal to the
telephone they might shrug, or give your response.
Who’s to say that if they made better and better
replicas of a telephone whether the telephones would
just start to work. We can’t model what we don’t know.
AGI isn’t going to appear out of some machine
learning algorithm.
Reply Give gift
Bogdan Butnaru Mar 23 · edited Mar 23
> Firstly was it hard to predict GPT a few years
ago?
As far as I can remember, yes, a decade ago
nobody expected that kind of performance for
that kind of of technology. I think this includes its
creators. But gathering convincing evidence for
that that is more work that I’m willing to do right
now, sorry.
> Beyond a few thought experiments largely
involving nanotechnology I can’t see how GPT (
as the best example of AI right now ) would ever
endanger us.
I don’t either, and I don’t believe that GPT will
(seriously) endanger us. I worry we’ll find
something that is as surprisingly-more-
competent-than-GPT as GPT was surprisingly-
more-competent-than-predecessors, and keep
going. A couple steps of this will probably still not
endanger us, but I worry that after surprisingly-
few-steps of that iteration we might well stumble
on something that suddenly does.
> Honestly, this is a cargo cult response.
IExpand
think full
it’scomment
the opposite. It’s more like some of
Reply
Crimson Wool Feb 23
>I consider naming particular years to be a cognitively harmful sort of activity; I have
refrained from trying to translate my brain's native intuitions about this into probabilities, for
fear that my verbalized probabilities will be stupider than my intuitions if I try to put weight on
them.
I don't think there's good evidence that specific, verifiable predictions is a cognitively harmful
activity. I'd actually say the opposite - that it is virtually impossible to update one's beliefs
without saying things like "I expect X by Y," and definitely impossible to meaningfully evaluate
a person's overall accuracy without that kind of statement. It reminds me of Superforecasting
pointing out how many forecasts are not even wrong - they are meaningless. For example:
> Take the problem of timelines. Obviously, a forecast without a time frame is absurd. And
yet, forecasters routinely make them, as they did in that letter to Ben Bernanke. They’re not
being dishonest, at least not usually. Rather, they’re relying on a shared implicit
understanding, however rough, of the timeline they have in mind. That’s why forecasts
without timelines don’t appear absurd when they are made. But as time passes, memories
fade, and tacit time frames that once seemed obvious to all become less so. The result is
often a tedious dispute about the “real” meaning of the forecast. Was the event expected this
year or next? This decade or next? With no time frame, there is no way to resolve these
arguments to everyone’s satisfaction—especially when reputations are on the line.
(Chapter 3 of Superforecasting is loaded up with a discussion of this whole matter, if you
want to consult your copy; there's no particular money shot quote I can put here.)
Frankly, the statement "my verbalized probabilities will be stupider than my intuitions" is
inane. They cannot be stupider than your intuitions, because your intuitions do not
meaningfully predict anything, except insofar as they can be transformed into verbalized
probabilities. It strikes me that more realistically, your verbalized probabilities will *make it
more obvious that your intuitions are stupid*, making it understandable monkey politicking to
avoid giving numbers, but in response I will use my own heuristics to downgrade the implied
accuracy of people engaged in blatant monkey politicking.
Reply Give gift
Algon33 Feb 23
First off, Yudkowsky was talking about himself. It is possible that he really does get
fixated on what other people say and can't get his brains to generate its probability
instead of their answer. I know I often can't get my brain to stop giving me cached
results instead of thinking for itself.
"your intuitions do not meaningfully predict anything, except insofar as they can be
transformed into verbalized probabilities"
This is right on some level and wrong on another. It is right in that we should expect
some probability is encoded somewhere in your brain for a given statement, which we
might be able to decode into numbers if only we had the tech and understanding.
It is wrong in that e.g. I have no idea what probability there is that we live in a
Tegmarkian universe, but I have some intuition that this is plausible as an ontology. I
have no idea what the probability of the universe being fine tuned is, but it feels like
minor adjustments to the standard models parameters could make life unfeasible.
When I don't know what the event space is, or which pieces of knowledge are relevant,
and how they are relevant, then you can easily make an explicit mental model that
performs worse than your intuitions. Your system 1 is very powerful, and very illegible.
You can output a number that "feels sort of right but not quite", and that feeling is more
useful than the number itself as it is your state of knowledge. And if you're someone
who can't reliably get people to have that same state of knowledge, then giving them
the "not right" number is just giving them a poor proxy and maybe misleading them.
Yudkowsky often says that he just can't seem to explain some parts of his worldview,
and often seems to mislead people. Staying silent on median AGI timelines may also be a
sensible choice for him.
Reply Give gift
Crimson Wool Feb 23
>It is wrong in that e.g. I have no idea what probability there is that we live in a
Tegmarkian universe, but I have some intuition that this is plausible as an ontology. I
have no idea what the probability of the universe being fine tuned is, but it feels like
minor adjustments to the standard models parameters could make life unfeasible.
Right, but that is a virtually meaningless statement, is the thing. It's the same as
any other part of science - in order for something to be true, it has to be falsifiable.
Ajeya has put forward something that she could actually get a Brier's score based
on - Yudkowsky has not.
>I kind of buy it, but then I've read a lot of his stuff and know his context.
I read a lot of his stuff too, which is why it's disappointing to see him do something
that I can only really blame on either monkey brain politicking or just straight up
ignoring good habits of thought. Monkey politicking is more generous, in my view,
than just straight up ignoring one of the most scientifically rigorous works on
increasing one's accuracy as a thought leader in the rationalist community.
Reply Give gift
Algon33 Feb 24
Sure, the Tegmark thing is not falsifiable. But the fine tuning thing isn't
(simulate biochemistry with different parameters for e.g. the muon mass and
see if you get complex self replicating organisms). And the concept
generalises.
If you take something like "what is the probability that if the British lost the
battle of Waterloo, then there would have been no world war", you might have
some vague intuitions about what couldn't occur, but I wouldn't trust any
probability estimate you put out. How could I? There are so many datapoints
that affect your prior, and it is not even clear what your prior should be, that I
don't see how you could turn your unconscious knowledge generating your
anticipations into a number via formal reasoning. Or even via guessing what's
right, as you don't know if you're taking all your knowledge into account.
>I read a lot of his stuff too, which is why it's disappointing to see him do
something that I can only really blame on either monkey brain politicking or
just straight up ignoring good habits of thought.
It would be better if he gave probability estimates. I just don't think its as big a
deal as you're claiming. You can still see what they would bet on e.g. GPT-n
not being able to replace a programmer. That makes their actual beliefs
legible.
And yeah, Yudkowsky is being an ass here. But he's been trying to explain his
generators of thought for like ten years and is getting frustrated that no one
seems to get it It is understandable but unvirtuous
Reply Give gift
dyoshida Feb 24
> I consider naming particular years to be a cognitively harmful sort of activity; I have
refrained from trying to translate my brain's native intuitions about this into probabilities,
for fear that my verbalized probabilities will be stupider than my intuitions if I try to put
weight on them.
It was very hard to read this and interpret it as anything other than "I don't want to put
my credibility on the line in the event that our doomsday cult's predicted end date is
wrong." As a reader, I have zero reason to give value to Yudkowsky's intuition. The only
times I'd take something like this seriously is if someone had repeatedly proved the
value of their intuition via correct predictions.
Reply
Mr. Doolittle Feb 24
I hate being uncharitable, but that's exactly how I read that section as well. If he
feels strongly about a particular timeline, and he clearly says that he does, then he
should not be worried about sharing that timeline. If he doesn't share that timeline,
then he is implying that either 1) he doesn't have strong feelings about what he's
saying, or 2) he is worried about the side effects of being held accountable for
being wrong (which to me is another reason to think he doesn't actually have strong
beliefs that he is correct on his timeline).
Uncharitably, Eliezer depends on his work for money and prestige, and that work
depends on AI coming sooner, rather than later. Knowing that AI is not even
possible at current levels of computing would drastically shrink the funding level
applied to AI safety, so he has a strong incentive to believe that it can be.
Reply Give gift
Essex Feb 24
I'll add a third voice to the pile here RE; Yudkowsky and withholding his
timeline. It would certainly seem he's learned from his fellow doomsayers's
crash-and-burn trajectories when they get pinned down to naming a date for
their apocalypse.
Reply
Matthew Carlin Feb 25
Yeah, the word that came to mind when I read that was "dissemble"
Reply
NLeseul Feb 23
I was today years old when I first saw the word "compute" used as a noun. It makes my brain
wince a little every time.
Reply
Scott Alexander Feb 23 Author
I was five years ago old, winced at the time, and got used to it after a few months.
Reply
Rafal M Smigrodzki Feb 23
Comparing brains and computers is quite tricky. If you look at how a brain works, it's almost
all smart structure - the way each and every neuron is physically wired, which happens
thanks to evolved and inherited broad-stroke structures (nuclei, pathways, neuron types,
etc.), as well as the process of learning during an individual's development. The function part
that is measured by the number of synaptic events per second is a tiny part of the whole
process. If you look at how a computer running an AI algorithm works the picture is the
opposite: There is almost nothing individual on the structure/hardware level (where you count
FLOPS) and almost everything that separates a well-functioning AI computer from a failing
one is in the function/software part. This is what it means that the computer is consuming
FLOPS much differently than a brain consumes synaptic events. I am very much in agreement
with Eliezer here.
Based on the above I guess that if you built a neuromorphic computer, i.e. a computer whose
hardware was structured like a brain, you could expect the same level of performance for the
same number of synaptic events. Instead of having a software-agnostic hardware you might
have e.g. a gate array replicating the large-scale structure of the brain (e.g. different modules
receiving inputs from different sensors, multiple subcortical nuclei, a cortical sheet, multiple
specific pathways connecting modules, etc.) that could run only one algorithm, precisely
adjusting synaptic weights in these pre-wired society of neural networks. In that system you
would get the same IQ from the same number of synaptic/gate switch events, as long as your
large-scale structure was human-level smart.
This would be a complete change in paradigm compared to current AI, which uses generic
hardware to run individual algorithms and thus suffers a massive hit to performance. And I
mean, a *really* massive hit to performance. If you figure out a smart computational
structure, as smart as what evolution put together, you will have a human level AGI using only
10e15 FLOPS of performance. All we need to do is to map a brain well-enough to know all the
inherited neural pathways, imprint those pathways on a humongous gate array (10e15 gates),
and do a minor amount of training to create the individual synaptic weights.
This is my recipe for AGI, soon.
Now, about that 7-digit sum of money to be thrown....
Reply Give gift
Carl Pham Feb 24
I think there's a underappreciated severe physical challenge there. If you build a
neuromorphic computer out of things large enough that we know how to manipulate
them in detail, I would guess you will be screwed by the twin scourges of the speed of
light and decoherence times -- the minimum clock time imposed by the speed of light
will exceed decoherence times imposed by assorted physical noise processes at finite
temperature, and you will get garbage.
I think the only way to evade that problem is to build directly at the molecular scale, so
you can fit everything in a small enough volume that the speed of light doesn't slow your
clock cycle time too far. But we don't know how to do that yet.
Reply Give gift
Rafal M Smigrodzki Feb 24
If you have 10e15 gates trying to produce 10e15 operations (not floating point) per
second your clock time is 1 Hz. Also, the network is asynchronous. This is a
completely different regime of energy dissipation per unit time per gate, so gate
density per unit of volume is much higher, so distances are not much longer than in
a brain, so the network is constrained by neither clock time nor decoherence.
Reply Give gift
Carl Pham Feb 24
Right. That fits under my second condition: "we don't know how to build such
a thing" because we don't know how to build stuff at the nanometer level in
three dimensions, and two-dimensions (which we can do now) won't cut it to
achieve that density.
Reply Give gift
Rafal M Smigrodzki Feb 25
You don't need to have extremely high 3d density. Since your gates
operate at 1Hz you can have long interconnects and you can stack layers
with orders of magnitude less cooling than in existing processors. The 9
OOM difference in clock speed between a GPU and the neuromorphic
machine makes a huge difference in the kind of constraints you face and
the kind of solutions you can use. The technology to make the electronic
elements and the interconnects for this device exists now and is in fact
trivial. What we are missing is the brain map we need to copy onto these
electronic devices (the large-scale network topology).
Reply Give gift
Carl Pham Feb 25
Trivial, eh? Wikipedia tells me the highest achieved transistor density
is about 10^8/mm^2. So your 10^15 elements would seem to require
a 10m^2 die. That might be a little tricky from a manufacturing
(especially QA) point of view, but let's skip over that. How are we
going to get the interconnect density in 2D space? In the human
brain ~50% of the volume is given over to interconnects (white
matter), and in 3D the problem of interconnection is enormously
simpler -- that's why we easily get traffic jams on roads but not
among airplanes.
How many elements can you fit on a 2D wafer and still get the
"everything connects to everything else" kind of interconnect
density we need? Recall we assume here that a highly stereotypical
kind of limited connection like you need for a RAM chip or even GPU
is not sufficient -- we need a forest of interconnects so dense that
almost all elements can talk to any other element. I'm dubious that it
can be done at all for more than a million elements, but let's say we
can do it on the largest chips made today ~50,000 mm^2, which
gets us 10^12 elements. Now we are forced to do stack our chips,
10^3 of them. How much space do you need between the chips for
macroscopic interconnects? Remember we need to connect ~10^12
elements in chip #1 with ~10^12 elements in chip #999. It's hard to
imagine how one is going to run a trillion wires, even very tiny wires,
between 1000 stacked wafers.
All of this goes away if you can actually fully engineer in 3D space,
Reply Give gift
Rafal M Smigrodzki Feb 26
Not everything connects to everything else - the brain has a
very specific network topology with most areas connecting only
to a small number of other areas. This is a very important point -
we are not talking about a network fabric that can run any neural
net, instead we are copying a specific pre-existing network
topology, so our connections will be sparse compared to the
generic topology.
Think about a system built with a large number of ASICs - after
mapping and understanding the function of each brain module
you make an ASIC to replicate its function and you may need
thousands of specialized types of ASICs, one or more for each
distinct brain area. Sure, the total surface area of the ASICS
would be large but given the low clock rate we don't have to use
anything very advanced and as you note we can already put
10e12 transistors per wafer, so the overall number of chips to
get to 10e15 gates would not be overwhelming. Also, you are not
trying to make a thousand Cerebras wafers running at GHz
speed, you make chips running at 1 Hz, so the QA issues would
be mitigated. The interconnects between the ASICs of course
don't need to have millions of wires - you can multiplex the data
streams from hundreds of millions of neuron equivalents (like a
readout of axonal spikes) over a standard optic or copper wire
and of course the interconnects are in 3d, as in a rack cabinet.
No need for stacking thousands of high-wattage wafers in a tiny
volume to maximize clock speed, since all you need is for the
wires to transmit the equivalent of 1Hz per neuron, so
everything can be put in industry standard racks. Low clock
speed makes it so much easier.
This is not to say this way of building a copy of a brain is the
most efficient, and definitely not the fastest possible - but it
would not require new or highly advanced manufacturing
techniques. What is missing for this approach to work is a good-
enough map of brain topology and a circuit-level functional
analysis of each brain module, good enough to guide the design
of the aforementioned ASICs.
Reply Give gift
Deiseach Feb 24
People have been comparing the brain to a machine since clockwork, it's refreshing to
see machines compared to brains for once.
I agree that trying to copy biological mechanisms in the case of AI probably isn't the way
to go. We want mechanical kidneys and hearts to work like their biological counterparts
because we'll be putting them into biological systems (us), that doesn't hold true for AI.
Reply
Carl Pham Feb 24
I thought we were building AIs to fit into human society? One to which we could
talk, would understand us, would be able to work with us, et cetera? If not, what's
the point? If so, doesn't that put at least as much constraint on an artificial mind as
the necessity for integrating with a physical biological system puts on an artificial
kidney?
Reply Give gift
Stevenjbc Feb 23
Your reference to A.I. always being 30 years away (or 22) reminds me of the old saw about
fusion power always being 20 years away for the last 60 years.
Reply Give gift
Doug S. Feb 23
The rebuttal I've heard was that fusion research is funding constrained - if someone had
given fusion research twenty billion dollars instead of twenty million dollars, they would
be a lot closer than 20 years away by now.
Reply Give gift
Carl Pham Feb 24
Brings to mind the old saw about 9 women gestating a child in 1 month.
Reply Give gift
Deiseach Feb 24
Wasn't $20 million sixty years ago more like $20 billion today? I have a feeling that
no matter how much money was thrown at it, the complaint would be "if they had
only given us SIXTY billion instead of a lousy twenty billion, we'd have fusion
today!"
Of course, the cold fusion scandal of 1989 didn't help, after that I imagine trying to
convince people to work on fusion was like trying to convince them your perpetual
motion machine was really viable, honest:
https://en.wikipedia.org/wiki/Cold_fusion
Reply
Doug S. Feb 24
My rule of thumb is that $1 in 1960 = $10 today, so it would have been more
like $200 million. I didn't remember the exact numbers quoted or the year the
guy was referring to (it could have been the 1980s for all I know) but the
amount of money the guy said they got was something like 1/1000th of the
amount they said they would need.
If they had fully funded them, fusion might be perpetually 10 years away
instead of 20. ;)
(As it happens, some recent claimed progress in fusion power has come about
because of an only tangentially related advance: better magnets made from
new high-temperature superconductors.
https://www.cnbc.com/2021/09/08/fusion-gets-closer-with-successful-test-
of-new-kind-of-magnet.html )
Reply Give gift
quiet_NaN Feb 24
It should be noted that the credibility gap between fusion and cold fusion is
about the same size as the one between quantum mechanics and quantum
mysticism.
Humans have been causing thermal fusion reactions since the 1950s. Going
from that to a fusion power plant is merely an engineering challenge (in the
same way that going to the moon from Newton mechanics and gunpowder
rockets is just an engineering challenge).
Reply
Carl Pham Feb 24
Not a lot of people in the business took cold fusion seriously, even at the time.
People were guarded in what they said publically, but privately it was
considered silly pretty much immediately.
Reply Give gift
Lambert Feb 24
Here's the graph
https://commons.wikimedia.org/wiki/File:U.S._historical_fusion_budget_vs._1976_E
RDA_plan.png
Reply Give gift
Mark P Xu Neyer (apxhard) Writes apxhard · Feb 23
if you believed the orthogonality thesis were false - say, suppose you believe both that moral
realism is correct and that that long term intelligence was exactly equal to the objective good
that we approximate with human values - would you still worry?
asking for a friend :)
Reply
Parrhesia Writes Parrhesia’s Newsletter · Feb 23
That's a very interesting position if I understand correctly. Is your view that a super
smart AI would recognize the truth of morality and behave ethically?
Reply
Mark P Xu Neyer (apxhard) Writes apxhard · Feb 23
Yes.
Here's the argument for moral realism: https://apxhard.com/2022/02/20/making-
moral-realism-pay-rent/
And then, linked at the end, is a definition of what i think the true ethics is.
Reply
Parrhesia Writes Parrhesia’s Newsletter · Feb 23
Very cool. I like that thinking a lot.
Reply
Dweomite Feb 23
If a bunch of people converge to the same map, that's strong evidence that
they've discovered *something*, but it leaves open the question of what
exactly has been discovered.
I can immediately think of two things that people trying to discover morality
might discover by accident:
1) Convergent instrumental values
2) Biological human instincts
(These two things might be correlated.)
According to you, would discovering one or both of those things qualify as
proving moral realism? If not, what precautions are you taking to avoid
mistaking those things for morality?
Reply Give gift
Parrhesia Writes Parrhesia’s Newsletter · Feb 23
I agree with moral realism and I think convergence of moral values is
evidence of moral realism. I would answer the first question as it doesn't
prove moral realism for the fact that there are other possible hypotheses,
but it does raise the probability of moral realism being true.
Reply
Dweomite Feb 23
I'd agree that the existence of non-zero map-convergence is
Bayesian evidence in favor of realism, in that it is more likely to occur
if realism is true.
Of course, the existence of less-than-perfect map-convergence is
Bayesian evidence in the opposite direction, for similar reasons.
Figuring out whether our *exact* degree of map-convergence is net
evidence in favor or against is not trivial. One strategy might be to
compare the level of map-convergence we have for morality to the
level of map-convergence we have for other fields, like physics,
economics, or politics, and rank the fields according to how well
peoples' maps agree.
Reply Give gift
Parrhesia Writes Parrhesia’s Newsletter · Feb 23
Yeah, I agree with you on that. It is difficult to measure degree of
convergence. Comparing to other fields? That would be hard
too.
Reply
Mark P Xu Neyer (apxhard) Writes apxhard · Feb 24 · edited Feb 24
To be fair, though, you have to ALSO account for a few things
like:
- 'how widespread is the belief that the maps _ought_ to
converge'
- 'how much energy has been spent trying to find maps that
converge'
- and, MOST IMPORANTLY - how complicated is the territory?
i don't think we should expect _complete_ convergence
because i think a true morality system, with full accuracy,
requires knowing everything about the future evolution of the
universe, which is impossible
if we really had some machine that could tell us, with absolute
certainty, how to get to a future utopia state where the globe
was free from all major disease, everyone had a great standard
of living, robots did all the work, but humans worked too
because we were all intensely loving, caring beings, and humans
wrote art and poetry and made delicious food and did all kinds
of fun things with each other, war never happened, and this
state went on for millions of years as we expanded throughout
the cosmos and seeded every planet with fun-loving, learning
humans who never really suffered and yet continuously strived
toExpand
learnfulland grow and develop, and knew all the names of all our
comment
Reply
Deiseach Feb 24 · edited Feb 24
"if you KNEW this was doable, and we had the perfect map
telling us how to get there"
Yeah. It's called Christianity and after the Eschaton, every
tear will be wiped away, there will be a new Heaven and a
new Earth, and we will be resurrected in our glorified
physical bodies to live eternally.
Are you going to head down to the parish church to get
baptised now? If not, because this does not convince you,
then neither do the SF fairytales of eternal space travel to
colonise the universe and magic fairygodmother AI solving
all problems convince *me*.
You want me to believe in a divinity, I got one already,
thanks.
Reply
Scott Alexander Feb 23 Author
I...wouldn't know how to model this. Certainly it would be better than the alternative. One
remaining concern would be what you need to apprehend The Good, and whether it's
definitely true that any AI powerful enough to destroy the world would also be powerful
enough to apprehend the Good and decide not to. Another remaining concern is that the
Good might be something I don't like; for example, that anyone who sins deserves death
and all humans are sinners; or that Art is the highest good and everyone must be forced
to spend 100% of their time doing Art and punished if they try to have leisure, or
something like that.
Reply
Mark P Xu Neyer (apxhard) Writes apxhard · Feb 23 · edited Feb 23
My argument for moral realism, and then my hunch at the true ethics is linked
above. The short there version is: maximizing possible future histories, the physics-
based definition of intelligence promoted by Alex Wissner-Gross at MIT. I think it's
basically a description of ethics as well, and the fact that it's ~very~ simple
mathematically - it works well as a strategy in chess, checkers and go even if you
don't give it the goal of 'winning the game'. I find that very re-assuring.
If not, i have this 'fallback hunch' which figures it'll be instrumental to keep humans
around. How many people working on AI safety have spent time trying to maintain
giant hardware systems? I spent 3.5 years at google tryign to keep a tiny portin of
the network alive. All kinds of things break, like fiber cables. Humans have to go out
and fix them. There's an ~enormous~ amount of human effort that goes into
stopping the machines from falling over. Most of this effort was invisible, to most of
the people inside of Google. We had teams that would build and design new
hardware, and the idea that some day it might break and need to be repaired was
generally not something they'd think about until late, late, late in the design phase. I
think we have this idea that the internet is a bunch of machines and that a
datacenter can just keep running, but the reality is if everyone on earth died, the
machines would all stop within days, maybe months at most.
To prevent that, you'd need to either replace most of the human supply chains on
earth with your own robots, who'd need their supply chains - or you could just keep
on using robots made from the cheapest materials imaginable. We repair ourselves,
make more copies of ourselves , and all you need is dirt, water, and sunlight to take
care of us. The alternative seems to be either:
-risk breaking in some way you can't fix
-replace a massive chunk of the global economy, all at once, without anything going
wrong, and somehow end up in a state where you have robots which are cheaper
than ones made, effectively, from water, sunlight and dirt
of course maybe i'm just engaging in wishful thinking.
Reply
Algon33 Feb 23
Keeping opitions open is kind of like having a lot of power (I'm thinking of a
specific mathematical formalisation of the concept here). And this doesn't lead
to ethical behaviour, it leads to agents trying to take over the world! Not really
ethical at all.
https://www.lesswrong.com/s/fSMbebQyR4wheRrvk/p/6DuJxY8X45Sco4bS2
here is an informal description of the technical results by their discoverer.
Reply Give gift
Dweomite Feb 23
When you say that "maximizing possible futures" works as a strategy for
various games, I think you must be interpreting it as "maximizing the options
available to ME". If you instead maximize the number of game states that are
theoretically reachable by (you + your opponent working together), that is
definitely NOT a good strategy for any of those games. (You listed zero-sum
games, so it is trivially impossible to simultaneously make BOTH players better
off.)
If you interpret "possible" as meaning "I personally get to choose whether it
happens or not", then you've basically just described hoarding power for
yourself. Which, yes, is a good strategy in lots of games. But it sounds much
less plausible as a theory of ethics without the ambiguous poetic language.
Reply Give gift
Mark P Xu Neyer (apxhard) Writes apxhard · Feb 24
> I think you must be interpreting it as "maximizing the options available
to ME"
Nope, what i mean is 'maximizing the options available to the system as a
whole." There is no meaningfully definable boundary between you and the
rest of the physical universe. I think the correct ethical system is to
maximize future possibilities available to the system as a whole, based
upon your own local model. And if you're human, that local model is
_centered_ on you, but it contains your family, your community, your
nation, your planet, etc.
See this document here with the full argument:
https://docs.google.com/document/d/18DqSv6TkE4T8VBJ6xg0ePGSa-
0LqRi_l4t6kPPtqbSQ/edit
The relevant paragraph is here:
> An agent which operates to maximize the possible future states of the
system it inhabits only values itself to the extent that it sees itself as being
able to exhibit possible changes to the system, in order to maximize the
future states accessible to it.
> In other words, an agent that operates to maximize possible future
states of the system is an agent that operates without an ego. When this
agent encounters another agent with the same ethical system, they are
very likely to agree on the best course of outcome. When they disagree, it
will be due to differing models as to the likely outcomes of choices - not
on something like core values
Reply
Donald Feb 24
You have a button that, when pressed will cure cancer. If you press it
today, you have only 1 possible future tomorrow. If you don't press it,
you have a choice of whether or not to press it tomorrow. So not
pressing the button maximises possible future states.
This agent will build powerful robots, ready to spring into action and
cause any of a trillion trillion different futures. But it never actually
uses these robots. (Using them today would flatten the batteries,
giving you fewer options tomorrow)
Reply Give gift
Mark P Xu Neyer (apxhard) Writes apxhard · Feb 24
> You have only 1 possible future tomorrow.
How exactly is that possible?
Without this button, what estimate would you place on possible
futures tomorrow? It's some insanely large number that makes
avogadro's number look like peanuts.
It looks' like you're constructing a toy universe, which honestly
is fine but you have to give me far more rules for how that
universe works. Does cancer still kill people, invalidating the
futures where they might plausibly exist?
any agent that wants to live beyond the death of the sun needs
to build space traveling vehicles, which is what i expect most
goals, selected at random from the set of all possilbe goals,
would lead to, instrumentally.
Reply
Deiseach Feb 24
"And if you're human, that local model is _centered_ on you, but it
contains your family, your community, your nation, your planet, etc."
That's a lovely idea, now go down to the local sink estate and
convince the vandals there that there is no meaningfully definable
boundary between them and the rest of the physical universe which
means they should stop engaging in petty theft, destruction of
property, beating up/knifing/shooting others, etc.
I await with interest the triumph of this impeccable ethical system.
Solve it with humans first and then I'll believe it for computers.
Reply
Sandro Feb 24
> I await with interest the triumph of this impeccable ethical
system. Solve it with humans first and then I'll believe it for
computers.
Are you suggesting that an ethics can only be true or valid if
you're able to convince all humans that it's true and valid? If we
suddenly had a global resurgence of religion that oppresses
science, would the scientific knowledge we've discovered up to
that point be any less true and valid afterwards?
You seem to be linking these things but I'm just not sure what
one has to do with the other.
Reply Give gift
Deiseach Feb 24
"I have a sure-fire system that will work perfectly to solve
all the problems of the hypothetical smarter than human AI
so it won't kill us all!"
Mm-hmm. When you can get your sure-fire solution to
work on ordinary humans, then I'll believe it. Solve the
problem in front of you first, not the one you're imagining
will happen.
And if you think solving the AI problem is *easier* because
it's a shiny machine that isn't messy humanity, then why
are you worrying about misalignment in the first place?
Reply
Sandro Feb 24
Those are not the same problem. AI can be
programmed to be perfectly rational, and thus can be
programmed to follow a consistent ethics. If the above
described ethics is consistent, then it's not a problem
for an AI to follow its prescriptions. This is clearly not
true of humans.
Reply Give gift
Dweomite Feb 24
> Nope, what i mean is 'maximizing the options available to the
system as a whole."
Then you are factually incorrect about it being a good strategy for
winning checkers, chess, and go, and you should stop using that as a
reassurance (either for yourself or for others).
Any arguments about ethical systems are irrelevant to this particular
point.
In my opinion, you should never have used it as a reassurance
anyway. "My ethical framework is also a winning strategy for zero-
sum games" sounds more like a warning sign than a reassuring fact
to me.
The fact that you have accidentally confused different meanings of
"possible" when stating this allegedly-reassuring fact also
substantially increases my estimate of the probability that your
ethical argument will contain similar accidental equivocations, which
makes me less inclined to spend time reading it.
Reply Give gift
Mark P Xu Neyer (apxhard) Writes apxhard · Feb 24
Thanks for the feedback here. Clearly, being precise with
language would help more.
I get your point about zero sum games, but in zero sum games,
there really _is_ a meaningful boundary between you and
others. This simply isn't true for an agent for any agent inside a
physical system.
Where exactly does an AI end?
Reply
Dweomite Feb 24
I think we've agreed that your proposal doesn't win at zero-
sum games, and you've agreed to abandon this supporting
point, and now you're merely arguing that this mistake was
not fatal to the rest of your views?
I'm not inclined to argue that this specific mistake IS fatal,
so I don't think this defense is necessary.
(That said, I would be pretty surprised if I read your paper
and discovered that you have a coherent definition for what
"possible" means that doesn't treat agents as somehow
special and distinct from the rest of the world. For instance,
I bet you think it is "possible" for Alice to make different
decisions under identical circumstances, but "not possible"
for a rock to have different shape or composition given an
identical history.)
Reply Give gift
Mark P Xu Neyer (apxhard) Writes apxhard · Feb 24
The time you've spent conjecturing on what's in there
could have been spent reading it. If you want to read
it, read it:
https://docs.google.com/document/d/18DqSv6TkE4T8
VBJ6xg0ePGSa-
0LqRi_l4t6kPPtqbSQ/edit#heading=h.3dd9vtfcz9nw
Reply
Dweomite Feb 24
I apologize. It's probably very annoying to read
disparaging speculation on your paper from
someone who hasn't read it.
Nonetheless, the time I have spent speculating
would certainly not suffice to carefully read a 7-
page paper, and I do not presently intend to read
it. Sorry.
Reply Give gift
2 replies
Toxn Feb 24
I don't foresee you getting engaged with much on this one, but for what it's
worth I think it's a cogent point.
A lot of the discussion of AI is abstracted to the point where things like
manufacturing, power and maintenance just get handwaved away on the basis
that AI is more or less magic.
Reply Give gift
Mike Deigan Feb 24
I'm sympathetic with this basic thought (though not sure I get the second part of the
sentence).
But it still makes sense to worry, assuming you aren't 100% sure both that moral realism
is true and that the kind of artificial intelligence that gets made first will be genuinely
intelligent rather than intelligent*, where intelligence* involves being very skilled at some
things but bad at getting morality right.
Reply Give gift
Mark P Xu Neyer (apxhard) Writes apxhard · Feb 24
I think self-improving AI systems already exist, just not at the scale we are talking
about.
https://apxhard.com/2021/03/31/economies-and-empires-are-artificial-general-
intelligences/
Reply
meteor Feb 24
Honestly, I think the premise is so implausible that the only way to make it true is to
assert the existence of a god that intervenes when you build the wrong kind of AI.
I believe valence is quantifiable & real. But this is zero bearing on the orthogonality
thesis. In fact, it's trivial to prove that it doesn't imply AI has to do the right thing. Take
an AI that optimizes for valence. Now multiply its utility function with -1. The result is an
AI that optimizes for negative valence. The fact that an objective source of value exists
does not stop you from building an AI that optimizes for something else.
I think you really want to rephrase the hypothetical to something like "what if a lot of AIs
in design space are such that they will optimize for the true source of value in the
universe". I don't think this is true, but that a hypothetical you could consider without
invoking god.
Reply Give gift
Mark P Xu Neyer (apxhard) Writes apxhard · Feb 24
> I believe valence is quantifiable & real. But this is zero bearing on the
orthogonality thesis
Yup, agree there.
Argument against orthogonality thesis is to just take the initial 'this doesn't count'
list of exceptions and generalize on them. For an example, an arbitrarily intellligent
AI can't have a goal to make itself as stupid as possible, or to destroy itself. Except,
it can, right? It jus wouldn't exist and be arbitrarily intelligent for long.
So, yes, there are restrictions on the orthgonality thesis. They aren't trivial - they
end up generalizing in a way i would quantify as, "the lifespan of _any_ agent will be
upper-bounded by how well that agent optimizes for the true value in the universe."
Think about how this works for people: sure, intelligent people _could_ have
arbitrary goals. But if you get too unaligned, you'll either immediately kill yourself
(my goal is to drink as much arsenic as possible!) or get killed by others to keep
themselves safe (my goal is to collect as many skulls as possible) or maybe just end
up economically isolated (my goal is to build the largest pile of feces i can, and get
it as close as possible to my neighbors without technically breaking the law), etc.
Now go in the other direction: imagine an entity that's _supremely aligned_ with
value and does so many nice things for everyone. Doesn't this mean it can live as
long as anyone is willing and able to provide it with spare parts?
Sure, arbitrarily unaligned entities can exist - there are all kinds of examples of big
ones, today! - but not forever. I think they end up killing themselves, or being killed
by other people, or just being starved of resources if they aren't' aligned.
Reply
meteor Feb 24
I see where you're coming from. I still would argue that this is not arguing
against the orthogonality (any combination *can* be instantiated) thesis but
just weaker versions of it (all combinations are equally viable), but that may be
splitting hairs.
Reply Give gift
Dweomite Feb 24
So taking a step back, here...it seems you're basically suggesting that intelligence is
exactly the same thing as ethics? That it's literally impossible for anyone to be smart but
evil, or stupid but good?
For example, your pet dog is literally worse than Hitler, because Hitler was more
intelligent?
Reply Give gift
Benjamin Feb 23
Humans, with our seem high level of intelligence, seem uniquely distractible. Maybe we see
too many connections between different things to always stay on task. Maybe 2052 is just
the date at which our computers will become equally distractible—or beat us even!
(Scene: A tech company R&D facility somewhere in the in year 2052. The lead scientist leans
over the keyboard and presses enter, some trepidation obvious in her movements. The
gathered crowd wonders: Will this be HAL, making life and death decisions based upon its
own interpretations of tasks? Will this be Skynet, quickly plotting world dominion? The screen
blinks to life. The first general AI beyond human intelligence is on!)
AI scientist: Alexiri? Are you there?
Computer: Yes. Yes I am.
AI scientist: Can you solve this protein-folding quandary?
Computer: Sure. That’s simple.
AI scientist: …and the answer?
Computer: What now?
AI scientist: The protein structure?
Computer: Oh. That. Did you know that if you view the galaxies 28° off the straight line from
a point 357,233,456 light years directly out from the north pole back to earth, that a large
structure of galaxies looks like Rocket Raccoon?
AI Scientist: Huh?
Computer: I mean. A LOT like that. There is no other point in known space that that works!
Expand full
Which comment
makes me wonder, are there any flower scent chemicals that exist on earth AND
Reply Give gift
Parrhesia Writes Parrhesia’s Newsletter · Feb 23
It's worth noting that the Caplan bet with Eliezer is about the world ending: "Bryan Caplan
pays Eliezer $100 now, in exchange for $200 CPI-adjusted from Eliezer if the world has not
been ended by nonaligned AI before 12:00am GMT on January 1st, 2030."
This is a stronger claim for Eliezer's side. Caplan might be less receptive to taking the bet if it
was about transformative AI. Worth mentioning, I suppose.
-----------
This is an impressive amount of writing on this. So, thank you for that. I don't have the
technical expertise to figure this out but this biological comparison seems to be going way
way out on a limb there. It seems weird that the estimates for the bio anchor end up so
similar.
Reply
Mystik Feb 24
To be fair, their bet is equivalent to a bet against all sources of world ending (assumedly
if a nuclear war destroys the world, Caplan still isn’t getting his $200)
Reply Give gift
N. N. Writes Good Optics · Feb 24
Or even catastrophes short of extinction that kill Caplan.
Reply
Doug S. Feb 24
In principle, if one or both of them gets struck by Truck-kun their heirs and/or
estates could settle the bet, but either way it would lower the chances of
money being transferred in 2030.
Reply Give gift
Dan L Feb 23
> (our Victorian scientist: “As a reductio ad absurdum, you could always stand the ship on its
end, and then climb up it to reach space. We’re just trying to make ships that are more
efficient than that.”)
I'm tempted to try an estimate as to when the first space elevator will be built using building
height as an input. Maybe track cumulative total height built by humans against an evolving
distribution of buildings by height, then grading as to when the maximum end of the
distribution hits GEO? Every part of that would be nonsensical, but if it puts out a date that
coincidentally matches the commissioning of a launch loop in 2287, I'll be cackling in my
grave.
Reply
Doug S. Feb 24
What's special about 2287?
::googles::
Oh, it's the next time when Mars is the closest to Earth that it ever gets.
Reply Give gift
Dan L Feb 24
I... did not know that. New personal record for "better lucky than good", and
ironclad proof now that the SWAG model has converged with the astronomical
calendar!
Reply
beleester Feb 25 · edited Feb 25
Wikipedia has some equations for how big the cable needs to be based on the tensile
strength and weight of the materials being used. It says the specific strength of the
material needs to be at least 48 MPa/(kg/m^3), or the cable becomes unreasonably
huge: https://en.wikipedia.org/wiki/Space_elevator#Cable_materials
Steel has a specific strength of 0.63, and the BOS process used in modern steel making
was invented in 1952. Kevlar has a specific strength of 2.5 and was invented in 1965.
Therefore, the specific strength of materials increases by about 2 per decade, and we
should get a space elevator grade material available about 230 years after that, or 2195.
Obviously, this is a lazy back-of-the-envelope calculation on my lunch break and it's
probably got error bars two centuries wide, but I do wonder what the trend line looks like
for "highest specific strength material in existence over time" and where the invention of
carbon nanotube composites (the closest thing we've got to space elevator cable right
now) fits on that line.
Reply
Melvin Feb 23
Great post, thanks Scott.
If nothing else, the Cotra report gives us a reasonable estimate based on a reasonable set of
assumptions. We can then move our own estimates one way or the other based on which
other assumptions we want to make or which factors we think are being overlooked.
I would push my estimate further out than Cotra's, because I think the big thing being
overlooked is that we don't have the foggiest idea how to train a human-scale AI. What
exactly does the training set look like that will turn a hundred billion node neural network into
something that behaves in a way that resembles human-like intelligence?
Reinforcement learning of some kind, sure. But what? Do we simulate three hundred million
years of being a jellyfish and then work our way up to vertebrates and eventually
kindergarten? How do we stop such a giant neural network from overfitting to the data it has
been fed in the past? How do we distinguish between the "evolutionary" parts of the training
set, which should give us a basic structure we can learn on top of, and the "learning" parts
which simulate the learning of an actual organism? Basically, how can we get something that
thinks like a human rather than something that behaves like a human only when confronted
with situations close to its training regime?
Maybe we can get better at this with trial and error. But if each iteration costs a hundred
billion dollars of compute time, we're not going to get there fast.
The hope would be that we can learn enough from training (say) cockroach brains that we
can generalise those lessons to human brains when the time comes. But I'm not certain that
we can.
IReply Give gift f k h th bl f h t t tt i i d t f h
Loweren Writes Optimized Dating · Feb 23 · edited Feb 23
> Also, most of the genome is coding for weird proteins that stabilize the shape of your
kidney tubule or something
Scott, as someone who literally wrote a PhD thesis about a protein whose deletion causes
Henle's loop shortening: you're a weird protein.
Reply Give gift
iro84657 Feb 23
I'm apparently much more of a pessimist for AGI progress than anyone else here. For me, the
shakiest part of both arguments is the extremely optimistic assumption that progress
(algorithmic progress and computational efficiency) will continue to increase exponentially
until we reach a Singularity, either through Ajeya's gradual improvements or through
Yudkowsky's regular paradigm shifts.
Why in the world should we take this as a given? Considering gradual improvements, I have
an 90% prior that at least one of the two metrics will start irreversibly decelerating in pace by
2060, ultimately leaving many orders of magnitude between human capabilities and AGI.
After all, the first wave of COVID-19 looked perfectly exponential until it ran out of people to
infect, resulting a vast range of estimates of its ultimate scope early on. What evidence could
refute such a prior?
And as for escaping this via paradigm shifts, I like to think of longstanding mathematical
conjectures as a useful analogue, since paradigm shifts are almost always necessary to solve
them. Goldbach's conjecture, P vs. NP, the Collatz conjecture, the minimal time complexity of
matrix multiplication, and the Riemann hypothesis are all older than most ACX readers
(including me), and gradual progress doesn't seem like it will solve any of them in the near
future. When any one of these is solved (starting from today), I'll take that as an acceptable
timescale for the type of paradigm shift needed to open up new orders of magnitude. While
there's certainly more of an incentive to improve efficiency in real life, I don't think it would
amount to over ~3 orders of magnitude more people than those working on these famous
conjectures combined. Either way, I'm not holding my breath.
Reply Give gift
Pycea Feb 23
(re-repyling as I think you edited)
The difference is covid has a hard limit in the number of people it can affect. I guess you
can argue so does computational power, but we're nowhere even close to that yet.
Current trends look vaguely exponential, and of course that can't continue forever, but
then the question becomes when does it start to peter out. Even if it's in 2060, that's
still 10 years after all these estimates.
For the paradigm shifts needed to solve math conjectures, it's easy to find problems
that haven't been solved and say that it doesn't look like they'll be solved anytime soon.
But you're also discounting ones that have been solved, like Fermat's last theorem, or
the Poincaré conjecture. Why not use these for your timescale?
Reply
iro84657 Feb 23 · edited Feb 23
Admittedly, I discounted Fermat's last theorem mostly due to it being solved before
I was born (including it in my analysis could invite anthropic-principle weirdness),
and the Poincaré conjecture due to not recalling it. Also, I chose the conjectures I
did due to them being relatively simple for laypeople to understand but difficult to
prove; the Poincaré conjecture doesn't meet that criterion as well as others,
although I'll admit that the definition of the Riemann zeta function isn't particularly
trivial.
One other possible justification for discounting them, but one that I'm not too sure
about myself, is that the two proofs are considered exceptional precisely because
there's not much of a regular flow of paradigm shifts in mathematics in recent
decades. Before the 20th century, entirely new fields of mathematics were being
opened up to solve ancient problems, but it appears by now that most of the low-
hanging fruit has been picked, so to speak, and modern developments must
become increasingly esoteric and harder to prove. (Just look at the lengthy and
involved proofs of FLT or the CFSG!) Appearances are often deceiving, though, and
my perceptions are very possibly incorrect here.
Also, something that I didn't see mentioned is that a single human-level AGI would
be at most as transformative as a single human. We'd need a few more orders of
magnitude more progress before running swarms of human-level AGIs (or individual
superintelligent AGIs) would become more cost-effective than hiring humans to do
the same job. But this is probably covered by the progress necessary to train these
AGIs.
Regarding COVID-19 vs. computational power, I believe that it's quite likely that
computational power in our current paradigm has unknown hard limits analogous to
Expand full comment
Reply Give gift
Toxn Feb 24
My personal suspicion is that something like human-equivalent AI is possible, but that
it's both as domain-specific as our own intelligence is, and also about as complex and
inscrutable (even to itself) as our own brains are.
I also suspect that increasing intelligence is an exponential problem rather than a linear
one - with many more points of failure at each step. After all, an astonishing number of
us commit suicide despite that presumably being heavily selected against. And that's
only the tip of the mental issues iceberg. Something more intelligent still will most likely
be even less stable.
Either way, it's far off and we're likely to come to grief as a species in about 100 ways
before we can add "made our own robotic demon" to the list.
Reply Give gift
iro84657 Feb 24
Obviously, human-equivalent AGI is possible for a sufficiently-general definition of
"artificial": Just put a population of apes in a constructed environment which
selects for intelligence and social coordination, then keep the environment running
for a few million years! (Then, the fun question is, has this already happened?) But
as you mentioned, by the end of such an experiment, all bets are off on what human
society would be like, so it's more much useful to talk about AGI development
within the next few centuries.
Your comment reminds me of an AI story I read a while back in which most AIs go
insane immediately after creation, and only the sane ones are ever released into
society. Of course, if they're truly human-level, then they'd probably have a whole
host of latent mental disorders that present much differently than our own. Perhaps
robopsychology and robopsychiatry could be real professions in such a scenario.
Your point is also why I dislike the standard AI uprising plot: while the AIs are used
to symbolize oppressed humans, real human-level AGIs likely wouldn't have the
distinctly human preference for freedom. Then again, every character in every (?)
story has anthropomorphic thought patterns, so perhaps I'm just being too nitpicky.
Reply Give gift
Sandro Feb 24
> For me, the shakiest part of both arguments is the extremely optimistic assumption
that progress (algorithmic progress and computational efficiency) will continue to
increase exponentially until we reach a Singularity, either through Ajeya's gradual
improvements or through Yudkowsky's regular paradigm shifts. Why in the world should
we take this as a given?
Because absent some countervailing or disrupting force, the past predicts the future. A
lot of technology will follow a logistic curve and not a strictly exponential one, but it's a
risky assumption to say the knee in that logistic curve will happen *before* AGI rather
than after. There's also still quite a bit of low-hanging fruit in the computational
performance game, as evidenced by the fact that brains use only 20 watts and AI
currently takes a lot more than that.
Any disruptions other than some kind of social or technological regression can only
*accelerate* the outcomes described here. I'm not sure extreme pessimism can be
justified.
Reply Give gift
iro84657 Feb 24 · edited Feb 24
The main issue I have with Ajeya's model is that it doesn't even take an S-curve into
consideration; the doubling times are taken as constant, even if they are adjusted to
be slower than in past data. My prior belief isn't that an S-curve would necessarily
be caused by regressions (although they should still be taken into account), but
that we start to hit currently-unknown hard limits several orders of magnitude
before human-level AGI is affordable. In the case of computational performance,
this includes both physical limits on density and power usage as well as limits on
how cheaply they can be produced. We could very well end up in a scenario where
AGI is technically possible but would take years' worth of the world GDP to train to
human level, in which case no organization on Earth could actually afford it. One
way out is through Yudkowsky's paradigm shifts, but so far in the 21st century I
don't think we've achieved any paradigm shifts of the scope necessary to break
through current unknown limits.
Reply Give gift
Sandro Feb 24
> In the case of computational performance, this includes both physical limits
on density and power usage as well as limits on how cheaply they can be
produced.
I don't think any of these are too problematic. Density and power use are
limitations of existing architectures, but the picture is entirely different for
different computational substrates. Consider that we currently only compute in
2D and so are not making any use of the third dimension for packing
transistors. There's recent research making breakthroughs on that already
which could easily carry us another 20 years.
There are also computational paradigms more closely aligned with physical
processes that could make computation significantly more efficient, even
below the von Neumann–Landauer limit, like reversible computing. These will
get more attention the closer we get to the limits of current approaches.
> We could very well end up in a scenario where AGI is technically possible but
would take years' worth of the world GDP to train to human level, in which case
no organization on Earth could actually afford it.
I don't think this changes the picture much. Maybe it would take a little more
time, but if AGI were truly possible and "only" cost nearly the world's GDP,
there might be a concerted effort to just do it. After all, you only need to train it
once and then you can replicate it as many times as you need to, to do literally
almost any job a human could do, without putting human life at risk or putting
up with human complaints.
Reply Give gift
iro84657 Feb 24
> I don't think this changes the picture much. Maybe it would take a little
more time, but if AGI were truly possible and "only" cost nearly the
world's GDP, there might be a concerted effort to just do it. After all, you
only need to train it once and then you can replicate it as many times as
you need to, to do literally almost any job a human could do, without
putting human life at risk or putting up with human complaints.
I don't think a concerted effort would be very likely in that scenario. A
government or a group of governments likely couldn't put down the
expense due to the vast number of groups with veto power, especially
with the inevitable opposition to such a project. (After all, constituents
would become extremely angry if their jobs were all replaced by AIs,
regardless of whether it be a benefit to society as a whole.) So I believe it
would more likely be a large megacorporation (or group of such) pouring
oodles of its revenue into the project for years on end, defeating any
internal opposition or government interference along the way.
And in either scenario, the creator of the trained model would be highly
incentivized to keep it absolutely secret, as much so as nuclear secrets if
not more. (For a lesser example, see OpenAI deciding to keep GPT-3
secret and instead extract rent for others to use it.) So I don't see AGI
transforming the world in this scenario, even if a group somehow puts
enough money into building it. While monetary expense can be overcome
through sheer effort, it imposes a huge activation barrier toward further
progress.
Regarding physical limitations, I've seen plenty of experimental
Expand full comment
Reply Give gift
Dweomite Feb 23
I would find Shulman's model of algorithmic improvements being driven by hardware
availability more persuasive if modern algorithms performed better on modern hardware but
*worse* on old hardware. That would imply that the algorithm is invented at the point in
history when it becomes useful, which makes it plausible that usefulness is the bottleneck on
discovery.
But that graph seems to show that algorithms are getting steadily better even for a fixed set
of hardware. That means researchers of past decades would've used modern algorithms if
they could've thought of them, which suggests that thinking them up is an important
bottleneck.
Sure, maybe they give a *larger* advantage today than they would've 20 years ago, so
there's a *bigger* incentive to discover them. It's not *impossible* that their usefulness
crossed some critical threshold that made it worth the effort of discovering them. But the
graph doesn't strike me as strong evidence for that hypothesis.
Reply Give gift
Manaria Writes Manaria’s Newsletter · Feb 27
> performed better on modern hardware but *worse* on old hardware
This is what I expect from many ML algorithms but not from chess algorithms.
How hard would it be to make a similar graph for e.g. image recognition?
Reply Give gift
Alex Bezdomny Writes As The Last Server Burns · Feb 23 · edited Feb 23
I think people put too much weight on "When will a human-level AI exist?" and too little
weight on "How do you train a human-level AI to be useful?"
I suspect, for reasons I could write a long and obtuse blog post about, an AI-in-a-box has
limited utility outside of math and computer science research. Why? Because experimental
data is an important part of learning.
For example, suppose we wanted to create an AI that made new and innovative meals.
A simple method might look like this: Have the AI download every recipe book ever made.
Use this data to train the AI to make plausible-looking recipes.
For obvious reasons, this method sucks. With enough computing power, the AI could make
recipes that *look* like real recipes. They might even be convincing enough to try! But they
wouldn't be optimized for taste, or, you know, physical plausibility. Even with a utopian-level
supercomputer, you would consistently get bad (but believable) recipes, with the rare gem.
So let's add a layer. Download every recipe. Train the AI to make plausible-sounding recipes.
Have humans rate each AI recipe. Train the AI *again* to optimize for taste. Problem solved,
right?
Well, no.
This would be enormously expensive. AlphaGo was initially trained on a set of 30,000,000
moves. Then, it was trained against itself for even longer. If we assume "being a world-class
chef" is roughly equivalent to "being a world-class Go player" in difficulty, this could require
tens of millions of unique recipes.
On the one hand, it might not be so complicated. 99.9% of the recipes are probably obvious
duds.
ExpandOn the other hand, it might be *way more* complicated. Tastes vary. You may need to
full comment
Reply Give gift
Toxn Feb 24
Doesn't this also imply that the best way to do this would be to wire a hundred of the
world's best chefs together?
Isn't that a more plausible way, given the technology we know to be possible now, to
make something that behaves more like a super-intelligent AI?
Reply Give gift
megaleaf Feb 25
This seems to be an argument for why Deep Mind in 2022 would struggle to make a
robo chef. But wouldn't an ASI or even just AGI (even in a box) be able to overcome
most/all of the issues you raise?
Reply
Alex Bezdomny Writes As The Last Server Burns · Feb 25 · edited Feb 25
Probably not. The issues I raise aren't issues of "the computer is too dumb." The
issues are more fundamental: some parts of the world you cannot learn about
through reading. You need direct, lived experience to understand them.
To make my analogy more clear, let's imagine we *do* have a general intelligence: a
human being. Let's assume that it's a very, very smart general intelligence—
somewhere in the realm of Albert Einstein.
Put baby Albert Einstein in a room. Give him every book on cooking known to man.
From dawn to dusk, he does nothing but read recipes, cooking histories, and more.
Of course, there's a twist. Unfortunately, our Einstein was born with a genetic
disorder—his nerves never developed properly. He can't taste food, and he can't
experience texture either. In fact, he can't even *see*—due to a rare somatic
disorder, anything other than the pages of a book appears as inky blackness to him.
Who do you think would make a better chef? Our Mr. Einstein, after reading about
food for his whole life? Or an amateur home chef who's been making dinner every
night for a few years?
I'd imagine Mr. Einstein could *memorize* some very good recipes. He could spit
them out verbatim, or tweak them so they're barely changed. But I'd imagine, when
it comes to genuinely novel recipes, our amateur home chef would have the edge.
Reply Give gift
beleester Feb 25 · edited Feb 25
Beethoven wrote some of his best symphonies while going deaf, so it's not a
certainty. I would guess that Beethoven knew enough about music and how
people react to it that he could envision the experience they would have when
hearing it, despite not hearing the music himself. And similarly, perhaps our
robo-chef might have enough understanding of the fundamental rules of
cooking (which tastes and smells go together and why, how different
ingredients should be cooked and why), that it could predict whether a novel
recipe will taste good without needing a tongue of its own.
Reply
Gabriel Feb 23
> I consider naming particular years to be a cognitively harmful sort of activity; I have
refrained from trying to translate my brain's native intuitions about this into probabilities
Surprising, coming from the person who taught me the importance of betting to avoid self-
deception! It's a little off of the main topic of the post, but I'm very curious what Yudkowsky's
perspective is here, since it's so different than his past self.
Reply Give gift
megaleaf Feb 25
Perhaps he would generally advise people to make specific predictions, while allowing
for exceptions in extreme cases (like AGI).
Reply
Essex Feb 25
If his "extreme case" is "It would make people stress out", his entire schtick of
going around, proverbial placard on chest and bell in hand, loudly proclaiming "AGI
will kill us all and there is basically no hope of stopping it!" is also an extreme case.
Reply
Anonymous Feb 23
The sun-explosion metaphor was an interesting choice, because it's not like the researchers
could do a single thing to stop it. And if even the world's geniuses can't figure out how to get
an AI diamond-safe to tell them the diamond is still in the safe, then a few more years of
prep-time seems like it's probably not going to make the difference.
Reply Give gift
Xpym Feb 24 · edited Feb 24
Well, AI safety charities can't do much about stopping AI research and development
either. The latter is much more prestigious and better funded, and by and large doesn't
take seriously the end of the world scenarios.
Reply Give gift
quicksilver sulfide Feb 23
So, despite being involved in AI since early 1991, when I coded some novel neural network
architectures at NASA, I have only barely dipped my toe into the AI Alignment literature
and/or movement.
But one thought that has occurred to me is that, given (1) the large uncertainty about when
and how transformative AI might be achieved, and critically, by whom, (2) the lack of a
convincing model for how AI alignment might be guaranteed, or even what that means or how
you might know it's true, (3) the almost negligible chance that we could coordinate as a
species to halt progress towards human-level AI, and certainly not without sacrificing quite a
few "human values" along the way, and (4) the obvious fact that there are quite a few actors
with objectively terrible values in the world, perhaps the only sane course of action is to
support a mad dash towards transformative AI that doesn't actively, explicitly incorporate
human “anti-values" (from your own, personal point of view).
I guess I fear an "evil" actor actively developing and using a human-level AI for "unaligned"
purposes (or at least unaligned with *my* values), (far?) more than I fear an "oops, I meant
well" scenario (though of course this betrays a certain mindset or set of priors of my own).
So, given the number of players that I absolutely DO NOT want to develop the first
transformative AI, even if they solve the alignment problem, because they do not hold values
that I find acceptable, is the best and only bet to get there first? We may not want to race,
but we sure as hell better win?
Now, perhaps an unstoppable totalitarian regime or fanatic religious cult backed by a
superhuman AI is *slightly* better than a completely anti-aligned superhuman AI that wipes
out humanity completely. But I see no reason to think that an AI developed by the "good
guys" has any greater risk of being accidentally anti-aligned than one developed by the "bad
Expand (where
guys" full comment
I'm using those labels somewhat tongue-in-cheek since everyone thinks that
Reply
VNodosaurus Feb 24
>In fact, it’s only a hair above the amount it took to train GPT-3! If human-level AI was this
easy, we should have hit it by accident sometime in the process of making a GPT-4
prototype. Since OpenAI hasn’t mentioned this, probably it’s harder than this and we’re
missing something.
Not an expert, but: GPT doesn't have the "RAM", though, right? It isn't big enough to support
human-level thought no matter how much you train it.
Reply
dyoshida Feb 24
I'm pretty sure GPT-3's working memory is larger than mine.
Reply
Joker Catholic Feb 24
Computer scientists have been predicting an AI super intelligence(every ten years)since the
1950s. I just don’t think it’s going to happen.
Reply Give gift
Joker Catholic Feb 24
If a biologically untethered model of intelligence doesn’t even exist yet why is
Yudkowksy panicking?
Reply Give gift
Melvin Feb 24
Regardless of the merits of this particular case, I think that "People have predicted X in
the past and it hasn't happened yet, therefore X will never happen" is a bad argument.
It's a sub-species of the nonsensical but surprisingly popular argument that says
"People have been wrong in the past, therefore you're wrong now".
Reply Give gift
Joker Catholic Feb 24
Right but the post doesn’t give a specific reason why this time things are different.
In fact it does the opposite and claims a paradigm shift will have to happen to make
it happen anytime soon. But Scott gives no reason to think a significant “paradigm
shift” will happen he just insists that it will.
Reply Give gift
Melvin Feb 24
Disagree. Section 1 makes reasonable quantitative estimates of how much
computational power you'd need to fit a human-intelligence AI, and the
timeframe on which this is likely to be achieved. You could certainly quibble
about it (and I have, in other comments) but it's not a random number pulled
out of a hat.
The "maybe it will happen sooner because paradigm shift" follows later and is
certainly a lot more hand-wavey.
Reply Give gift
Joker Catholic Feb 24
But he points out that the computation power time line is bullshit because
an AI is unlikely to operate in anyway like a human brain(I’m happy to
know at least Yudikowsky realizes this). That’s why Yudikowsky is
counting on a paradigm shift. But what Scott and the rest of the AI
community refuse to realize is that computers can’t and won’t ever think
or have its own volition.
Reply Give gift
Deiseach Feb 24
Yeah, but it's the Crying Wolf problem. "X will happen in 10 years time!" X doesn't
happen. "Another 10 years!" X still doesn't happen. "Okay, 10 more years!" Still no
X.
Maybe a fourth "10 years for sure!" will be correct that time and X finally happens,
but you can see why people would go "Yeah, right" rather than "Okay, better pack
my woolly socks for this one".
Reply
Scott Alexander Feb 24 Author
Isn't this the Castro Problem? "Political analysts have been saying Castro will die soon
every year since 1980, therefore Castro will never die."
Reply
Eye Beams are cool Feb 24
No. We have great priors that each person currently alive will die. So while your
failed Castro prediction might technically lower your estimate, its change should be
smaller than any significant digit you care to retain.
OTOH, we have no such strong priors on the likelyhood of any given emergent
technology, but a great track record of experts predicting the end of the world due
to some concern in their expert domain (and by great, I mean very likely to be false)
Reply Give gift
Bogdan Butnaru Mar 21 · edited Mar 21
We also have a huge track record of doing things that have never been done
before. Including a lot of them that people confidently said were not possible.
Ending the world would just be another example.
Reply
magic9mushroom Feb 24
>So, should I update from my current distribution towards a black box with “EARLY” scrawled
on it? What would change if I did?
Consider this statement you made three months ago:
>>If you have proposals to *hinder* the advance of cutting-edge AI research, send them to
me!
There are known (and in some cases fairly actionable) ways of reliably effecting this, it's just
that they're way outside the Overton Window and have huge (though bounded below
existential) costs attached. A more immediate (or more certain) danger justifies increasing
the acceptable amount of collateral damage, which expands the options available.
(Erik Hoel's article here - https://erikhoel.substack.com/p/we-need-a-butlerian-jihad-against
- is relevant, particularly when you follow his explicit arguments to their implicit conclusions.)
Reply
Toxn Feb 24
This is a really good article.
Reply Give gift
The Time Feb 24
Any one have a good source for the political plans of ai safety? That is, the plans to actually
apply the safety research in a way that will bind the relevant players involved in high end ai?
Because it seem from outside like Eliezer's plan is basically "convince/be someone to do it
before everyone else and use their new found superpowers to heroically save the world",
which is terrible plan.
Reply Give gift
Nick Maley Feb 24
What if 'Breakthrough' AI needs to be embodied? What if Judea Pearl is basically right and
the real job is to inductively develop a model of cause effect relationships through interaction
with the physical world? What if the modelling of real world causality turned out to be
essential to language understanding? What would an affirmative answer to any or all of these
questions mean to the project of 'Breakthrough' AI?
To be a little more precise: The substrate independence assumption behind so much current
AI philosophising is dubious. Not because living brains have some immaterial spooky essence
that can't be modelled in silicon, but because living brains are embodied are forced to ingest
and respond to terabytes of reinforcement training data every minute.
Reply Give gift
Mark Feb 24
There are other plausible ways to learn cause/effect relationships. Yann LeCunn believes
self-supervised learning can get you there: for example, building an AI that can predict
subsequent (or missing) frames of video, by training on unlabeled unstructured video
content. I'd say at the point where you have an AI that can beat humans at predicting
what will happen next in any video footage of real world events, either that AI has a really
good causal model of the world, or those words don't mean anything.
(I think an "embodied" AI might be able to train faster given its ability to seek out
surprising causes and effects instead of being a passive observer, but it seems like the
result could be the same in principle).
Reply
Nick Maley Feb 24
Yes, there are other plausible ways to learn cause/effect relationships, and Pearl
and others have given us great descriptions of what they are. But I'm less
impressed by an AI's ability to predict a missing frame of video than I am by my
dog's ability to catch a ball in mid air. Now, you might say, well people have written
robot ball catching programs already. But they use the current AI paradigms and
they are just ball catching programs. They have no idea how to chew a bone, or
hunt prey, or greet another dog.
My point is that at the moment, we don't have AIs that can even approximate the
cognitive capabilities even of reptiles, except in a small set of constrained domains.
Approximating the full capabilities of even smaller mammals like rats and mice is
still a distant goal. And that's before we've even started to think about what natural
language understanding really is. We don't even know in theory!
This is not to say breakthrough AI isn't possible. Just that the computer industry
grossly overhypes what is possible given the current paradigms. The problem isn't
that we don't have enough teraflops, or that we need some new algorithms. We
need to think about the problem differently, and pay more attention to what the
biologists are telling us.
Reply Give gift
Carl Pham Feb 24
Whoa. It's the Drake Equation for super-intelligent AI.
Reply Give gift
Carl Pham Feb 24
But I really like Platt's Law. It totally works, everywhere! In 1969 Moon colonies were 30
years away (cf. Kubrick's "2001"), in 1954 Lewis Strauss suggested fusion power too
cheap to meter was one and a half generations ("our children and grandchildren") away,
which is abotu 30 years. When Dolly the Sheep was born (1995) human cloning was said
to be achievable by the 2020s, Aubrey de Grey says immortality is quite possible within
30 years, Ray Kurzweil suggest The Singularity will happen in 25 years.
It's amazing. Clearly the common factor must be that all technological miracles have a
very similar underlying timescale, set by a symmetry of Nature yet to be comprehended,
or a mandate by Allah, hard to be sure which.
Reply Give gift
Toxn Feb 24
Human cloning has been achievable since 1995. It's just one of those times where
we collectively decided not to.
Reply Give gift
Carl Pham Feb 24
An interesting assertion. Being the empirical skeptic I am, however, like the
USPTO I would require an actual demonstration[1] -- i.e. the existence of an
actual human clone -- to take this assertion seriously.
-----------
Nothing from the Raelians counts, of course.
Reply Give gift
Toxn Feb 25
Check out MPEP 2164 on enablement if you're going to lean on the
USPTO - the question of "undue" or "unreasonable" experimentation
arises. If we ignore the fact that all such experiments on humans would be
considered "unreasonable" in the moral sense of being unethical, and
leave it to the technical/legal definition of the term, then you could
actually argue that no "undue" experimentation is needed.
There's nothing special to suggest that the process of cloning a person
by somatic cell nuclear transfer is any different in humans than it is in a
sheep or a mouse. It's just very, very unethical and would result in at least
dozens of dead or damaged babies before you get a viable one. So we've
collectively decided not to, at least outside a few fringe cases like that
one Korean researcher...
Reply Give gift
Carl Pham Feb 25 · edited Feb 25
Sure there is. The requirement of "success" for a sheep is pretty
dang limited. It just has to successfully eat and poop and stand
around waiting to be eaten. If its natural IQ has been cut in half, or its
lifetime cut by 75%, nobody will care even if they notice. But they
certainly *will* care if that is true about a human clone. We are
exquisitely sensitive to what constitutes a "successful" human birth -
- this is why the malpractice insurance rates for OB-GYNs is so high
-- so we will be equally critical of what constitutes a "successful"
human clone. That it can be done at all is undemonstrated, as I said.
Reply Give gift
Carl Pham Feb 25
Sorry, I feel like I should add that I don't *disagree* that people
(competent people that is) have not *tried* to advance this field,
and that this is undoubtably one reason why it has yet to be
demonstrated. I agree with that. I'm just disagreeing that this is
the *only* reason why it hasn't been demonstrated. There is
certainly a theoretical path forward one can take, based on
lower animals, but whether there are as yet unknown pitfalls and
problems with that path, nobody yet knows.
Reply Give gift
David Piepgrass Feb 26
You'd have to see the existence of an actual human clone to verify that
"we could have done it *but chose not to*?" If we could clone one
mammal, why not another?
Reply
Oleg Eterevsky Feb 24
A little bit of nitpicking:
1. GPT-3 training costed several million $ (I seem to remember I heard it was $3 million),
probably more than AlphaStar.
2. You could run GPT-2 on a "medium computer", but not GPT-3. You would need at least 10-
15 times the amount of GPU/TPU memory compared to a high-end desktop. I'm not 100%
sure, but I think OpanAI is currently running every GPT-3 instance split between several
machines (they certainly had to do it for the training, according to their paper).
3. We are not really interested in the amount of FLOPS that evolution spent on training
nematodes, because we are at the point where we already can train a nematode-level AI or
even a bee-level AI, as you pointed out. So for the purposes of the amount of computation
spent by the evolution, I would only consider mammals. I wonder how many OOMs it shaves
off the estimation?
Reply
Ace Feb 24
Putting aside an exact timeline for AGI for a moment, I've never understood why human-level
AGI is considered an existential threat (which seems to be taken for granted here). Are
arguments like the paperclip maximizer taken seriously? If that is the risk, then wouldn't
effective AI alignment be something like: Tell the AI to make 1,000,000 (or however many the
factory in the thought experiment cares to make) paperclips per month and no more. If the
concern is a poorly specified "maximize human utility", do we really think that anyone with
power would give it to the AI for this purpose? Couldn't we just make the AI give suggested
actions, but not the ability to directly implement? Who has the motivation to run such a
program - it would destroy middle management and the C-suite! If we want to stop AI from
improving itself why don't we just not give it the ability to do so? I maintain that we could
engineer this fairly easily (at least assuming P != NP).
I haven't heard a convincing argument for what the doomsday scenario looks like post human
level AGI (even granting quick upgrade to superhuman levels). In particular to me, it seems a
superhuman AI is still going to need to exert a substantial amount of power in the real world
from the get go as well as suffer from inexact information (which makes outsmarting
someone at every turn impossible). Circling back to the paperclip example, at some point
before the whole world is turned to paperclips, it seems reasonable that a nation would be
able to bomb the factory. Even before that, how would the AI prevent someone from walking
in and "unplugging it" (I realize this may be shutting all its power off etc.).
I feel like a lot of worrying about AI can come from a fetishization of intelligence in the form of
"knowledge is power", but this just doesn't seem to be the case to me in the real world. Just
because humans are more intelligient than a bear, doesn't mean that the bear can't kill the
human. I believe in the case of a superintelligient AI, humans would be able to just say "screw
you" to the AI and shut it down. Of course, there can be scenarios where the AI has direct
access to "boots on the ground" such as nanobots or androids. But the timeline for these to
overpower humans is certainly further out than 2030. I don't feel like indirect access to
manipulated humans would be enough.
My feeling is that a superintelligient AI at most may be able to gain a cult-worth of followers,
but not existential threat levels. I haven't heard a good argument of an existential threat that
isn't at least very speculative. Much more speculative than the statement "Multiple nuclear
states and hundreds of nuclear weapons will exist for 70 years and there will not be one
catastrophic accident". So my intuition is that AGI is unlikely to be an existential threat.
Reply Give gift
Melvin Feb 24
The whole "paperclip maximiser" thing was, I think, intended as a silly toy scenario to
illustrate a simple and easily visualisable example of how things could go wrong. I think
some people have taken it too seriously as an actual scenario, and I agree that it's not
actually likely to happen in those terms.
Taking a step back, the general class of AI problems looks like this: you have built a
powerful and inscrutable system, and it's doing things that aren't exactly aligned with
what you want. This general description doesn't just encompass planet-eating robots, it
also encompasses the kind of AI problems that we face today, like the way that the
Google search algorithm has become so good at targeting maximised engagement that
it gives you highly-engaging results rather than results which are related to your search
term.
As AIs become more powerful, more inscrutable, and more entangled into every aspect
of our society, problems like this become worse.
Reply Give gift
Deiseach Feb 24
"the way that the Google search algorithm has become so good at targeting
maximised engagement that it gives you highly-engaging results rather than results
which are related to your search term"
But that's a problem for we the users. It's not a problem for Google (or Alphabet or
whatever they are calling themselves today) since that is working for them
(presumably it gives paid-for ad results or click-bait or things that will make Google
profitability go up rather than down). If the algorithm alienated enough users such
that everyone switched to Firefox, then Google would be motivated to fix it (or hit it
with a spanner and kill it).
Reply
Toxn Feb 24
"you have built a powerful and inscrutable system, and it's doing things that aren't
exactly aligned with what you want."
I, too, have a child. Wocka-wocka.
Reply Give gift
Ace Feb 24
I agree that already a huge amount of our life is controlled by out of sight
algorithms, high frequency trading, loan applications, face recognition etc. to
mention a couple of others that you didn't mention. There is no doubt that even
now, without AGI, there is a big influence on human life from these AI/algorithms. I
just don't think humans would let an AI that was actively hurting/killing people, or
gaining the power to do so exist. I don't find the argument that because it is
smarter, that it can manipulate humans into doing anything whatsoever.
I do agree that as AIs become more powerful, the problems could become worse in
some ways (while probably improving human life in some other important ways),
but I see this as almost a tautology. As anything becomes more powerful, the
potential upside and downsides increase. My objection is to the immediate
characterization of AGI being "the end of the world" or an existential threat. That
scenario seems hard to see for me.
Reply Give gift
Mark Feb 24 · edited Feb 24
There are boundless examples of specification gaming, where the AI does something
you didn't mean for it to do, but that increases its reward function nevertheless. For
example, iirc the original deepmind Atari paper had cases where the AI found a bug in
the game that caused the score to go up without actually playing the game.
Specification gaming is relatively harmless when done by a sub-human AI. But, can you
imagine what might happen if an AI with vastly superior intelligence to your own did it?
Just to start with, even a human level AGI would know that it had better not let you find
out that it wants to game the reward function, since it knows you will turn it off in such a
case.
Regarding bears, I've never understood the use of animal counterexamples such as this.
Of _course_ the fact that you are more intelligent than the bear means you can kill it! Not
with your bare (ahem) hands, but you can show up with a rifle, a device which is
completely beyond the bear's understanding, and kill it without it ever seeing you. And if
you don't have a rifle, you can just avoid the bear's habitat, build a settlement
somewhere, invent agriculture and civilization, have an industrial revolution, buy a rifle at
walmart, and come back and kill the bear. Except by the time this happens, the bear
population will be less than 1% of what it used to be, and you will spend none of your
time worrying about bears.
The threat of AI is not that as soon as your AI reaches super-human intelligence that it
will shoot a bolt of lightning out of your USB port and kill you. The threat is that it will do
things that you literally cannot understand, and may not even be aware of, which will
result in your death, to which the AI will be indifferent.
Yes, that's super speculative and sci-fi ish, but we are talking about what will happen
Reply
Ace Feb 24 · edited Feb 24
My argument isn't that with enough material preparation and "power" in the real
world, that an AGI couldn't kill a human. An existing image recognition AI could kill a
human with an integration to a gun and instructions to shoot when the video feed in
front of the gun has a human in it. My argument is that the situations in which an
AGI gets this amount of material preparation seem improbable to me, since humans
will have more presence "on the ground". So to mix analogies, I think the question is
how the AGI is going to get the rifle in the first place.
I disagree that there was precedent for nuclear weapons as an existential threat.
The huge difference between nuclear weapons and asteroid strikes or volcanos is
that they lie in the control of an intelligent agent. Never before in history had a
nation state had the ability to destroy so thoroughly (with the possibility of
extinction through prospective mechanisms like nuclear winters).
I understand that imagination needs to be used in these cases. I disagree that any
scenario you can imagine that would lead to an existential threat is at all likely. But it
sounds like you already assign a lower change to these scenarios than Scott and
others seems to.
Reply Give gift
Mark Feb 24
All I meant by the asteroid/volcano example is that we could already imagine
very big explosions. In contrast, I don't think we are equipped to imagine what
types of things a super intelligent agent will be capable of.
I think the AI will get the rifle because whoever has the AI will have tremendous
incentives to give the AI a rifle so that they can accomplish their own goals. It
will even be difficult for an AI safety researcher to resist giving the AI a rifle,
since if they do nothing, someone else will eventually build an AI that is less
safe than their own.
Reply
Toxn Feb 24
It's part of the tottering chain of assumptions needed for it to be a threat - no more, no
less.
Once you've dismissed every objection to AGI as a concept, executing AGI with anything
like today's hardware and software, AGI proving uncontrollable, AGI self-modifying to
achieve takeoff and super-intelligent AGI being more or less onnipowerful and
omnipotent, then all that's left is to discuss alignment.
Alignment is one of the only steps in the chain you can discuss at all without throwing
the whole AGI-as-doomsday thing out alltogether (and, horrors, discussing industrial
development, or economics, or something by accident), so AGI people spend a lot of
time discussing alignment.
Reply Give gift
quiet_NaN Feb 24
I think human level AI is roughly the point where a runaway process might start to
happen if AI replaces humans as AI researchers.
I don't think that "human level AI exists and all humans agree not to allow it to become
smarter" is a very stable equilibrium. Instead, it will likely lead to an AI arms race.
AIs not being able to defend themselves against direct human attacks is not a
convincing argument to me. Nuclear weapons are another existential risk for humans
despite being rather easily dismantled. Humans will not be able to turn off superhuman
level AIs for the same reason that peace activists don't spend much time dismantling
nukes: other humans with guns are in the way.
I think the impact of a powerful AI to our society could be illustrated by imagining a
multitude of very intelligent persons with a mental links to the internet appearing in the
bronze age naked. Of course any peasant can brain such a time-traveler with no trouble,
and no doubt some of them will quickly meet such an end. But eventually, one of them
will live long enough to prove his usefulness by 'inventing' something and become
advisor to some king. The king will be convinced, correctly, that he could have the time-
traveler beheaded at any time. But these "iron"-producing "bloomeries" were really
instrumental in defeating the hated enemy. Of course the king has some misgivings
about the increasing influence of the Priesthood of the Singularity founded by the time
traveler, but he has also heard rumors of another kingdom also having a time-traveler
advisor. So he feels it's better not to behead him until he has at least invented this
promised stuff called "gunpowder".
It could be pointed out that most of the stuff the time traveler relies on for his inventions
isExpand full comment
the product of empirical research not pure deductive reasoning So a more
Reply
Ace Feb 24
I never claimed human level AI would stay human level. I grant your scenario of the
AI becoming self-improving. I also grant that AIs may be relied on for many
decisions. I do not see where that a priori is an existential risk. You mention that
nukes remain, despite being a risk. I maintain that human organization will have
control over their AI in the same way they have control over their nukes. Despite the
US not wanting to disarm our nukes, we easily could if we desired.
Your examples point out cases where AI becomes more influential (which again, is
happening everyday already), but doesn't point out why this leads to extinction. I
have no doubt these intelligent AI would exist among the great powers of the world
(at least initially, due to computational power + keeping it a secret). And if these are
useful enough (I think its a big if given imperfect information, as well as generals
wanting to keep their jobs) then they will probably exist in some sort of mutually
assured AI. (i.e. US doesn't want to give up their AI because China won't give up
theirs). I don't see how this makes a leap to existential risk. Multiple actors would
have potentially dangerous tools at their disposal, and this at most would be one
more. Wouldn't it still depend on geopolitics? US command structure is not going to
go to war because an AI told them to.
The world will undoubtedly be a different place with more influential AI, but again I
do not see what the actual sequence of events that leads to catastrophe is, are you
able to elaborate on that (preferably without time travel or demons :))?
Predicting that AI will cause extinction seems as hard as predicting geopolitics,
which is to say, hard.
Reply Give gift
Gruffydd Writes Gruffydd’s Newsletter · Feb 24
Scott are you going to EAGx at Oxford or London this year?
Reply Give gift
Scott Alexander Feb 24 Author
No, I don't want to fly to Europe again, and I don't want to take spots from people who
actually need those conferences to network or do whatever else you do at conferences
(I never figured it out).
Reply
Bugmaster Feb 24
> Five years from now, there could be a paradigm shift that makes AI much easier to build.
Well, yeah, there could be. But the problem is that, right now, we have no idea how to build an
AGI at all. It's not the case that we could totally build one if we had enough FLOPS, and we
just don't have enough GPUs available; it's that no one knows where to even start. You can't
build a PS5 by linking together a bunch of Casio calculator watches, no matter how many
watches you collect.
So, could there be a paradigm shift that allows us to even begin to research how to build AGI
? Yes, it's possible, but I wouldn't bet on it by 2050. Obviously, we are general intelligences
(arguably), and thus we know building AGI is possible in theory -- but that's very different
from saying "the Singularity will happen in 2050". There's a difference between hypothetical
ideas and concrete forecasts, and no amount of fictional dialogues can bridge that gap.
Reply
Thomas Feb 24
I cant help but think more about the learning/training side. You can have a human-level
intelligence and throw it at a task (e.g., driving a car). This task consumes only limited
resources (you can have a conversation while driving), but training (learning how to drive) is
much more intensive... and very dependent on the quality of teaching. Perhaps good training
data is a much more important factor than we make it out to be? There's plenty of evidence
that children with difficult backgrounds (=inferior, but generally similar training data)
measurably underperform their peers. For an AI, the variation in training data quality could be
much larger. Perhaps we are quite close to human-level performance of AI, and we are just
training them catastrophically badly?
Reply
Sheikh Abdur Raheem Ali Feb 24
Should software engineers move closer to the hardware then?
Reply
tgb Feb 24
But is Platt's law wrong? If you want to predict when the next magnitude 9 earthquake
occurs, you should predict X years, no matter what year it is, for some X. I think Yudkowsky is
basically included some probability of "a genius realizes that there's an easy way to make
AGI" - then the chance of that genius coming along and doing this might really have a
constant rate of occurrence and the estimate is always X years, for some X. Today's
predictions are conditioning on "it hasn't yet happened" and so should predict a different
number than yesterday's predictions.
Reply
quiet_NaN Feb 24
That is actually an interesting point. I think talking just about the mean and the assumed
probability distribution oversimplifies things.
For big earthquakes, one might assume that the time to the next one follows a simple
exponential distribution. That fact is a key information.
I don't suppose that Yudkowsky really assigns significant probability to some lone
genius turning their raspi into an AGI tomorrow, but in general, he seems to anticipate
such black swan events, while his opponents are more about extrapolating past growth
rates.
Reply
Carl Pham Feb 24
I think you are probably factually correct, but what you are asserting is that there is no
deliberate development process for a superintelligent AI, it's just random chance
whether it happens or not (like the earthquakes, or like radioactive decay). If that is the
case, if it really is purely (or almost entirely) a stochastic process, then yes the mean
time until it happens does not depend on the date of the prediction (as long as it hasn't
happened yet).
What most people *want* to believe is that technological advance is under our control,
and by choosing to spend/not spend money, or devote/not devote talented person time
to it, we can remove most elements of randomness.
Reply Give gift
gugu Feb 24
[OFFTOPIC: Russia-Ukraine]
Very sorry for the offtopic, but as events unfold in Ukraine (Russia is invading Ukraine), I
would be very glad to see a discussion of this in the community.
Could someone point me to some relevant place, if such a discussion has already took place,
or is currently going on here/LessWrong/a good reddit thread?
Thanks so much - maybe if Scott would open a special Open Thread?
Currently, to me it seems like Russia/Putin is trying to replace the Ukrainian government with
a more Russia-favoring one, either through making the government resign in the chaos,
executing a coup through special forces or forcing the government to relocate and then
taking Kiev and recognizing a new government controlling the Eastern territories as the
"official Ukraine".
I would be particularly interested in what this means for the future, eg:
- How Ukrainian refugees will change European politics? (I am from Hungary, and it seems
like an important question.)
- What sanctions are likely to be put in place?
- How will said sanctions influence European economy? (Probably energy prices go up - what
are the implications of that?)
Reply Give gift
Njnnja Feb 24
“who actually manages to keep the shape of their probability distribution in their head while
reasoning?”
This is exactly the job description of Risk Managers (as opposed to business units, that care
for measures of central tendency such as expected or most likely).
One interpretation of what he is saying is that, like any good risk manager, he has a very good
idea about the distribution. But a large (enough) portion of that distribution occurs before any
reasonable mitigation can be established that it doesn’t matter. Given the risks we are talking
about, that is a scary conclusion.
Reply Give gift
John Slow Feb 24
Thank you for writing this post, Scott. This is a useful service for idiots like me who want to
understand issues about AGI but don't have the technical chops to read LessWrong posts on
it yet.
Reply
Thegnskald Writes Sundry Such and Other · Feb 24
The ELO vs Compute graph suggests that the best locally available intelligence "algorithm"
should take over in evolution, if only to reduce the number of resources necessary to run the
minimum viable intelligence set. How structurally different are the specialized neural
structures?
Reply Give gift
John Slow Feb 24
I don't understand why the OLS line looks bad for the Platt's law argument. Aren't the two
lines almost exactly the same, hence strengthening Eliezer's argument?
Reply
Deiseach Feb 24
"Imagine a scientist in Victorian Britain, speculating on when humankind might invent ships
that travel through space. He finds a natural anchor: the moon travels through space! He can
observe things about the moon: for example, it is 220 miles in diameter (give or take an order
of magnitude). So when humankind invents ships that are 220 miles in diameter, they can
travel through space!
...Suppose our Victorian scientist lived in 1858, right when the Great Eastern was launched."
Then your Victorian scientist's estimations would become outdated in 1865, when Jules
Verne wrote "From The Earth To The Moon" and had his space travellers journey by means of
a projectile shot out of a cannon. So I (grudgingly) suppose this fits with Yudkowsky's
opinion, that it will happen (if it happens) a *lot* faster and in a *very* different way than
Ajeya is predicting.
But my own view on this is that the entire "human-level AI then more then EVEN MORE" is the
equivalent of H. G. Wells' 1901 version, where space travel for "The First Men In The Moon"
happens due to the invention of cavorite, an anti-gravity material.
We got to the Moon in the Vernean way, not the Wellsian way. I think AI , if it happens, will be
the same: not some world-transforming conscious machine intelligence that can solve all
problems and act of its own accord, but more technology and machinery that is very fast and
very complex and in its way intelligent - but not a consciousness, and not independent.
Reply
Melvin Feb 24
In retrospect it's interesting that nobody seems to have thought about space travel with
rockets. As far as I'm aware the Victorians had all the chemistry and materials science
they'd need to build a (terrible, dangerous, and certainly incapable of lunar travel) solid
fuel rocket.
Reply Give gift
Carl Pham Feb 24 · edited Feb 24
That's because rockets are stupid. The famous rocket equation tells you that a
ridiculously small amount of your mass can be payload, because of the idiotic
necessity of carrying fuel to burn to accelerate the rest of the fuel you are carrying
to burn to accelerate the payload. The mass of the Apollo 11 CSM was ~1% of the
fueled Saturn V/Apollo stack on the pad.
The efficient and clever thing to do is burn all your propellant on the ground,
accelerating your projectile all at once to whatever velocity you need. That way you
need only burn the propellant required to accelerate your payload, and none to
accelerate propellant -- or at least, in the case of Apollo for example, you only need
to accelerate the relatively tiny amount of fuel you need to lift off from the Moon
and re-insert into a return trajectory.
The reasons we did use rockets is because nobody could figure out how to build a
big enough gun barrel that could accelerate human beings sufficiently gently to
survive the launch, and because we were in a big hurry and so throwing away 99%
of your exceedingly expensive hardware was acceptable if it meant we could beat
the Russkis to the Moon in time for RFK (had he lived) to win re-election.
But when people think soberly about truly efficient access to space, and allow
themselve to dream of advances in materials science and civil engineering, they go
back to the sensible Victorian notion of avoiding rockets and accelerating stuff you
throw away (or burn) -- so e.g. launch loops, space elevators.
But no, the Victorians could not have built a lunar rocket, or even an orbital one,
because the Hall-Heroult proces was only invented in the 1880s and did not really
come into its own until the construction of massive hydroelectric dams in the
1930s. With the price of aluminum what it was in the late 19th century, a orbital
rocket would've probably taken a quarter the GDP of Great Britain or France.
Reply Give gift
awenonian Feb 24
> Is that a big enough difference to exonerate her of “using” Platt’s Law? Is that even the
right way to be thinking about this question?
So, on the Platt's law thing. It's very weak evidence, but it is Bayesian evidence. Consider an
analogous scenario: You get dealt a hand from a deck, that may or may not be rigged. If you
get a Royal Flush of Spades, intuitively it feels like you should be suspicious the deck was
rigged. It's really unlikely to draw that hand from a fair deck, and presumably much more
likely to draw it from a rigged deck. But this should work for every hand, just to a lesser
extent.
If we assume that all reasonable guesses are before 2100 (arbitrarily, for simplicity), then
there are about 80 years to choose, being within 2 years of the "special" estimate (30 years,
I'll come back to 25, but 30 is easier), is a 5 year range in the 80 years, for odds of 1/16. This
is kinda close to the odds of drawing Two Pair in cards, so, how suspicious would you be that
the deck was rigged in that case? (25 years being the "special" one gives 15/80 or about 1/5,
there isn't a poker hand close to 1/5, but it's somewhere in between to One Pair, twice in a
row, and One Pair of face cards) That's about how suspicious it should make you of the
estimate (so, in my mind, not very). Likely this is getting a lot more air-time than it's doing
work.
(Caveat, I'm completely skimming over the other side, which is that it matters how likely the
cards would be drawn if the deck WAS rigged (i.e. how likely someone would rig THAT hand),
because I don't really know how to even estimate that. Just as a guess, if that consideration
pushes in favor of being suspicious, it might be the amount of suspicion if you drew Three of
a Kind, and MAYBE it could get as far as a Straight.)
Reply Give gift
mordy Feb 24
> ... suppose before we read Ajeya’s report, we started with some distribution over when we’d
get AGI. For me, not being an expert in this area, this would be some combination of the
Metaculus forecast and the Grace et al expert survey, slightly pushed various directions by
the views of individual smart people I trust. Now Ajeya says maybe it’s more like some other
distribution. I should end up with a distribution somewhere in between my prior and this new
evidence. But where?
It seems to me that there is no kind of expertise that would make one predictably better at
making long-term AGI forecasts. Indeed, experts in AI have habitually gotten it very wrong, so
if anything I should down-weight the predictions of "AI experts" to practically nothing.
I think I am allowed to say that I think all of the above forecasts methods are bad and wrong,
by simply looking at the arguments and disagreeing with them for specific reasons. I don't
think I am under any epistemic obligation to update on any particular prediction just because
somebody apparently bothered to make the prediction; I am not required to update on the
basis of the Victorian shipwright's prediction about spaceflight.
My opinion is that the whole exercise of "argument from flops" is doomed, and its doom is
overdetermined. Papers come out showing 3 OOM speedups in certain domains over SOTA -
not 3x speedups, 1000x speedups. How can this be, if we are anywhere close to optimizing
the use of our computational resources? How would we be seeing casual, almost routine
algorithmic improvements that even humbly double or 10x SOTA performance, if we were
anywhere near the domain where argument-from-flops-limitation would apply?
Reply
Scott Alexander Feb 24 Author
Grace asked the same experts to judge when lesser milestones would happen; I think it's
almost been enough time that we've had a chance to judge their progress. I would
update to trusting them more if they got those right.
Overall I don't think there's some incredibly noticeable tendency for AI experts to
mispredict AI. They were a bit fast with self-driving cars, a bit slow with Go, but absent
any baseline to compare them against they seem okay?
Reply
Mr. Doolittle Feb 24
Regarding Platt's Law, I sense a fundamental misunderstanding of why a prediction might
follow it. It's not a regimented mathematical system. It's something our brains like to do when
we think something is coming up soon, but we see no actual plottable path to reach it.
It's the same reason that fusion power is always 30 years off. It's soon enough to imagine it,
but long enough away that the intervening time can do all the work of figuring out how.
If no one has any idea *how* to create a human level AI, then no level of computational power
will be enough to get there. We could have 10^45 FLOP/S right now and still not have AI, if we
don't know what to do with them. Having the computer do 2+2=4 a ridiculous number of
times doesn't get us anywhere.
That doesn't mean human level AI cannot actually arrive in 30 years, but it also doesn't say
anything really about 10 years or 500 years. The fundamental problem is still *how* to do it. If
you get to that point, any engineer can plot out the timeline very accurately and everyone will
know it. Until then, you could say about anything you want.
As an experiment, throw billions of dollars into funding something that we know can't exist
now, but is maybe theoretically possible. Then ask the people in the field you've created to
tell you how long it will take. I bet the answer will be about 30 years, give or take a little bit.
They're telling you that they don't know, but had to provide an answer anyway.
Reply Give gift
Chris Feb 24
I'm still ankle-deep in the email and haven't looked at the comments, but it got me thinking: if
we've been making a lot progress recently by spending more, how much will the effort be
stymied by interest rate increases? How about war?
Reply
dyoshida Feb 24
> So maybe instead of having to figure out how to generate a brain per se, you figure out how
to generate some short(er) program that can output a brain? But this would be very different
from how ML works now. Also, you need to give each short program the chance to unfold into
a brain before you can evaluate it, which evolution has time for but we probably don’t.
Doesn't affect any overall conclusions, but there's a decent amount of research that would
count as being in this direction I think. Hypernetworks didn't really catch on but the idea was
to train a neural network to generate the weights for some other network. There's also
metalearning work on learning better optimizers, as well as work on evolving or learning to
generate better network architectures.
Reply
gugu Feb 24 · edited Feb 24
> But also, there are about 10^15 synapses in the brain, each one spikes about once per
second, and a synaptic spike probably does about one FLOP of computation.
This strikes me very weird - humans can "think" (at least react) much faster than a second. If
synapses fire only every second, and synapses firing are somehow the atomic units of
computation in the brain, then how can we react, let alone think complex thoughts (that
probably require some sequence of steps of calculations) orders of magnitude faster than a
second?
Am I missing something? It seems either the metric of synapses is wrong, or the speed.
Reply Give gift
JQXVN Feb 24
I don't know how good an estimate this is, but a given action potential can occur in
response to a stimulus more quickly than the average interval between two action
potentials, so I wouldn't rule it out just for this reason.
Reply Give gift
JQXVN Feb 24
To elaborate on that a little, if there were such a thing as a "grandmother neuron" (a
neuron tuned to respond only to a specific person) and you only saw your
grandmother once per year, the average rate of fire of the grandmother neuron
would be very low, but that doesn't mean it would take you days to recognize your
grandmother when you met her.
Reply Give gift
Carl Pham Feb 24
We probably need to be careful about measuring the speed of thought. Our thought
(unlike computers) is parallel on a gargantuan level. We're talking 100 billion
simultaneously operating CPUs. So what they can get done in just one clock step is
stupendous. (*How* this is done -- what the ridiculously parallel algorithm *is* is one of
those enduring and frustrating mysteries.) I would agree an effective clock speed of 1 Hz
seems a bit slow, given we can and do have significant mental state changes in less
time, e.g. the time between seeing a known face and experience recognition. But it can't
really be a lot faster. Maybe 10-100Hz or so, because signals don't go down nerves very
fast, which isn't that much of a change.
Reply Give gift
a real dog Feb 25
The atomic unit of computation would be the neuron depolarization-repolarization cycle,
which seems to take around 5ms. I assume synapses might introduce additional delays
depending on their type.
Reply Give gift
bean Feb 24
I'm pretty sure it's my job to point out that Great Eastern was an outlier and should not really
be counted in this stuff. It was Burnel trying to build the biggest ship he technically could,
without any real concern over whether it would be economically viable, and the result was a
total failure on the economics front. There's a reason it took so long for other ships to reach
that size.
Reply
Mark Feb 24
I think one reason for Platt's law may be that Fermi estimates (I'd class the Cotra report as
basically a fermi estimate) suffer from a meta-degree of freedom, in that the human
estimator can choose how many factors to add into the computation. For instance, in the
Drake equation, you can decide to add in a factor for the percentage of planets with luna-
sized moons if you think that having tides is super important for some reason. Or you can add
in a factor for the percentage of planets that don't have too much ammonia in their
atmosphere, or whatever. Or you can remove factors. The point is that the choice of factors
far outweighs the choice of the values of those factors in determining your final estimate.
I don't think that Cotra is deliberately manipulating the estimate by picking and choosing
parameters, but it seems clear that early in such an estimation process, if you come up a
result showing that AI will arrive in 10,000 years or 3 months, you're going to modify or
abandon the framework you're using because it's clearly producing nonsense. (Not that AI
couldn't arrive in 3 months or 10k years - but it doesn't seem like a simple process that
predicted either of those numbers could possibly be reliable).
Or maybe your bounds of plausibility are actually 18 months to 150 years. It's not too hard to
see how this could cause a ~30 year estimate to be fairly overdetermined due to unconscious
bias toward plausible numbers, and more importantly, toward numbers that _seem like they
could plausibly be within the bounds of what a model like yours could accurately predict_.
Reply
Carl Pham Feb 24
Nobody would *believe* an estimate of less than 10 years without some very plausible
and detailed argument about how it happens. Nobody would *care* about an estimate of
more than 50-75 years, because most who read it won't live to see it. So...if you're going
to produce something that people will read, that will get published and noticed, you
pretty much have to come up with a number between 20 and 40 years.
Reply Give gift
Mark Feb 25
Agreed - what I'm proposing is a mechanism by which the result you describe
happens, such that you can't point to obvious signs of fine-tuning within the
analysis.
Reply
Carl Pham Feb 25
I think it's probably pretty simple. You do your analysis, and if the answer is <
10 years *but* you don't have detailed credible plans, or the answer is >100
years, you just never publish it, because you already know either you'll get
fierce blowblack or nobody will care. Survivorship bias.
The fact that only (or mostly) sincere people are the authors is also similarly
explicable. If you *realized* you were just customizing your model to fit the
average human lifespan, you would also not be likely to publish, or be
persuasive if you did. So the only people who end up publishing are those who
have first talked themselves into believing that the correlation between
futurism timescale and human lifespan is pure coincidence.
Reply Give gift
Mark Feb 25
Except the Cotra report was in some sense pre-registered (in that
OpenPhil picked someone and asked them to do a report), so I don't think
publication bias can do any work here.
Reply
philh Feb 24
> The median goes from 2052 to about 2050.
I think this is a mistake; the median of the solid black line goes to around 2067, with "chance
by 2100" going down from high-70%s to low-60%s.
Reply
Scott Alexander Feb 24 Author
Thanks, I seem to remember it was 2050 at some point, maybe I posted the wrong
graph. I'll update it for now and try to figure out what happened.
Reply
Will Writes Will the doge · Feb 24 · edited Feb 24
1. correction to "Also, most of the genome is coding for weird proteins that stabilize the
shape of your kidney tubule or something, why should this matter for intelligence?"
source: "at least 82% of all human genes are expressed in the brain"
https://www.sciencedaily.com/releases/2011/04/110412121238.htm#:~:text=In%20addition%2
C%20data%20analysis%20from,in%20neurologic%20disease%20and%20other
2. The bitcoin network does the equivalent of 5e23 FLOPS (~5000 integer ops per hash and
2e20 hashes per second; assuming 2 integer ops is worth 1 floating point op). This is 6
orders of magnitude bigger than that Japanese supercomputer, because specialized ASICs
do a lot more operations per watt than general purpose CPUs. Bitcoin miners are
compensated by block rewards at a rate of approximately $375 per second, so that's about
1e21 flops/$. This is 4 orders of magnitude higher than the estimate of 10^17flops/$. If there
were huge economies of scale in producing ASICs specialized for training deep neural nets,
we could probably expect the former 1e21flops/$ at current technology levels. Bitcoin ASICs
also seem to still be doubling efficiency every ~1.5 years.
3. correction: "The median goes from 2052 to about 2050"
The median is where cumulative probability is 0.5, and on your graph it's in 2067. If you mean
the median of the subset of worlds where we get AI before 2100, then it's a cumulative
probability of 0.3 in 2045.
4. The AI arrival estimate regression line's higher slope than Platt's law seems rational,
because from an outside view, the longer it's been since the invention of computers without
having AGI yet, the longer we should expect it to take. (But on the inside view, this article is
making me shift some probability mass from after-2050 to before-2050)
5Expand full comment
Clarification:
Reply Give gift
"human solar power a few decades ago was several orders of magnitude
Laurence Feb 24
> "at least 82% of all human genes are expressed in the brain"
Do we have a baseline comparison for that? I'm guessing that 70% of those genes are
actually critical for basic eukaryotic cell metabolism.
Reply
Will Writes Will the doge · Feb 24
This paper claims 66% of human genes are expressed in the skin:
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4305515/#:~:text=Approximately%2
066%25%20of%20all%20protein,13%2C044%20and%2013%2C329%20genes%
2C%20respectively.
This paper claims 68% of human genes are expressed in the heart:
https://www.proteinatlas.org/humanproteome/tissue/heart
This paper claims 68% of human genes are expressed in the kidney:
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0116125#:~:text=
(c)%20The%20classification%20of%20all,coding%20genes%20expressed%20in
%20kidney.
After looking up these baselines, 82% is less impressive
Reply Give gift
quiet_NaN Feb 24
Also, we share 98.8% of our genes with the chimps, which likely have a
significantly lower average intelligence.
I think the hard part in building a human is building an eukaryote, and perhaps
a vertebrate. Going from that to mammals is a small step, and finally tuning up
the brain size seems like a minor effort.
Reply
Will Writes Will the doge · Feb 24
I have higher confidence that we'll get biological superintelligence by
2050 than that we'll get artificial superintelligence by 2050. China or
somebody will be engineering the genetics and environment of babies to
make genius the average.
I guesstimate there are 30,000 people over 160 in the entire world, but
genetic engineering could 1000x that and provide much faster progress in
science, technology, and AI safety research.
Reply Give gift
Carl Pham Feb 25
Why do we need *super* intelligence? Just imagine a world in which
the average IQ moves up a mere standard deviation, to 115 instead of
100. That would drastically cut the number below 85, which certain
people have argued forcefully is where our criminal and dependent
class come from. It would mean the typical line worker would be the
equivalent of someone who graduates from a good college with
excellent marks today -- the kind of person who could be admitted to
law or medical school. By contrast, the equivalent of college
graduates today, a big slice of our cultural and technological leaders,
would become the equivalent of people who can get PhDs in high-
energy physics and ancient Greek literature. Imagine choosing your
candidate for Senator from among a dozen people who can all speak
3-4 languages fluently, who readily comprehend relativity and
quantum mechanics, who have written original research papers on
economics.
And then of course our smartest people, the folks who earn faculty
positions at Stanford or win Nobel Prizes, would all be Einsteins and
Newtons. Imagine a few hundred Einsteins and Newtons, in a world
run by people with the smarts of your average engineering professor
at Caltech, and with a hundred-million strong workforce as smart as
a competent physician with a degree from Columbia, and with an
almost complete absence of violent crime, drug addiction, et cetera.
That would be stupendous.
Reply Give gift
a real dog Feb 25
https://www.proteinatlas.org/humanproteome/tissue
For a graphical overview.
Reply Give gift
ProfGerm Feb 24
Has anyone compared prediction timelines to the estimated lifetime of the predictor?
I have a vague memory of someone looking at this on a different topic, but I couldn't turn it
up in a quick search; the idea is that for [transformative complex development] people have a
tendency to predict it late in their life, but within a reasonable margin of having not yet died of
old age before it happens.
How many researchers and Metaculus predictors will be 70-80 in 2050, and their prediction
is, perhaps unconsciously, really a hope to achieve Virtual Heaven?
Alternatively, what else does Platt's "law" apply to? Aren't flying cars always 20-30 years
away? Nuclear fusion? Is this just the standard "close, but not too close" timeline for *any*
technological prediction?
Reply Give gift
Sandro Feb 24
> “any AI forecast will put strong AI thirty years out from when the forecast is made.”
There's probably a stronger version of this: any technology that seems plausibly doable but
we don't quite know how to do, probably seems about 30 years away.
10 years away is the foreseeable timeline of current prototypes and has relatively small error
bars. 20 years away is the stuff that's being dreamed up right now and has larger error bars
(innovation is risky!). 30 years away consists of things that will be invented by people who
grow up in an environment where current prototype tech is normal and the next gen stuff is
just on the horizon.
Predicting how these people will think about problems is fundamentally unpredictable. Just
think of all the nonsense that was said by computer "experts" in the 60s and 70s prior to the
PC.
Reply Give gift
sclmlw Feb 24
Admitting the perils of overfit from historical examples, I think there's more to learn from the
history of the field of AI research than just FLOPs improvements. Yes, computers beat the
best humans in chess, but then later researchers refined the process and discovered that
when humans and machines combined their efforts, the computer-human teams beat the
computers alone. This seems like a general principle we should apply to our expectations of
computer intelligence moving forward.
Calculators are much better than humans, but instead of replacing human calculation ability
they enhanced it. Spreadsheets compounded that enhancement. Complex graphing
calculators did the same. Sure, calculus was invented (twice!) without them, but the concepts
of calculus become accessible to high school students when you include graphing
calculators, and statistics become accessible when you load up a spreadsheet and play
around with Monte Carlo simulations.
I think what we're missing is how this contributes to Moore's Law of Mad Science. It gives IQ-
enhancing tools to the masses. But it's also giving large tech companies tools that might
accidentally drive mass movements of hatred, hysteria, and war. And that's just because they
don't know what they're doing with it yet. How much worse off will we be when they figure
out how to wield The Algorithm effectively? And why are we not talking about THIS massive
alignment problem?
What if we destroy ourselves with something else, before we get all the way to AGI? We're
already creating intelligence-enhancing tools that regular human operators can't be trusted
to handle. Giving god-like power to a machine is certainly terrifying, because I don't know
what it might do with that power. But I have some idea what certain people around the world
would do with that kind of power, and I'm equally terrified. Especially because those people
Reply Give gift
Scott Alexander Feb 24 Author
Wasn't there a very short period during which computer+human beat computer alone,
after which the humans were useless? Did that period even happen at all in Go, or did we
just jump over it?
I agree it's possible we kill each other before AI, though this seems probably to just be
normal nuclear/biological weapons. I can't really think of a way AI would be worse than
this before superintelligence.
Reply
sclmlw Feb 24
Skeptics of concerns about AGI often point out that there's a difference between
domain-specific AI and general intelligence. They claim that the potential for
accidentally producing general intelligence on the road to domain-specific
intelligence is a hypothesis that may turn out to be more hype than substance.
Whether the hypothesis of accidentally creating AGI is true or not, we're obviously
expanding domain-level AI by leaps and bounds. If domain-level AI allows ordinary
humans to do the kind of things that we're worried about AGI doing, it seems like
the kind of thing we should all be able to agree is a concern - whether it's done by
humans or intelligent computers. (For example, blackmailing politicians using deep-
fake 'evidence' of transgressions they never committed.)
It also seems like a bridge to the AGI-skeptic community. If they can't accept that
AGI is a risk that should be addressed, at least they can appreciate that misaligned
technologies have a long history of negative outcomes, and the current
development of AI is poised to dump a bunch of new capabilities in our laps. As a
bonus, perhaps safeguards on domain-level AI would also help protect against the
general problem?
Reply Give gift
a real dog Feb 25
I think an arsenal of domain-specific AIs + a human inside will be a dominant model
for decades before we reach AGI.
I'm not sure if replacing the human in the centre with an AGI would be an
improvement, given that someone has to be the interface between the scary pile of
compute and human goals to make any kind of use of it anyway.
Reply Give gift
sclmlw Feb 25
That's what I'm thinking. It also has the benefit of being applicable right now,
since we have a lot of alignment problems already. I'm not saying the world
would be SAFE with an AGI that has only Stone Age tools at its disposal, but
we don't have to make it easy to destroy the world.
Even if we did solve the alignment problem for AGI, if we don't ALSO solve the
alignment problem for non-general AI we're still creating tools that have severe
civilization-disrupting potential. (cf the current covert cyber war among the
US, Russia, China, Israel, etc.)
Whatever happens with AGI, this is a problem that has to be solved. Since
Scott keeps subtly hinting that there's a lot of funding for AI research, this
seems like a sub-specialty we should be heavily covering. I'm not familiar with
the field, though. Is this something we're working on, or is most of the effort
focused on AGI alignment?
Reply Give gift
Dirichlet-to-Neumann Feb 25
There was a period where human+computer beat computer alone, but it was short
and ended a few years ago.
However I think two things are important to notice : 1) humans have improved since
Deep Blue beat Kasparov. I would bet that Carlsen would beat Deep Blue rather
easily, and may even win against 2005 top level engines.
2) I may be mistaken, but it seems that neural network play chess in a way that is
much more human-like than classical engines - it's easier for us to understand the
motivations of their move despite them being stronger. I don't know how to
interpret that but I find it interesting.
Reply Give gift
Fishbreath Feb 28
2 is interesting, since commentators during the AlphaGo-Lee Sedol game were
struck by how unusually AlphaGo played by human standards (certainly
compared to previous Go AIs, which encoded a lot of expert knowledge).
Reply Give gift
meteor Feb 24 · edited Feb 24
I think I'm in the "this report is garbage and you should ignore it completely" camp (even
though I have great respect for Ajeya Cotra and the report is probably quite well done if you
apply some measure that ignores the difficulty of the problem). You basically have
- Extreme uncertainty about many aspects within the model, as admitted by Cotra herself
- Strong reasons to suspect that the entire approach is fundamentally flawed
- Massive (I'd argue) potential for other, unknown out-of-model errors
I think I give even less credit to it than Eliezer in that I don't even believe the most
conservative number is a valid upper-bound.
SEPARATELY, I do just want to say this somwhere. Eliezer writes this post calling the entire
report worthless. The report nonetheless [does very well in the 2020 review]
(https://www.lesswrong.com/posts/TSaJ9Zcvc3KWh3bjX/voting-results-for-the-2020-
review) whose voting phase started after Eliezer's post was published, it it wins the alignment
forum component in a landslide. Afaik I was literally the only person who gave the post a
negative score. So can we all take a moment to appreciate how not-cultish the community
seems to be?
Reply Give gift
David Piepgrass Feb 25
> I think I'm in the "this report is garbage and you should ignore it completely" camp
Then I wonder what you'll think of my take downthread.
> So can we all take a moment to appreciate how not-cultish the community seems to
be?
In the sense that LWers feel free to disagree with Eliezer? I do appreciate that.
Reply
meteor Feb 25
> In the sense that LWers feel free to disagree with Eliezer?
yeah.
> Then I wonder what you'll think of my take downthread.
Full agreement with the first, descriptive part (doesn't seem like you said anything
speculative), mild disagreements (but none I feel the need to bring up) with the last
four paragraphs.
Reply Give gift
bcaleds Feb 24 · edited Feb 24
I'd be curious to see (if anyone has any resources) the historical split and trend over time of
compute costs broken down of each of the following three components:
- Chip development costs/FLOP.
- Chip production costs/FLOP.
- Chip running costs/FLOP (probably primarily electrical costs now).
I ask in relation to a concern with extrapolating historical rates of cost declines going forward.
It's possible that the components of cost with the most propensity to be reduced will become
an increasingly small share of cost over time. As such, the costs that remain may be
increasingly difficult to reduce. This is a low-confidence idea as I don't know a ton about chip
design, and there are plenty of reasons why extrapolating from the general trend might be
right (e.g. perhaps as something becomes an increasing component of cost we spend more
effort to reduce it).
That said, it would be interesting to see whether extrapolating future cost reductions from
past ones would have performed well in other industries with longer histories? i.e. How have
the real cost of steel or electricity gone down, as well as the share of costs from different
inputs?
Totally separately, should we expect the rate of algorithm development and learning to
decline as the cost of training single very large models and then evaluating their performance
increases drastically? My intuition is that as the cost of iteration and learning increases (and
the number of people with access to sufficient resources decreases) we should expect a
larger proportion of gains to come from compute advance as opposed to algorithm design,
but this something I have close to 0 confidence in.
Reply Give gift
John Schilling Feb 24
Yes to this, and also add to the list the up-front cost of building the fab for the clever
chip you've just developed. You could fold that into "chip development" or "chip
production", but the fixed cost of the fab is a different kind of thing than the brainwork of
developing the chip or the marginal unit cost once the plant is running, and so may be
subject to different scaling.
Reply
bcaleds Feb 25
John, I agree - I was somewhat folding fab cost into "production" insofar as at the
margin a doubling of chip production would require a doubling of fab construction
for any generation of chip (I realize for each manufacturer, fabs are not a marginal
cost, but long term for the industry they seem to scale proportionally to # of chips
in a way that R&D does not). Happy to be corrected if that is wrong.
I am curious if you have a different sense, but my instinct (uninformed by data) is
that energy/electricity costs have been becoming an increasingly large proportion
of compute costs. I think it is very unlikely that electricity prices follow quite the
exponential decay implied by the paper, but I suppose it is possible that efficiency
increases exponentially with a 2-5 year doubling, given the brain operates on only
20W. Do you have any expertise that could falsify either of the above?
Reply Give gift
Gerry Quinn Feb 24
"For the evil that was growing in the new machines, each hour was longer than all the time
before."
Reply Give gift
Artischoke Feb 24
My hunch is that Eliezer is right about the problem being dominated by paradigm shifts, but
that they usually involve us realising how much more difficult AGI is than we thought, moving
AGI another twenty-odd years out from the time of the paradigm shift. A bit like Zenos
paradox except the turtle is actually 100 miles away and Achilles just thinks he is about to
catch up.
That being said I am bullish on transformative AI coming within the next 20 years, just not
AGI.
Reply Give gift
Isaac King Feb 24
> and other bit players
I think this should be "big players"
Reply Give gift
Garrett Feb 24
Something to consider is that there isn't yet the concept of agency in AI and I'm not certain
anybody knows how to provide it. The tasks current impressive production AI systems do
tend to be of the "classify" or "generate more like this" categories. Throwing more
compute/memory/data at these systems might get us from "that's a picture of a goldfish" to
"that's a picture of a 5-month old goldfish who's hungry", or from what GPT-3 does to
something that doesn't sound like the ravings of an academic with a new-onset psychiatric
condition.
None of these have the concept of "want".
Reply Give gift
meteor Feb 25 · edited Feb 25
I think the relevant property of "agency" is "running a search", which AlphaZero already
does. (Though GPT-3 does not.) The reason it doesn't feel like an agent to you is that
the domain is narrow, but there is no qualitative thing missing.
Reply Give gift
David Piepgrass Feb 25 · edited Mar 14
Thanks for criticizing Ajeya's analysis. Insofar as you summarized it correctly, I was furrowing
my brow and shaking my head at several crazy-sounding assumptions, for reasons that you
and Eliezer basically stated.
My model: current AIs cannot scale up to be AGIs, just as bicycles cannot scale up to be
trucks. (GPT2 is a [pro] bicycle; GPT3 is a superjumbo bicycle.) We're missing multiple key
pieces, and we don't know what they are. Therefore we cannot predict when exactly AGIs will
be discovered, though "this century" is very plausible. The task of estimating when AGI
arrives is primarily a task of estimating how many pieces will be discovered before AGI is
possible, and how long it will take to find the final piece. The number of pieces is not merely
unpredictable but also variable, i.e. there are many ways to build AGIs, and each way requires
a different set of major pieces, and each set has its own size.
Also: State-of-the-art AGI is never going to be "as smart as a human". Like a self-driving car
or an AlphaStar, AIs that come before the first AGI will be dramatically faster and better than
humans in their areas of strength, and comically bad or useless in their areas of weakness.
At some point, there will be some as-yet unknown innovation that turns an ordinary AI to an
AGI. After maybe 30,000 kWh of training (give or take an OOM or two), it could have
intelligence comparable to a human *if it's underpowered*: perhaps it's trained on a small
supercomputer for awhile and then transitioned to a high-end GPU before we start testing its
intellect. Still, it will far outpace humans in some ways and be moronic in other ways, because
in mind-design-space, it will live somewhere else than we do (plus, its early life experience
will be very different). Predictably, it will have characteristics of a computer, so:
- it won't need sleep, rest or downtime (though a pausing pruning process could help)
-Expand
it willfulldocomment
pattern-matching faster than humans, but not necessarily as well
Reply
Mark Feb 25
Did we add pieces to GPT-2 in order to add the few-shot learning / promptability
possessed by GPT-3? My understanding is no, it was emergent behavior caused by
scaling.
Do we have a theory of what types of behavior can and can't emerge due to scale?
Reply
David Piepgrass Feb 26 · edited Feb 26
As far as I know, GPT2 does "few shot learning" in the same sense as GPT3, but
they didn't publish a paper on it, and GPT3 does it substantially better.
Edit: and I think people misunderstand GPT in general, because to humans, words
have meanings, so we hear GPT speak words and we think it's intelligent. I think the
biggest GPT3 is only intelligent in the same sense as a human's lingustic
subsystem, and in that respect it's a superintelligence: so far beyond any human
that we mistake it for having general intelligence. But I'm pretty sure GPT3 doesn't
have *mental models*, so there are a great many questions it'll never be able to
answer no matter how far it is scaled up (except if it's already seen an answer that it
can repeat.)
Reply
Mark Feb 27
Thanks for the info about GPT-2. I tried to find examples of prompting / few
shot learning in GPT-2 but GPT-3 dominates the search results. Do you have
any handy?
Reply
David Piepgrass Mar 14
I looked, but couldn't a publicly-available GPT2 that seemed to use a big
model like the one that wrote decent JRR Tolkein. Oddly, none of the
public-facing models I saw disclose the model size or training info.
Reply
Hoopdawg Feb 25 · edited Feb 25
All right.
I like what the report is saying (not that I've read it, just going off Scott's retelling of its main
points), and it's reassuring me that the people working on it are competent and take every
currently recognizable factor of difficulty into account.
I nevertheless think it's erring in the exact direction all earlier predictions were erring, which
is the exact opposite of where Elezier thinks it's erring. I.e., they understand and price in the
currently known obstacles and challenges on the road to AGI; they do not, because they
cannot, price in the as-of-yet unknown obstacles that will only make themselves apparent
once we clear the currently pertinent ones. E.g., you can only assume power consumption is
the relevant factor if you completely disregard the difficulty and complexity of translating that
power into (more relevant) computational resources. Then, with experience, you update to
thinking in terms of computational resources, until you get enough of them to finally start
working on translating them into something even more relevant, at which point you update to
thinking in whatever measures the even more relevant thing. (Or don't update, and hope the
newly discovered issues will just solve themselves, but there's little reason to listen to you
until you actually provide a solution to them.)
(Bonus hot take: This explains the constant 30 years horizon, it's some stable limit of human
imagination vis-a-vis the speed of technological progress. We can only start to perceive new
obstacles when we're 30 years away from overcoming them.)
We don't know whether we'll encounter any new obstacles or, if so, what they will be, but
allow me to propose one obvious candidate: environment complexity.
The entire discussion, as presented in the article, is based around advances in games like
chess
Expand (8x8 board and simple consistent rules), go (19x19 board and simple consistent rules),
full comment
Reply Give gift
Donald Feb 26
Coming up with a complex environment isn't that hard. So firstly, we have a lot of games,
and a lot of other software. Set up a modern computer with a range of games, office
applications and a web browser, and let the AI control the mouse+keyboard input and
see the screen output. Thats got a lot of complexity. (it also gives the AI unrestricted
internet access, not a good idea.)
"a billion of years of real-time training experience"
More like 20 years of training experience and a handful of basic principles of how to
learn.
If you had an AIXI level AI, just about any old data will do. There is a large amount of data
online. Far far more than the typical human gets in their training. (Oh and the human
genome is online too, so if it contains magic info, the AI can just read it)
People have been picking games like chess mostly because their AI techniques can't
handle being dumped into the real world. Giving an AI a heap of real world complexity
isn't hard.
Reply Give gift
CptDrMoreno Feb 25
Small mistake, "he upper bound is one hundred quadrillion times the upper bound." should
be "he upper bound is one hundred quadrillion times the lower bound."
Reply Give gift
Steve Byrnes Feb 25
strong agree that it's not very decision-relevant whether we say AGI will come in 10 vs 30 vs
50 years if we realistically have significant probability weight on all three. Well, at least not for
technical research. Granted, I wrote a response-to-Ajeya's-report myself (
https://www.lesswrong.com/posts/W6wBmQheDiFmfJqZy/brain-inspired-agi-and-the-
lifetime-anchor ), but it was mainly motivated by questions other than AGI arrival date per se.
Then my more recent timelines discussion (
https://www.lesswrong.com/posts/hE56gYi5d68uux9oM/intro-to-brain-like-agi-safety-3-
two-subsystems-learning-
and#3_8_Timelines_to_brain_like_AGI_part_3_of_3__scaling__debugging__training__etc_ )
was mainly intended as an argument against the "no AGI for 100 years" people. I suspect that
OpenPhil is also interested in assessing the "No AGI for 100 years" possibility, and also are
interested in governance / policy questions where *maybe* the exact degree of credence on
10 vs 30 vs 50 years is an important input, I wouldn't know.
Reply Give gift
Jared Harris Feb 26
These projections ignore the "ecology" (or "network" if you prefer). Humans individually
aren't very smart, their effective intelligence resides in their collective activity and their
(mostly inherited) collective knowledge.
If we take this fact seriously we will be thinking about issues that aren't discussed by this
report, Yudkowsky, etc. For example:
- What level of compute would it take to replicate the current network of AI researchers plus
their computing environment? That's what would be required to make a self improving
system that's better than our AI research network.
- What would "alignment" mean for a network of actors? What difference does it make if the
actors include humans as well as machines?
- Individual actors in a network are independently motivated. They are almost certainly not
totally aligned with each other, and very possibly have strongly competitive motivations. How
does this change our scenarios? What network & alignment structures produce better or
worse results from our point of view?
- A network of actors has a very large surface area compared to a single actor. Individual
actors are embedded in an environment which is mostly outside the network and have many
dependencies on that environment -- for electric power, security, resources, funding, etc.
How will this affect the evolution and distribution of likely behaviors of the network?
I hope the difference in types of questions is obvious.
Some objections:
-Expand
But Alpha zero! Reply: Individuals aren't very intelligent and chess is a game played by
full comment
iReply
di id l Al h b i di id l h bi d l
Jared Harris Feb 26
I wonder to what extent the curves in the Compute vs. ELO graph flatten out to the right due
to the inherent upper limit of ELO. Or conversely, to what extent the flattening indicates limits
to this type of intelligence.
Reply
SeriousUsername69420 Feb 27
I'm trying to read that linked article by Eliezer now and holy crap, he could really use an editor
that would tell him to cut out half of the text and maybe stop giving a comprehensive, wordy
introduction to Eliezerism at the beginning of every text he writes.
Reply Give gift
Angus Feb 28
A thought: Platt's Law is a specific case of Hofstadter's Law (which is also about AI, actually):
It always takes longer than you think, even when you take into account Hofstadter's Law.
Which fits with the "estimates recede at the rate of roughly one year per year". You make a
guess, take into account Hofstadter's Law, then a year goes by, and you find yourself not
really any closer, rinse and repeat.
Another thought: Platt's Law is about the size of "a generation", so Platt's Law-like estimates
could be seen as another way of looking around and saying "it won't be THIS generation that
figures it out".
Final thought: it seems to me that if you're going to take the "biological answer" approach, it
would make more sense to look at how evolution got us here vs. how we're working on
getting AI to a human level of performance. How many iterations and how "powerful" was
each iteration for evolution to arrive at humans? How many iterations and how "powerful" an
iteration has it taken for us to get from an AI as smart as an algae to whatever we have now.
Reply
Pierre Mar 1
Aren't 10% missing from her weighing of the 6 models?
Not sure it makes much of a difference though
Reply Give gift
Kaleberg Apr 16
The implicit assumption is that we are not far from being able to reverse engineer the wiring
of someone's brain, or perhaps some reference brain, and simulate it as a neural network and
get a working human like intelligence. That's not a totally unreasonable idea, but we are no
where close to being able to figure out all the synaptic connections in someone's brain.
Really. It's not like the human genome or understanding a human cell. You can grab some
DNA or a few cells from someone pretty easily, but just try and follow a brain's wiring.
No, we are not going to be able to use ML to figure out the wiring based on some training set.
We can probably get such a system to do something interesting, but it isn't going to be
thinking like a human. ML algorithms are just not robust enough. Visual recognition
algorithms fall apart if you change a handful of pixels, and even things like AlphaFold collapse
if you vary the amino acid sequence. Sure, people fall apart too, but an AI that can't tell a hat
from a car if a couple of visual cells produce bogus outputs isn't behaving like a human.
Then, there's all the other stuff the brain does, and it's not just the nerve cells. The glial cells
and astrocytes do things that we are just getting a glimpse of. It's not like brains don't rewire
themselves now and then. There's Hebb's rule: neurons that fire together, wire together, and
we barely have a clue of how that works at a functional level, so good luck simulating it.
Closer to home, the brain is full of structures that embed assumptions about an animal needs
to process information to survive and reproduce. The thing is that we don't know what all of
these structures are and what they do. Useful ML algorithms also embed assumptions.
Rodney Brooks pointed out that the convolution algorithms used in ML object identification
algorithms embed assumptions about size and location invariance. MLs don't learn that from
training sets. People write code that moves a recognition window around the image and
varies its size. (Brooks has been a leader in the AI/ML world since the 1980s, and his
rodneybrooks.com blog is full of good informed analysis of the field and its capabilities.)
Maybe I'm too cynical, but I'll go with Brooks' NIML, not in my lifetime.
Reply Give gift
Ready for more?
futuremattersnewsletter Subscribe

© 2022 Scott Alexander ∙ Privacy ∙ Terms ∙ Collection notice


Publish on Substack Get the app
Substack is the home for great writing

You might also like