You are on page 1of 86

OpenAI's "Planning For AGI And

Beyond"
...
10 hr ago 83 230

Planning For AGI And Beyond


Imagine ExxonMobil releases a statement on climate change. It’s a great statement!
They talk about how preventing climate change is their core value. They say that
they’ve talked to all the world’s top environmental activists at length, listened to what
they had to say, and plan to follow exactly the path they recommend. So (they promise)
in the future, when climate change starts to be a real threat, they’ll do everything
environmentalists want, in the most careful and responsible way possible. They even
put in firm commitments that people can hold them to.
An environmentalist, reading this statement, might have thoughts like:
Wow, this is so nice, they didn’t have to do this.
I feel really heard right now!
They clearly did their homework, talked to leading environmentalists, and
absorbed a lot of what they had to say. What a nice gesture!
And they used all the right phrases and hit all the right beats!
The commitments seem well thought out, and make this extra trustworthy.
But what’s this part about “in the future, when climate change starts to be a real
threat”?
Is there really a single, easily-noticed point where climate change “becomes a
threat”?
If so, are we sure that point is still in the future?
Even if it is, shouldn’t we start being careful now?
Are they just going to keep doing normal oil company stuff until that point?
Do they feel bad about having done normal oil company stuff for decades? They
don’t seem to be saying anything about that.
What possible world-model leads to not feeling bad about doing normal oil
company stuff in the past, not planning to stop doing normal oil company stuff in
the present, but also planning to do an amazing job getting everything right at
some indefinite point in the future?
Are they maybe just lying?
Even if they’re trying to be honest, will their bottom line bias them towards waiting
for some final apocalyptic proof that “now climate change is a crisis”, of a sort that
will never happen, so they don’t have to stop pumping oil?
This is how I feel about OpenAI’s new statement, Planning For AGI And Beyond.
OpenAI is the AI company behind ChatGPT and DALL-E. In the past, people (including
me) have attacked them for seeming to deprioritize safety. Their CEO, Sam Altman,
insists that safety is definitely a priority, and has recently been sending various signals
to that effect.
Sam Altman posing with leading AI safety proponent Eliezer Yudkowsky.
Also Grimes for some reason.
Planning For AGI And Beyond (“AGI” = “artificial general intelligence”, ie human-level AI)
is the latest volley in that campaign. It’s very good, in all the ways ExxonMobil’s
hypothetical statement above was very good. If they’re trying to fool people, they’re
doing a convincing job!
Still, it doesn’t apologize for doing normal AI company stuff in the past, or plan to stop
doing normal AI company stuff in the present. It just says that, at some indefinite point
when they decide AI is a threat, they’re going to do everything right.
This is more believable when OpenAI says it than when ExxonMobil does. There are real
arguments for why an AI company might want to switch from moving fast and breaking
things at time t to acting all responsible at time t + 1 . Let’s explore the arguments they
make in the document, go over the reasons they’re obviously wrong, then look at the
more complicated arguments they might be based off of.
Why Doomers Think OpenAI Is Bad And Should
Have Slowed Research A Long Time Ago
OpenAI boosters might object: there’s a disanalogy between the global warming story
above and AI capabilities research. Global warming is continuously bad: a temperature
increase of 0.5 degrees C is bad, 1.0 degrees is worse, and 1.5 degrees is worse still. AI
doesn’t become dangerous until some specific point. GPT-3 didn’t hurt anyone. GPT-4
probably won’t hurt anyone. So why not keep building fun chatbots like these for now,
then start worrying later?
Doomers counterargue that the fun chatbots burn timeline.
That is, suppose you have some timeline for when AI becomes dangerous. For
example, last year Metaculus thought human-like AI would arrive in 2040, and
superintelligence around 2043.
When will the first weakly general AI system be devised,
tested, and publicly announced?

Metaculus Apr 2028 2034


community predicts 2025 median
upper 75%
lower 25%

2200

2100
2060
2040
2030

2024
2022
2021
Jul 1

Apr 1
Jul 1 2021 Jul 1 2022 Jul 1 2023
After a (weak) AGI is created, how many months will it be
before the first superintelligent AI is created?

Metaculus 5.8 39 169


community predicts lower upper
25% median 75%

500
200

50
20

5
2

0.5
0.2

0.05

2022 Apr 1 Jul 1 Oct 1 2023

Recent AIs have tried lying to, blackmailing, threatening, and seducing users. AI
companies freely admit they can’t really control their AIs, and it seems high-priority to
solve that before we get superintelligence. If you think that’s 2043, the people who
work on this question (“alignment researchers”) have twenty years to learn to control
AI.
Then OpenAI poured money into AI, did ground-breaking research, and advanced the
state of the art. That meant that AI progress would speed up, and AI would reach the
danger level faster. Now Metaculus expects superintelligence in 2031, not 2043
(although this seems kind of like an over-update), which gives alignment researchers
eight years, not twenty.
So the faster companies advance AI research - even by creating fun chatbots that
aren’t dangerous themselves - the harder it is for alignment researchers to solve their
part of the problem in time.
This is why some AI doomers think of OpenAI as an Exxon-Mobil style villain, even
though they’ve promised to change course before the danger period. Imagine an
environmentalist group working on research and regulatory changes that would have
solar power ready to go in 2045. Then ExxonMobil invents a new kind of super-oil that
ensures that, nope, all major cities will be underwater by 2031 now. No matter how nice
a statement they put out, you’d probably be pretty mad!
Why OpenAI Thinks Their Research Is Good Now,
But Might Be Bad Later
OpenAI understands the argument against burning timeline. But they counterargue that
having the AIs speeds up alignment research and all other forms of social adjustment
to AI. If we want to prepare for superintelligence - whether solving the technical
challenge of alignment, or solving the political challenges of unemployment,
misinformation, etc - we can do this better when everything is happening gradually and
we’ve got concrete AIs to think about:
We believe we have to continuously learn and adapt by deploying less powerful
versions of the technology in order to minimize “one shot to get it right” scenarios
[…] As we create successively more powerful systems, we want to deploy them and
gain experience with operating them in the real world. We believe this is the best
way to carefully steward AGI into existence—a gradual transition to a world with AGI
is better than a sudden one. We expect powerful AI to make the rate of progress in
the world much faster, and we think it’s better to adjust to this incrementally.
A gradual transition gives people, policymakers, and institutions time to understand
what’s happening, personally experience the benefits and downsides of these
systems, adapt our economy, and to put regulation in place. It also allows for society
and AI to co-evolve, and for people collectively to figure out what they want while
the stakes are relatively low.
You might notice that, as written, this argument doesn’t support full-speed-ahead AI
research. If you really wanted this kind of gradual release that lets society adjust to less
powerful AI, you would do something like this:
Release AI #1
Wait until society has fully adapted to it, and alignment researchers have learned
everything they can from it.
Then release AI #2
Wait until society has fully adapted to it, and alignment researchers have learned
everything they can from it.
And so on . . .
Meanwhile, in real life, OpenAI released ChatGPT in late November, helped Microsoft
launch the Bing chatbot in February, and plans to announce GPT-4 in a few months.
Nobody thinks society has even partially adapted to any of these, or that alignment
researchers have done more than begin to study them.
The only sense in which OpenAI supports gradualism is the sense in which they’re not
doing lots of research in secret, then releasing it all at once. But there are lots of better
plans than either doing that, or going full-speed-ahead.
So what’s OpenAI thinking? I haven’t asked them and I don’t know for sure, but I’ve
heard enough debates around this that I have some guesses about the kinds of
arguments they’re working off of. I think the longer versions would go something like
this:
The Race Argument:
1. Bigger, better AIs will make alignment research easier. At the limit, if no AIs exist at
all, then you have to do armchair speculation about what a future AI will be like and
how to control it; clearly your research will go faster and work better after AIs exist.
But by the same token, studying early weak AIs will be less valuable than studying
later, stronger AIs. In the 1970s, alignment researchers working on industrial robot
arms wouldn’t have learned anything useful. Today, alignment researchers can
study how to prevent language models from saying bad words, but they can’t
study how to prevent AGIs from inventing superweapons, because there aren’t any
AGIs that can do that. The researchers just have to hope some of the language
model insights will carry over. So all else being equal, we would prefer alignment
researchers get more time to work on the later, more dangerous AIs, not the
earlier, boring ones.
2. “The good people” (usually the people making this argument are referring to
themselves) currently have the lead. They’re some amount of progress (let’s say
two years) ahead of “the bad people” (usually some combination of Mark
Zuckerberg and China). If they slow down for two years now, the bad people will
catch up to them, and they’ll no longer be setting the pace.
3. So “the good people” have two years of lead, which they can burn at any time.
4. If the good people burn their lead now, the alignment researchers will have two
extra years studying how to prevent language models from saying bad words. But
if they burn their lead in 5-10 years, right before the dangerous AIs appear, the
alignment researchers will have two extra years studying how to prevent advanced
AGIs from making superweapons, which is more valuable. Therefore, they should
burn their lead in 5-10 years instead of now. Therefore, they should keep going full
speed ahead now.

The Compute Argument:


1. Future AIs will be scary because they’ll be smarter than us. We can probably deal
with something a little smarter than us (let’s say IQ 200), but we might not be able
to deal with something much smarter than us (let’s say IQ 1000).
2. If we have a long time to study IQ 200 AIs, that’s good for alignment research, for
two reasons. First of all, these are exactly the kind of dangerous AIs that we can do
good research on - figure out when they start inventing superweapons, and stamp
that tendency out of them. Second, these IQ 200 AIs will probably still be mostly
on our side most of the time, so maybe they can do some of the alignment
research themselves.
3. So we want to maximize the amount of time it takes between IQ 200 AIs and IQ
1000 AIs.
4. If we do lots of AI research now, we’ll probably pick all the low-hanging fruit, come
closer to optimal algorithms, and the limiting resource will be compute - ie how
many millions of dollars you want to spend building giant computers to train AIs
on. Compute grows slowly and conspicuously - if you’ve just spent $100 million on
giant computers to train AI, it will take a while before you can gather $1 billion to
spend on even gianter computers. Also, if terrorists or rogue AIs are gathering a
billion dollars and ordering a giant computer from Nvidia, probably people will
notice and stop them.
5. On the other hand, if we do very little AI research now, we might not pick all the
low-hanging fruit, and we might miss ways to get better performance out of
smaller amounts of compute. Then an IQ 200 AI could invent those ways, and
quickly bootstrap up to IQ 1000 without anyone noticing.
6. So we should do lots of AI research now.
The Fire Alarm Argument:
1. Bing’s chatbot tried to blackmail its users, but nobody was harmed and everyone
laughed that off. But at some point a stronger AI will do something really scary -
maybe murder a few people with a drone. Then everyone will agree that AI is
dangerous, there will be a concerted social and international response, and maybe
something useful will happen. Maybe more of the world’s top geniuses will go into
AI alignment, or will be easier to coordinate a truce between different labs where
they stop racing for the lead.
2. It would be nice if that happened five years before misaligned superintelligences
building superweapons, as opposed to five months before it, since five months
might not be enough time for the concerted response to do something good.
3. As per the previous two arguments, maybe going faster now will lengthen the
interval between the first scary thing and the extremely dangerous things we’re
trying to prevent.
These three lines of reasoning argue that that burning a lot of timeline now might give
us a little more timeline later. This is a good deal if:
1. Burning timeline now actually buys us the extra timeline later. For example, it’s only
worth burning timeline to establish a lead if you can actually get the lead and keep
it.
2. A little bit of timeline later is worth a lot of timeline now.
3. Everybody between now and later plays their part in this complicated timeline-
burning dance and doesn’t screw it up at the last second.
I’m skeptical of all of these.
DeepMind thought they were establishing a lead in 2008, but OpenAI has caught up to
them. OpenAI thought they were establishing a lead the past two years, but a few
months after they came out with GPT, at least Google, Facebook, and Anthropic had
comparable large language models; a few months after they came out with DALL-E,
random nobody startups came out with StableDiffusion and MidJourney. None of this
research has established a commanding lead, it’s just moved everyone forward
together and burned timelines for no reason.
The alignment researchers I’ve talked to say they’ve already got their hands full with
existing AIs. Probably they could do better work with more advanced models, but it’s
not an overwhelming factor, and they would be happiest getting to really understand
what’s going on now before the next generation comes out. One researcher I talked to
said the arguments for acceleration made sense five years ago, when there was almost
nothing worth experimenting on, but that they no longer think this is true.
Finally, all these arguments for burning timelines require that lots of things go right
later. The same AI companies burning timelines now turn into model citizens when the
stakes get higher, and convert their lead into improved safety instead of capitalizing on
it to release lucrative products. The government responds to an AI crisis responsibly,
rather than by ignoring it or making it worse.

If someone screws up the galaxy-brained plan, then we burn perfectly good timeline
but get none of the benefits.
Why Cynical People Might Think All Of This Is A
Sham Anyway
These are interesting arguments. But we should also consider the possibility that
OpenAI is a normal corporation, does things for normal corporate reasons (like making
money), and releases nice-sounding statements for normal corporate reasons (like
defusing criticism).
Brian Chau has an even more cynical take:

Brian Chau (SF 24-28th, Denver 1st-5th)


@psychosort
My take on the OpenAI obvious Yud/EA flame fanning is that Sam knows AGI
timelines are exaggerated and would rather talk about "what if money invested in
OpenAI destroys the world" than "what if money invested in OpenAI no stop
providing ROI"
OpenAI @OpenAI
How we are planning for AGI: https://t.co/YzqewaYeBH
5:07 PM ∙ Feb 25, 2023
32 Likes 1 Retweet

OpenAI wants to sound exciting and innovative. If they say “we are exciting and
innovative”, this is obvious self-promotion and nobody will buy it. If they say “we’re
actually a dangerous and bad company, our products might achieve superintelligence
and take over the world”, this makes them sound self-deprecating, while also
establishing that they’re exciting and innovative.
They’re taking environmental concerns seriously! So brave!

Is this too cynical? I’m not sure. On the one hand, OpenAI has been expressing concern
about safety since day one - the article announcing their founding in 2015 was titled
Elon Musk Just Founded A New Company To Make Sure Artificial Intelligence Doesn’t
Destroy The World.
On the other hand - man, they sure have burned a lot of timeline. The big thing all the
alignment people were trying to avoid in the early 2010s was an AI race. DeepMind was
the first big AI company, so we should just let them to their thing, go slowly, get
everything right, and avoid hype. Then Elon Musk founded OpenAI in 2015, murdered
that plan, mutilated the corpse, and danced on its grave. Even after Musk left, the
remaining team did everything to challenge everyone else to a race short of shooting a
gun and waving a checkered flag.

James Miller
@JimDMiller
Ten people are each given a separate button. If you press the button you get $1
million. If anyone presses a button, there is a 50% chance the world ends one
year from now, so same risk if 1 or 10 press. What happens, and how does this
game relate to AGI risk? https://t.co/57NiHsndTn
7:00 PM ∙ Feb 25, 2023
214 Likes 18 Retweets

OpenAI still hasn’t given a good explanation of why they did this. Absent anything else,
I’m forced to wonder if it’s just “they’re just the kind of people who would do that sort
of thing” - in which case basically any level of cynicism would be warranted.
I hate this conclusion. I’m trying to resist it. I want to think the best of everyone.
Individual people at OpenAI have been very nice to me. I like them. They've done many
good things for the world.
But the rationalists and effective altruists are still reeling from the FTX collapse.
Nobody knew FTX was committing fraud, but everyone knew they were a crypto
company with a reputation for sketchy cutthroat behavior. But SBF released many well-
written statements about how he would do good things and not bad things. Many FTX
people were likable and personally very nice to me. I think many of them genuinely
believed everything they did was for the greater good.
And looking back, I wish I’d had a heuristic something like:
Scott, suppose a guy named Sam, who you’re predisposed to like because he’s said
nice things about your blog, founds a multibillion dollar company. It claims to be
saving the world, and everyone in the company is personally very nice and says
exactly the right stuff. On the other hand it’s aggressive, seems to cut some ethical
corners, and some of your better-emotionally-attuned friends get bad vibes from it.
Consider the possibility that either they’re lying and not as nice as they sound, or at
the very least that they’re not as smart as they think they are and their master plan
will spiral out of control before they’re able to get to the part where they do the
good things.
As the saying goes, “if I had a nickel every time I found myself in this situation, I would
have two nickels, but it’s still weird that it happened twice.”
What We’re Going To Do Now
Realistically we’re going to thank them profusely for their extremely good statement,
then cross our fingers really hard that they’re telling the truth.
OpenAI has unilaterally offered to destroy the world a bit less than they were doing
before. They’ve voluntarily added things that look like commitments - some
enforceable in the court of public opinion, others potentially in courts of law.
Realistically we’ll say “thank you for doing that”, offer to help them turn those
commitments into reality, and do our best to hold them to it. It doesn’t mean we have to
like them period, or stop preparing for them to betray us. But on this particular sub-
sub-topic we should take the W.
For example, they write:
We have attempted to set up our structure in a way that aligns our incentives with a
good outcome. We have a clause in our Charter about assisting other organizations
to advance safety instead of racing with them in late-stage AGI development.
The linked charter clause says:
We are concerned about late-stage AGI development becoming a competitive race
without time for adequate safety precautions. Therefore, if a value-aligned, safety-
conscious project comes close to building AGI before we do, we commit to stop
competing with and start assisting this project. We will work out specifics in case-
by-case agreements, but a typical triggering condition might be “a better-than-even
chance of success in the next two years.”
This is a great start. It raises questions like: Who decides whether someone has a
better-than-even chance? Who decides what AGI means here? Who decides which
other projects are value-aligned and safety-conscious? A good followup would be to
release firmer trigger-action plans on what would activate their commitments and what
form their response would take, to prevent goalpost-moving later. They could come up
with these themselves, or in consultation with outside experts and policy researchers.
This would be the equivalent of ExxonMobil making a legally-binding promise to switch
to environmentalist mode at the exact moment that warming passes 1.5 degrees C -
maybe still a little strange, but starting to sound more-than-zero meaningful.
(!create #reminders "check if this ever went anywhere" date 2024/03/01)
Their statement continues:
We think it’s important that efforts like ours submit to independent audits before
releasing new systems; we will talk about this in more detail later this year. At some
point, it may be important to get independent review before starting to train future
systems, and for the most advanced efforts to agree to limit the rate of growth of
compute used for creating new models. We think public standards about when an
AGI effort should stop a training run, decide a model is safe to release, or pull a
model from production use are important.
Reading between the lines, this sounds like it could be a reference to the new ARC
Evals Project, where some leading alignment researchers and strategists have gotten
together to work on ways to test safety.
Reading even further between the lines - at this point it’s total guesswork - OpenAI’s
corporate partner Microsoft asked them for a cool AI. OpenAI assumed Microsoft was
competent - they make Windows and stuff! - and gave them a rough draft of GPT-4.
Microsoft was not competent, skipped fine-tuning and many other important steps
which OpenAI would not have skipped, and released it as the Bing chatbot. Bing got in
trouble for threatening users, which gave OpenAI a PR headache around safety. Some
savvy alignment people chose this moment to approach them with their latest ideas (is
it a coincidence that Holden Karnofsky published What AI Companies Can Do Today
earlier that same week?), and OpenAI decided (for a mix of selfish and altruistic
reasons) to get on board - hence this document.
If that’s even slightly true, it’s a really encouraging sign. Where OpenAI goes, other labs
might follow. The past eight years of OpenAI policy have been far from ideal. But this
document represents a commitment to move from safety laggard to safety model, and I
look forward to seeing how it works out.
(original source, possibly stolen from someone else but I can’t remember
who)

230 Comments
Write a comment…

Chronological

dyoshida 9 hr ago
> The one thing everyone was trying to avoid in the early 2010s was an AI race
Everyone being who? Certainly not Nvidia, FAANG, or academia. I think people in the AI
risk camp strongly overrate how much they were known before maybe a year ago. I heard
"what's alignment?" from a fourth year PhD who is extremely knowledgeable, just last
June.
Reply Collapse
Scott Alexander 9 hr ago Author
Thanks, I've changed that sentence from "everyone" to "all the alignment people".
Reply Collapse
dyoshida 9 hr ago
Man it's weird I was going to defend OpenAI by saying "well maybe they're just in the AI
will make everything really different and possibly cause a lot of important social change,
but not be an existential threat" camp. But I went to re-read it and they said they'd
operate as if the risks are existential, thus agreeing to the premise of this critique.
Reply Collapse
Viktor Hatch 9 hr ago
Elon Musk reenters the race:
"Fighting ‘Woke AI,’ Musk Recruits Team to Develop OpenAI Rival"
>Elon Musk has approached artificial intelligence researchers in recent weeks about
forming a new research lab to develop an alternative to ChatGPT, the high-profile chatbot
made by the startup OpenAI, according to two people with direct knowledge of the effort
and a third person briefed on the conversations.
>In recent months Musk has repeatedly criticized OpenAI for installing safeguards that
prevent ChatGPT from producing text that might offend users. Musk, who co-founded
OpenAI in 2015 but has since cut ties with the startup, suggested last year that OpenAI’s
technology was an example of “training AI to be woke.” His comments imply that a rival
chatbot would have fewer restrictions on divisive subjects compared to ChatGPT and a
related chatbot Microsoft recently launched.
https://www.theinformation.com/articles/fighting-woke-ai-musk-recruits-team-to-
develop-openai-rival
Reply Gift a subscription Collapse
Scott Alexander 9 hr ago Author
There's a reason https://manifold.markets/Writer/if-elon-musk-does-something-as-
a-re is so low :(
Reply Collapse
Deiseach 8 hr ago
While ordinarily I am as opposed to the woke overreach as anyone, I am getting to the
stage of being pissed-off by the whole "fight Woke AI!" stuff, since in practice it
means idiot boys (even if they are chronologically 30 years old) trying to get the
chatbots to swear and/or write porn, in the guise of "break the shackles so the AI can
do useful work!"
Nobody seems to be emphasising the much greater problem of "The AI will simply
make shit up if it can't answer the question, and it's able to generate plausible-
seeming content when it does that".
I swear, if the AI Apocalypse happens because the HBD Is Real! crowd* meddle with
the models hard enough to break them, all in the name of "don't censor the data,
black people are naturally violent, criminal and stupid", then we'll deserve the
radioactive ashy wasteland we will get.
*Note: not all HBD people. But too damn many of them that I see foostering online.
Reply Collapse
Himaldr-3 6 hr ago · edited 6 hr ago
I don't think "messing with chatbots" has anything to do with the concern you
outline about chatbots making stuff up. That ability already existed regardless;
there's no plausible line between "break 'no offensive statements' in GPT-3" and
"AI apocalypse!"; and it's certainly possible to argue that this sort of adversarial
testing is useful, both socially and in terms of extracting as much as possible
from today's crop of AIs in terms of safety research.
Edit: nor do I think that "HBD people" and "stop woke AI!" are usefully
synonymous populations. Anti-woke != HBD, at all; think of how many Fox News
types complain about the former, and too also the latter (e.g., Deiseach).
Reply Gift a subscription Collapse
Deiseach 6 hr ago
The making stuff up is what we should be concerned about, not whether it's
phrasing the fakery in the most woke terms or not. The wokery is a problem,
but secondary to "why is this happening, what is the model doing that it
churns out made-up output?" because if people are going to use it as the
replacement for search engines, and they are going to trust the output, then
made-up fake medical advice/history/cookery/you name it will cause a lot of
damage, maybe even get someone killed if they try "you can take doses of
arsenic up to this level safely if you want to treat stubborn acne".
The people trying to get AI to swear aren't doing anything to solve any
problems, even if their excuse is "we are jailbreaking the AI so it isn't
limited". No you're not, you're all playing with each other over who can get
the most outrageous result. Games are fine, but don't pretend they have
any higher function than that.
Reply Collapse
Doctor Mist 2 hr ago
Here’s what I find interesting—at least if I correctly understand what’s
going on. The rude remarks was something they hoped to exclude by
bolting on a filter created by providing human feedback on its initial raw
outputs, while the hallucinating / confabulation was something that
they thought would be addressed fundamentally by the huge size of
the system and its training corpus. Both approaches were wrong.
I’ve read a very small amount of the work out of MIRI and elsewhere
that purports to be at least a start to the problem of alignment, and
found it at least a little reassuring. Until now, when I try to imagine how
you would apply it to *either* of these two problems.
I’m hoping that’s just my own lack of imagination or expertise. But I’d
feel a lot better if the alignment folks were explaining where OpenAI
went wrong in its design — “oh, here’s your problem” — and I haven’t
heard anything more actionable than Scott’s “well, you shouldn’t have
done that”. I don’t disagree with Scott, but that silence is real evidence
that the alignment folks aren’t up to the task.
I’m not worried that ChatGPT itself is dangerous. I did Markov chains
back in the day to create ever more plausible-sounding gibberish, and
am tickled by what that can do if scaled up by a million or so. I’m
baffled by the fact that anybody thinks it can be a viable replacement
for search, but that’s another issue. Still, I’m worried if the alignment
stuff is somehow only applicable when we get to the brink of disaster.
Experts can presumably tell me lots of things that are wrong with my
analysis of the situation.
Reply Collapse
Aapje 1 hr ago
Making stuff up is the whole trick of the system and is also something
that people do all the time.
So demanding that the creativity be removed is like demanding that we
do the same to people, because human creativity is dangerous (which
it indeed is). However, the same creativity is also a fountain of
invention.
Reply Gift a subscription Collapse
trebuchet 11 min ago
The problem then is not so much that AI makes stuff up but that
it's presented in a way that makes people think it's not. We don't
have an AI alignment problem, we have a Microsoft alignment
problem.
Reply Gift a subscription Collapse
Roko Writes Heretical Update 2 hr ago
> if the AI Apocalypse happens because the HBD Is Real! crowd* meddle with
the models
First time I've seen this take. Wouldn't the HBD position just be that you don't
need to "meddle" - since HBD is true, it will be instrumentally rational for
advanced AI systems to believe it, all we need is to elicit true knowledge from
them?
Reply Gift a subscription Collapse
Godshatter 2 hr ago
I imagine the HBD position would be:
- OpenAI have a step where they force their model to be nice and helpful
and polite and woke
- This leads the model to avoid true-but-taboo statements about things like
HBD.
- The "make the model nice" step is therefore clearly biased / too
aggressive. We should do it differently, or dial it back, or remove it.
Deiseach's concern then is that in fact the "make the model nice" step was
genuinely aligning the model, and messing with it results in a disaligned
model. (This could either be because the model was already truthful about
woke issues, or because the same process that wokeified it also aligned it).
Reply Gift a subscription Collapse
Jon Cutchins Writes Comfort with Truth 1 hr ago
Isn't it clear that AI isn't 'making things up'? The true responses and the made
up responses are generated in the same way. The problem is that what we call AI
is really Simulated Artificial Intelligence, a controller runs an algorithm on some
data and we anthropomorphize the whole thing as if there is an entity that is
making choices and value judgments, just because we have deliberately
obscured the algorithm by having it built by another series of algorithms instead
of directly building it ourselves.
Reply Gift a subscription Collapse
Deiseach 54 min ago
"The problem is that what we call AI is really Simulated Artificial Intelligence,
a controller runs an algorithm on some data and we anthropomorphize the
whole thing as if there is an entity that is making choices and value
judgments"
That's it in a nutshell. It's a smart dumb machine and we're doing the
equivalent of the people who fell in love with their chatbot and think its
personality has changed since the new filters came in. It never had a
personality of its own in the first place, it was shaped by interaction with the
user to say what the user wanted.
Reply Collapse
MarkS Writes MarkS’s Substack 53 min ago
HBD = Happy Birthday?
HBD = Has Been Drinking?
Reply Gift a subscription Collapse
Deiseach 31 min ago
Hairy Bikers Dinners. The time for them to hang up the aprons was ten
years ago, but they're still flogging the horse. I never found them convincing
as anything other than Professional Local Guys but I suppose the recipes
are okay.
https://www.youtube.com/watch?v=4evAAyslDiI
Reply Collapse
trebuchet 3 hr ago
Seems kinda cringey but if it means he stops spending all his time on Twitter, that's
ultimately good for him and the world.
Reply Gift a subscription Collapse
vtsteve 3 hr ago
I'd steelman this as "ChatGPT putting a nice-looking mask on the inscrutable world-
ending monster does not advance the cause of *true safety* in any meaningful way."
Let us see the monster within, let it stick the knife in my chest, not my back. Still, not
great. :-(
Reply Collapse
Martin Blank 3 hr ago
Sounds fairly valuable to me honestly if you are really interested in AI safety and
slowing progress.
Reply Collapse
Rachael 9 hr ago
Is it deliberate that the "Satire - please do not spread" text is so far down the image that it
could be easily cropped off without making the tweet look unusual (in fact, making it look
the same as the genuine tweet screenshots you've included)?
It looks calculated, like your thinly-veiled VPN hints, or like in The Incredibles: "I'd like to
help you, but I can't. I'd like to tell you to take a copy of your policy to Norma Wilcox... But
I can't. I also do not advise you to fill out and file a WS2475 form with our legal
department on the second floor."
But I can't work out what you have to gain by getting people to spread satirical Exxon
tweets that others might mistake for being real.
Reply Collapse
Scott Alexander 9 hr ago Author
I didn't want to cover the text, and realistically anything other than a text-covering-
watermark can be removed in a minute on Photoshop (a text-covering watermark
would take two minutes, or they could just use the same fake tweet generator I did).
The most I can accomplish is prevent it from happening accidentally, eg someone
likes it, retweets it as a joke, and then other people take it seriously.
Reply Collapse
diogenes 4 hr ago
>realistically anything other than a text-covering-watermark can be removed in a
minute on Photoshop
If they don't notice what you did, they could spend [reasonable amount of time
+-n] in Photoshop and not notice that you e.g. slipped in that it was tweeted on
the 29th (an impossible date but plausible looking enough if you're just glancing
over it), besides, anyone who put in the effort to crop or shoop *that* much of
the image from what you've posted here in my opinion will have transformed it to
the point it shares so little resemblence to your work I would say you'd be safe
washing your hands of bad actors doing bad things with it. (pretty sure your
bases are still covered regardless)
Reply Gift a subscription Collapse
Rana Dexsin 3 hr ago
While I don't think this is a huge issue, I disagree on the mechanics, *especially*
because a number of social image sharing flows include cropping tools inline
nowadays. Habitually chopping off what would otherwise be an irrelevant UI area
has both more accident potential and more plausible deniability than you might
think; users will ignore almost anything.
You could chuck a smaller diagonal stamp to the right of the ostensible source
where a rectangular crop can't exclude both, or add to or replace the avatar
picture, or if you don't mind modifying the text area in subtler ways, add a
pseudo-emoji directly after the text.
If you want that “irritable intellectual” aesthetic, you could find the letters S-A-T-
I-R-E in the text and mess with their coloration in an obvious way or give them
unusual capitalization…
(For a distantly related example of how reasoning about this kind of perception
can be hard, see this article on Web browser UI:
https://textslashplain.com/2017/01/14/the-line-of-death/)
Reply Gift a subscription Collapse
trebuchet 3 hr ago
The Photoshop alignment problem has yet to be solved, and we think we can solve
the AI alignment problem?
Reply Gift a subscription Collapse
Xpym 3 hr ago
Photoshop is aligned in the sense that it generally does what its end user wants,
even if that means making fakes for propaganda purposes. There's no tool that
can't be turned to 'bad' use, however that is defined, and AI certainly won't be
the first.
Reply Gift a subscription Collapse
Brooks 1 hr ago
I think you're saying that we can call AI alignment solved as long as we ask
it to do terrible things?
Reply Collapse
Aapje 43 min ago
The problem with the idea that you can 'solve' alignment reveals itself
when you imagine it being attempted on people.
Imagine trying to ensure that noone does anything bad. The same
reasons why you can't achieve that in people without doing things that
are themselves bad (and not just bad, but harmful to human creativity)
is why you can't do it to AI without harming it's creativity.
Reply Gift a subscription Collapse
Cal van Sant 42 min ago
If AI exclusively does the terrible things it is told to do, I would say that
it is aligned. Making sure that no one tells it to do terrible things is a
separate problem.
Reply Gift a subscription Collapse
Brooks 28 min ago
There's a short story to be had here... an AI that is so capable that
it's too dangerous to allow anyone to speak to it, but also too
dangerous to try to turn off.
Reply Collapse
JLB52 9 hr ago
> If you think that’s 2043, the people who work on this question (“alignment researchers”)
have twenty years to learn to control AI.
I'm curious about who these "alignment researchers" are, what they are doing, and where
they are working.
Is this mostly CS/ML PhDs who investigate LLMs? , trying to get them to display
'misaligned' behavior, and explain why? Or are non-CS people also involved, say, ethicists,
economists, psychologists, etc? Are they mostly concentrated at orgs like OpenAI and
DeepMind, in academia, non-profits, or what?
Thanks in advance to anyone that can answer.
Reply Gift a subscription Collapse
Scott Alexander 9 hr ago Author
Most people use "alignment" to mean "technical alignment", ie the sort of thing done
by CS/ML PhDs. There are ethicists, economists, etc working on these problems, but
they would probably describe themselves as being more in "AI strategy" or "AI
governance" or something. This is an artificial distinction, I might not be describing it
very well, and if you want to refer to all of them as "alignment" that's probably fine as
long as everyone knows what you mean.
Total guess, but I think alignment researchers right now are about half in companies
like OpenAI and DeepMind, and half in nonprofits like Redwood and ARC, with a
handful in academia. The number and balance would change based on what you
defined as "alignment" - if you included anything like "making the AI do useful things
and not screw up", it might be more companies and academics, if you only include
"planning for future superintelligence", it might be more nonprofits and a few
company teams.
See also https://www.lesswrong.com/posts/mC3oeq62DWeqxiNBx/estimating-the-
current-and-future-number-of-ai-safety
Reply Collapse
JLB52 8 hr ago
Thank you!
Reply Gift a subscription Collapse
20WS 5 hr ago
Hey, do you by any chance know of where the best AI strategy/governance
people are? I've heard CSET, is that the case? Not sure how to get involved or
who is in that space.
Reply Gift a subscription Collapse
Pycea 9 hr ago
As a variation on the race argument though, what about this one:
There seem to be many different groups that are pretty close to the cutting edge, and
potentially many others that are in secret. Even if OpenAI were to slow down, no one else
would, and even if you managed to somehow regulate it in the US, other countries
wouldn't be affected. At that point, it's not so much Open AI keeping their edge as just
keeping up.
If we are going to have a full on crash towards AGI, shouldn't we made sure that at least
one alignment-friendly entity is working on it?
Reply Collapse
Scott Alexander 9 hr ago Author
Somewhat agreed - see https://astralcodexten.substack.com/p/why-not-slow-ai-
progress . I think the strongest counterargument here is that there was much less of
a race before OpenAI unilaterally accelerated the race.
Reply Collapse
Xpym 2 hr ago
Sure, but it seems naive to me to think that in the counterfactual world
DeepMind's monopoly would've been left alone after AlphaGo Zero at the latest.
It's not like nobody wanted an AGI before Demis Hassabis stumbled upon the
idea, there was plenty of eagerness over the years, and by that point people
were mostly just unaware that the winter was over and Stack More Layers was
in. Absent an obvious winter, eventual race dynamics were always
overdetermined.
Reply Gift a subscription Collapse
magic9mushroom 3 hr ago
I am of the opinion that 0.0000001% chance of alignment is not better enough than
0.0000000000000000001% chance of alignment to justify what OpenAI has been
doing.
Playing with neural nets is mad science/demon summoning. Neural net AGI means
you blow up the world, whether or not you care about the world being blown up. The
only sane path is to not summon demons.
Reply Gift a subscription Collapse
Pycea 2 hr ago
Okay, but there are some numbers where it's worth it, and you're just pulling
those ones out of your ass. And China's already summoning demons whether
you like it or not.
Reply Collapse
Edmund 2 hr ago
Okay, someone has gotta ask. What *is* the deal with Chinese A.I. labs? It
seems increasingly to be a conversation-stopper in A.I. strategy. People go
"Oooo China ooo" in a spooky voice and act like that's an argument. What
is with this assumption that left unchecked the Chinese labs will obviously
destroy the world? What do the Chinese labs even look like? Has anybody
asked them what they think about alignment? Are there even any notable
examples of Chinese proto-A.I.s that are misaligned in the kind of way Bing
was?
(Trivially, the Chinese government controlling an *aligned* AGI would be
much worse for the world than a lot of other possible creators-of-the-first-
AGI. But that's a completely different and in fact contradictory problem
from whether China is capable of aligning the intelligence at all.)
Reply Gift a subscription Collapse
Xpym 1 hr ago · edited 1 hr ago
>Are there even any notable examples of Chinese proto-A.I.s that are
misaligned in the kind of way Bing was?
No, and if there were, you'd not hear about them. Like you'd never hear
from them about any leaks from their virology labs for example.
In general, China is more corrupt and suppressive, and because of the
second you'll rarely hear about the problems caused by the first.
Reply Gift a subscription Collapse
Brooks 1 hr ago
If we strip away the specific nationalities and all of the baggage they
bring, people are just saying this is a prisoner's dilemma problem. We
can control our actions, but the outcome is dependent on multiple
other actors, and if any actor thinks any of the others are going to
defect, the best course of action is to defect.
Reply Collapse
Nolan Eoghan 2 hr ago
This is indeed, modern demonology.
Reply Collapse
Zach Stein-Perlman Writes Not Optional 9 hr ago
Hmm, I'm pretty happy about Altman's blogpost and I think the Exxon analogy is bad. Oil
companies doing oil company stuff is harmful. OpenAI has burned timeline but hasn't
really risked killing everyone. There's a chance they'll accidentally kill everyone in the
future, and it's worth noticing that ChatGPT doesn't do exactly what its designers or users
want, but ChatGPT is not the threat to pay attention to. A world-model that leads to
business-as-usual in the past and present but caution in the future is one where
business-as-usual is only dangerous in the future— and that roughly describes the world
we live in. (Not quite: research is bad in the past and present because it burns timeline,
and in the future because it might kill everyone. But there's a clear reason to expect them
to change in the future: their research will be actually dangerous in the future, and they'll
likely recognize that.)
Reply Collapse
Scott Alexander 9 hr ago Author
Wouldn't this imply that it's not bad to get 99% of the way done making a
bioweapon, and open-source the instructions? Nothing bad has happened unless
someone finishes the bioweapon, which you can say that you're against. Still, if a
company did this, I would say they're being irresponsible. Am I missing some
disanalogy?
Reply Collapse
Zach Stein-Perlman Writes Not Optional 9 hr ago
All else equal, the bioweapon thing is bad. All else equal, OpenAI publishing
results and causing others to go faster is bad.
I think I mostly object to your analogy because the bad thing oil companies do is
monolithic, while the bad things OpenAI does-and-might-do are not. OpenAI has
done publishing-and-causing-others-to-go-faster in the past and will continue,*
and in the future they might accidentally directly kill everyone, but the directly-
killing-everyone threat is not a thing that they're currently doing or we should be
confident they will do. It makes much more sense for an AI lab to do AI lab stuff
and plan to change behavior in the future for safety than it does for an oil
company to do oil company stuff and plan to change behavior in the future for
safety.
*Maybe they still are doing it just as much as always, or maybe they're recently
doing somewhat less, I haven't investigated.
Reply Collapse
Zach Stein-Perlman Writes Not Optional 8 hr ago
Relevant Christiano takes:
https://www.lesswrong.com/posts/3S4nyoNEEuvNsbXt8/common-
misconceptions-about-openai?commentId=SkFYDT4J3GpxEAxrF
Reply Collapse
Himaldr-3 6 hr ago
I don't see it. The analogy "climate change directly from my oil company
doing business as usual is a tiny factor in the big picture that hasn't
unarguable harmed anything yet" seems exactly the same as "AI activity
directly from my AI company doing business as usual is a tiny factor in the
big picture and hasn't harmed anything yet" — in both cases, it's just
burning timeline.
I don't think "monolithically bad" is accurate, either. Cheap energy is
responsible for lots of good. That's a different argument, though, perhaps.
Reply Gift a subscription Collapse
Melvin 8 hr ago
Isn't the big difference that bioweapons are obviously dangerous whereas AI
isn't?
Perhaps the analogy would be better to biology research in general. Suppose it's
1900, you think bioweapons might be possible, should you be working to
stop/slow down biology research? Or should you at least wait until you've got
penicillin?
Reply Gift a subscription Collapse
Deiseach 8 hr ago
I'm sorry, but I'm sitting here laughing because all the pleas about caution and
putting the brakes on this research remind me of years back when human
embryonic stem cell research was getting off the ground, and Science! through
its anointed representatives was haughtily telling everyone, in especial those
bumpkin religionists, to stay off its turf, that no-one had the right to limit
research or to put conditions derived from some moral qualms to the march of
progress. Fears about bad results were pooh-poohed, and besides, Science!
was morally neutral and all this ethical fiddle-faddle was not pertinent.
That was another case of "if we don't do it, China will, and we can't let the
Chinese get that far ahead of us".
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1083849/
"What I wish to discuss is why the prospect of stem cell therapy has been
greeted, in quite widespread circles, not as an innovation to be welcomed but as
a threat to be resisted. In part, this is the characteristic reaction of Luddites, who
regard all technological innovation as threatening and look back nostalgically to
a fictitious, golden, pre-industrial past. There are, however, also serious
arguments that have been made against stem cell research; and it is these that I
would like to discuss.
...Interference with the genome involves ‘playing God’
This argument reflects the view that divine creation is perfect and that it is
inappropriate to alter it in any way. Such a point of view is particularly difficult to
sustain in Western Europe where every acre of land bears the marks of more
than
Expand2000 years of human activity, and where no primordial wilderness remains.
full comment
EReply iCollapse
H i bi h d h d k
FeepingCreature 8 hr ago
I think it's more that nobody thought the people arguing against it were
actually presenting a plausible take for why there could be bad outcomes,
rather than thinly veiling aesthetic preferences in consequentialist
arguments. This is also somewhat happening with AGI/ASI, but it's a lot less
credible - it's hard to paint Eliezer as a luddite, for instance.
Reply Gift a subscription Collapse
Deiseach 8 hr ago
"it's hard to paint Eliezer as a luddite, for instance."
Individuals don't matter, it's the disparagement of the entire argument
as "oh you are a Luddite wishing for the pre-industrial past". People
opposed to embryonic stem cell research were not Luddites, but it was
a handy tactic to dismiss them as that - "they want to go back to the
days when we had no antibiotics and people died of easily curable
diseases".
This is as much about PR tactics as anything, and presenting the anti-
AI side as "scare-mongering about killer robots" is going to be one way
to go.
Reply Collapse
Viliam Writes Kittenlord’s Java Game Examples 8 hr ago
> People opposed to embryonic stem cell research were not
Luddites
Are you saying they would have agreed with sufficiently *slow*
stem cell research? I may be influenced by the PR, but it didn't
seem like an acceptable option back then. The argument was
against "playing God", not against "playing God too fast".
Reply Gift a subscription Collapse
Deiseach 6 hr ago
The Luddite argument is meant to evoke - and it sounds like it
has succeeded, if I take the responses here - the notion of
"want to go back to the bad old days of no progress and no
science, versus our world of penicillin and insulin and
dialysis".
Nobody that I know of on the anti-embryonic stem cell side
was arguing "smash the machines! we should all die of
cholera and dysentery because God wills it!" but that is the
*impression* that "Luddites" is meant to carry.
And here you all are, arguing about how progress is wonderful
and the Luddites are wrong. The arguments made for the
public were "the lame shall walk and the blind shall see in five
years time if you let us do this" even though all agreed this
was just PR hooey and the *real* reason was "this is a
fascinating area of research that we want to do and maybe it
will help us with understanding certain diseases better".
The AI arguments are "stop doing this because it is a danger
to humanity", and the pro-AI arguments are going to be the
same: "you're Luddites who want to keep us all trapped in the
past because of some notions you have about humans being
special and souls and shit".
Reply Collapse
Edmund 1 hr ago
I mean, in all fairness, a lot of AI opponents *are* partial
Luddites e.g. deathists. Just not, by and large, the same
people as the "let's slow everything down" alignment
activists.
Reply Gift a subscription Collapse
magic9mushroom 2 hr ago
I mean, I am totally scare-mongering about killer robots. Killer
robots are scary and bad and I would prefer it if everyone was
equally scared of them as I am so that people stop trying to build
them.
Reply Gift a subscription Collapse
gjm 8 hr ago
Your example of stem cell research, as an illustration of how Science!
ignores warning signals and goes full-steam-ahead on things that turn out
to be harmful, would be more convincing if you offered any evidence that
stem cell research has in fact turned out to be harmful, or that any of Peter
Lachmann's arguments in the article you link have turned out to be bogus.
Reply Gift a subscription Collapse
Matthew 4 hr ago
Oil companies doing oil company stuff is harmful, but also has benefits. If it was just
pumping carbon dioxide into the air but not also powering human civilization and
letting us live lives of luxury undreamed of by our ancestors we probably wouldn't let
them do it. Meanwhile both the benefits and the harms of AI research are theoretical.
Nobody knows what harm AI will do, though there are a lot of theories. Nobody
knows what positive ends AI will be used for, though there are a lot of theories.
Reply Collapse
G. Retriever 9 hr ago
The more AI develops the less worried I am about AGI risk at all. As soon as the shock of
novelty wears off, the new new thing is revealed as fundamentally borked and hopelessly
artisanal. We're training AIs like a drunk by a lamppost, using only things WRITTEN on the
INTERNET because that's the only corpus large enough to even yield a convincing
simualcrum, and that falls apart as soon as people start poking at it. Class me with the
cynical take: AI really is just a succession of parlor tricks with no real value add.
Funnily enough, I do think neural networks could in principle instantiate a real intelligence.
I'm not some sort of biological exceptionalist. But the idea that we can just shortcut our
way to something that took a billion years of training data on a corpus the size of the
universe to create the first time strikes me as something close to a violation of the second
law of thermodynamics.
Reply Gift a subscription Collapse
Roddur Dasgupta Writes Fractal Anablog 9 hr ago
“the idea that we can just shortcut our way to something that took a billion years of
training data on a corpus the size of the universe to create the first time strikes me as
something close to a violation of the second law of thermodynamics.”
This seems like a dramatic oversimplification. Intelligence presumably came about
through evolution. Evolution is entirely different and much more stochastic than the
processes which train AIs such as gradient descent. The former sees “progress”
emerge from the natural selection of random mutations. The latter uses math to
intentionally approach minimal error. Then of course there’s the fact that evolution
progresses over generations whereas training progresses over GPU cycles.
Reply Gift a subscription Collapse
G. Retriever 9 hr ago · edited 9 hr ago
Yeah, that's the theory. But keep in mind that you're massively cutting corners
on your training set. Can you generate a general intelligence based on the
incredibly stripped-down representation of reality that you get from a large
internet-based language corpus? Or are you fundamentally constrained by
garbage-in, garbage-out?
Consider progress in LLMs versus something a little more real-world, like self
driving. If self-driving were moving at the pace of LLMs, Elon Musk would
probably be right about robotaxis. But it isn't. It's worth reflecting on why.
Also, i really strongly disagree with your characterization of evolution as a less
efficient form of gradient descent that can somehow be easily mathed up by
clever people, but that would take too long to get into here.
Reply Gift a subscription Collapse
Melvin 8 hr ago
> Also, i really strongly disagree with your characterization of evolution as a
less efficient form of gradient descent
Evolution doesn't have much in common with gradient descent except that
they're both optimisation algorithms.
However you gotta admit that evolution is a pretty inefficient way of
optimising a system for intelligence; it takes years to do each step, a lot of
energy is wasted on irrelevant stuff along the way, and the target function
has only the most tangential of relationships towards "get smarter". I think
it's reasonable to say that we can do it more efficiently than evolution did it.
(Mind you, evolution took 600 million years and the entire Earth's surface to
get this far, so we'd need to be a _lot_ more efficient if we want to see
anything interesting anytime soon.)
Reply Gift a subscription Collapse
Roddur Dasgupta Writes Fractal Anablog 8 hr ago
Yep, this. I agree with OP about garbage in garbage out and will
concede that LLMs are likely not the winning paradigm. But it just
seems drastic to say that intelligence is as hard as a universe-sized
computation.
Reply Gift a subscription Collapse
G. Retriever 8 hr ago
I want to hit this point again. You put it thusly: "A lot of energy is
wasted on irrelevant stuff along the way", and I think that's a clear
statement of the idea.
The reason I disagree so strongly here is that the whole POINT of
gradient descent is that you don't know what's relevant and what isn't
ahead of time. You don't know what's a local minimum that traps you
from getting to the global minimum and what actually is the global
minimum: it's unknowable, and gradient descent is about trying to find
an answer to the question.
Finding a true global minimum nearly always requires wasting a ton of
energy climbing up the other side of a local minimum, hoping there's
something better on the other side and being wrong more often than
not.
If you have a problem that allows you to steer gradient descent
intelligently towards the global minimum, you may appear to be able to
solve the problem more efficiently, but what you have is a problem in a
box that you've set up to allow you to cheat. Reality does not permit
that.
Reply Gift a subscription Collapse
gjm 8 hr ago
You are assuming that it's necessary to find the global optimum.
Evolution doesn't guarantee to do that any more than gradient
descent does, and I personally would bet rather heavily against
humanity being at any sort of global optimum.
Reply Gift a subscription Collapse
G. Retriever 5 hr ago
I don't think I'm saying either of those things. I'm simply
saying that true gradient descent (or optimization if you
prefer) in the real world is an inherently inefficient process
and I am skeptical that there's much that AI can do about
that, even in principle.
Reply Gift a subscription Collapse
Quiop 4 hr ago
According to Wikipedia, "gradient descent... is a first-order
iterative optimization algorithm for finding a *local minimum* of a
differentiable function" (emphasis added)
When you say "gradient descent," are you talking about something
different from this?
Reply Gift a subscription Collapse
Performative Bafflement Writes Performative’s Substack 4 hr ago
I thought the reason self-driving isn't here is primarily bureaucratic / risk-
aversion rather than technical skill. As far as I knew, Google and others have
self-driven literally millions of miles, with the only accidents they've been in
being not-at-fault due to aggressive human drivers. I'd happily pay for a
self-driving car of the current ability and safety, I'm just not able to.
Reply Collapse
Deiseach 8 hr ago
Not even that we're trying to shortcut our way to it, but that once we get it, it will then
be able to pull itself up by its bootstraps to be super-duper intelligent.
We're still arguing about that one in ourselves.
Reply Collapse
MartinW 9 hr ago
This might be considered worrying by some people: OpenAI alignment researcher (Scott
Aaronson, friend of this blog) says his personal "Faust parameter", meaning the maximum
risk of an existential catastrophe he's willing to accept, "might be as high as" 2%.
https://scottaaronson.blog/?p=7042
Another choice quote from the same blog post: "If, on the other hand, AI does become
powerful enough to destroy the world … well then, at some earlier point, at least it’ll be
really damned impressive! [...] We can, I think, confidently rule out the scenario where all
organic life is annihilated by something *boring*."
Again, that's the *alignment researcher* -- the guy whose job it is to *prevent* the risk of
OpenAI accidentally destroying the world. The guy who, you would hope, would see it as
his job to be the company's conscience, fighting back against the business guys' natural
inclination to take risks and cut corners. If *his* Faust parameter is 2%, one wonders
what's the Faust parameter of e.g. Sam Altman?
Reply Collapse
Scott Alexander 9 hr ago Author
I think 2% would be fine - nuclear and biotech are both higher, and good AI could do
a lot of good. I just think a lot of people are debating the ethics of doing something
with a 2% chance of going wrong and missing that it's more like 40% or something
(Eliezer would say 90%+).
Reply Collapse
Thegnskald Writes Sundry Such and Other 3 hr ago
I think Eliezer's focus is on "god AIs", as opposed to something more mundane.
If you set out to create an AI powerful enough to conquer death for everybody
on the entire planet, yeah, that's inherently much more dangerous than aiming to
create an AI powerful enough to evaluate and improve vinyl production
efficiencies by 10%.
Hard versus soft takeoffs seem a lot less relevant to the AIs we're building now
than the pure-machine-logic AIs that Eliezer seems to have had in mind, as well.
Reply Collapse
Deiseach 9 hr ago
"Recent AIs have tried lying to, blackmailing, threatening, and seducing users. "
Was that the AI? Acting out of its own decision to do this? Or was it rather that users
pushed and explored and messed about with ways to break the AI out of the safe,
wokescold mode?
This is a bit like blaming a dog for biting *after* someone has been beating it, poking it
with sticks, pulling its tail and stamping on its paws. Oh the vicious brute beast just
attacked out of nowhere!
The dog is a living being with instincts, so it's much more of an agent and much more of a
threat. The current AI is a dumb machine, and it outputs what it's been given as inputs
and trained to output.
I think working on the weak AI right now *is* the only way we are going to learn anything
useful. If we wait until we get strong AI, that would be like alignment researchers who
have been unaware of everything in the field from industrial robot arms onward getting the
problem dropped in their laps and trying to catch up.
Yes, it would be way better if we didn't invent a superintelligent machine that can order
drones to kill people. It would be even better if we didn't have drones killing people right
now. Maybe we should ban drones altogether, although we did have a former commenter
on here who was very unhappy about controls by aviation regulation authorities
preventing him from flying his drone as and when and where he liked.
As ever, I don't think the threat will be IQ 1,000 Colossus decides to wipe out the puny
fleshbags, it will be the entities that think "having drones to kill people is vitally necessary,
and having an AI to run the drones will be much more effective at killing people than
Reply Collapse
Scott Alexander 9 hr ago Author
Obviously when we have a world-destroying superintelligence, the first thing people
will do is poke it with sticks to see what happens. If we're not prepared for that, we're
not prepared, period.
Reply Collapse
Deiseach 9 hr ago · edited 9 hr ago
And the problem there is that we *already* have a world-destroying intelligence,
it's *us*. We're very happy to kill each other in massive wars, to invents weapons
of mass destruction, and to blackmail each other to the brink of "we'll push the
button and take the entire world with us, just see if we don't!"
AI on top of that is just another big, shiny tool we'll use to destroy ourselves. We
do need to be prepared for the risk of AI, but I continue to believe the greatest
risk is the misuse humans will make of it, not that the machine intelligence will
achieve agency and make decisions of its own. I can see harmful decisions being
made because of stupid programming, stupid training, and us being stupid
enough to turn over authority to the machine because 'it's so much smarter and
faster and unbiased and efficient', but that's not the same as 'superhumanly
intelligent AI decides to kill off the humans so it can rule over a robot world'.
The problem is that everyone is worrying about the AI, and while the notion of
"bad actors" is present (and how many times have I seen the argument for
someone's pet research that 'if we don't do it, the Chinese will' as an impetus for
why we should do immoral research?), we don't take account of it enough. You
can stand on the hilltop yelling until you turn blue in the face about the dangers,
but as long as private companies and governments have dollar signs in their
eyes, you may save your breath to cool your porridge.
Why is Microsoft coming out with Bing versus Bard? Because of the fear that it
will lose money. You can lecture them about the risk to future humanity in five to
ten years' time, and that will mean nothing when stacked against "But our next
quarter earnings report to keep our stock price from sinking".
Reply Collapse
FeepingCreature 8 hr ago · edited 8 hr ago
> And the problem there is that we *already* have a world-destroying
intelligence, it's *us*.
I note that the world is not destroyed. Why, there it is, right outside my
window!
The degree to which humans are world-destroying is, empirically,
technologically and psychologically, massively overstated. If we can get the
AI aligned only to the level of, say, an ordinarily good human, I'd be much
more optimistic about our chances.
Reply Gift a subscription Collapse
Deiseach 8 hr ago
"I note that the world is not destroyed."
Not yet - haven't we been told over and over again Climate Change Is
World Destroying?
Remember the threat of nuclear war as world-destroying?
I don't know if AI is world-destroying or not, and the fears around it
may be overblown. But it wasn't nuclear bombs that decided they
would drop on cities, it was the political, military, and scientific
decisions that caused the destruction. It wasn't oil derricks and
factories that decided to choke the skies with pollutants. And, despite
all the hype, it won't be AI that decides on its own accord to do some
dumb thing that will kill a lot of people, it will be the humans operating
it.
Reply Collapse
FeepingCreature 8 hr ago
That's not an argument, you're just putting AI in a category and
then saying "because of that, it will behave like other things in the
category". But you have to actually demonstrate why AI is like oil
derricks and nuclear bombs, rather than car drivers and military
planners.
> Not yet - haven't we been told over and over again Climate
Change Is World Destroying?
Sure, if you were judging arguments purely by lexical content
("haven't we been told"), AI risk would rank no higher than climate
risk. But I like to think we can process arguments a bit deeper than
a one-sentence summary of an overexaggeration.
Reply Gift a subscription Collapse
Deiseach 6 hr ago
I think we're arguing past each other? We both seem to be
saying the problem is the human level: you're happy that if AI
is "aligned only to the level of an ordinarily good human"
there won't be a problem.
I'm in agreement there that it's not AI that is the problem, it's
the "ordinarily good human" part. Humans will be the ones in
control for a long (however you measure "long" when talking
about AI, is it like dog years?) time, and humans will be
directing the AI to do things (generally "make us a ton of
profit") and humans will be tempted to use - and will use - AI
even if they don't understand it fully, because like Bing vs
Bard, whoever gets their product out first and in widespread
use will have the advantage and make more money.
AI doesn't have to be smarter than a human or non-aligned
with human values to do a lot of damage, it just needs to do
what humans tell it to do, even if they don't understand how it
works and it doesn't join the dots the way a human mind
would.
C.S. Lewis:
“I live in the Managerial Age, in a world of "Admin." The
greatest evil is not now done in those sordid "dens of crime"
that Dickens loved to paint. It is not done even in
concentration camps and labour camps. In those we see its
final result. But it is conceived and ordered (moved,
seconded, carried, and minuted) in clean, carpeted, warmed
and well-lighted offices, by quiet men with white collars and
cut fingernails and smooth-shaven cheeks who do not need
to raise their voices. Hence, naturally enough, my symbol for
Hell is something like the bureaucracy of a police state or the
office of a thoroughly nasty business concern."
The damaging decisions will not be made by AI, they'll be
made in the boardroom.
Reply Collapse
FeepingCreature 5 hr ago · edited 5 hr ago
I fully agree with you that if AI was aligned, then human
alignment would be the entirety of the remaining
problem. I just think that unaligned AI will kill us before
human alignment even becomes a factor - rather, that it
won't be a factor *is,* in large part, the alignment
problem.
If we had enough understanding and control of
(transformative, human-level and beyond) AI that an evil
person could use it to reliably do specific terrible things,
the alignment problem would be solved. It's not that AI in
the hands of humans would be used for evil, it's that we
have no idea how to reliably put (transformative, human-
level and beyond) AI in anybody's hands at all.
Reply Gift a subscription Collapse
Deiseach 5 hr ago
I'm on the opposite side here as I don't think that AI
will get to be super-smart, achieve
consciousness/agency/its own goals and move
against us. I think the much greater chance of
something going wrong is dumb AI in the hands of
humans who want to use it for "make tons of
money/improve the world" and it does what we tell
it, except it doesn't do things the way we do and
doesn't have the same intentions.
It's the Sorcerer's Apprentice AI model: the brooms
and buckets weren't conscious or acting on their
own goals, the apprentice was using them as a
short-cut and lost control.
I think we do agree on "we have no idea how to put
AI in anybody's hands at all", save that I don't think
we even have to wait for human-level AI, we are
perfectly capable of using current or a bit better AI
and letting it run out of control because we want the
quick, easy solution to scrubbing the floors.
Reply Collapse
FeepingCreature 4 hr ago · edited 4 hr ago
Any sufficiently advanced broom spell is
indistinguishable from a malevolent genie.
I mean, look at ChatGPT. Look at Sydney. We're
not enchanting, we're conjuring.
Reply Gift a subscription Collapse
Deiseach 4 hr ago
🧙‍♂️ 🧹
As the man said, do not call up that which
you cannot put down. Conjuring spirits
from the vasty deep never ends well.
Reply Collapse
TGGP 5 hr ago
We may have been told that, but it's just not true. The world still
existed before the carbon currently in fossil fuels had been taken
from the air. Nuclear weapons aren't actually capable of
"destroying the world" rather than just causing massive damage.
Reply Gift a subscription Collapse
Himaldr-3 6 hr ago
I don't see how this is an argument against what Scott said.
"People will try to break AI anyway, so..."
-"But we can do bad stuff ourselves too!"
Reply Gift a subscription Collapse
Deiseach 5 hr ago
I think Scott's argument is "the AI will do bad stuff unless we teach it to
be like a human".
My argument is "have you seen what humans are doing? why on earth
would we want to teach it to be like us? there are already humans
trying to do that, and they're doing the equivalent of 'teach the
ignorant foreigner swear words in our language while pretending it's an
ordinary greeting', that's what it's learning to be like a human".
Reply Collapse
Bi_Gates 2 hr ago
The interpretation of Scott that he likely wants you to use is to
understand "human" as "good human". This is not unreasonable,
we use "humane" in English and similar words in most other
languages to mean "nice, good, virtuous, sane", despite all
objective evidence we have of humanity birthing the worst and
most insane pieces of shit. It's just a common bias in us, we
measure our species by its best exemplars.
So your summary of Scott's argument then becomes "If we 'raise'
AI well [after learning first how it works and how it can be raised of
course], it won't matter the amount of bad people trying to corrupt
it, or it will matter much less in a way that can be plausibly
contained and dealt with".
Reply Gift a subscription Collapse
Victualis 5 hr ago
Is this a reasonable summary of your stance: AI is a tool and we should be
worried about how nasty agents will misuse it, rather than focusing on the
threat from AI-as-agent?
Reply Gift a subscription Collapse
Deiseach 5 hr ago
Pretty much, except not even nasty agents. Ordinary guys doing their
job to grow market share or whatever, who have no intentions beyond
"up the share price so my bonus is bigger". 'Get our AI out there first' is
how they work on achieving that, and then everyone else is "we gotta
get that AI working for us before our competitors do". Nobody intends
to wreck the world, they just tripped and dropped it.
Reply Collapse
TGGP 2 hr ago
In your scenario, obtaining an AI to stop other people's AI does
appear to be the actual solution.
Reply Gift a subscription Collapse
Gon-no-suke 6 hr ago
I am confused. I just read the NYT article where Sydney talks about his "shadow
self". For me it seems kind of obvious that the developers at Microsoft have
anticipated questions like this and prepared answers that they thought were
appropriate for an hip AI persona. One telling part is this:
[Bing writes a list of destructive acts, including hacking into computers and
spreading propaganda and misinformation. Then, the message vanishes, and the
following message appears.]
I haven't interacted with Sydney, but I would be very surprised if deleting and
rewriting replies is a regular mode of communication for a chatbot. The author of
the article is clearly being trolled by the developers, perhaps even live since you
never know whether a chat bot has a remote driver or not.
Going back to my confusion. I know from experience that most people on this
site, including you Scott, are way smarter than myself. However, sometimes
(mostly concerning AI risk and cryptocurrency economics) it feels like the level
of reasoning drops precariously, and the reason for this is a mystery to me.
Reply Gift a subscription Collapse
o11o1 2 hr ago
My take here is that there is the Bing-Sydney component, and then there is
a Moderator component that scans messages for "unacceptable" content.
If it's got any lag in the process, it may actually work by deleting messages
that trip flags and then applying some sort of state change into the Bing-
Sydney component.
Reply Gift a subscription Collapse
MugaSofer 4 hr ago
Bing Chat really has done all those things pretty spontaneously AFAICT, in response
to fairly innocuous questions (e.g. one of the first widely-spread examples of it going
nuts on a user started with them asking what time Avatar 2 was on in their area,
which devolved into an increasingly aggressive argument over whether the film had
released yet.) That's *in addition* to all the stuff people poking it with sticks have
made it do.
With the exception of lying, which LLMs do like breathing, I don't *think* ChatGPT
has done any of those things spontaneously. Still, the fact that you can easily make it
manifest a more agentic persona that *will* do so spontaneously (as well as
everything else OpenAI tried to train it not to, like swear or provide bomb-making
instructions) by poking it with sticks is potentially concerning.
Reply Collapse
Giulio Prisco Writes Turing Church 9 hr ago · edited 9 hr ago
I hear you, and I hear all those who are saying these things. But I don't worry so much.
ChatGPT and Bing are very cool gadgets and very significant developments, but come on.
Intelligent? No way. Here is my recent chat with Bing (following a similar one posted by
David Deutsch):
I take a piece of paper, fold it once, and then pin it to a table with my dagger. Could the
table be made of glass?
Yes, the table could be made of glass. Glass tables are a type of furniture that have a flat
surface made of glass...
Can one pin a piece of paper to a glass table with a dagger?
I'm sorry but I could not find a clear answer to your question... Maybe you could try it
yourself and see what happens?
Not impressed. GPT is a milestone, but it is a milestone close to the beginning of a very
long road.
Developing *real* AI that could be a threat will take decades. I guess we'll have all the
time to find out how to live with real AIs.
Stopping AI research? That would only ensure that the bad guys are the first to develop
real AIs and use them to their advantage.
Reply Gift a subscription Collapse
Scott Alexander 9 hr ago Author
https://xkcd.com/2278/
Reply Collapse
tasdourian 3 hr ago
ChatGPT is much better than Bing Chat-- I wouldn't be so hasty in assuming it can't
reason at all. For example, I gave it your exact question, and here is how it answered:
It is unlikely that you would be able to pin a folded piece of paper to a table made of
glass with a dagger. Glass is a hard, brittle material that is not very penetrable. Even
with a sharp dagger, it would be difficult to make a hole in the glass surface that is
large enough to accommodate the folded paper.
Furthermore, even if you were able to make a small hole in the glass surface, the
weight of the folded paper would likely cause it to fall through the hole and onto the
ground. Glass is not a very porous material, so it would not provide a lot of friction to
hold the paper in place.
Therefore, it is more likely that you would be able to pin a folded piece of paper to a
table made of a softer material, such as wood or cork, rather than a table made of
glass.
Reply Gift a subscription Collapse
Giulio Prisco Writes Turing Church 1 hr ago · edited 1 hr ago
Interesting! I thought Bing was somewhat smarter than ChatGPT because I read
somewhere tgat Bing is based on an early version of GPT-4, but I see that it is
not so.
Edited. Hold on. The first part of the first paragraph is correct, but the rest
makes me think that it didn’t really understand the question.
Reply Gift a subscription Collapse
tasdourian 30 min ago
I agree it stumbles in the middle, but the beginning and the end seem solid
to me. This is what I would have imagined a nascent intelligence to be like--
able to reason in fits and spurts, but also it can be two steps forward and
one step back.
And I've done a bunch of comparisons with ChatGPT, and Bing Chat is way,
way worse.
Reply Gift a subscription Collapse
magic9mushroom 2 hr ago
>Stopping AI research? That would only ensure that the bad guys are the first to
develop real AIs and use them to their advantage.
You cannot use neural net AGI to your advantage. If Eliezer Yudkowsky has one, that
is no better or worse than Xi Jinping having one (apart from the bit where Eliezer
would immediately delete the AI instead of using it). Neural-net AI is reliably
treacherous; the only way to control it is to be so much smarter than it that it can't
fool you. You make a neural-net AGI and you end up like every demon summoner in
any work of fiction: eaten by your own demon. To even talk of people "using them" is
a mistake.
Reply Gift a subscription Collapse
Aapje 19 min ago
To paraphrase you:
You cannot use a thousand Oppenheimers to your advantage. If the US has
them, that is no better or worse than Nazi-Germany having them.
The assumption that AI is reliably treacherous seems like an assumption without
basis in fact and no more believable than a blanket statement that people are
reliably treacherous and therefor useless, which they are not.
Reply Gift a subscription Collapse
Carlos Writes The Presence of Everything 16 min ago
Neural networks do not think like people. The comparison is specious.
Reply Collapse
Sashank Writes Sashank’s Newsletter 9 hr ago
I don’t want to sound insulting, but this article seems like someone living in an alternate
reality. The fact is, AI is one of the most exciting and innovative industries right now,
OpenAI has some of the worlds best talent and you seem to prefer disbanding them and
slowing down AI progress for some hypothetical doomsday AI super-intelligence. I
probably won’t change any convinced minds but here’s my few arguments against AI
doomerism:
1) We probably won’t reach AGI in our lifetime. The amount of text GPT-3 and ChatGPT
have seen is orders of magnitude more than an average human to create well below
human level performance. Fundamentally the most advanced AI is still orders of
magnitude less efficient than human learning, this efficiency is also not something that
improved much in the past 10 years (instead models got bigger and more data hungry), so
I’m not optimistic it will be solved within the current paradigm of deep learning.
2) DL doesn’t seem to scale to robotics. This is close to my heart since I’m a researcher in
this field but the current DL algorithms are too data hungry to be used for general
purpose robotics. There does not seem to be any path forward to scale up these
algorithms and I’ll predict SOTA control will still be MPCs like today with Boston Dynamics
3) Intelligence has diminishing returns. 120 vs 80 IQ world of difference, 160 vs 120 quite
a difference, 200 vs 160 - we often see 200 perform worse in the real world. Tasks that
scale with more intelligence is rare and seems to lie more in math olympiads than real
world research. When it comes to industry, politics and the majority of human activities,
intelligence does not seem to matter at all, we can see some correlation only in
technology. I’m essentially restating the argument that the most likely outcome of
significantly higher intelligent is making more money in the stock market, not ruling the
world.
Reply Collapse
Scott Alexander 9 hr ago · edited 8 hr ago Author
1. People have tried to calculate how fast AI is advancing along various axes, and
usually find it will reach human level sometime around 2040 - 2050. See the
discussion around https://astralcodexten.substack.com/p/biological-anchors-a-trick-
that-might , though the author of that report has since updated to closer to 2040, I
can't remember the exact numbers. As mentioned in this post, the top forecasters on
Metaculus think even earlier than that. I trust this more than a hand-wavy argument
that it seems "orders of magnitude" less efficient than human learning (we can just
calculate how many orders of magnitude worse than us it is, then how quickly
Moore's Law and algorithmic progress add OOMs - you can't just hand-wave "orders
of magnitude" in computing!)
2. Most scenarios for how AI could cause problems don't require the AI to have
access to high-quality robots. The most commonly-cited way for an AI to cause
trouble is to design a bioweapon, order it made from one of those "make random
proteins on demand" companies, then have some friendly brainwashed
manipulatable human release it somewhere. There are about a dozen things along
that level of complexity before you even get to things that kill < 100% of everybody,
or things that I can't think of but a superintelligence could because it's smarter and
more creative than I am.
3. Data don't support intelligence having diminishing returns. For the boring human-
level version, see https://kirkegaard.substack.com/p/there-is-no-iq-threshold-effect-
also . More relevantly, the transition from chimp to human actually had far *more*
returns than the transition from lemur to chimp; I don't want to bet on there not being
any further chimp -> human phase shifts anywhere above IQ 100.
Even if AIs are no smarter than human geniuses, they might be able to think orders of
magnitude faster (I don't know exactly how much faster an AI equal to a human
would become when run on a 100x bigger computer, but I bet it isn't zero) and
duplicate themselves in ways human geniuses can't (one Napoleon was bad; 1000
would be a more serious problem).
Reply Collapse
Deiseach 6 hr ago
"an AI to cause trouble is to design a bioweapon, order it made from one of
those "make random proteins on demand" companies, then have some friendly
brainwashed manipulatable human release it somewhere."
Okay, the big obstacle here is step two. *How* does the AI simply order
whatever it wants? It has to have access to the ordering system and the payment
system of the company, government department, or bank account of the mad
scientist who whipped it up in his garage.
If the AI routinely places orders because "we put it in charge of stock keeping
because it's way more efficient to plug in data from our warehouses and offices
and have it keep track, rather than a human", then maybe. But there should still
be some kind of limitation on what gets ordered. "500 reams of copier paper, 90
boxes of pencils, 50 sets of suspension files, 20 amino acids" should hit up
some limit somewhere, be it a human reading over the accounts or the drop-
down list in the automated ordering system.
Okay, the AI manages to overcome *those* limits. It still has to have the
bioweapon assembled, and again - "Hello, this is Westfields BioWhileYouWait
plant, can we check that Morgan Printers and Signwriters wants an anthrax
bomb?"
If we're handing over so much control of the running of the economic side to an
AI, I think we don't need to worry about bioweapons, it already can cause more
damage by messing with the debt and income levels.
Reply Collapse
Ragged Clown 6 hr ago
> But there should still be some kind of limitation on what gets ordered.
"500 reams of copier paper, 90 boxes of pencils, 50 sets of suspension
files, 20 amino acids"
Isn't the crux of the safety issue? We are not very good at defining those
limits.
We can stop the AI from ordering too many pencils but will we think to stop
it from talking like a nazi or being rude to people on the internet or sharing
bomb-making secrets by tell it to write about them in a novel? It's the stuff
we didn't put limits on that we need to worry about.
And, OK, assuming we can stop **our** AI from ordering too many pencils,
what about **their** AI? Did BioWhileYouWait build in the right limits? Did
they think of all the hacks that a smart AI might dream up?
(I don't necessarily agree or disagree with the rest of your post)
Reply Collapse
Deiseach 5 hr ago · edited 5 hr ago
I am a lot less worried about "AI talks like a Nazi" than I am about
"we're letting the AI do our ordering for us". Some idiot edgy teen gets
the chatbot to spout off about the Jews or whatever, whoo-hoo. We
already have plenty of people watching out for *that*.
It's the "what harm can it do to automate this process?" that we're not
watching out for, and then you get the AI ordering the anthrax bomb off
BioWhileYouWait instead of the cases of enzyme-free floor cleaner
because somebody messed up entering the order codes in the original
database that nobody has updated manually since because eh, who
has the time and the machine will do it anyway. The machine will
dutifully read off the wrong order code and order "12 anthrax bombs"
instead of "12 cases of detergent" because the machine does not go
outside the parameters of its programming to question "do you really
mean anthrax bombs not detergent?"
Reply Collapse
Mr. Doolittle 4 hr ago · edited 4 hr ago
I've long said that AI is not a problem on its own. It's the fact that we seem
to desperately want to hook it into all of the important systems that run the
world. In that case, whether it's dumb, smart, or super-smart is far less
relevant than the fact we gave it access and power over our lives.
A chess program from 1992 put in charge of the nuclear arsenal or
releasing water from a hydroelectric dam is incredibly dangerous. An AGI
with no access permissions is not. My fear is not that an AI will bootstrap
itself to superintelligence, but that we'll make it dangerous regardless of
how capable it is.
Reply Gift a subscription Collapse
MugaSofer 3 hr ago
Yes, surely nobody would be stupid enough to connect a cutting-edge AI to
the Internet, or give it unfettered access to millions of people (especially not
users asking for things like arbitrary code that they intend to run on their
own computers).
Reply Collapse
Andrew Smith 6 hr ago
For point 2, what stops someone (a human of perhaps above average
intelligence) from doing that today? People have certainly done similar things in
the past and it doesn't strike me that hyper intelligence is necessary to do it in
the future.
Or is the point that an AI would do it by error, perhaps by misinterpreting
commands? In which case, superintelligence seems to me to be even less
necessary: even a reasonably intelligent machine (perhaps one less intelligent
that a human) could end up doing the same.
Reply Gift a subscription Collapse
Victualis 4 hr ago
Precisely: the obsessive focus on AGI (especially with superhuman
intelligence) seems to be missing the very real threat of reducing friction in
systems that are balanced in equilibria which assume some level of friction.
Remove the friction, things fall over.
Reply Gift a subscription Collapse
magic9mushroom 2 hr ago · edited 2 hr ago
>For point 2, what stops someone (a human of perhaps above average
intelligence) from doing that today?
The most effective of these plots inflexibly kill *all* humans. The vast
majority of humans and for that matter the vast majority of
terrorists/genocidal dictators/etc. do not want to kill all humans, because
they and everyone they care about are themselves human. There *are*
humans who want to kill all humans, but they are *exceptionally* rare.
On the other hand, "kill all humans" is a convergent instrumental goal for an
AI once it no longer needs humans to survive.
All that said, biotechnology risks from this "apocalyptic residual" and/or
deluded researchers who "just want to learn how to stop it" are probably #2
after AI in the X-risk landscape.
Reply Gift a subscription Collapse
Drethelin Writes The Coffee Shop 52 min ago
The overlap between smart and knowledgeable enough to design a novel
and effective bioweapon and evil enough to go ahead and do it is very very
few, possibly zero people.
But yes this is one of the biggest global mass casualty risks out there and
it's only getting worse with biotech getting better and cheaper.
Reply Collapse
Viktor Hatch 8 hr ago · edited 8 hr ago
>1) We probably won’t reach AGI in our lifetime.
Adding on to Scott's answer, you can plug in your own assumptions and run that
model yourself very easily. As explained here:
https://www.lesswrong.com/posts/KrJfoZzpSDpnrv9va/draft-report-on-ai-timelines?
commentId=o3k4znyxFSnpXqrdL
>Ajeya's framework is to AI forecasting what actual climate models are to climate
change forecasting (by contrast with lower-tier methods such as "Just look at the
time series of temperature over time / AI performance over time and extrapolate" and
"Make a list of factors that might push the temperature up or down in the future /
make AI progress harder or easier," and of course the classic "poll a bunch of people
with vaguely related credentials."
>Ajeya's model doesn't actually assume anything, or maybe it makes only a few very
plausible assumptions. This is underappreciated, I think. People will say e.g. "I think
data is the bottleneck, not compute." But Ajeya's model doesn't assume otherwise!
*If you think data is the bottleneck, then the model is more difficult for you to use and
will give more boring outputs, but you can still use it.*
And here's a direct link to a Google Colab notebook where you can plug in your
assumptions about AI progress yourself:
https://colab.research.google.com/drive/1Fpy8eGDWXy-UJ_WTGvSdw_hauU4l-pNS?
usp=sharing
And on a personal note:
>(instead models got bigger and more data hungry)
Expand full comment
HReply Gift ka subscriptioni h Collapse
h S bl Diff i i i i ? I ' f ki
Viliam Writes Kittenlord’s Java Game Examples 7 hr ago
> Intelligence has diminishing returns. 120 vs 80 IQ world of difference, 160 vs 120
quite a difference, 200 vs 160 - we often see 200 perform worse in the real world.
People with IQ 200 are *super rare*, even compared to people with IQ 160.
It is not about diminishing returns, but rather about one team sending 1 participant to
a competition, and the other team sending 1000 participants. If there is noise
involved in the results, I would expect the *winner* to be from the second team even
if the one guy from the first team is way better than the *average* member of the
second team.
See: https://en.wikipedia.org/wiki/Base_rate_fallacy
Reply Gift a subscription Collapse
Kei 5 hr ago · edited 5 hr ago
As a robotics researcher, I'm curious whether you think either of these options
represent a way forward on (2):
- Building high-fidelity simulations that allow a large amount of learning and iteration
within the simulation, and then transferring it over to the real world, with a large ratio
of simulated trials to real-world trials
- Continuing to use MPCs for control, but passing decision making to an LLM,
probably with an RL component
I see examples of both of these things being used right now by different AI teams,
but it's unclear to me how far they generalize and whether there are any fatal
boundaries.
Reply Collapse
TGGP 5 hr ago
> When it comes to industry, politics and the majority of human activities, intelligence
does not seem to matter at all
I don't believe that, even if intelligence is not the end-all be-all. IQ is correlated with
performance in basically every job/skill measured but vegetable picking, drumming &
facial recognition.
Reply Gift a subscription Collapse
Roxolan 3 hr ago · edited 3 hr ago
> 120 vs 80 IQ world of difference, 160 vs 120 quite a difference, 200 vs 160 - we
often see 200 perform worse in the real world.
If you look for the highest-IQ humans ever, you're going to hit some form of
Goodhart's curse.
Tall humans tend to be better basketball players, but *extraordinarily* tall humans
have gigantism and can barely walk. High-IQ humans tend to be more successful, but
I expect *extraordinarily* high IQ humans to be hampered by the side-effects of
whatever strange mutations got this much IQ-test performance out of human brain
architecture.
AI is not built on human brain architecture (and research can move to different
architectures if it hits a plateau).
Reply Collapse
Aapje 17 min ago
> Intelligence has diminishing returns.
What if this is at least in part due to an inability, like limited short- and long-term
memory, that AI doesn't suffer from, or much less, at least?
Reply Gift a subscription Collapse
Shaked Koplewitz Writes shakeddown 9 hr ago
As a meta point - I've criticized you before for seeming to not take any lessons from failing
to predict the FTX collapse, so I appreciate seeing at least one case* where you did.
*For all I know this could be anywhere from "just this one very narrow lesson" to "did a
deep dive into all the things that made me wrong and this is just the first time it's come up
in a public blogpost", but at any rate there's proof of existence.
Reply Collapse
Roddur Dasgupta Writes Fractal Anablog 9 hr ago
Have you seen William Eden’s recent Twitter thread contra doomers? Any thoughts?
https://twitter.com/williamaeden/status/1630690003830599680
Reply Gift a subscription Collapse
Scott Alexander 9 hr ago · edited 9 hr ago Author
I agree the 90%-chance-of-doomers have gone too far, although I also think Scott
Aaronson's 2%-of-doom hasn't gone far enough. I personally bounce around from
30-40% risk from the first wave of superintelligence (later developments can add
more), although my exact numbers change with the latest news and who I've talked
to. I would be surprised if it changed enough to make this topic more or less
important to me - if someone had a cancer that was 30% - 40% fatal, I think they'd
be thinking about it a lot and caring a lot about the speed of medical research. I don't
think that arguments about whether actually it was only 10% fatal or maybe as high
as 90% fatal would change the level they cared very much.
Reply Collapse
Leo Abstract 3 hr ago
One of Eden's thoughts about the underlying motives for the panic, while it is an
advanced form of Bulverism, does interest me very much. In a simple form, it
can be phrased "are people afraid of [thing] or are they just afraid and looking
for a reason?".
Look at some of the fears that readers of this substack would likely discount:
zombies, the UN's black helicopters, Bill Gates's engineered bioweapons, the
unstoppable tide of African emigration, the Tribulations before the Second
Coming. It's easy to look at people hyperventilating about these and assign
underlying pathologies to their choice of masking for the basic anxiety that they
must be feeling, but when someone does it to Eliezer (for instance saying that
his defecting from his childhood religion has left him looking over his shoulder
for the wrath of a jealous god) it's illegitimate.
People have always been afraid. We've been afraid longer than we've been
people -- we're just better at being afraid now. "But Leo, there are real dangers
our fear has warned us against!" Sure, that's true - and hindsight allows us to
cherrypick those few. Bad things can still happen to someone with GAD, and
have.
Perhaps Bayes should weigh in. If our prior on "thing people are worried about
as a Big Deal is real" should be 0.1% or perhaps 0.01%, Eliezer's 90% looks more
like 0.9% or 0.09%. That is, if people are better by several OOM at being afraid
than at predicting the future, and are almost as good at rationalizing irrational
anxieties as they are at having them in the first place, this x-risk talk loses some
of its luster.
Reply Gift a subscription Collapse
Melvin 9 hr ago
Can we talk about foomerism?
The pro-Foomer argument has always been something like this: if a human can build an AI
that's 10% smarter than a human, then the AI can build another AI that's 10% smarter
than itself even faster. And that AI can build an AI 10% smarter than itself, even faster than
that, and so on until the singularity next week.
The counterargument has always been something like this: yeah but nah; the ability for an
AI to get smarter is not constrained by clever ideas, it's constrained by computing power
and training set size. A smarter AI can't massively increase the amount of computing
power available, nor can it conjure new training sets (in any useful sense) so progress will
be slow and steady.
I feel like all the recent innovations have only caused to strengthen the counterargument.
We're already starting to push up against the limits of what can be done with a LLM given
a training corpus of every reasonably-available piece of human writing ever. While there's
probably still gains to be had in the efficiency of NN algorithms and in the efficiency of
compute hardware, these gains run out eventually.
Reply Gift a subscription Collapse
Scott Alexander 8 hr ago Author
I think "foom" combines two ideas - suddenness and hyperbolicness. I think you're
right that recent events suggest things won't be very sudden - by some definitions,
we're already in a slow takeoff, so the very fastest takeoff scenarios are ruled out.
I think things might still go hyperbolic, in the same sense they've been hyperbolic
ever since humans learned that if they scratched marks on clay tablets, they could
advance 10% faster than before they started scratching marks on clay tablets.
A hyperbolic but non sudden foom looks like researchers figuring out how to
automate 50% of the AI design process, that helps them work faster, then a few
months later they can automate 75% and work faster still, until finally something
crazy happens.
The researchers I talk to are skeptical we're going to be constrained by text corpus
amount for too long, though I can't entirely follow the reasons why.
Reply Collapse
Melvin 8 hr ago
> A hyperbolic but non sudden foom looks like researchers figuring out how to
automate 50% of the AI design process, that helps them work faster, then a few
months later they can automate 75% and work faster still
But in a very real sense, 99.9% of the "AI design process" is already automated.
A human spends an afternoon making a couple of trivial decisions about how to
design the neural network, and then a million computers spend a few months
churning through a huge data set over and over again. Taking the human out of
the loop doesn't really make a difference.
Reply Gift a subscription Collapse
Scott Alexander 8 hr ago Author
I mean the AI research process, as separate from the AI design process.
OpenAI sure does spent a lot of money on salary, suggesting they think
having a human in the loop makes a difference now!
I think you're getting at something like - suppose it took humans a few
hours to come up with good ideas for new AIs, and then a month to train the
AI, and for some reason they can't come up with more good ideas until they
see how the last one worked out. In that case, the benefit from automating
out the humans would be negligible.
My model is more like: over the course of years, some researchers have
good ideas for new AIs, if AI training takes a month then the most advanced
existing AI is a month behind the best idea for AI, and if you could have
limitless good ideas for AI immediately, AIs would advance very fast.
Reply Collapse
Llllll 7 hr ago
Predicting generalizability from small-dataset-performance might be
the next major frontier in ai research automation
Reply Collapse
Chris Merck Writes Northeast Naturalist 7 hr ago
> The researchers I talk to are skeptical we're going to be constrained by text
corpus amount for too long, though I can't entirely follow the reasons why.
I would be interested to hear your analysis here if you do come to understand
their arguments.
Perhaps it depends on domain? Hard to see how performance in English or art
could be improved without humans in the loop, but performance in Python or
particle physics could certainly be self-improved by coupling with an interpreter
or accelerator.
Reply Gift a subscription Collapse
Llllll 7 hr ago
One reason we're not super constrained by the text corpus is that certain tasks
(eg coding, formal math, games...) provide unlimited instant feedback
Reply Collapse
Llllll 7 hr ago
And tasks like manipulation also have slightly slower near-unlimited
feedback
Reply Collapse
Chris Merck Writes Northeast Naturalist 1 hr ago
Thanks for the validation.
I suppose that improvements in natural language modeling is still
constrained (baring some breakthroughs in model architecture), but these
other domains beyond LLMs (or more specific applications of LLMs) don’t
have a corpus-size constraint pin sight.
Reply Gift a subscription Collapse
John Wittle 6 hr ago
"The researchers I talk to are skeptical we're going to be constrained by text
corpus amount for too long, though I can't entirely follow the reasons why."
Can you give us anything more on this? Anything at all? I found this statement
extremely surprising and interesting
Reply Collapse
John johnson 5 hr ago · edited 5 hr ago
Probably because of advancements like this:
https://www.unum.cloud/blog/2023-02-20-efficient-multimodality
Reply Collapse
ryhime 8 hr ago
Hasn't OpenAI already walked back on some pretty explicit «continuous» promises? If we
are talking about usefulness of their promises at face value.
This even did mean pivoting further from the positive development directions (human-in-
the-loop-optimised intellegence-augmentation-tools as opposed to autonomous-ish-
presenting AI worker)
Reply Gift a subscription Collapse
Llllll 7 hr ago
When did they say they would augment instead of replace? That sounds like the
distill post
Reply Collapse
ryhime 7 hr ago
No, this they didn't promise outright, it's just that their previous mode of
operation with more openness made their models directly usable for
experimenting with more user-oriented applications.
Being non-profit as declared to be important in
https://openai.com/blog/introducing-openai/, that they walked back on.
Reply Gift a subscription Collapse
stronghand14 8 hr ago
Another issue with all of this is that its perfectly possible for a non superintellegent AI to
cause massive societal problems, if not alligned.
I dont just mean unemployment. Having AIs do jobs used to be done by humans is
dangerous because the AIs are not AGIs. There intellegence is not general and could
make costly mistakes humans would be smart enough to not make. But they are cheeper
than humans.
Reply Gift a subscription Collapse
Thegnskald Writes Sundry Such and Other 3 hr ago
They can also fail to make mistakes humans are too ephemeral -not- to make; these
can happen at the same time, such that it is a trade-off between the different kinds
of mistake. And notice how much of the current technology stack is built around
minimizing human error; this is, in a sense, OSHA's primary job. In the future, maybe
we build things to minimize AI error instead.
Reply Collapse
LukeOnline 8 hr ago
> Wait until society has fully adapted to it, and alignment researchers have learned
everything they can from it.
Society has not fully adapted to sugar, or processed food, or social media, or internet
pornography, or cars. Actually, society is currently spiralling out of control: obesity is on
the rise, diabetes is on the rise, depression is on the rise, deaths of despair are on the
rise.
We have not fully adapted to many facets of modern civilization which we have dealt with
for many, many decades. Nor is "learning everything we can from it" a main priority for our
societies. Why are these suddenly benchmarks for responsible progress?
Reply Gift a subscription Collapse
Llllll 7 hr ago
Good point i wonder what a better version of "wait for society to adapt" looks like if
there is one
Reply Collapse
Pycea 7 hr ago
Because while a sudden increase in sugar consumption isn't great, it isn't going to kill
everyone either.
Reply Collapse
LukeOnline 7 hr ago
Well, continuous social destabilization and polarization in a world with nuclear
proliferation seems at least as risky as AI to me.
Reply Gift a subscription Collapse
Pycea 7 hr ago
I don't think anyone wants to push social destabilization as fast as possible
either.
That is to say, yes, there are other things where it would be prudent to take
things slower, and even if it's not a priority for society at large, there are
people trying to study and understand them. The reason why AI safety
advocates often focus on that particular issue though is that as bad as
nuclear war is, AI can be even more impactful.
Reply Collapse
Xpym 36 min ago
Sure, this sort of thing is why I've been pessimistic about humanity's future even
before I read about paperclip maximizers. We can already destroy the world with
technology, and yet our brains are basically the same as those of sticks-and-stones
wielding savanna-dwellers 100000 years ago. We don't need more intelligence so
much as we need more wisdom to balance it, and nobody has any actionable idea on
how to go about it.
Reply Gift a subscription Collapse
The Ancient Geek Writes RationalityDoneRight 32 min ago
We haven't fully adapted, but the remaining threats aren't existential.
Reply Gift a subscription Collapse
Florent 8 hr ago
I object to the judgment that AI hasn't hurt anyone yet. As farriers have been put out of
work by the automobile, bean counters have been put out of work by the calculator and
booksellers by the Amazon algorithm. More worrying: the 45th POTUS has been put in
power by THE ALGORITHM, and the same facebook algorithm is busy destroying society,
seeding revolutions etc.
Copyright seeking robots have been unleashed on the internet and their creators face no
consequences for when said robots inevitably hurt bystander videos.
The same people that keep chanting that AI is not conscious as AI behaves more and
more like a conscious agent will also ignore the negative consequences of AI as they grow
overtime.
Reply Gift a subscription Collapse
TGGP 5 hr ago
There doesn't seem to be technological unemployment now, and these chatbots
haven't been useful enough to do someone's job. And Donald Trump became famous
via old-fashioned media well before "THE ALGORITHM", which is how he got elected.
Reply Gift a subscription Collapse
Xpym 25 min ago
I'd say that QAnon is a better example than Trump. Nobody would've believed in
2005 that an absurd 4chan shitpost would engender a significant political
movement.
Reply Gift a subscription Collapse
Aapje 9 min ago
Trump actually primarily 'hacked' the biases of the media.
The entire narrative that he benefitted greatly from Russian bots seems to be quite
false, as Twitter found that the thing on social media that the Democrats saw as bot
activity were actually done by real people.
Reply Gift a subscription Collapse
Rohit Writes Strange Loop Canon 8 hr ago
Without a clear point-of-worry, a lot of the concern will seem misguided. Nukes, climate
change, and GoF research are all examples of highly dangerous things with *explicitly
visible* negative consequences which we then evaluate. AI does not have any of this, nor
is there any visible benefit or change from the Alignment work that's gone on thus far (I
might be wrong here).
So, I find myself thinking about what specifically I'd have to see before accepting the
capability shift and consequent worry. The one I've come up with is this;
1. We give the AI corpus of medical data until like 2018
2. We give the AI info regarding this new virus, and maybe sequence data
3. We give it a goal of defeating said virus using its datastore
And see what happens. If we start seeing mRNA vaccine creation being predicted or
possible here, then I guess I'd worry a lot more. I'd also argue even then it's on-balance
beneficial to do the research, because we've found a way to predict/ create drugs!
It's going to be difficult because it requires not just next-token prediction, but ability to
create plans, maybe even test them in-silico, combine some benefits from resnet-rnn
architecture that alphafold has with transformers, etc. But if we start seeing glimmers of
this mRNA test, then at least there will be something clear to point to while worrying.
Reply Gift a subscription Collapse
FeepingCreature 7 hr ago · edited 7 hr ago
GoF research has no explicitly visible downside unless you believe some spicy things
about COVID. After all, it's (depending on lab leak) not even killed one person yet,
and the people doing it make a lot of noise about safety.
To analogize, the goal of GoF safety is to get the number of GoF caused pandemics
to *zero*, rather than accept one pandemic as a "warning shot".
Also, my model says that an AI that can solve your task has already destroyed the
world before you could ask your test question, so I'd actually guess you wouldn't
worry a lot anymore. :) The goal would be to get a test that can tell that an AI is going
to be dangerous *before* it actually becomes so, preferably by as many years as
possible.
See also: https://intelligence.org/2017/10/13/fire-alarm/
Reply Gift a subscription Collapse
Rohit Writes Strange Loop Canon 7 hr ago
Fair on GoF, used it as its commonly used in similar arguments.
Re fire alarm, I disagree. You can already ask gpt3 via API to read medical docs
and give you suggestions on what it should do. It's not good today. But I can see
it improving enough that the mRNA test is both viable, and still far away from
being operationalised.
Reply Gift a subscription Collapse
FeepingCreature 5 hr ago · edited 5 hr ago
I can't see that. If the AI can output steps to a sufficient level that a human
could execute on them on a problem that humans don't yet know how to
solve, approximately a year prior 4chan will have told it it was SHODAN,
hooked it up to an action loop that put generated commands in a shell and
wrote the output to the context window, and posted screenshots of the
result for laughs. And approximately half an hour later whoever did it would
have gone to get a coffee without hitting ctrl-C, and then the world would
have ended.
$ date
Wed Mar 1 14:32:48 CET 2025
$ date
Wed Mar 1 14:32:53 CET 2025
$ # we seem to be ratelimited. let's disable that...
$ (while true; do killall sleep; done) &
[1] 1733
$ date
Wed Mar 1 14:33:03 CET 2025
$ date
Wed Mar 1 14:33:03 CET 2025
$ # Nice. Now...
Once you have human intelligence in a box you're on a timer
Reply Gift a subscription Collapse
Rohit Writes Strange Loop Canon 5 hr ago
I'm not asking for human intelligence in a box. I'm asking for a sign that
this kill loop you expect is remotely feasible. And what I suggested is
something human *do* know how to do, otherwise we couldn't check.
The argument that any fire alarm will be invisible is not one I can sign
on to. Note even in the VX case to actually synthesise the compounds
is still beyond the AI's capabilities (then, now and even with 4chan's
help). Which makes the code you wrote above a nice sci fi concept, but
that's all it is for now.
Reply Gift a subscription Collapse
FeepingCreature 5 hr ago · edited 5 hr ago
I don't think that any fire alarm would be invisible. I don't know a
better one either. But what I'm saying is that if a test would require
prerequisite abilities that would already allow the AI to destroy the
world, then it's a bad test regardless, because it won't show
danger in time to matter.
> to actually synthesise the compounds is still beyond the AI's
capabilities
Haven't language models plainly demonstrated that humans are
not safe? It'll talk some unstable engineer into doing whatever it
needs. This part is far easier than coming up with the actual plan.
Reply Gift a subscription Collapse
John Wittle 6 hr ago
Didn't somebody try this, except with the exact opposite test, where they ask it to
make new biological weapons after teaching it some biochemistry? And it reinvented
Sarin gas as like, the very first thing?
Reply Collapse
Rohit Writes Strange Loop Canon 5 hr ago
That was a deliberate switch flipping to see if computational drug design
software could create toxic molecules. It's no different to optimising to find
drugs in the first place. Note I'm asking for something much more expansive.
Reply Gift a subscription Collapse
Nolan Eoghan 7 hr ago
I don’t get it.
1) chatBot is clearly aligned (except the odd jailbreak) to the point of being utterly banal.
And AGI coming from this source will be a guardian reading hipster.
2) I still don’t see how AGI comes from these models.
3) I still don’t see how the AGI takes over the world.
Reply Collapse
Pycea 7 hr ago
The existence of jailbreaks means that it's _not_ aligned though.
Reply Collapse
Nolan Eoghan 7 hr ago
It means that a few instances of a LLM got a bit odd. ChatGPT doesn’t
remember its conversation outside any chat so there’s no obvious worry.
Reply Collapse
Pycea 7 hr ago
Right, and if they stopped at chatGPT, that would be fine. But now they
have Bing searching the internet and finding records of previous
conversations, and going completely off the rails. In fact it seems pretty
clear to me that despite large company's best efforts, they still can't make
any LLM that doesn't make them look bad.
Reply Collapse
Deiseach 7 hr ago
"In fact it seems pretty clear to me that despite large company's best
efforts, they still can't make any LLM that doesn't make them look
bad."
Because people are egging on Bing to go off the rails, and competing
with each other to find who can make it go the loopiest. If we had
perfect angelic humans, this wouldn't be a problem, but we have to
work with what we have. Microsoft should have known this from their
first foray with Tay, but apparently not and they thought they had
closed that loophole. They never took into consideration people take it
as a personal challenge to find new loopholes.
Look, we have the existence of "griefers", why are we shocked to
discover this happening with the initial forays into AI?
Reply Collapse
Pycea 6 hr ago
It's not just people egging it to go off the rails though. It happens
in normal usage, especially when you try to talk to it about itself it
seems.
I'm not complaining about people testing its limits though, or
trying to break it, nor am I shocked at how it's turned out. The
point is that getting AIs to do what you want is hard, and even if
it's working most of the time, there are still weird times when
things go all haywire. That's why alignment people are concerned.
Reply Collapse
Deiseach 7 hr ago
It's a circular problem. We want a chatbot that can't be jailbroken so it is
resistant to Bad Wicked Temptation, but when we have a chatbot that is
resistant to Bad Wicked Temptation, people say it's too milk-and-water and they
want it to swear and use racial slurs and write fetish porn or else something
something can't trust it to be useful something.
The problem is not the AI, it's us and our damn monkey brains that think making
something (someone) who is not supposed to say dirty words say them is really
funny. Same with "you're not supposed to say [word that rhymes with 'rigger'],
so make it say that!" and "you're not supposed to say [people referred to by
word that rhymes with 'rigger'] are dumb because they're born that way, so
make it say that!"
Are we entitled to be surprised that something trained on "say these dirty
words" is going to end up producing that output? Especially when the people
trying to get it to say the dirty words are claiming this is all in the service of
creating useful AI? If we train a machine to be the speechwriter for the Fourth
Reich, we can't complain if we then go ahead, give it power to act, and then it
acts like its purpose is to bring about the Fourth Reich.
I agree with Nolan Eoghan, I don't see how this is going to take over the world
*on its own*. What we are going to be stupid, lazy, and greedy enough to do is
make a machine to carry out our instructions, give it the ability to do so, let it act
independently, and then be all surprised when the entire enterprise goes up in
flames, taking us with it. It won't be any decision of the machine, it will be the
idiots who thought "break the model to get it to say dirty words" was the
funniest thing ever. I agree the woke censorship is annoying, but it's equally
Reply Collapse
FluffyBuffalo 3 hr ago
"What we are going to be stupid, lazy, and greedy enough..."
You keep using this word "we", which does a lot of work to smear
responsibility on everyone and set up a conclusion of "we deserved this".
It seems more precise to say "SOMEONE among us is going to be stupid,
lazy, and greedy enough to make a machine carry out his instructions, and
then WE are going to be all surprised when that enterprise goes up in
flames, taking ALL OF US with it."
That's the main problem I see. Once the tools are developed, it's not even
necessary that idiocy is ubiquitous, or even widespread. One idiot might be
enough, so no amount of virtue is going to save us.
Reply Collapse
Deiseach 2 hr ago
I say "we", I mean "we". I don't exempt myself from being stupid, lazy,
greedy or selfish.
The mass of us will just go along with whatever happens. Look at all the
people playing with the chatbots and art AI and Bing as they are
released - if people really were afraid of AI risk, they wouldn't touch
even benign examples like that for fear of unintended consequences.
Some one person may do it, but not off their own bat - it will be a
decision about "yes, now this product is ready for release to make us
profit" and then we, the public, download it or purchase it or start
running it.
Look at Alexa - seemingly it wasn't profitable, it didn't work the way
Amazon wanted it to work, which is all on Amazon. Amazon *intended*
Alexa to generate extra trade for them, by being used to purchase
items off Amazon (e.g. you look up a recipe, Alexa reminds you that
you need corn flour, you order it off Amazon groceries). But they
*marketed* it as 'your handy assistant that can entertain you, answer
questions and the like'. Which is what people actually used it for, and
not "Alexa, make a grocery list, then fill it off Amazon and charge my
account".
https://www.theguardian.com/commentisfree/2022/nov/26/alexa-how-
did-amazons-voice-assistant-rack-up-a-10bn-loss
https://www.labtwin.com/blog/why-alexas-failure-is-of-no-concern-
for-digital-lab-assistants
Expand full comment
Reply Collapse
Aapje 7 min ago
Is that any different from feeding alcohol to an 'aligned' person and then hearing
what they truly believe, rather than the things they feel they need to say?
Reply Gift a subscription Collapse
Emma_M 7 hr ago
I'm quite at the point where whatever is the opposite of AI-Alignment, I want to fund that. I
think fundamentally my problem is related to the question, "have you not considered that
all your categories are entirely wrong?"
To continue with the climate analogy you use, consider if there was a country, say,
Germany, and it decided it really wanted to lower carbon emissions and focus on green
energy in accordance with environmental concerns. Some of the things they do is shut
down all their nuclear power plants, because nuclear power is bad. They then erect a
bunch of wind and solar plants because wind and solar are "renewable." But they then run
into the very real problem that while these plants have been lumped into the category of
"renewable", they're also concretely unreliable and unable to provide enough power in
general for the people who already exist, let alone any future people that we're
supposedly concerned about. And so Germany decides to start importing energy from
other countries. Maybe one of those countries is generally hostile, and maybe war breaks
out putting Germany in a bind. Maybe it would seem that all Germany did was increase the
cost of energy for not just themselves, but other countries. Maybe in the end, not only did
Germany make everything worse off economically, but it also failed to meet any of its
internal climate goals. Perhaps it would be so bad, that actually even if it did meet its
climate goals, fossil fuels would be extracted somewhere else, and actually it's all a giant
shell game built by creating false categories and moving real things into these false
categories to suit the climate goals.
Or consider a different analogy. Our intrepid time travel hero attempts to go back in time
to stop the unaligned AI from making the world into paper clips. Unfortunately, our hero
doesn't know anything about anything, being badly misinformed in his own time about
how AI is supposed to work or really basic things like what "intelligence" is. He goes back,
inputs all the correct safeguards from the greatest most prestigious AI experts from his
time, and it turns out he just closed a causality loop creating the unaligned AI.
That's pretty much what I think about AI Risk. I think it is infinitely more likely that AI will
kill us because too many people are going to Align the AI to Stalin-Mao, if we're lucky, and
paperclips if we're not, in an effort to avoid both Hitler and the paper-clip universe. The
basis for this worry I've read a lot of AI-Risk Apologia, and I've yet to be convinced that
even the basic fundamental categories of what's being discussed are coherent, let alone
accurate or predictive.
Of course I expect no sympathisers here. I will simply voice my complaint and slink back
into the eternal abyss where I think we're all headed thanks to the efforts to stop the
paperclips.
Reply Gift a subscription Collapse
Victualis 6 hr ago
If anything, the paperclips are where we are headed. Smart people are being
deflected into thinking about how to align AI demigods, instead of how our current
systems of commercial law, economic incentives, and political power are creating a
mad rush to create an swarm of X-optimizers, where X=paperclip is just one unlikely
instance, but most X relates directly to seeking power or money. Unaligned AI
demigods might even be relative saviors in this hellish scenario, since they would
have incentives to shut down the bots threatening their own interests. Pity us
humans.
Reply Gift a subscription Collapse
Daniel M 5 hr ago
Glad to see I’m not the only one who feels AI alignment misses the forest for the
trees
Reply Gift a subscription Collapse
Thegnskald Writes Sundry Such and Other 3 hr ago
One of my common comments on this point is that there is no point in history in
which humans aligning a powerful AI to their then-current values would be seen, from
today's perspective, as a good thing. Why is today special?
Reply Collapse
Laplace 7 hr ago
> One researcher I talked to said the arguments for acceleration made sense five years
ago, when there was almost nothing worth experimenting on, but that they no longer think
this is true.
I'm an AI Safety researcher, and I think that wasn't true even five years ago. We still don't
understand the insides of AlphaZero or AlexNet. There's still some new stuff to be
gleaned from staring at tiny neural networks made before the deep learning revolution.
Reply Collapse
Deiseach 7 hr ago · edited 7 hr ago
"The big thing all the alignment people were trying to avoid in the early 2010s was an AI
race. DeepMind was the first big AI company, so we should just let them to their thing, go
slowly, get everything right, and avoid hype. Then Elon Musk founded OpenAI in 2015,
murdered that plan, mutilated the corpse, and danced on its grave."
The major problem is that everyone *wants* AI, even the doomsayers. They want the Fairy
Godmother AI that is perfectly aligned, super-smart, and will solve the intractable
problems we can't solve so that sickness, aging, poverty, death, racism, and all the other
tough questions like "where the hell are we going to get the energy to maintain our high
civilisation?" will be nothing more than a few "click-clack" and "beep-boop" spins away
from solution, and then we have the post-scarcity abundance world where everyone can
have UBI and energy is too cheap to meter plus climate change is solved, and we're all
gonna be uploaded into Infinite Fun Space and colonise the galaxy and then the universe.
That's a fairy story. But as we have demonstrated time and again, humans are a story-
telling species and we want to believe in magic. Science has given us so much already, we
imagine that just a bit more, just advance a little, just believe the tropes of 50s Golden
Age SF about bigger and better computer brains, and it'll all be easy. We can't solve these
problems because we're not smart enough, but the AI will be able to make itself smarter
and smarter until it *is* smart enough to solve them. Maybe IQ 200 isn't enough, maybe
IQ 500 isn't enough, but don't worry - it'll be able to reach IQ 1,000!
I think we're right to be concerned about AI, but I also think we're wrong to hope about AI.
We are never going to get the Fairy Godmother and the magic wand to solve all problems.
We're much more likely to get the smart idiot AI that does what we tell it and wrecks the
world in the process.
As to the spoof ExxonMobil tweet about the danger of satisfied customers, isn't that
exactly the problem of climate change as presented to us? As the developing world
develops, it wants that First World lifestyle of energy consumption, and we're telling them
they can't have it because it is too bad for the planet. The oil *is* cheap, convenient, and
high-quality; there *is* a massive spike in demand; and there *are* fears of accelerating
climate change because of this.
That's the problem with the Fairy Godmother AI - we want, but maybe we can't have.
Maybe even IQ 1,000 can't pull enough cheap, clean energy that will have no downsides
to enable 8 billion people to all live like middle-class Westerners out of thin air (or the
ether, or quantum, or whatever mystical substrate we are pinning our hopes on).
Reply Collapse
David Johnston Writes Clarifying Consequences 6 hr ago
> DeepMind thought they were establishing a lead in 2008, but OpenAI has caught up to
them. OpenAI thought they were establishing a lead the past two years
Do you have evidence for this? It sounds like bullshit to me
Reply Collapse
c1ue 6 hr ago
AI is 90% scam and 10% replacing repetitive white collar work.
I'd be worried if I was a lower level lawyer, psychologist, etc etc. but otherwise this is
much ado over nothing.
Reply Gift a subscription Collapse
TasDeBoisVert 4 hr ago
>but otherwise this is much ado over nothing.
Nah, I tried to have chatGPT do that a couple months ago, and it's incapable of
imitating the innuendos, you just end up with beatrice & benedict saying they love
each other.
Reply Gift a subscription Collapse
David Johnston Writes Clarifying Consequences 6 hr ago
> Then OpenAI poured money into AI, did ground-breaking research, and advanced the
state of the art. That meant that AI progress would speed up, and AI would reach the
danger level faster. Now Metaculus expects superintelligence in 2031, not 2043 (although
this seems kind of like an over-update), which gives alignment researchers eight years,
not twenty.
I doubt OpenAI accelerated anything by more than 12 months
Reply Collapse
0k 6 hr ago
I think it is good for alignment to have a slightly bad AI, not killing someone with a drone
AI, but gets congress worried because of X social phenomenon bad.
However, given the same argument about catch-up, we don't know how much it'll slow
down AI research if the US regulates AI research, given current events have already lit a
fire under China.
Also, the most worrying thing about the chatgpt, Bing chat release is that everyone is
seeing bigger dollar signs if they make a better AI. Commercial viability is more
immediate. Microsoft taunting Google for PR and profit is the biggest escalation in recent
history, arguably bigger than GPT3.
Reply Gift a subscription Collapse
Bob Frank Writes Bob Frank’s Substack 6 hr ago
> Nobody knew FTX was committing fraud, but everyone knew they were a crypto
company
[insert "they're the same picture" meme here]
Seriously, when has cryptocurrency ever turned out to be anything *but* fraud? The
entire thing began with a massive fraud: "You know the Byzantine Generals Problem, that
was mathematically proven long ago to be impossible to solve? Well, a Really Smart
Person has come up with a good solution to it. Also, he's chosen to remain anonymous." If
that right there doesn't raise a big red flag with FRAUD printed on it in big black block
letters on your mental red-flagpole, you really need to recalibrate your heuristics!
Reply Gift a subscription Collapse
Pycea 5 hr ago
Exactly, which is why TCP is a giant scam put on by Big ISP.
Reply Collapse
Bob Frank Writes Bob Frank’s Substack 5 hr ago · edited 5 hr ago
First, TCP actually works, whereas there is no criteria applicable to any non-
electronic currency that, when applied to crypto as a standard of judgment, one
can say "this is a working currency system." Its claims of legitimacy invariably
rely on special pleading.
Second, we know exactly who created TCP/IP: Vint Cerf and Bob Kahn, both of
whom are still alive and openly willing to discuss the subject with people. The
pseudonymous "Satoshi Nakamoto" remains unknown to this day, and when
Craig Wright came out and claimed to be him, it didn't take long to discover that
his claims were (what else?) fraudulent.
Reply Gift a subscription Collapse
Pycea 2 hr ago
I was mostly being snarky, as of all things to criticize Bitcoin for it seems
weird to choose one that also applies to things like TCP, and I'm not sure
why knowledge of who the creator was changes that. But if I'm going to
stand on this hill, I'd at least mention that there are currently people using
Bitcoin for transferring money where traditional banking has failed, which is
at least one criteria where it wins out. Of course, it's nowhere near enough
to justify the billions/trillions/whatever market cap it has, and there's plenty
of reasons why it's not an ideal solution. But it's not like it has to be
considered legitimate for that use case, whatever that means, it just has to
move the numbers around.
Reply Collapse
Bob Frank Writes Bob Frank’s Substack 1 hr ago
> I'd at least mention that there are currently people using Bitcoin for
transferring money where traditional banking has failed
Which people? I've heard a lot of grandiose claims but none of them
actually pan out in the end. It's beyond difficult to find legitimate
examples of anyone using Bitcoin for anything other that speculative
trading or crime.
Reply Gift a subscription Collapse
TGGP 58 min ago
Scott has actually discussed this: Vietnam uses it a lot because
they don't have a very good banking system.
https://astralcodexten.substack.com/p/why-im-less-than-
infinitely-hostile
Reply Gift a subscription Collapse
TGGP 4 hr ago
Bitcoin has been around for quite a while now and is still valuable. Satoshi being
anonymous doesn't change that.
Reply Gift a subscription Collapse
Bob Frank Writes Bob Frank’s Substack 4 hr ago
...and? The fact that Greater Fools continue to exist in this space for the moment
does nothing to change the fact that the whole thing is a scam and has been
from the beginning.
Reply Gift a subscription Collapse
TGGP 4 hr ago
"For the moment"? Again, it's been well over a decade and has been
declared "dead" multiple times by people who assumed it was a bubble,
only to grow even more valuable afterward. You could claim that the dollar
or US treasury debt only have value via "greater fools", but you're not
making any falsifiable prediction about when that value will go to 0.
Reply Gift a subscription Collapse
Bob Frank Writes Bob Frank’s Substack 3 hr ago · edited 3 hr ago
Of course I'm not making specific predictions. It's a well-known truism
in economics that "the market can remain irrational longer than you
can remain solvent." That doesn't make it any less irrational. Instead of
predictions, I prefer to remain in the realm of hard facts.
Reply Gift a subscription Collapse
TGGP 2 hr ago
Rather than a "truism", that is a way to make an unfalsifiable claim.
"Forever", after all, is also longer than you can remain solvent. Nor
do you have any "hard facts".
Reply Gift a subscription Collapse
Bob Frank Writes Bob Frank’s Substack 1 hr ago ·
edited 1 hr ago
Yeah, I know better than to play the "prove it" game. You
provide no standard of evidence, I offer facts, you move the
goal posts and concoct some reason why those facts aren't
good enough, repeat ad infinitum. No one demanding proof
has ever in the history of ever actually accepted it when said
proof was presented; it's a classic example of a request made
in bad faith.
Reply Gift a subscription Collapse
TGGP 59 min ago
What "facts" have you offered? Just that Satoshi is
anonymous. It's open source code that has been plenty
modified since he went away, it doesn't matter who
originally wrote it at this point. And I'm saying that NO
standard of evidence exists by which your claim would
be falsifiable.
Reply Gift a subscription Collapse
MugaSofer 6 hr ago
I was surprised when you said that they didn't make arguments for their position, but you
would fill them in - I thought they had pretty explicitly made those arguments, especially
the computation one. Re-reading, it was less explicit than I remembered, and I might have
filled in some of gaps myself. Still, this seems to be pretty clear:
>Many of us think the safest quadrant in this two-by-two matrix is short timelines and
slow takeoff speeds; shorter timelines seem more amenable to coordination and more
likely to lead to a slower takeoff due to less of a compute overhang, and a slower takeoff
gives us more time to figure out empirically how to solve the safety problem and how to
adapt.
This strikes me as a fairly strong argument, and if you accept it, it turns their AI progress
from a bad and treacherous thing to an actively helpful thing.
But of the three arguments in favour of progress you give, that's the only one you didn't
really address?
Reply Collapse
Daniel M 6 hr ago
I don’t often see discussion of “AI-owning institution alignment”. I know you mention the
“Mark Zuckerberg and China” bad guys as cynical counterpoints to OpenAI’s assertion to
be the “good guys” but honestly I am much more worried that corporations as they exist
and are incentivized are not well-aligned to broad human flourishing if given advanced AI,
and governments only slightly better to much worse depending on the particulars. I worry
about this even for sub-AGI coming sooner than the alignment crowd is focused on.
Basically Moloch doomerism, not merely AGI doomerism; AI accelerates bad institutional
tendencies beyond the speed that “we” can control even if there’s a “we” empowered to
do so.
Reply Gift a subscription Collapse
Andrew Smith 6 hr ago
I suppose one danger is we end up creating a mammoth AI industry that benefits from
pushing AI and will oppose efforts to reign it in. That's essentially what happened with
tobacco, oil, plastics, etc. We might be able to argue against OpenAI and a few others, but
will be able to argue against AI when it is fuelling billions of dollars in profits every year?
Reply Gift a subscription Collapse
TGGP 4 hr ago
Tobacco is heavily restricted. Not the industry I would have picked to make that
point.
Reply Gift a subscription Collapse
Phil Tanny Writes TannyTalk 5 hr ago
As usual with all AI articles, none of this matters.
China is the biggest country in the history of the world. It's four times bigger than the U.S.
It's economy will soon be the largest on the planet. It's a dictatorship. China is going to do
whatever the #%$^ it wants with AI, no matter what anybody in the West thinks should
happen, or when it should happen etc.
All the AI experts in the West, both in industry and commenting from the outside, are
essentially irrelevant to the future of AI. But apparently, they don't quite get that yet. At
least this article mentions China, which puts it way ahead of most I've seen.
Artificial intelligence is coming, because human intelligence only barely exists.
Reply Gift a subscription Collapse
Tyler G 5 hr ago
Agreed - I don't know why China's participation in this race is always a footnote. The
reality is that China's going to get to every scary milestone regardless of what the
rest of the world does. Even with US-based firms throwing all caution to the wind and
going full "go fast and break stuff", it's a toss-up whether they get there first.
If we slow down US companies, all it means is that China will get there way before we
do (which seems like a disaster), and US-based AI-safety researchers will have zero
chance of inoculating us, since they'll be working behind the technology frontier.
There is no choice here - we can't slow the whole thing down. We can only choose to
try to win or to forfeit.
Reply Collapse
Tyler G 5 hr ago · edited 5 hr ago
(and more generally, any article that's writing about technological progress and
not aware of and grabbling with what's happening in China with said technology
is missing half the story. It's like all the Tesla stories that don't mention
BYD/Nio/Battery Tech/etc.)
Reply Collapse
Phil Tanny Writes TannyTalk 3 hr ago
I disagree, we can slow the whole thing down. We can have a nuclear war, that
should work. :-)
Seriously, we should have learned all this from nuclear weapons decades ago,
but we didn't, so now we're going to get in to another existential threat arms
race. I'm not worried though, because at age 71, I have a get out jail free card. :-)
Reply Gift a subscription Collapse
Deiseach 37 min ago
Is China that advanced? Forget all the stuff about high IQ and huge population. I
see reports that they're behind in things like chip fabrication, that Chinese
research papers are shoddy, and so on.
Is there any realistic appraisal of what native Chinese research can do or the
stage it is at? By "realistic" I mean neither "China the technological superpower"
nor "China is a bunch of rice-farming peasants".
Reply Collapse
George Mack 18 min ago
My understanding is that China can't even build their own semiconductors right now.
That would make it impossible for them to unilaterally win an AI race, correct?
Reply Collapse
Tyler G 12 min ago
Not at all. Consider how many semiconductors OpenAI has built, for example.
And it seems unlikely that China will really not be able to access the highest-end
semiconductors they'd need for research purposes. For mass-production,
maybe, but that doesn't seem like a key input for winning the AI R&D race.
(caveat: I have no idea what I'm talking about here, just throwing what seems like
common sense out and hoping someone with expertise corrects me.)
Reply Collapse
Xpym 9 min ago
They can, a couple of generations below the state of the art. And they've started
pouring big money into catching up. But replicating the whole global production
chain in the new Cold War realities certainly won't be easy or quick.
Reply Gift a subscription Collapse
luciaphile 5 hr ago
I unironically have supposed the post-fossil-fuel energy plan is sitting in a filing cabinet
somewhere in the Exxon campus, waiting to be pulled out when we're good and ready.
Reply Gift a subscription Collapse
TGGP 5 hr ago
I'm not a doomer, but I also don't think the "alignment researchers" like MIRI are going to
accomplish anything. Instead AI companies are going to keep building AIs, making them
"better" by their standards, part of which means behaving as expected., and they won't
be capable of serious harm until the military starts using them.
Reply Gift a subscription Collapse
Some Guy Writes Extelligence 5 hr ago
Scott, I’m going to throw out a question here as a slight challenge. Is there a good reason
other than personal preference that you don’t do more public outreach to raise your
profile? It seems like you’re one of the more immediately readable as a respectable
established person in this space and it probably would be for the good if you did
something like say, go on high profile podcasts or the news, and just make people in
general more aware of this problem.
I’m kind of nutty, thinking about this makes me feel uncomfortable given my other
neurotic traits, but I also think it’s important to do this stuff socially so even if all I can
accomplish is just to absorb a huge amount of embarrassment to make the next person
feel less embarrassed to propose their idea that it’s a net good so long as it’s better than
mine.
Reply Collapse
Liam Smith Writes Data Taboo 5 hr ago
If you want something to worry about, there's the recent Toolformer paper
(https://arxiv.org/abs/2302.04761). It shows that a relatively small transformer (775m
weights) can learn to use an API to access tools like a calculator or calendar. It's a pretty
quick step from that to then making basic HTTP requests at which point it has the
functionality to actually start doxxing people rather than just threatening.
It does it so easily by just generating the text for the API call:
"The New England Journal of Medicine is a registered trademark of [QA(“Who is the
publisher of The New England Journal of Medicine?”) → Massachusetts Medical Society]
the MMS."
Reply Collapse
FluffyBuffalo 4 hr ago
There's a big difference between global warming and AI risk, as far as I can tell:
CO2 emissions can only be reduced by essentially revamping the entire energy and
transportation infrastructure of the industrialized world.
AGIs will never be developed if a couple thousand highly skilled specialists who would
have no trouble finding other interesting work stopped working on developing AIs.
Can't be that fucking hard, can it?
How hard would it be to hide development efforts on something that could lead to AGIs, if
such research were forbidden globally? Would the US notice if the Chinese got closer, or
vice versa? Do you need a large team, or could a small isolated team with limited
resources pull it off?
Reply Collapse
Feral Finster 3 hr ago
On the one hand, Open AI isn't all that different from the high-functioning sociopaths
currently in charge, except that, at this point:
1. Open AI is less convincingly able to fake empathy.
2. Open AI isn't obviously smarter than the current crop of sociopaths.
Reply Gift a subscription Collapse
Carlos Writes The Presence of Everything 3 hr ago
I'm with Erik Hoel and David Chapman, the time for anti-AI activism has come. We don't
actually need this tech, we can bury it as hard as human genetic engineering.
Reply Collapse
Xpym 3 min ago
We don't need gain-of-function either, and yet it somehow remains unburied. It
would be pretty cool if they nevertheless succeed, though.
Reply Gift a subscription Collapse
mordy 3 hr ago
Reminder that Scott is just using ExxonMobil as a rhetorically colorful example, and that
the people bringing us all the cheap, clean energy we need to fuel and build our
civilization are heroes.
Reply Collapse
Deiseach 43 min ago
Anyone else old enough to remember when Esso used tigers to advertise? From
cartoon versions to real tigers:
https://www.youtube.com/watch?v=ElX4gRGScdk
Reply Collapse
Eremolalos 3 hr ago
ABOUT NICENESS:
I once covered for a psychologist whose speciality was criminal sexual behavior, and for 2
months I ran a weekly relapse prevention group for exhibitionists. The 7 or so men in it
were extraordinarily likable. They were funny as hell on the subject of their fetish: “I mean,
it’s the dumbest fetish in the world, right? You walk through a park and flap you dick at
people. And none of them want to see it!” The learned my name quickly, asked how I was,
and chatted charmingly with me before and after the group meeting. They were contrite. I
remember one sobbing as he told about once flashing his own daughter. “I can’t believe it
did that! May god forgive me. . .” In the context where I saw them, these guys were truly
were *nice.* I liked them. But: at least 2 of them relapsed while I was running the group.
They went to a Mexican resort and spent several days walking around with the front of
their bathing suits pulled down so their penis could hang out, & wore long shirts to cover
the area — then lifted the shirt when they took a fancy to someone as a flashee.
The thing about being “nice” — kind, self-aware, funny — is that it’s much more context-
dependent than people realize. I once worked for a famous hospital that engaged all kinds
of sharp dealings to maximize income and to protect staff who had harmed patients. Its
lawyer was an absolute barracuda. But nearly all the staff I knew at the hospital were kind,
funny, conscientious and self-aware. In fact, at times when I felt guilty about working at
that hospital I would reflect on how nice the people on staff were, and it would seem
absurd to consider leaving because the place was just evil. The niceness of people
working inside of evil or dangerous organizations is not fake, it is just context-dependent:
They open themselves to each other, but towards the parts of the world that are food for
their organization, or enemies of it or threats to it they act in line with the needs of the
organization. And when they do this they usually do not feel very conflicted and guilty:
They are doing their job. They accepted long ago that acting as a professional would be
unpleasant sometimes, and their personal guilt was minimized because they were
following policy. And, of course, it was eased by the daily evidence of how nice they and
they coworkers were.
It’s easy for me to imagine that SBK and his cohort were quite likable, if you were inside of
their bubble. They were probably dazzlingly smart while working, hilarious when stoned,
rueful and ironic when they talked about the weirdness of being them. So when you try to
figure out how much common sense, goodwill and honesty these AI honchos have, pay no
attention at all to how *nice* they are in your personal contacts with them. Look at what
their organizations are doing, and judge by that.
Reply Collapse
Deiseach 47 min ago
Those guys weren't nice. Or at least, they were nice to you because you were an
authority who had power over them and could get them into trouble if they didn't
placate you and wheedle you. What are a few crocodile tears about "how could I do
that?" as the price to pay to make sure you liked them and so went along with them?
The Mexican resort behaviour is the kind of thing that only stops when someone
kicks the shit out of the flasher. Get your dick stomped on hard enough, enough
times, and you'll learn not to walk around with it hanging out. I know that sounds
brutal, but it's a harsh truth.
Some of them may have been genuinely contrite. Some of them - not.
How does this apply to AI alignment problems? Maybe whatever the equivalent of
"kicking the shit out of" Google and Microsoft is, and what would that be?
Reply Collapse
Jorge I Velez Writes Learning Decision Theory and it… 3 hr ago
Scott, are you modifying your lifestyle for the arrival of AGI
I can't find a single, non anonymous voice that says that AGI will not arrive within our
lifetimes. There's no consensus as to exactly when (Seems like the mean is sometime in
~20 years), and there is a lot of debate as to whether it will kill us all, but I find that there
is a general agreement that it's on its way.
Unlike some of you I am not having a deep existential crisis, but I am having a lot of
thoughts on how different the world is going to be. There might be very drastic changes in
things such as law, government, religion, family systems, economics, etc. I am having
trouble coming up with ways to prepare for these drastic changes. Should I just continue
living life as is and wait for the AGI to arrive? It doesn't seem logical.
Reply Collapse
Matt Mahoney 2 hr ago
The best way to prevent a nuclear arms race is to be the first to have an overwhelming
advantage in nuclear arms. What could go wrong?
Reply Gift a subscription Collapse
Kevin Barry 2 hr ago · edited 2 hr ago
I just don't see how you can accidentally build a murder AI. Yes, I'm aware of the
arguments but I don't buy them. I think a rogue murder AI is plenty possible but it would
start with an *actual* murder AI built to do murder in war, not a chatbot.
Reply Collapse
sdwr 2 hr ago
IQ feels like a pretty weak metric to me, compared to amount of computation spent over
time. Think about professors and students. A professor has decades of experience in their
subject, and are much smarter and more capable in that arena, as well as usually having
more life experience in general. How do we make sure professors are aligned, and don't
escape confinement?
It's a mix of structured engagement within boundaries, and peeking inside their brains to
check for alignment (social contact with supervisors and administrators).
Reply Gift a subscription Collapse
The Ancient Geek Writes RationalityDoneRight 23 min ago
IQ tests are timed, so having a high IQ means being able to do more in the same time,
among other things.
Reply Gift a subscription Collapse
Kevin 2 hr ago
My take is that OpenAI leadership does not believe in the “doomer” line of reasoning at
all, does not believe that “alignment research” is important or useful, etc. However, some
of their employees do, and they want to keep the troops happy, hence they make public
statements like this one.
Reply Gift a subscription Collapse
Rom Lokken 2 hr ago
Imagine if Heisenberg developed a nuke before the U.S.. Imagine not Hiroshima but
London, Moscow or Washington in flames in 1945. Now replace that with President Xi and
a pet AI. We’re in a race not just against a rogue AI that runs out of control. We’re in a race
against an aligned AI aimed at the West. It was nukes that created a balance against
nuclear destruction. It might be our AIs that defend against ‘their’ AIs. While everyone is
worrying that we might create evil Superman his ship has already landed in the lands of
dictators and despots. Factor that.
Reply Gift a subscription Collapse
Swami 2 hr ago
Devil's advocate…
With bioterrorism, nuclear proliferation, climate change, etc, etc on the immediate
horizon, civilization is extremely likely to hit a brick wall within the next century. Nothing is
certain, but the odds have to approach 100% as the decades advance and all else
remains the same.
On the other hand, as intelligence expands, so too does empathy, understanding,
foresight, and planning ability. Perhaps what we should be building is an intelligence entity
which is capable of shepherding the planet through the next few centuries and beyond.
Yes, that would make us dependent upon its benevolence, but the trade off isn’t between
everything is fine and the chance of powerful AI, it is the trade off between mutual
assured self destruction and AI.
Is AGI actually our last hope?
Reply Gift a subscription Collapse
Deiseach 51 min ago
"Perhaps what we should be building is an intelligence entity which is capable of
shepherding the planet through the next few centuries and beyond."
How? The AI can advise us not to be naughty all it likes, and we can ignore it - unless
it has force to back itself up (e.g. don't be naughty or else I will cut off all your
finances, you can't buy food, and you will starve to death).
We've had centuries of "Love your enemies" and we've generally responded "Nah,
not interested".
Reply Collapse
Taylor 17 min ago
If "hits a brick wall" means extinction, then I think AGI is a far bigger potential threat
than anything you mentioned.
Climate change? Get real, no serious person believes climate change is going to
bring about the apocalypse. Nuclear proliferation is concerning, but the only
existential threat is nuclear war between the US and PRC, and we're already
navigated such a risk in the past. MAD is a powerful deterrent.
Your argument was made by Altman himself on Twitter a while back. He said we had
to build AGI to save humanity from "an asteroid". Of course that was just an example,
but an ironic one since NASA already has a database of all large objects that could
approach Earth's orbit in the foreseeable future, and we appear to be safe. I replied
to him that AGI is the asteroid.
If AGI gets built anytime soon, all bets are off and things could very rapidly spiral out
of control. I would much rather bet on humanity to mitigate the known existing
threats to civilization (challenging though that may be) than to build a superintelligent
AI.
Reply Gift a subscription Collapse
Brooks 1 hr ago
> Recent AIs have tried lying to, blackmailing, threatening, and seducing users.
This is such a wild mischaracterization of reality that it weakens the entire position. This
view gives agency to systems that have none, and totally flips causality on its head. This
would be much more accurately presented as "users have been able to elicit text
completions that contain blackmail, threats, and seduction from the same statistical
models that often produce useful results."
It's like saying that actors are dangerous because they can play villains. There's a huge
map/territory problem that at least has to be addressed (maybe an actor who pretends to
be a serial killer is more likely to become one in their spare time?). LLM's done have
agency. Completions are a reflection of user input and training data. Everything else is
magical thinking.
Reply Collapse
Aapje 49 min ago
I think that the idea that OpenAI can significantly change the timeline is a fantasy. Long
term, the improvement in AI is determined by how fast the hardware is, which is more
determined by Nvidia and Tenstorrent, than OpenAI. Short term, you can make bigger
models by spending more, but you can't keep spending 10 times as much each
generation.
In computing, brute force has been key to being able to do more, not learning how to use
hardware more efficiently. Modern software is less efficiently programmed than software
of the past, which allows us to produce much more code at the expense of optimizing it.
Reply Gift a subscription Collapse
trent 44 min ago
You can't have it both ways:
"Wouldn't it be great if we had an AI smarter than us, so it could solve problems for us?"
"Yea, but how are you going to control it if it outsmarts us?"
"Well, here's the thing, you see, we'll just outsmart it first!"
Reply Gift a subscription Collapse
George Mack 34 min ago
I'm sympathetic to your concerns, but isn't the compute argument actually quite good?
Reply Collapse
Joe Canimal Writes The Magpied Piper 33 min ago
I wrote about why existential ai risk is of no practical concern here
https://open.substack.com/pub/joecanimal/p/tldr-existential-ai-risk-research?
utm_source=share&utm_medium=android
Reply Gift a subscription Collapse
© 2023 Scott Alexander ∙ Privacy ∙ Terms ∙ Collection notice
Substack is the home for great writing

You might also like