Professional Documents
Culture Documents
P OLICY FORUM
ARTIFICIAL INTELLIGENCE
By Michael K. Cohen1,2, Noam Kolt3,4, under control is also reflected in President So long as an agent’s rewards can
Yoshua Bengio5,6, Gillian K. Hadfield2,3,4,7, Biden’s 2023 executive order that intro- be controlled, it can be incentivized to
Stuart Russell1,2 duces reporting requirements for AI that achieve complex goals by conditioning the
could “eva[de] human control or oversight rewards appropriately. But a sufficiently
T
echnical experts and policy-makers through means of deception or obfusca- capable RL agent could take control of its
guarantees appear highly unlikely for any could also control the transfer of large pre- LTPAs fills an important gap, further insti-
AI systems built similarly to the most pow- trained models or other relevant resources. tutional mechanisms will likely be needed
erful systems today (13). Further, regulators could make it unlawful to mitigate the risks posed by advanced ar-
for other actors to use AI systems that fail tificial agents. j
MONITORING AND REPORTING to comply with these requirements (15).
REF ERENCES AND NOTES
Just as nuclear regulation extends to con- Taken together, controls on the develop-
1. G. Hinton et al., “Statement on AI risk” (Center
trolling uranium, AI regulation must ex- ment, use, and dissemination of produc- for AI Safety, May 2023); https://www.safe.ai/
tend to controlling the resources needed tion resources will substantially reduce the statement-on-ai-risk.
to produce dangerously capable LTPAs. We likelihood of these resources being used to 2. United Kingdom Department for Science, Innovation,
and Technology, United Kingdom Foreign,
define production resources (PRs) as any build dangerously capable LTPAs.
Commonwealth, and Development Office, United
information that makes the production of Kingdom Prime Minister’s Office, “The Bletchley
a dangerously capable LTPA cheaper than a ENFORCEMENT MECHANISMS Declaration by countries attending the AI Safety
threshold determined by regulators accord- To ensure compliance with these reporting Summit, 1-2 November 2023” (Gov.uk, November
2023); https://www.gov.uk/government/publications/
ing to their risk tolerance. Unlike uranium, requirements and usage controls, regula- ai-safety-summit-2023-the-bletchley-declaration/
a PR is not a physical resource—it could tors may need to be authorized to (i) issue the-bletchley-declaration-by-countries-attending-the-
include any AI model trained beyond a cer- legal orders that compel organizations to ai-safety-summit-1-2-november-2023.
tain compute threshold (14). Fortunately, report production resources and man- 3. J. Biden, “Executive order on the safe, secure,
and trustworthy development and use of arti-
regulators could detect such PRs by fol- date the cessation of prohibited activities; ficial intelligence” (The White House, October