这是用户在 2024-4-29 11:52 为 https://app.immersivetranslate.com/pdf-pro/e9527ee8-5130-44f1-af51-f9e37d1f7503 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?
2024_04_29_8e69ab715e08d65983b0g

Introduction to the Al Index Report 2024
2024 年 Al 指数报告简介

Welcome to the seventh edition of the Al Index report. The 2024 Index is our most comprehensive to date and arrives at an important moment when Al's influence on society has never been more pronounced. This year, we have broadened our scope to more extensively cover essential trends such as technical advancements in AI, public perceptions of the technology, and the geopolitical dynamics surrounding its development. Featuring more original data than ever before, this edition introduces new estimates on Al training costs, detailed analyses of the responsible Al landscape, and an entirely new chapter dedicated to Al's impact on science and medicine.
欢迎阅读 Al 指数报告的第七版。 2024 年的指数是迄今为止我们最全面的一次,正值 AI 对社会影响从未如此突出的重要时刻。 今年,我们将范围扩大,更全面地涵盖了技术进步、公众对技术的看法以及围绕其发展的地缘政治动态等重要趋势。 本版包含的原始数据比以往任何时候都更多,新增了有关 AI 培训成本的新估算、对负责任的 AI 场景的详细分析,并特别新增了一个章节,专门介绍 AI 对科学和医学的影响。
The AI Index report tracks, collates, distills, and visualizes data related to artificial intelligence (AI). Our mission is to provide unbiased, rigorously vetted, broadly sourced data in order for policymakers, researchers, executives, journalists, and the general public to develop a more thorough and nuanced understanding of the complex field of Al.
Al 指数报告追踪、整理、提炼和可视化与人工智能(AI)相关的数据。 我们的使命是提供经过公正审查、广泛收集的数据,让决策者、研究人员、高管、记者和广大公众能够更全面、更细致地了解 AI 这一复杂领域。
The Al Index is recognized globally as one of the most credible and authoritative sources for data and insights on artificial intelligence. Previous editions have been cited in major newspapers, including the The New York Times, Bloomberg, and The Guardian, have amassed hundreds of academic citations, and been referenced by high-level policymakers in the United States, the United Kingdom, and the European Union, among other places. This year's edition surpasses all previous ones in size, scale, and scope, reflecting the growing significance that is coming to hold in all of our lives.
Al 指数被全球公认为人工智能领域数据和洞见最可靠、最权威的来源之一。之前的版本曾被《纽约时报》、彭博社和《卫报》等主要报纸引用,获得了数百次学术引用,并被美国、英国和欧盟等地的高级政策制定者引用。今年的版本在规模、范围和影响力上超过了以往所有版本,反映了人工智能在我们生活中日益重要的地位。

Message From the Co-directors
联合主任致辞

A decade ago, the best systems in the world were unable to classify objects in images at a human level. struggled with language comprehension and could not solve math problems. Today, Al systems routinely exceed human performance on standard benchmarks.
十年前,世界上最好的人工智能系统无法以人类水平对图像中的对象进行分类。它们在语言理解方面遇到困难,无法解决数学问题。如今,人工智能系统在标准基准测试中通常超过人类表现。
Progress accelerated in 2023. New state-of-the-art systems like GPT-4, Gemini, and Claude 3 are impressively multimodal: They can generate fluent text in dozens of languages, process audio, and even explain memes. As AI has improved, it has increasingly forced its way into our lives. Companies are racing to build Al-based products, and is increasingly being used by the general public. But current technology still has significant problems. It cannot reliably deal with facts, perform complex reasoning, or explain its conclusions.
2023 年,进展加快。像 GPT-4、Gemini 和 Claude 3 这样的最新先进系统令人印象深刻:它们可以用几十种语言生成流畅的文本,处理音频,甚至解释表情包。随着人工智能的改进,它越来越多地渗透到我们的生活中。公司们正在竞相开发基于 AI 的产品,并且越来越多地被普通大众所使用。但目前的 AI 技术仍然存在重大问题。它无法可靠地处理事实,进行复杂的推理,或解释其结论。
Al faces two interrelated futures. First, technology continues to improve and is increasingly used, having major consequences for productivity and employment. It can be put to both good and bad uses. In the second future, the adoption of is constrained by the limitations of the technology. Regardless of which future unfolds, governments are increasingly concerned. They are stepping in to encourage the upside, such as funding university R&D and incentivizing private investment. Governments are also aiming to manage the potential downsides, such as impacts on employment, privacy concerns, misinformation, and intellectual property rights.
人工智能面临着两种相互关联的未来。首先,技术持续改进并越来越广泛地应用,对生产力和就业产生重大影响。它可以被用于善良和恶劣之用。在第二种未来中,技术的采用受到技术限制的约束。无论哪种未来展现出来,政府都越来越关注。他们正在采取措施鼓励积极方面,如资助大学研发并激励私人投资。政府还致力于管理潜在的负面影响,如对就业的影响,隐私问题,错误信息和知识产权。
As rapidly evolves, the Index aims to help the Al community, policymakers, business leaders, journalists, and the general public navigate this complex landscape. It provides ongoing, objective snapshots tracking several key areas: technical progress in Al capabilities, the community and investments driving Al development and deployment, public opinion on current and potential future impacts, and policy measures taken to stimulate innovation while managing its risks and challenges. By comprehensively monitoring the Al ecosystem, the Index serves as an important resource for understanding this transformative technological force.
随着人工智能迅速发展, 指数旨在帮助人工智能社区、政策制定者、商业领袖、记者和普通公众在这个复杂的领域中进行导航。它提供持续的客观快照,跟踪几个关键领域:人工智能能力的技术进展、推动人工智能发展和部署的社区和投资、公众对当前和潜在未来影响的看法,以及采取的政策措施,以刺激 创新,同时管理其风险和挑战。通过全面监测人工智能生态系统,该指数成为了理解这一变革性技术力量的重要资源。
On the technical front, this year's Al Index reports that the number of new large language models released worldwide in 2023 doubled over the previous year. Two-thirds were open-source, but the highest-performing models came from industry players with closed systems. Gemini Ultra became the first LLM to reach humanlevel performance on the Massive Multitask Language Understanding (MMLU) benchmark; performance on the benchmark has improved by 15 percentage points since last year. Additionally, GPT-4 achieved an impressive 0.96 mean win rate score on the comprehensive Holistic Evaluation of Language Models (HELM) benchmark, which includes MMLU among other evaluations.
从技术角度来看,今年的 AI 指数报告称,2023 年全球发布的新大型语言模型数量是去年的两倍。其中三分之二是开源的,但性能最高的模型来自具有封闭系统的行业玩家。Gemini Ultra 成为第一个在 Massive Multitask Language Understanding (MMLU) 基准测试中达到人类水平性能的模型;相比去年,该基准测试的性能提高了 15 个百分点。此外,GPT-4 在综合语言模型全面评估 (HELM) 基准测试中取得了令人印象深刻的 0.96 平均获胜率分数,其中包括 MMLU 在内的其他评估。

Message From the Co-directors (cont'd)
联合主任们的消息(续)

Although global private investment in Al decreased for the second consecutive year, investment in generative skyrocketed. More Fortune 500 earnings calls mentioned than ever before, and new studies show that tangibly boosts worker productivity. On the policymaking front, global mentions of in legislative proceedings have never been higher. U.S. regulators passed more AI-related regulations in 2023 than ever before. Still, many expressed concerns about Al's ability to generate deepfakes and impact elections. The public became more aware of Al, and studies suggest that they responded with nervousness.
尽管全球私人对 Al 的投资连续第二年下降,但生成式 的投资却激增。更多财富 500 强公司的财报电话提到 ,新研究显示 明显提升了工人的生产力。在政策制定方面,全球在立法程序中提到 的次数达到历史最高。美国监管机构在 2023 年通过的与 AI 相关的法规比以往任何时候都多。然而,许多人对 Al 生成深度伪造和影响选举的能力表示担忧。公众对 Al 的认识更加深入,研究表明他们的反应是紧张的。

Ray Perrault and Jack Clark
Ray Perrault 和 Jack Clark

Co-directors, Al Index Al 指数联合主任

Top 10 Takeaways 十大要点

  1. Al beats humans on some tasks, but not on all. Al has surpassed human performance on several benchmarks, including some in image classification, visual reasoning, and English understanding. Yet it trails behind on more complex tasks like competition-level mathematics, visual commonsense reasoning and planning.
    在某些任务上,人工智能超过人类,但并非所有任务都是如此。人工智能在一些基准测试中已经超过了人类的表现,包括图像分类、视觉推理和英语理解等方面。然而,在更复杂的任务上,如竞赛级数学、视觉常识推理和规划方面,人工智能仍然落后。
  2. Industry continues to dominate frontier Al research. In 2023, industry produced 51 notable machine learning models, while academia contributed only 15 . There were also 21 notable models resulting from industry-academia collaborations in 2023, a new high.
    行业继续主导前沿的人工智能研究。2023 年,行业发布了 51 个显著的机器学习模型,而学术界仅贡献了 15 个。2023 年,行业与学术界合作产生了 21 个显著模型,创下新高。
  3. Frontier models get way more expensive. According to AI Index estimates, the training costs of state-of-the-art Al models have reached unprecedented levels. For example, OpenAl's GPT-4 used an estimated million worth of compute to train, while Google's Gemini Ultra cost million for compute.
    边境模型变得更加昂贵。根据 AI 指数估计,现代 AI 模型的训练成本已经达到了前所未有的水平。例如,OpenAI 的 GPT-4 使用了估计价值 百万的计算资源进行训练,而谷歌的 Gemini Ultra 则需要 百万计算成本。
  4. The United States leads China, the EU, and the U.K. as the leading source of top AI models. In 2023, 61 notable Al models originated from U.S.-based institutions, far outpacing the European Union's 21 and China's 15.
    美国领先于中国、欧盟和英国,成为顶尖人工智能模型的主要来源。2023 年,61 个知名的人工智能模型源自美国机构,远远超过欧盟的 21 个和中国的 15 个。

5. Robust and standardized evaluations for LLM responsibility are seriously lacking.
5. 对LLM责任的强大和标准化评估严重匮乏。

New research from the AI Index reveals a significant lack of standardization in responsible Al reporting. Leading developers, including OpenAI, Google, and Anthropic, primarily test their models against different responsible Al benchmarks. This practice complicates efforts to systematically compare the risks and limitations of top Al models.
  1. Generative Al investment skyrockets. Despite a decline in overall Al private investment last year, funding for generative AI surged, nearly octupling from 2022 to reach billion. Major players in the generative Al space, including OpenAI, Anthropic, Hugging Face, and Inflection, reported substantial fundraising rounds.
    生成性人工智能投资大幅增长。尽管去年整体人工智能私人投资出现下滑,但生成性人工智能的资金激增,从 2022 年增长近 8 倍,达到 亿美元。生成性人工智能领域的主要参与者,包括 OpenAI、Anthropic、Hugging Face 和 Inflection,都宣布了重大的募资轮次。
  2. The data is in: Al makes workers more productive and leads to higher quality work. In 2023, several studies assessed Al's impact on labor, suggesting that Al enables workers to complete tasks more quickly and to improve the quality of their output. These studies also demonstrated Al's potential to bridge the skill gap between low- and high-skilled workers. Still, other studies caution that using Al without proper oversight can lead to diminished performance.
    数据显示:人工智能提高了工人的生产力,促进了更高质量的工作。2023 年,几项研究评估了人工智能对劳动力的影响,表明人工智能使工人能够更快地完成任务并提高其产出质量。这些研究还展示了人工智能在弥合低技能和高技能工人之间的技能差距方面的潜力。然而,其他研究警告称,没有适当监督的情况下使用人工智能可能会导致绩效下降。

Top 10 Takeaways (cont'd)
十大要点(续)

  1. Scientific progress accelerates even further, thanks to Al. In 2022, Al began to advance scientific discovery. 2023, however, saw the launch of even more significant science-related Al applicationsfrom AlphaDev, which makes algorithmic sorting more efficient, to GNoME, which facilitates the process of materials discovery.
    科学进步得到进一步加速,这要归功于人工智能。2022 年,人工智能开始推动科学发现。然而,2023 年,更多重要的与科学相关的人工智能应用程序推出,从 AlphaDev 推出的使算法排序更加高效,到 GNoME 简化材料发现过程。
  2. The number of AI regulations in the United States sharply increases. The number of AIrelated regulations in the U.S. has risen significantly in the past year and over the last five years. In 2023, there were 25 Al-related regulations, up from just one in 2016. Last year alone, the total number of Al-related regulations grew by .
    美国的人工智能法规数量急剧增加。美国与人工智能相关的法规数量在过去一年和过去五年显著增加。2023 年,与人工智能相关的法规数量达到 25 项,远高于 2016 年的 1 项。仅去年一年,与人工智能相关的法规总数增长了
  3. People across the globe are more cognizant of Al's potential impact-and more nervous. A survey from Ipsos shows that, over the last year, the proportion of those who think Al will dramatically affect their lives in the next three to five years has increased from to . Moreover, express nervousness toward Al products and services, marking a 13 percentage point rise from 2022. In America, Pew data suggests that of Americans report feeling more concerned than excited about Al, rising from 37% in 2022.
    全球各地的人们更加意识到人工智能的潜在影响,并感到更加紧张。Ipsos 的一项调查显示,过去一年中,认为人工智能将在未来三到五年内对他们的生活产生巨大影响的人所占比例已从 增加到 。此外, 对人工智能产品和服务表现出紧张情绪,较 2022 年增加了 13 个百分点。根据皮尤数据,美国有 的美国人表示对人工智能感到更担忧而不是兴奋,较 2022 年的 37%上升。

Steering Committee 指导委员会

Co-directors 联合导演

Jack Clark, Anthropic, OECD
杰克·克拉克,Anthropic,经济合作与发展组织
Raymond Perrault, SRI International
雷蒙德·佩罗,SRI 国际

Members 成员

Erik Brynjolfsson, Stanford University Juan Carlos Niebles, Stanford University, Salesforce
埃里克·布林约尔松,斯坦福大学 胡安·卡洛斯·尼埃布莱斯,斯坦福大学,Salesforce
John Etchemendy, Stanford University Vanessa Parli, Stanford University
约翰·埃切门迪,斯坦福大学 瓦内萨·帕利,斯坦福大学
Katrina Ligett, Hebrew University
卡特里娜·利杰特,希伯来大学
Terah Lyons, JPMorgan Chase & Co.
特拉·莱昂斯,摩根大通公司
James Manyika, Google, University of Oxford
詹姆斯·马尼卡,谷歌,牛津大学
Russell Wald, Stanford University
斯坦福大学的 Russell Wald

Staff and Researchers 员工和研究人员

Research Manager and Editor in Chief
研究经理和总编辑

Nestor Maslej
Stanford University 斯坦福大学

Research Associate 研究助理

Loredana Fattorini
Stanford University 斯坦福大学

Affiliated Researchers 附属研究员

Elif Kiesow Cortez, Stanford Law School Research Fellow
Elif Kiesow Cortez,斯坦福法学院研究员
Anka Reuel, Stanford University
Anka Reuel,斯坦福大学
Robi Rahman, Data Scientist
Robi Rahman,数据科学家
Alexandra Rome, Freelance Researcher Lapo Santarlasci, IMT School for
Alexandra Rome,自由研究员 Lapo Santarlasci,IMT 学院
Advanced Studies Lucca 高级研究卢卡

Graduate Researchers 研究生研究员

James da Costa, Stanford University
James da Costa,斯坦福大学
Simba Jonga, Stanford University
Simba Jonga,斯坦福大学

Undergraduate Researchers
本科研究员

Emily Capstick, Stanford University Summer Flowers, Stanford University Armin Hamrah, Claremont McKenna College Amelia Hardy, Stanford University Mena Hassan, Stanford University Ethan Duncan He-Li Hellman, Stanford University Julia Betts Lotufo, Stanford University Sukrut Oak, Stanford University Andrew Shi, Stanford University Jason Shin, Stanford University Emma Williamson, Stanford University Alfred Yu, Stanford University
Emily Capstick,斯坦福大学 Summer Flowers,斯坦福大学 Armin Hamrah,克莱蒙特麦肯纳学院 Amelia Hardy,斯坦福大学 Mena Hassan,斯坦福大学 Ethan Duncan He-Li Hellman,斯坦福大学 Julia Betts Lotufo,斯坦福大学 Sukrut Oak,斯坦福大学 Andrew Shi,斯坦福大学 Jason Shin,斯坦福大学 Emma Williamson,斯坦福大学 Alfred Yu,斯坦福大学

How to Cite This Report
如何引用本报告

Nestor Maslej, Loredana Fattorini, Raymond Perrault, Vanessa Parli, Anka Reuel, Erik Brynjolfsson, John Etchemendy, Katrina Ligett, Terah Lyons, James Manyika, Juan Carlos Niebles, Yoav Shoham, Russell Wald, and Jack Clark, "The Al Index 2024 Annual Report," AI Index Steering Committee, Institute for Human-Centered AI, Stanford University, Stanford, CA, April 2024.
The AI Index 2024 Annual Report by Stanford University is licensed under Attribution-NoDerivatives 4.0 International.
《2024 年斯坦福大学 AI 指数年度报告》由斯坦福大学授权,采用署名-禁止演绎 4.0 国际许可。

Public Data and Tools
公共数据和工具

The Al Index 2024 Report is supplemented by raw data and an interactive tool. We invite each reader to use the data and the tool in a way most relevant to their work and interests.
2024 年 Al 指数报告附带原始数据和交互工具。我们邀请每位读者根据自己的工作和兴趣以最相关的方式使用数据和工具。
  • Raw data and charts: The public data and high-resolution images of all the charts in the report are available on Google Drive.
    原始数据和图表:报告中所有图表的公共数据和高分辨率图像可在 Google Drive 上获得。
  • Global AI Vibrancy Tool: Compare the Al ecosystems of over 30 countries. The Global Al Vibrancy tool will be updated in the summer of 2024.
    全球 AI 活力工具:比较 30 多个国家的 AI 生态系统。全球 AI 活力工具将于 2024 年夏季更新。

Al Index and Stanford HAI
阿尔指数和斯坦福人工智能研究所

The AI Index is an independent initiative at the Stanford Institute for Human-Centered Artificial Intelligence (HAI).
人工智能指数是斯坦福人类中心人工智能研究所(HAI)的独立倡议。
Artificial Intelligence Index
人工智能指数

Stanford University 斯坦福大学

Human-Centered 以人为本
Artificial Intelligence 人工智能
The Al Index was conceived within the One Hundred Year Study on Artificial Intelligence (Al100).
Al 指数是在人工智能百年研究(Al100)中构想的。
The AI Index welcomes feedback and new ideas for next year. Contact us at AI-Index-Report@stanford.edu.
人工智能指数欢迎对明年的反馈意见和新想法。请通过 AI-Index-Report@stanford.edu 与我们联系。
The Al Index acknowledges that while authored by a team of human researchers, its writing process was aided by Al tools. Specifically, the authors used ChatGPT and Claude to help tighten and copy edit initial drafts. The workflow involved authors writing the original copy, then utilizing Al tools as part of the editing process.
Al 指数承认,虽然由一组人类研究人员创作,但其写作过程得到了 Al 工具的帮助。具体来说,作者们使用 ChatGPT 和 Claude 来帮助修改和编辑初稿。工作流程涉及作者撰写原始文本,然后利用 Al 工具作为编辑过程的一部分。

Supporting Partners 支持合作伙伴

Analytics and Research Partners
分析和研究合作伙伴

accenture 밈은 QCSET 三EPochal Github (govini =

Lightcast Linked in
Lightcast 领英

Contributors 贡献者

The Al Index wants to acknowledge the following individuals by chapter and section for their contributions of data, analysis, advice, and expert commentary included in the AI Index 2024 Report:
AI 指数希望通过章节和部分向以下个人致谢,感谢他们在 AI 指数 2024 年报告中提供的数据、分析、建议和专家评论的贡献:

Introduction 介绍

Loredana Fattorini, Nestor Maslej, Vanessa Parli, Ray Perrault
Loredana Fattorini,Nestor Maslej,Vanessa Parli,Ray Perrault

Chapter 1: Research and Development
第一章:研究与开发

Catherine Aiken, Terry Auricchio, Tamay Besiroglu, Rishi Bommasani, Andrew Brown, Peter Cihon, James da Costa, Ben Cottier, James Cussens, James Dunham, Meredith Ellison, Loredana Fattorini, Enrico Gerding, Anson Ho, Percy Liang, Nestor Maslej, Greg Mori, Tristan Naumann, Vanessa Parli, Pavlos Peppas, Ray Perrault, Robi Rahman, Vesna Sablijakovic-Fritz, Jim Schmiedeler, Jaime Sevilla, Autumn Toney, Kevin Xu, Meg Young, Milena Zeithamlova
Catherine Aiken,Terry Auricchio,Tamay Besiroglu,Rishi Bommasani,Andrew Brown,Peter Cihon,James da Costa,Ben Cottier,James Cussens,James Dunham,Meredith Ellison,Loredana Fattorini,Enrico Gerding,Anson Ho,Percy Liang,Nestor Maslej,Greg Mori,Tristan Naumann,Vanessa Parli,Pavlos Peppas,Ray Perrault,Robi Rahman,Vesna Sablijakovic-Fritz,Jim Schmiedeler,Jaime Sevilla,Autumn Toney,Kevin Xu,Meg Young,Milena Zeithamlova

Chapter 2: Technical Performance
第二章:技术表现

Rishi Bommasani, Emma Brunskill, Erik Brynjolfsson, Emily Capstick, Jack Clark, Loredana Fattorini, Tobi Gertsenberg, Noah Goodman, Nicholas Haber, Sanmi Koyejo, Percy Liang, Katrina Ligett, Sasha Luccioni, Nestor Maslej, Juan Carlos Niebles, Sukrut Oak, Vanessa Parli, Ray Perrault, Andrew Shi, Yoav Shoham, Emma Williamson
Rishi Bommasani,Emma Brunskill,Erik Brynjolfsson,Emily Capstick,Jack Clark,Loredana Fattorini,Tobi Gertsenberg,Noah Goodman,Nicholas Haber,Sanmi Koyejo,Percy Liang,Katrina Ligett,Sasha Luccioni,Nestor Maslej,Juan Carlos Niebles,Sukrut Oak,Vanessa Parli,Ray Perrault,Andrew Shi,Yoav Shoham,Emma Williamson

Chapter 3: Responsible AI
第三章:负责任的人工智能

Jack Clark, Loredana Fattorini, Amelia Hardy, Katrina Ligett, Nestor Maslej, Vanessa Parli, Ray Perrault, Anka Reuel, Andrew Shi
Jack Clark,Loredana Fattorini,Amelia Hardy,Katrina Ligett,Nestor Maslej,Vanessa Parli,Ray Perrault,Anka Reuel,Andrew Shi

Chapter 4: Economy 第四章:经济

Susanne Bieller, Erik Brynjolfsson, Mar Carpanelli, James da Costa, Natalia Dorogi, Heather English, Murat Erer, Loredana Fattorini, Akash Kaura, James Manyika, Nestor Maslej, Cal McKeever, Julia Nitschke, Layla O’Kane, Vanessa Parli, Ray Perrault, Brittany Presten, Carl Shan, Bill Valle, Casey Weston, Emma Williamson
Susanne Bieller,Erik Brynjolfsson,Mar Carpanelli,James da Costa,Natalia Dorogi,Heather English,Murat Erer,Loredana Fattorini,Akash Kaura,James Manyika,Nestor Maslej,Cal McKeever,Julia Nitschke,Layla O’Kane,Vanessa Parli,Ray Perrault,Brittany Presten,Carl Shan,Bill Valle,Casey Weston,Emma Williamson

Chapter 5: Science and Medicine
第五章:科学与医学

Russ Altman, Loredana Fattorini, Remi Lam, Curtis Langlotz, James Manyika, Nestor Maslej, Vanessa Parli, Ray Perrault, Emma Williamson

Contributors (cont'd) 贡献者(续)

Chapter 6: Education 第六章:教育

Betsy Bizot, John Etchemendy, Loredana Fattorini, Kirsten Feddersen, Matt Hazenbush, Nestor Maslej, Vanessa Parli, Ray Perrault, Svetlana Tikhonenko, Laurens Vehmeijer, Hannah Weissman, Stuart Zweben
Betsy Bizot,John Etchemendy,Loredana Fattorini,Kirsten Feddersen,Matt Hazenbush,Nestor Maslej,Vanessa Parli,Ray Perrault,Svetlana Tikhonenko,Laurens Vehmeijer,Hannah Weissman,Stuart Zweben

Chapter 7: Policy and Governance
第七章:政策与治理

Alison Boyer, Elif Kiesow Cortez, Rebecca DeCrescenzo, David Freeman Engstrom, Loredana Fattorini, Philip de Guzman, Mena Hassan, Ethan Duncan He-Li Hellman, Daniel Ho, Simba Jonga, Rohini Kosoglu, Mark Lemley, Julia Betts Lotufo, Nestor Maslej, Caroline Meinhardt, Julian Nyarko, Jeff Park, Vanessa Parli, Ray Perrault, Alexandra Rome, Lapo Santarlasci, Sarah Smedley, Russell Wald, Emma Williamson, Daniel Zhang

Chapter 8: Diversity 第八章:多样性

Betsy Bizot, Loredana Fattorini, Kirsten Feddersen, Matt Hazenbush, Nestor Maslej, Vanessa Parli, Ray Perrault, Svetlana Tikhonenko, Laurens Vehmeijer, Caroline Weis, Hannah Weissman, Stuart Zweben

Chapter 9: Public Opinion
第九章:公众舆论

Maggie Arai, Heather English, Loredana Fattorini, Armin Hamrah, Peter Loewen, Nestor Maslej, Vanessa Parli, Ray Perrault, Marco Monteiro Silva, Lee Slinger, Bill Valle, Russell Wald
Maggie Arai,Heather English,Loredana Fattorini,Armin Hamrah,Peter Loewen,Nestor Maslej,Vanessa Parli,Ray Perrault,Marco Monteiro Silva,Lee Slinger,Bill Valle,Russell Wald
The Al Index thanks the following organizations and individuals who provided data for inclusion in this year's report:
Al 指数感谢以下组织和个人为本年度报告提供数据:

Organizations 组织

Center for Research on
研究中心
Foundation Models 基础模型
Rishi Bommasani, Percy Liang

Center for Security and Emerging
安全与新兴技术中心

Technology, Georgetown University
Catherine Aiken, James Dunham, Autumn Toney

Code.org

Hannah Weissman

Computing Research Association
计算机研究协会

Betsy Bizot, Stuart Zweben
Betsy Bizot,Stuart Zweben

Epoch 时代

Ben Cottier, Robi Rahman
本·科蒂尔,罗比·拉赫曼

GitHub

Peter Cihon, Kevin Xu
彼得·西洪,凯文·徐

Govini

Alison Boyer, Rebecca DeCrescenzo, Philip de Guzman, Jeff Park

Informatics Europe

Svetlana Tikhonenko 斯维特拉娜·蒂霍年科

International Federation of Robotics
国际机器人联合会

Susanne Bieller 苏珊娜·比勒

Lightcast 亮投

Cal McKeever, Julia Nitschke, Layla O’Kane
卡尔·麦克弗,朱莉娅·尼奇克,莱拉·奥卡恩

Linkedln 领英

Murat Erer, Akash Kaura, Casey Weston

McKinsey & Company 麦肯锡公司

Natalia Dorogi, Brittany Presten
Munk School of Global Affairs and Public Policy
全球事务与公共政策蒙克学院
Peter Loewen, Lee Slinger
彼得·洛文,李·斯林格

Quid

Heather English, Bill Valle

Schwartz Reisman Institute for Technology and Society

Maggie Arai, Marco Monteiro Silva

Studyportals

Kirsten Feddersen, Laurens Vehmeijer
Women in Machine Learning
机器学习中的女性
Caroline Weis
The Al Index also thanks Jeanina Casusi, Nancy King, Carolyn Lehman, Shana Lynch, Jonathan Mindes, and Michi Turner for their help in preparing this report; Joe Hinman and Nabarun Mukherjee for their help in maintaining the Al Index website; and Annie Benisch, Marc Gough, Panos Madamopoulos-Moraris, Kaci Peel, Drew Spence, Madeline Wright, and Daniel Zhang for their work in helping promote the report.
Al Index 也感谢 Jeanina Casusi、Nancy King、Carolyn Lehman、Shana Lynch、Jonathan Mindes 和 Michi Turner 为准备此报告提供的帮助;Joe Hinman 和 Nabarun Mukherjee 为维护 Al Index 网站提供的帮助;Annie Benisch、Marc Gough、Panos Madamopoulos-Moraris、Kaci Peel、Drew Spence、Madeline Wright 和 Daniel Zhang 在促进该报告方面的工作。

Table of Contents 目录

Report Highlights ..... 14
报告亮点 ..... 14

Chapter 1 Research and Development ..... 27
第 1 章 研究与开发 ..... 27

Chapter 2 Technical Performance ..... 73
第 2 章 技术表现 ..... 73

Chapter 3 Responsible AI ..... 159
第三章 负责任的人工智能 ..... 159

Chapter 4 Economy ..... 213
第四章 经济 ..... 213

Chapter 5 Science and Medicine ..... 296
第五章 科学与医学 ..... 296

Chapter 6 Education ..... 325
第 6 章 教育 ..... 325

Chapter 7 Policy and Governance ..... 366
第 7 章 政策与治理 ..... 366

Chapter 8 Diversity ..... 411
第 8 章 多样性 ..... 411

Chapter 9 Public Opinion ..... 435
第 9 章 公众意见 ..... 435

Appendix ..... 458 附录 ..... 458
ACCESS THE PUBLIC DATA
访问公共数据

Report Highlights 报告亮点

Chapter 1: Research and Development
第一章:研究与开发

  1. Industry continues to dominate frontier Al research. In 2023, industry produced 51 notable machine learning models, while academia contributed only 15 . There were also 21 notable models resulting from industry-academia collaborations in 2023, a new high.
    行业继续主导前沿的人工智能研究。2023 年,行业发布了 51 个显著的机器学习模型,而学术界仅贡献了 15 个。2023 年,行业与学术界合作产生了 21 个显著模型,创下新高。
  2. More foundation models and more open foundation models. In 2023 , a total of 149 foundation models were released, more than double the amount released in 2022. Of these newly released models, were open-source, compared to only in 2022 and in 2021.
    发布更多基础模型和开源基础模型。2023 年共发布了 149 个基础模型,比 2022 年发布的数量翻了一倍还多。其中, 个是开源的,而 2022 年只有 个,2021 年更少,只有 个。
  3. Frontier models get way more expensive. According to Al Index estimates, the training costs of state-of-the-art AI models have reached unprecedented levels. For example, OpenAl's GPT-4 used an estimated million worth of compute to train, while Google's Gemini Ultra cost million for compute.
    前沿模型变得更加昂贵。根据 Al Index 估计,最先进的人工智能模型的训练成本已经达到了前所未有的水平。例如,OpenAI 的 GPT-4 使用了估计价值 百万的计算资源进行训练,而谷歌的 Gemini Ultra 则花费了 百万用于计算资源。
  4. The United States leads China, the EU, and the U.K. as the leading source of top AI models. In 2023, 61 notable Al models originated from U.S.-based institutions, far outpacing the European Union's 21 and China's 15.
    美国领先于中国、欧盟和英国,成为顶尖人工智能模型的主要来源。2023 年,61 个知名的人工智能模型源自美国机构,远远超过欧盟的 21 个和中国的 15 个。
  5. The number of Al patents skyrockets. From 2021 to 2022, Al patent grants worldwide increased sharply by . Since 2010 , the number of granted Al patents has increased more than 31 times.
    Al 专利数量激增。从 2021 年到 2022 年,全球 Al 专利授权数量急剧增加 。自 2010 年以来,已授权的 Al 专利数量增加了超过 31 倍。
  6. China dominates Al patents. In 2022, China led global Al patent origins with , significantly outpacing the United States, which accounted for of Al patent origins. Since 2010, the U.S. share of AI patents has decreased from .
    中国主导人工智能专利。2022 年,中国在全球人工智能专利来源中处于领先地位,占比 ,远远超过美国,美国占 的人工智能专利来源。自 2010 年以来,美国在人工智能专利中的份额已经从 下降。
  7. Open-source Al research explodes. Since 2011, the number of AI-related projects on GitHub has seen a consistent increase, growing from 845 in 2011 to approximately 1.8 million in 2023. Notably, there was a sharp 59.3% rise in the total number of GitHub Al projects in 2023 alone. The total number of stars for Al-related projects on GitHub also significantly increased in 2023, more than tripling from 4.0 million in 2022 to 12.2 million.
    开源人工智能研究蓬勃发展。自 2011 年以来,GitHub 上与人工智能相关项目的数量持续增加,从 2011 年的 845 个增长到 2023 年的约 180 万个。值得注意的是,仅在 2023 年,GitHub 上人工智能项目的总数就急剧增加了 59.3%。2023 年 GitHub 上与人工智能相关项目的星标总数也显著增加,从 2022 年的 400 万增至 1220 万。
  8. The number of Al publications continues to rise. Between 2010 and 2022 , the total number of publications nearly tripled, rising from approximately 88,000 in 2010 to more than 240,000 in 2022 . The increase over the last year was a modest .
    人工智能出版物数量持续增长。2010 年至 2022 年间,人工智能出版物总数几乎翻了三番,从 2010 年的约 8.8 万增至 2022 年的超过 24 万。过去一年的增长幅度较为温和,为

Report Highlights  报告亮点

Chapter 2: Technical Performance
第二章:技术表现

\begin{abstract}
1. Al beats humans on some tasks, but not on all. Al has surpassed human performance on several benchmarks, including some in image classification, visual reasoning, and English understanding. Yet it trails behind on more complex tasks like competition-level mathematics, visual commonsense reasoning and planning.
1. 人工智能在某些任务上超过了人类,但并非所有任务都是如此。人工智能在一些基准测试中已经超过了人类的表现,包括图像分类、视觉推理和英语理解等领域。然而,在更复杂的任务上,如竞赛级数学、视觉常识推理和规划等方面,人工智能仍然落后。

\end{abstract} \end{摘要}
  1. Here comes multimodal Al. Traditionally Al systems have been limited in scope, with language models excelling in text comprehension but faltering in image processing, and vice versa. However, recent advancements have led to the development of strong multimodal models, such as Google's Gemini and OpenAl's GPT-4. These models demonstrate flexibility and are capable of handling images and text and, in some instances, can even process audio.
    传统上,AI 系统在范围上存在局限性,语言模型擅长文本理解,但在图像处理方面表现不佳,反之亦然。然而,最近的进展导致了强大的多模态模型的发展,例如谷歌的 Gemini 和 OpenAI 的 GPT-4。这些模型表现出灵活性,能够处理图像和文本,并且在某些情况下甚至可以处理音频。
  2. Harder benchmarks emerge. Al models have reached performance saturation on established benchmarks such as ImageNet, SQuAD, and SuperGLUE, prompting researchers to develop more challenging ones. In 2023, several challenging new benchmarks emerged, including SWE-bench for coding, HEIM for image generation, MMMU for general reasoning, MoCa for moral reasoning, AgentBench for agent-based behavior, and HaluEval for hallucinations.
    更难的基准出现了。AI 模型在 ImageNet、SQuAD 和 SuperGLUE 等已建立的基准上达到了性能饱和,促使研究人员开发出更具挑战性的基准。2023 年,出现了几个具有挑战性的新基准,包括用于编码的 SWE-bench,用于图像生成的 HEIM,用于一般推理的 MMMU,用于道德推理的 MoCa,用于基于代理的行为的 AgentBench,以及用于幻觉的 HaluEval。

4. Better Al means better data which means ... even better Al. New Al models such as
4. 更好的人工智能意味着更好的数据,这意味着...甚至更好的人工智能。新的人工智能模型,如

SegmentAnything and Skoltech are being used to generate specialized data for tasks like image segmentation and 3D reconstruction. Data is vital for technical improvements. The use of to create more data enhances current capabilities and paves the way for future algorithmic improvements, especially on harder tasks.
SegmentAnything 和 Skoltech 正在用于生成专门用于诸如图像分割和 3D 重建等任务的数据。数据对于技术改进至关重要。利用 来创建更多数据增强了当前的能力,并为未来算法改进铺平了道路,特别是在更难的任务上。
  1. Human evaluation is in. With generative models producing high-quality text, images, and more, benchmarking has slowly started shifting toward incorporating human evaluations like the Chatbot Arena Leaderboard rather than computerized rankings like ImageNet or SQuAD. Public sentiment about AI is becoming an increasingly important consideration in tracking Al progress.
    人类评估正在兴起。随着生成模型生成高质量的文本、图像等,基准测试逐渐开始转向纳入人类评估,如 Chatbot Arena Leaderboard,而不是像 ImageNet 或 SQuAD 这样的计算机排名。公众对人工智能的情绪变得越来越重要,成为追踪人工智能进展的一个日益重要的考虑因素。
  2. Thanks to LLMs, robots have become more flexible. The fusion of language modeling with robotics has given rise to more flexible robotic systems like PaLM-E and RT-2. Beyond their improved robotic capabilities, these models can ask questions, which marks a significant step toward robots that can interact more effectively with the real world.
    由于LLMs的出现,机器人变得更加灵活。语言建模与机器人技术的融合催生了更加灵活的机器人系统,如 PaLM-E 和 RT-2。除了改进了机器人的能力,这些模型还可以提问,这是机器人与现实世界更有效互动的重要一步。

Chapter 2: Technical Performance (cont'd)
第二章:技术性能(续)

  1. More technical research in agentic Al. Creating Al agents, systems capable of autonomous operation in specific environments, has long challenged computer scientists. However, emerging research suggests that the performance of autonomous agents is improving. Current agents can now master complex games like Minecraft and effectively tackle real-world tasks, such as online shopping and research assistance.
    关于主动型人工智能的更多技术研究。创建自主在特定环境中运行的人工智能代理系统一直是计算机科学家们长期面临的挑战。然而,新兴的研究表明,自主 代理系统的性能正在不断提高。现如今的代理系统已经能够掌握复杂的游戏,如 Minecraft,并且能够有效地处理实际任务,如在线购物和研究辅助。
  2. Closed LLMs significantly outperform open ones. On 10 select Al benchmarks, closed models outperformed open ones, with a median performance advantage of . Differences in the performance of closed and open models carry important implications for Al policy debates.
    闭源LLMs明显优于开源。在 10 个选择的 Al 基准测试中,闭源模型的性能优于开源模型,具有 的中位性能优势。闭源模型和开源模型的性能差异对 Al 政策辩论具有重要意义。

Report Highlights 报告亮点

Chapter 3: Responsible Al
第三章:负责任的 Al

1. Robust and standardized evaluations for LLM responsibility are seriously lacking.
1.对于LLM责任的稳健和标准化评估严重缺乏。

New research from the AI Index reveals a significant lack of standardization in responsible AI reporting. Leading developers, including OpenAI, Google, and Anthropic, primarily test their models against different responsible AI benchmarks. This practice complicates efforts to systematically compare the risks and limitations of top Al models.
  1. Political deepfakes are easy to generate and difficult to detect. Political deepfakes are already affecting elections across the world, with recent research suggesting that existing Al deepfake methods perform with varying levels of accuracy. In addition, new projects like CounterCloud demonstrate how easily AI can create and disseminate fake content.
    政治 Deepfake 很容易生成,难以检测。最近的研究表明,政治 Deepfake 已经对世界各地的选举产生影响,现有的 Al Deepfake 方法的准确性不同。此外,CounterCloud 等新项目展示了 AI 如何轻松创建和传播假内容。
  2. Researchers discover more complex vulnerabilities in LLMs. Previously, most efforts to red team Al models focused on testing adversarial prompts that intuitively made sense to humans. This year, researchers found less obvious strategies to get LLMs to exhibit harmful behavior, like asking the models to infinitely repeat random words.
    研究人员在LLMs中发现了更复杂的漏洞。以前,对 Al 模型进行红队测试的大部分工作都集中在测试对人类直观有意义的对抗性提示上。今年,研究人员发现了一些不太明显的策略,可以使LLMs表现出有害行为,例如要求模型无限重复随机单词。
  3. Risks from are becoming a concern for businesses across the globe. A global survey on responsible Al highlights that companies' top Al-related concerns include privacy, data security, and reliability. The survey shows that organizations are beginning to take steps to mitigate these risks. Globally, however, most companies have so far only mitigated a small portion of these risks.
    来自 的风险正成为全球企业关注的焦点。一项关于负责任人工智能的全球调查显示,公司对人工智能相关的主要担忧包括隐私、数据安全和可靠性。调查显示,组织正开始采取措施来减轻这些风险。然而,在全球范围内,大多数公司迄今只减轻了这些风险的一小部分。
  4. LLMs can output copyrighted material. Multiple researchers have shown that the generative outputs of popular LLMs may contain copyrighted material, such as excerpts from The New York Times or scenes from movies. Whether such output constitutes copyright violations is becoming a central legal question.
    LLMs可以输出受版权保护的材料。多位研究人员已经证明,流行LLMs的生成输出可能包含受版权保护的材料,例如《纽约时报》的摘录或电影场景。这种输出是否构成侵犯版权的问题正逐渐成为一个核心的法律问题。

6. Al developers score low on transparency, with consequences for research. The newly
6. 人工智能开发者在透明度方面得分较低,对研究产生影响。新近进行的研究显示,人工智能开发者在透明度方面得分较低,这对研究产生了影响。

introduced Foundation Model Transparency Index shows that AI developers lack transparency, especially regarding the disclosure of training data and methodologies. This lack of openness hinders efforts to further understand the robustness and safety of Al systems.

Chapter 3: Responsible AI (cont'd)
第三章:负责任的人工智能(续)

  1. Extreme Al risks are difficult to analyze. Over the past year, a substantial debate has emerged among Al scholars and practitioners regarding the focus on immediate model risks, like algorithmic discrimination, versus potential long-term existential threats. It has become challenging to distinguish which claims are scientifically founded and should inform policymaking. This difficulty is compounded by the tangible nature of already present short-term risks in contrast with the theoretical nature of existential threats.
    极端 AI 风险很难分析。在过去的一年中,AI 学者和从业者之间出现了一场关于立即模型风险(如算法歧视)与潜在的长期存在威胁的辩论。区分哪些声明在科学上有根据并应该影响政策制定已变得具有挑战性。这一困难被已经存在的短期风险的具体性质与存在威胁的理论性质所加剧。
  2. The number of Al incidents continues to rise. According to the Al Incident Database, which tracks incidents related to the misuse of Al, 123 incidents were reported in 2023, a 32.3 percentage point increase from 2022. Since 2013, Al incidents have grown by over twentyfold. A notable example includes Al-generated, sexually explicit deepfakes of Taylor Swift that were widely shared online.
    AI 事件的数量继续增加。根据 AI 事件数据库的数据,追踪与 AI 滥用相关的事件,2023 年报告了 123 起事件,比 2022 年增长了 32.3 个百分点。自 2013 年以来,AI 事件增长了超过二十倍。一个显著的例子是泰勒·斯威夫特的 AI 生成的深度假视频,在网上广为传播。
  3. ChatGPT is politically biased. Researchers find a significant bias in ChatGPT toward Democrats in the United States and the Labour Party in the U.K. This finding raises concerns about the tool's potential to influence users' political views, particularly in a year marked by major global elections.
    ChatGPT 在政治上存在偏见。研究人员发现 ChatGPT 在美国偏向民主党和英国工党。这一发现引发了对该工具可能影响用户政治观点的担忧,尤其是在一个以重大全球选举为特征的年份。

Report Highlights 报告亮点

Chapter 4: Economy 第四章:经济

  1. Generative investment skyrockets. Despite a decline in overall Al private investment last year, funding for generative surged, nearly octupling from 2022 to reach billion. Major players in the generative Al space, including OpenAl, Anthropic, Hugging Face, and Inflection, reported substantial fundraising rounds.
    生成 投资激增。尽管去年整体 AI 私人投资有所下降,但生成 的资金激增,从 2022 年几乎增加了八倍,达到 亿美元。生成 AI 领域的主要参与者,包括 OpenAI、Anthropic、Hugging Face 和 Inflection,报告了大量的筹款轮。
  2. Already a leader, the United States pulls even further ahead in Al private investment. In 2023, the United States saw Al investments reach billion, nearly 8.7 times more than China, the next highest investor. While private Al investment in China and the European Union, including the United Kingdom, declined by and , respectively, since 2022, the United States experienced a notable increase of in the same time frame.
    美国已经是领导者,在人工智能私人投资方面进一步拉开了差距。2023 年,美国的人工智能投资达到 亿美元,几乎是中国的近 8.7 倍,后者是第二大投资国。尽管自 2022 年以来,中国和欧盟(包括英国)的私人人工智能投资分别下降了 ,但同期美国经历了 的显著增长。
  3. Fewer Al jobs in the United States and across the globe. In 2022, Al-related positions made up of all job postings in America, a figure that decreased to in 2023. This decline in Al job listings is attributed to fewer postings from leading Al firms and a reduced proportion of tech roles within these companies.
    美国和全球范围内的人工智能工作岗位减少。2022 年,与美国所有职位招聘相比,与人工智能相关的职位占比为 ,这一数字在 2023 年降至 。人工智能工作岗位的减少归因于领先人工智能公司发布的职位减少以及这些公司中技术角色比例的降低。
  4. Al decreases costs and increases revenues. A new McKinsey survey reveals that of surveyed organizations report cost reductions from implementing (including generative ), and report revenue increases. Compared to the previous year, there was a 10 percentage point increase in respondents reporting decreased costs, suggesting is driving significant business efficiency gains.
    人工智能降低成本并增加收入。麦肯锡的一项新调查显示, 的受访组织表示通过实施 (包括生成式 )实现了成本降低, 表示收入增加。与前一年相比,报告成本降低的受访者比例增加了 10 个百分点,表明 正在推动重大的商业效率提升。

5. Total Al private investment declines again, while the number of newly funded AI
5. 总体上,私人投资再次下降,而新资助的 AI 数量增加。

companies increases. Global private Al investment has fallen for the second year in a row, though less than the sharp decrease from 2021 to 2022 . The count of newly funded Al companies spiked to 1,812 , up from the previous year.
  1. Al organizational adoption ticks up. A 2023 McKinsey report reveals that of organizations now use (including generative ) in at least one business unit or function, up from in 2022 and in 2017.
    AI 组织采用率有所提高。2023 年麦肯锡报告显示, 的组织现在至少在一个业务部门或功能中使用 (包括生成 ),高于 2022 年的 和 2017 年的
  2. China dominates industrial robotics. Since surpassing Japan in 2013 as the leading installer of industrial robots, China has significantly widened the gap with the nearest competitor nation. In 2013, China's installations accounted for of the global total, a share that rose to by 2022 .
    中国主导工业机器人。自 2013 年超过日本成为工业机器人领先安装国以来,中国与最近的竞争对手国家之间的差距显著扩大。2013 年,中国的安装量占全球总量的 ,到 2022 年已上升至

Chapter 4: Economy (cont'd)
第四章:经济(续)

  1. Greater diversity in robot installations. In 2017, collaborative robots represented a mere of all new industrial robot installations, a figure that climbed to by 2022 . Similarly, 2022 saw a rise in service robot installations across all application categories, except for medical robotics. This trend indicates not just an overall increase in robot installations but also a growing emphasis on deploying robots for human-facing roles.
    机器人安装的多样性增加。2017 年,协作机器人仅占所有新工业机器人安装量的 ,到 2022 年这一数字上升到 。同样,2022 年各应用类别的服务机器人安装量均有所增加,除了医疗机器人。这一趋势不仅表明机器人安装总量增加,还表明人类面向角色的机器人部署受到越来越多的重视。
  2. The data is in: Al makes workers more productive and leads to higher quality work. In 2023, several studies assessed Al's impact on labor, suggesting that Al enables workers to complete tasks more quickly and to improve the quality of their output. These studies also demonstrated Al's potential to bridge the skill gap between low- and high-skilled workers. Still, other studies caution that using Al without proper oversight can lead to diminished performance.
    数据显示:人工智能提高了工人的生产力,促进了更高质量的工作。2023 年,几项研究评估了人工智能对劳动力的影响,表明人工智能使工人能够更快地完成任务并提高其产出质量。这些研究还展示了人工智能在弥合低技能和高技能工人之间的技能差距方面的潜力。然而,其他研究警告称,没有适当监督的情况下使用人工智能可能会导致绩效下降。
  3. Fortune 500 companies start talking a lot about Al, especially generative Al. In 2023, Al was mentioned in 394 earnings calls (nearly of all Fortune 500 companies), a notable increase from 266 mentions in 2022. Since 2018, mentions of Al in Fortune 500 earnings calls have nearly doubled. The most frequently cited theme, appearing in of all earnings calls, was generative Al.
    《财富》500 强公司开始大谈人工智能,尤其是生成式人工智能。2023 年,394 家公司在财报电话会议中提到了人工智能(几乎占所有《财富》500 强公司的 ),较 2022 年的 266 次提及有显著增加。自 2018 年以来,在《财富》500 强公司的财报电话会议中提到人工智能的次数几乎翻了一番。最经常被提及的主题,出现在 的所有财报电话会议中,是生成式人工智能。

Report Highlights 报告亮点

Chapter 5: Science and Medicine
第五章:科学与医学

  1. Scientific progress accelerates even further, thanks to AI. In 2022, Al began to advance scientific discovery. 2023, however, saw the launch of even more significant science-related Al applicationsfrom AlphaDev, which makes algorithmic sorting more efficient, to GNoME, which facilitates the process of materials discovery.
    由于人工智能的推动,科学进步进一步加速。2022 年,人工智能开始推动科学发现。然而,2023 年看到了更多重要的与科学相关的人工智能应用的推出,从使算法排序更加高效的 AlphaDev,到促进材料发现过程的 GNoME。
  2. Al helps medicine take significant strides forward. In 2023, several significant medical systems were launched, including EVEscape, which enhances pandemic prediction, and AlphaMissence, which assists in Al-driven mutation classification. is increasingly being utilized to propel medical advancements.
    AI 帮助医学取得了重大进展。2023 年,推出了几款重要的医疗系统,包括增强疫情预测的 EVEscape 和协助基于 AI 的突变分类的 AlphaMissence。 越来越多地被用于推动医学进步。
  3. Highly knowledgeable medical Al has arrived. Over the past few years, Al systems have shown remarkable improvement on the MedQA benchmark, a key test for assessing Al's clinical knowledge. The standout model of 2023 , GPT-4 Medprompt, reached an accuracy rate of , marking a 22.6 percentage point increase from the highest score in 2022. Since the benchmark's introduction in 2019, Al performance on MedQA has nearly tripled.
    高度了解医学的 AI 已经到来。在过去几年里,AI 系统在 MedQA 基准测试上显示出了显著的改进,这是评估 AI 临床知识的关键测试。2023 年的杰出模型 GPT-4 Medprompt 达到了 的准确率,比 2022 年最高分提高了 22.6 个百分点。自 2019 年引入基准测试以来,AI 在 MedQA 上的表现几乎翻了三倍。
  4. The FDA approves more and more Al-related medical devices. In 2022, the FDA approved 139 Al-related medical devices, a 12.1% increase from 2021. Since 2012, the number of FDA-approved Al-related medical devices has increased by more than 45 -fold. is increasingly being used for real-world medical purposes.
    FDA 批准了越来越多与 AI 相关的医疗设备。2022 年,FDA 批准了 139 款与 AI 相关的医疗设备,比 2021 年增加了 12.1%。自 2012 年以来,FDA 批准的与 AI 相关的医疗设备数量增加了超过 45 倍。 越来越多地被用于真实世界的医疗用途。

Report Highlights 报告亮点

Chapter 6: Education 第六章:教育

  1. The number of American and Canadian CS bachelor's graduates continues to rise, new CS master's graduates stay relatively flat, and PhD graduates modestly grow. While the number of new American and Canadian bachelor's graduates has consistently risen for more than a decade, the number of students opting for graduate education in CS has flattened. Since 2018, the number of CS master's and graduates has slightly declined.
    美国和加拿大计算机科学学士毕业生人数持续增长,新的计算机科学硕士毕业生人数保持相对稳定,博士毕业生逐渐增加。虽然美国和加拿大新毕业的学士生人数在过去十多年一直在增加,但选择攻读计算机科学研究生教育的学生人数已经趋于稳定。自 2018 年以来,计算机科学硕士和 毕业生人数略有下降。
  2. The migration of Al PhDs to industry continues at an accelerating pace. In 2011, roughly equal percentages of new AI PhDs took jobs in industry (40.9%) and academia (41.6%). However, by 2022, a significantly larger proportion (70.7%) joined industry after graduation compared to those entering academia (20.0%). Over the past year alone, the share of industry-bound AI PhDs has risen by 5.3 percentage points, indicating an intensifying brain drain from universities into industry.
    人工智能博士毕业生向产业界的迁移速度正在加快。2011 年,新的人工智能博士毕业生在产业界(40.9%)和学术界(41.6%)的就业比例大致相等。然而,到 2022 年,毕业后加入产业界的比例显著增加至 70.7%,远远超过进入学术界的比例(20.0%)。仅在过去一年中,进入产业界的人工智能博士毕业生的比例增加了 5.3 个百分点,表明大学向产业界的人才流失正在加剧。
  3. Less transition of academic talent from industry to academia. In of new Al faculty in the United States and Canada were from industry. By 2021, this figure had declined to 11%, and in 2022, it further dropped to . This trend indicates a progressively lower migration of high-level AI talent from industry into academia.
    行业向学术界的学术人才转移减少。在美国和加拿大的新人工智能教职员中, 都来自行业。到 2021 年,这一比例下降至 11%,并在 2022 年进一步降至 。这一趋势表明,高水平人工智能人才从行业流向学术界的迁移越来越少。
  4. CS education in the United States and Canada becomes less international. Proportionally fewer international CS bachelor's, master's, and PhDs graduated in 2022 than in 2021. The drop in international students in the master's category was especially pronounced.
    美国和加拿大的计算机科学教育变得不那么国际化。2022 年,获得计算机科学学士、硕士和博士学位的国际学生比例较 2021 年有所下降。硕士类别中国际学生的减少尤为显著。

5. More American high school students take CS courses, but access problems remain.
5. 更多的美国高中学生选修计算机科学课程,但仍存在接触问题。

In 2022, 201,000 AP CS exams were administered. Since 2007, the number of students taking these exams has increased more than tenfold. However, recent evidence indicates that students in larger high schools and those in suburban areas are more likely to have access to CS courses.
  1. Al-related degree programs are on the rise internationally. The number of English-language, Al-related postsecondary degree programs has tripled since 2017, showing a steady annual increase over the past five years. Universities worldwide are offering more Al-focused degree programs.
    人工智能相关的学位课程在国际上不断增加。自 2017 年以来,英语授课的人工智能相关高等教育学位课程数量增加了三倍,过去五年稳步增长。全球各大学正在提供更多以人工智能为重点的学位课程。

Chapter 6: Education (cont'd)
第 6 章:教育(续)

  1. The United Kingdom and Germany lead in European informatics, CS, CE, and IT graduate production. The United Kingdom and Germany lead Europe in producing the highest number of new informatics, CS, CE, and information bachelor's, master's, and PhD graduates. On a per capita basis, Finland leads in the production of both bachelor's and PhD graduates, while Ireland leads in the production of master's graduates.
    英国和德国在欧洲信息学、计算机科学、计算机工程和信息技术研究生产量方面处于领先地位。英国和德国在生产新的信息学、计算机科学、计算机工程和信息学学士、硕士和博士毕业生数量方面领先于欧洲。以人均计算,芬兰在生产学士和博士毕业生方面领先,而爱尔兰在生产硕士毕业生方面领先。

Report Highlights 报告亮点

Chapter 7: Policy and Governance
第七章:政策与治理

  1. The number of Al regulations in the United States sharply increases. The number of Al-related regulations has risen significantly in the past year and over the last five years. In 2023 , there were -related regulations, up from just one in 2016. Last year alone, the total number of Al-related regulations grew by .
    美国的 Al 法规数量急剧增加。过去一年和过去五年,与 Al 相关的法规数量显著增加。2023 年,与 Al 相关的法规数量达到 个,远远超过 2016 年的仅有一个。仅去年一年,与 Al 相关的法规总数增长了 个。
  2. The United States and the European Union advance landmark Al policy action. In 2023, policymakers on both sides of the Atlantic put forth substantial proposals for advancing Al regulation The European Union reached a deal on the terms of the AI Act, a landmark piece of legislation enacted in 2024. Meanwhile, President Biden signed an Executive Order on Al, the most notable Al policy initiative in the United States that year.
    美国和欧盟推进具有里程碑意义的人工智能政策行动。2023 年,大西洋两岸的决策者提出了推进人工智能监管的实质性提案。欧盟就《人工智能法案》的条款达成协议,这是 2024 年出台的具有里程碑意义的立法。与此同时,拜登总统签署了一项关于人工智能的行政命令,这是美国在那一年最显著的人工智能政策倡议。
  3. Al captures U.S. policymaker attention. The year 2023 witnessed a remarkable increase in Al-related legislation at the federal level, with 181 bills proposed, more than double the 88 proposed in 2022.
    人工智能引起美国决策者的关注。2023 年,联邦层面关于人工智能立法的提案大幅增加,共有 181 项提出,是 2022 年提出的 88 项的两倍多。
  4. Policymakers across the globe cannot stop talking about Al. Mentions of Al in legislative proceedings across the globe have nearly doubled, rising from 1,247 in 2022 to 2,175 in 2023. Al was mentioned in the legislative proceedings of 49 countries in 2023 . Moreover, at least one country from every continent discussed Al in 2023, underscoring the truly global reach of Al policy discourse.
    全球各地的决策者们谈论不休人工智能。全球立法程序中关于人工智能的提及几乎翻了一番,从 2022 年的 1,247 次上升至 2023 年的 2,175 次。 2023 年,有 49 个国家的立法程序提及了人工智能。此外,至少来自每个大洲的一个国家在 2023 年讨论了人工智能,凸显了人工智能政策议题的真正全球影响力。
  5. More regulatory agencies turn their attention toward AI. The number of U.S. regulatory agencies issuing Al regulations increased to 21 in 2023 from 17 in 2022, indicating a growing concern over Al regulation among a broader array of American regulatory bodies. Some of the new regulatory agencies that enacted Alrelated regulations for the first time in 2023 include the Department of Transportation, the Department of Energy, and the Occupational Safety and Health Administration.
    更多监管机构将注意力转向人工智能。2023 年,发布 Al 法规的美国监管机构数量增至 21 家,而 2022 年为 17 家,表明越来越多的美国监管机构对 Al 监管表示关注。2023 年首次颁布与 Al 相关法规的一些新监管机构包括交通部、能源部和职业安全与健康管理局。

Report Highlights 报告亮点

Chapter 8: Diversity 第八章:多样性

  1. U.S. and Canadian bachelor's, master's, and PhD CS students continue to grow more ethnically diverse. While white students continue to be the most represented ethnicity among new resident graduates at all three levels, the representation from other ethnic groups, such as Asian, Hispanic, and Black or African American students, continues to grow. For instance, since 2011, the proportion of Asian CS bachelor's degree graduates has increased by 19.8 percentage points, and the proportion of Hispanic CS bachelor's degree graduates has grown by 5.2 percentage points.
    美国和加拿大的计算机科学学士、硕士和博士研究生在族裔多样性上持续增长。白人学生在这三个层次的新入学毕业生中仍然是最多的族裔代表,而其他少数民族群体的代表,如亚洲人、西班牙裔和非洲裔学生的比例则继续增长。例如,自 2011 年以来,亚洲计算机科学学士学位毕业生的比例增加了 19.8 个百分点,西班牙裔计算机科学学士学位毕业生的比例增长了 5.2 个百分点。
  2. Substantial gender gaps persist in European informatics, CS, CE, and IT graduates at all educational levels. Every surveyed European country reported more male than female graduates in bachelor's, master's, and PhD programs for informatics, CS, CE, and IT. While the gender gaps have narrowed in most countries over the last decade, the rate of this narrowing has been slow.
    欧洲在信息学、计算机科学、计算机工程和信息技术领域的女性毕业生普遍存在较大的性别差距。每个受调查的欧洲国家在信息学、计算机科学、计算机工程和信息技术的学士、硕士和博士项目中都报告了男性毕业生的人数多于女性。虽然在过去十年中,各国间的性别差距有所缩小,但这种缩小的速度很慢。
  3. U.S. K-12 CS education is growing more diverse, reflecting changes in both gender and ethnic representation. The proportion of AP CS exams taken by female students rose from in 2007 to 30.5% in 2022. Similarly, the participation of Asian, Hispanic/Latino/Latina, and Black/African American students in AP CS has consistently increased year over year.
    美国 K-12 计算机科学教育正在变得更加多样化,反映出性别和种族代表性的变化。女性学生参加 AP 计算机科学考试的比例从 2007 年的 上升到 2022 年的 30.5%。同样,亚洲、西班牙裔/拉丁裔和非洲裔美国学生参加 AP 计算机科学考试的比例也在逐年增加。

Report Highlights 报告亮点

Chapter 9: Public Opinion
第九章:公众舆论

  1. People across the globe are more cognizant of Al's potential impact-and more nervous. A survey from Ipsos shows that, over the last year, the proportion of those who think AI will dramatically affect their lives in the next three to five years has increased from to . Moreover, express nervousness toward Al products and services, marking a 13 percentage point rise from 2022. In America, Pew data suggests that of Americans report feeling more concerned than excited about Al, rising from in 2022.
    环球范围内的人们对人工智能的潜在影响更加明显,也更加紧张。Ipsos 的一项调查显示,在过去一年中,认为人工智能在未来三到五年内将对他们的生活产生巨大影响的人的比例从 上升到 。此外, 对人工智能产品和服务表达出紧张情绪,与 2022 年相比上升了 13 个百分点。在美国,Pew 的数据显示, 的美国人对人工智能感到更多的担忧而不是兴奋,从 2022 年的 上升到现在。
  2. Al sentiment in Western nations continues to be low, but is slowly improving. In 2022, several developed Western nations, including Germany, the Netherlands, Australia, Belgium, Canada, and the United States, were among the least positive about Al products and services. Since then, each of these countries has seen a rise in the proportion of respondents acknowledging the benefits of Al, with the Netherlands experiencing the most significant shift.
    西方国家的情绪仍然低迷,但正在慢慢改善。2022 年,包括德国、荷兰、澳大利亚、比利时、加拿大和美国在内的几个发达的西方国家对人工智能产品和服务持最少的积极态度。自那时起,这些国家中的每一个都看到了承认人工智能好处的受访者比例上升,其中荷兰经历了最显著的变化。
  3. The public is pessimistic about Al's economic impact. In an Ipsos survey, only of respondents feel will improve their job. Only anticipate will boost the economy, and believe it will enhance the job market.
    公众对人工智能的经济影响持悲观态度。在一项 Ipsos 调查中,只有 的受访者认为 会改善他们的工作。只有 预期 会促进经济增长, 相信它会提升就业市场。

4. Demographic differences emerge regarding Al optimism. Significant demographic
4. 关于人工智能乐观情绪存在人口统计学差异。显著的人口统计学差异 emerge regarding Al optimism.

differences exist in perceptions of Al's potential to enhance livelihoods, with younger generations generally more optimistic. For instance, of Gen respondents believe will improve entertainment options, versus only of baby boomers. Additionally, individuals with higher incomes and education levels are more optimistic about Al's positive impacts on entertainment, health, and the economy than their lower-income and less-educated counterparts.
  1. ChatGPT is widely known and widely used. An international survey from the University of Toronto suggests that of respondents are aware of ChatGPT. Of those aware, around half report using ChatGPT at least once weekly.
    ChatGPT 被广泛认知和广泛使用。据多伦多大学的一项国际调查显示,受访者中有 知道 ChatGPT。而在知道 ChatGPT 的人中,约一半表示每周至少使用一次 ChatGPT。
CHAPTER 1: 第一章:
Artificial Intelligence 人工智能
Index Report 2024 2024 索引报告
Research and Development
研究与开发

Preview 预览

Overview 29
Chapter Highlights 章节亮点 30
1.1 Publications 1.1 出版物 31
Overview 31
Total Number of Al Publications
出版物总数
31
By Type of Publication
按出版物类型
32
By Field of Study
按研究领域
33
By Sector 34
Al Journal Publications Al 期刊出版物 36
Al Conference Publications
人工智能会议出版物
37
1.2 Patents 38
Al Patents 38
Overview 38
By Filing Status and Region
按申请状态和地区
39
1.3 Frontier Al Research
1.3 前沿 Al 研究
45
General Machine Learning Models
通用机器学习模型
45
Overview 45
Sector Analysis 领域分析 46
National Affiliation 国家隶属 47
Parameter Trends 参数趋势 49
Compute Trends 计算趋势 50
Highlight: Will Models Run Out of Data?
亮点:模型会用尽数据吗?
52
Foundation Models 基础模型 56
Model Release 模型发布 56
Organizational Affiliation
组织隶属
58
National Affiliation 国家隶属 61
Training Cost 训练成本 63
1.4 Al Conferences ..... 66
1.4 Al 会议 ..... 66

Conference Attendance ..... 66
会议出席 ..... 66

1.5 Open-Source Al Software ..... 69
1.5 开源人工智能软件 ..... 69

Projects ..... 69 项目 ..... 69
Stars ..... 71 星标 ..... 71
ACCESS THE PUBLIC DATA
访问公共数据

Al Journal Publications ..... 36 ..... 37
所有期刊出版物 ..... 36 ..... 37

1.2 Patents ..... 38 ..... 39
1.2 专利 ..... 38 ..... 39

1.3 Frontier Al Research ..... 45
1.3 前沿人工智能研究 ..... 45

Overview ..... 45 概述 ..... 45
46
Foundation Models 基础模型
Model Release ..... 56
模型发布 ..... 56

Training Cost ..... 63
培训成本 ..... 63
This chapter studies trends in research and development. It begins by examining trends in Al publications and patents, and then examines trends in notable Al systems and foundation models. It concludes by analyzing Al conference attendance and open-source Al software projects.
本章研究 研发的趋势。首先,它分析了 Al 出版物和专利的趋势,然后检查了显著的 Al 系统和基础模型的趋势。最后,它分析了 Al 会议的参与情况以及开源 Al 软件项目。

Chapter Highlights 章节亮点

  1. Industry continues to dominate frontier Al research. In 2023, industry produced 51 notable machine learning models, while academia contributed only 15 . There were also 21 notable models resulting from industry-academia collaborations in 2023, a new high.
    行业继续主导前沿的人工智能研究。2023 年,行业发布了 51 个显著的机器学习模型,而学术界仅贡献了 15 个。2023 年,行业与学术界合作产生了 21 个显著模型,创下新高。
  2. More foundation models and more open foundation models. In 2023, a total of 149 foundation models were released, more than double the amount released in 2022. Of these newly released models, were open-source, compared to only in 2022 and in 2021.
    更多的基础模型和更多的开源基础模型。2023 年,共发布了 149 个基础模型,是 2022 年发布数量的两倍以上。在这些新发布的模型中, 个是开源的,而 2022 年只有 个,2021 年只有 个。
  3. Frontier models get way more expensive. According to Al Index estimates, the training costs of state-of-the-art Al models have reached unprecedented levels. For example, OpenAl's GPT-4 used an estimated million worth of compute to train, while Google's Gemini Ultra cost million for compute.
    边界模型变得更加昂贵。根据 Al Index 估计,最先进的 Al 模型的训练成本已经达到了前所未有的水平。例如,OpenAl 的 GPT-4 使用了估计价值 百万的计算资源进行训练,而谷歌的 Gemini Ultra 则花费了 百万用于计算资源。
  4. The United States leads China, the EU, and the U.K. as the leading source of top AI models. In 2023, 61 notable Al models originated from U.S.-based institutions, far outpacing the European Union's 21 and China's 15.
    美国领先于中国、欧盟和英国,成为顶尖人工智能模型的主要来源。2023 年,61 个知名的人工智能模型源自美国机构,远远超过欧盟的 21 个和中国的 15 个。
  5. The number of Al patents skyrockets. From 2021 to 2022, Al patent grants worldwide increased sharply by . Since 2010 , the number of granted Al patents has increased more than 31 times.
    Al 专利数量激增。从 2021 年到 2022 年,全球 Al 专利授权数量急剧增加 。自 2010 年以来,已授权的 Al 专利数量增加了超过 31 倍。
  6. China dominates Al patents. In 2022, China led global Al patent origins with , significantly outpacing the United States, which accounted for of Al patent origins. Since 2010, the U.S. share of AI patents has decreased from .
    中国主导人工智能专利。2022 年,中国在全球人工智能专利来源中处于领先地位,占比 ,远远超过美国,美国占 的人工智能专利来源。自 2010 年以来,美国在人工智能专利中的份额已经从 下降。
  7. Open-source Al research explodes. Since 2011, the number of AI-related projects on GitHub has seen a consistent increase, growing from 845 in 2011 to approximately 1.8 million in 2023. Notably, there was a sharp 59.3% rise in the total number of GitHub Al projects in 2023 alone. The total number of stars for Al-related projects on GitHub also significantly increased in 2023, more than tripling from 4.0 million in 2022 to 12.2 million.
    开源人工智能研究蓬勃发展。自 2011 年以来,GitHub 上与人工智能相关项目的数量持续增加,从 2011 年的 845 个增长到 2023 年的约 180 万个。值得注意的是,仅在 2023 年,GitHub 上人工智能项目的总数就急剧增加了 59.3%。2023 年 GitHub 上与人工智能相关项目的星标总数也显著增加,从 2022 年的 400 万增至 1220 万。
  8. The number of Al publications continues to rise. Between 2010 and 2022 , the total number of publications nearly tripled, rising from approximately 88,000 in 2010 to more than 240,000 in 2022 . The increase over the last year was a modest .
    人工智能出版物数量持续增长。2010 年至 2022 年间,人工智能出版物总数几乎翻了三番,从 2010 年的约 8.8 万增至 2022 年的超过 24 万。过去一年的增长幅度较为温和,为

1.1 Publications 1.1 出版物

Overview 概述

The figures below present the global count of English- and Chinese-language AI publications from 2010 to 2022, categorized by type of affiliation and cross-sector collaborations. Additionally, this section details publication data for journal articles and conference papers.
以下图表展示了 2010 年至 2022 年间英语和中文语言 AI 出版物的全球数量,按隶属类型和跨部门合作进行分类。此外,本节详细介绍了期刊文章和会议论文的出版数据。

Total Number of AI Publications
AI 出版物的总数

Figure 1.1.1 displays the global count of Al publications. Between 2010 and 2022, the total number of AI publications nearly tripled, rising from approximately 88,000 in 2010 to more than 240,000 in 2022 . The increase over the last year was a modest .
图 1.1.1 显示了全球 AI 出版物的数量。2010 年至 2022 年间,AI 出版物的总数几乎翻了三倍,从 2010 年的约 88,000 增加到 2022 年的超过 240,000。过去一年的增长是适度的
Number of Al publications in the world, 2010-22
2010 年至 2022 年间全球 AI 出版物的数量
1 The data on publications presented this year is sourced from CSET. Both the methodology and data sources used by CSET to classify Al publications have changed since their data was last featured in the Al Index (2023). As a result, the numbers reported in this year's section differ slightly from those reported in last year's edition. Moreover, the Al-related publication data is fully available only up to 2022 due to a significant lag in updating publication data. Readers are advised to approach publication figures with appropriate caution.
今年提供的出版物数据来自 CSET。自上次在 AI 指数(2023 年)中展示数据以来,CSET 用于分类 AI 出版物的方法和数据来源发生了变化。因此,今年报告的数字与去年版中报告的数字略有不同。此外,由于更新出版物数据存在显着滞后,与 AI 相关的出版物数据仅完全可用至 2022 年。建议读者谨慎对待出版物数据。

By Type of Publication
按出版物类型

Figure 1.1.2 illustrates the distribution of Al publication types globally over time. In 2022, there were roughly journal articles compared to roughly 42,000 conference submissions. Since 2015, Al
图 1.1.2 展示了全球 AI 出版物类型随时间的分布。2022 年,大约有 篇期刊文章,而会议提交约为 42,000 篇。自 2015 年以来,AI
journal and conference publications have increased at comparable rates. In 2022, there were 2.6 times as many conference publications and 2.4 times as many journal publications as there were in 2015.
期刊和会议出版物的数量以相当的速度增长。2022 年,会议出版物的数量是 2015 年的 2.6 倍,期刊出版物的数量是 2015 年的 2.4 倍。
Number of Al publications by type, 2010-22
2010-22 年各类别 Al 出版物数量
Source: Center for Security and Emerging Technology, 2023 | Chart: 2024 Al Index fopot
来源:安全与新兴技术中心,2023 年 | 图表:2024 年 Al 指数 fopot

By Field of Study
按研究领域

Figure 1.1.3 examines the total number of publications by field of study since 2010. Machine learning publications have seen the most rapid growth over the past decade, increasing nearly sevenfold since 2015. Following machine learning, the most published Al fields in 2022 were computer vision (21,309 publications), pattern recognition (19,841), and process management .
图 1.1.3 研究了自 2010 年以来按研究领域刊登的 文章总数。机器学习领域的出版物在过去十年中增长最为迅速,自 2015 年以来增长了近七倍。在机器学习之后,2022 年发表最多的 AI 领域是计算机视觉(21,309 篇文章)、模式识别(19,841 篇)和过程管理
Number of Al publications by field of study (excluding Other AI), 2010-22
2010 年至 2022 年各研究领域的 AI 出版物数量(不包括其他 AI)
Source: Center for Security and Emerging Technology, 2023 | Chart: 2024 Al Index report
源自:安全与新兴技术中心,2023 年 | 图表:2024 年 Al 指数报告
Figure 1.1.3 图 1.1.3

By Sector 按行业

This section presents the distribution of publications by sector-education, government, industry, nonprofit, and other-globally and then specifically within the United States, China, and the European Union plus the United Kingdom. In 2022, the academic sector contributed the majority of publications (81.1%), maintaining its position as the leading global source of research over the past decade across all regions (Figure 1.1.4 and Figure 1.1.5). Industry participation is most significant in the United States, followed by the European Union plus the United Kingdom, and China (Figure 1.1.5).
本节介绍了 出版物按教育、政府、工业、非营利和其他部门的全球分布,然后具体介绍了在美国、中国和欧盟以及英国的情况。2022 年,学术部门贡献了大多数 出版物(81.1%),在过去十年中在所有地区保持其作为全球 研究领域的领先来源的地位(图 1.1.4 和图 1.1.5)。工业参与在美国最为显著,其次是欧盟加英国,以及中国(图 1.1.5)。
Figure 1.1.4 图 1.1.4

Al publications (% of total) by sector and geographic area, 2022
2022 年按部门和地理区域划分的出版物(总量的百分比)

Figure 1.1.5 图 1.1.5

Al Journal Publications Al 期刊出版物

Figure 1.1.6 illustrates the total number of Al journal publications from 2010 to 2022. The number of journal publications experienced modest growth from 2010 to 2015 but grew approximately 2.4 times since 2015 .
图 1.1.6 展示了 2010 年至 2022 年间 Al 期刊出版物的总数量。从 2010 年到 2015 年,Al 期刊出版物数量经历了适度增长,但自 2015 年以来增长了约 2.4 倍。
Between 2021 and 2022, Al journal publications saw a 4.5% increase.
2021 年至 2022 年间,AI 期刊出版物增加了 4.5%。
Number of Al journal publications, 2010-22
2010 年至 2022 年 AI 期刊出版物数量
Source: Center for Security and Emerging Technology, 2023 | Chart: 2024 Al Index report
源自:安全与新兴技术中心,2023 年 | 图表:2024 年 Al 指数报告

Al Conference Publications
人工智能会议出版物

Figure 1.1.7 visualizes the total number of conference publications since 2010 . The number of Al conference publications has seen a notable rise in the past two years, climbing from 22,727 in 2020 to 31,629 in 2021, and reaching 41,174 in 2022. Over the last year alone, there was a increase in conference publications. Since 2010, the number of AI conference publications has more than doubled.
图 1.1.7 展示了自 2010 年以来 会议出版物的总数。过去两年,人工智能会议出版物数量显著增加,从 2020 年的 22,727 增加到 2021 年的 31,629,再到 2022 年的 41,174。仅在过去一年中, 会议出版物数量增加了 。自 2010 年以来,人工智能会议出版物数量翻了一番多。
Number of Al conference publications, 2010-22
2010 年至 2022 年的人工智能会议出版物数量
Source: Center for Security and Emerging Technology, 2023 | Chart: 2024 Al Index report
源自:安全与新兴技术中心,2023 年 | 图表:2024 年 Al 指数报告
This section examines trends over time in global Al patents, which can reveal important insights into the evolution of innovation, research, and development within Al. Additionally, analyzing Al patents can reveal how these advancements are distributed globally. Similar to the publications data, there is a noticeable delay in AI patent data availability, with 2022 being the most recent year for which data is accessible. The data in this section comes from CSET.
本节旨在分析全球人工智能专利的时间趋势,这可以揭示关于人工智能创新、研究和发展演变的重要见解。此外,分析人工智能专利可以揭示这些进展在全球的分布情况。与出版数据类似,人工智能专利数据的可用性存在明显的延迟,截至 2022 年为止的数据是最新可获取的。本节数据来自 CSET。

1.2 Patents 1.2 专利

Al Patents 所有专利

Overview 概述

Figure 1.2.1 examines the global growth in granted Al patents from 2010 to 2022. Over the last decade, there has been a significant rise in the number of patents, with a particularly sharp increase in recent years. For instance, between 2010 and 2014, the total growth in granted Al patents was . However, from 2021 to 2022 alone, the number of patents increased by .
图 1.2.1 研究了 2010 年至 2022 年全球已授予的铝专利的增长情况。在过去的十年中, 专利数量显著增加,近年来增长尤为迅猛。例如,从 2010 年到 2014 年,已授予的铝专利总增长为 。然而,仅在 2021 年至 2022 年间, 专利数量增加了
Number of Al patents granted, 2010-22
2010-22 年授予的 Al 专利数量
Source: Center for Security and Emerging Technology, 2023 | Chart: 2024 Al Index report
源自:安全与新兴技术中心,2023 年 | 图表:2024 年 Al 指数报告

By Filing Status and Region
按申请状态和地区

The following section disaggregates patents by their filing status (whether they were granted or not granted), as well as the region of their publication.
以下部分按其申请状态(已授予或未授予)以及发布地区对 专利进行了细分。
Figure 1.2.2 compares global Al patents by application status. In 2022, the number of ungranted Al patents was more than double the amount granted . Over time, the landscape of Al patent approvals has shifted markedly. Until 2015, a larger proportion of filed Al patents were granted. However, since then, the majority of Al patent filings have not been granted, with the gap widening significantly. For instance, in of all filed Al patents were not granted. By 2022, this figure had risen to .
图 1.2.2 比较了全球 Al 专利的申请状态。2022 年,未获批准的 Al 专利 数量超过了获批准的 数量的两倍以上。随着时间的推移,Al 专利批准的情况发生了显著变化。直到 2015 年,提交的 Al 专利中获批准的比例更高。然而,自那时起,大多数 Al 专利申请未获批准,差距显著扩大。例如,在 提交的所有 Al 专利中,未获批准的比例。到 2022 年,这一数字已上升至

Al patents by application status,
申请状态的 Al 专利,

The gap between granted and not granted patents is evident across all major patent-originating geographic areas, including China, the European Union and United Kingdom, and the United States
授予和未授予的 专利之间的差距在所有主要专利产生地区都很明显,包括中国、欧盟和英国以及美国

(Figure 1.2.3). In recent years, all three geographic areas have experienced an increase in both the total number of Al patent filings and the number of patents granted.
(图 1.2.3)。近年来,这三个地理区域在 Al 专利申请总数和授予专利数方面都出现了增加。

Al patents by application status by geographic area, 2010-22
2010 年至 2022 年各地区按申请状态划分的 Al 专利

Source: Center for Security and Emerging Technology, 2023 | Chart: 2024 Al Index report
源自:安全与新兴技术中心,2023 年 | 图表:2024 年 Al 指数报告
Figure 1.2.4 showcases the regional breakdown of granted Al patents. As of 2022, the bulk of the world's granted Al patents originated from East Asia and the Pacific, with North America being the next largest contributor at . Up until 2011,
图 1.2.4 展示了授予的人工智能专利的区域分布。截至 2022 年,世界大部分授予的人工智能专利 源自东亚和太平洋地区,北美洲是下一个最大的贡献者,达到 。直到 2011 年,

North America led in the number of global Al patents. However, since then, there has been a significant shift toward an increasing proportion of Al patents originating from East Asia and the Pacific.
北美曾在全球人工智能专利数量上处于领先地位。然而,自那时起,人工智能专利的比例出现了显著转变,越来越多的专利来源于东亚和太平洋地区。

Granted Al patents (% of world total) by region, 2010-22
按地区划分,2010-22 年各地区授予的 Al 专利(占世界总量的比例)

Disaggregated by geographic area, the majority of the world's granted Al patents are from China (61.1%) and the United States (20.9%) (Figure 1.2.5). The share of Al patents originating from the United States has declined from in 2010.
按地理区域细分,世界授予的大多数 Al 专利来自中国(61.1%)和美国(20.9%)(图 1.2.5)。来自美国的 Al 专利份额已从 2010 年的 下降。
Granted Al patents (% of world total) by geographic area, 2010-22
按地理区域划分,2010-22 年各地区授予的 Al 专利(占世界总量的比例)
Figure 1.2.6 and Figure 1.2.7 document which countries lead in Al patents per capita. In 2022, the country with the most granted Al patents per 100,000 inhabitants was South Korea (10.3), followed by Luxembourg (8.8) and the United States (4.2)
图 1.2.6 和图 1.2.7 记录了哪些国家在人均 Al 专利方面处于领先地位。2022 年,每 10 万居民中拥有最多 Al 专利的国家是韩国(10.3),其次是卢森堡(8.8)和美国(4.2)。

(Figure 1.2.6). Figure 1.2.7 highlights the change in granted Al patents per capita from 2012 to 2022. Singapore, South Korea, and China experienced the greatest increase in Al patenting per capita during that time period.
(图 1.2.6)。图 1.2.7 突出显示了 2012 年至 2022 年间每人均获得的 Al 专利数量的变化。新加坡、韩国和中国在那段时间内经历了 Al 专利每人均增长最多。
Granted Al patents per 100,000 inhabitants by country, 2022
2022 年各国每 10 万居民获得的 Al 专利数量
Source: Center for Security and Emerging Technology, 2023 | Chart: 2024 Al Index report
源自:安全与新兴技术中心,2023 年 | 图表:2024 年 Al 指数报告
Percentage change of granted Al patents per 100,000 inhabitants by country, 2012 vs. 2022
每 10 万居民获得的 Al 专利授权数量的百分比变化,按国家划分,2012 年对比 2022 年
This section explores the frontier of Al research. While many new Al models are introduced annually, only a small sample represents the most advanced research. Admittedly what constitutes advanced or frontier research is somewhat subjective. Frontier research could reflect a model posting a new state-of-the-art result on a benchmark, introducing a meaningful new architecture, or exercising some impressive new capabilities.
本节探讨了 Al 研究的前沿。虽然每年都会推出许多新的 Al 模型,但只有少数代表了最先进的研究。诚然,什么构成先进或前沿研究在某种程度上是主观的。前沿研究可能反映了一个模型在基准测试中发布了新的最先进结果,引入了一个有意义的新架构,或者展示了一些令人印象深刻的新能力。
The Al Index studies trends in two types of frontier Al models: "notable models" and foundation models. Epoch, an Al Index data provider, uses the term "notable machine learning models" to designate noteworthy models handpicked as being particularly influential within the AI/machine learning ecosystem. In contrast, foundation models are exceptionally large Al models trained on massive datasets, capable of performing a multitude of downstream tasks. Examples of foundation models include GPT-4, Claude 3, and Gemini. While many foundation models may qualify as notable models, not all notable models are foundation models.
Al 指数研究两种类型的前沿 Al 模型的趋势:"显著模型"和基础模型。 Epoch,一个 Al 指数数据提供商,使用术语"显著机器学习模型"来指定被手工挑选为在 AI/机器学习生态系统中特别有影响力的显著模型。相比之下,基础模型是在大规模数据集上训练的异常庞大的 Al 模型,能够执行多种下游任务。基础模型的示例包括 GPT-4、Claude 3 和 Gemini。虽然许多基础模型可能符合显著模型的条件,但并非所有显著模型都是基础模型。
Within this section, the Al Index explores trends in notable models and foundation models from various perspectives, including originating organization, country of origin, parameter count, and compute usage. The analysis concludes with an examination of machine learning training costs.
在本节中,Al 指数从各种角度探讨显著模型和基础模型的趋势,包括起源组织、起源国家、参数数量和计算使用情况。分析最后对机器学习训练成本进行了审查。

1.3 Frontier Al Research
1.3 前沿 Al 研究

General Machine Learning Models
通用机器学习模型

Overview 概述

Epoch Al is a group of researchers dedicated to studying and predicting the evolution of advanced . They maintain a database of and machine learning models released since the 1950s, selecting entries based on criteria such as state-of-theart advancements, historical significance, or high citation rates. Analyzing these models provides a comprehensive overview of the machine learning landscape's evolution, both in recent years and over the past few decades. Some models may be missing from the dataset; however, the dataset can reveal trends in relative terms.
Epoch AI 是一群致力于研究和预测先进 演变的研究人员。他们维护着自 1950 年代以来发布的 和机器学习模型的数据库,根据最新技术进展、历史意义或高引用率等标准选择条目。分析这些模型可以全面了解机器学习领域的演变,无论是近年来还是过去几十年。 数据集中可能缺少一些模型,但数据集可以揭示相对趋势。
3 "Al system" refers to a computer program or product based on AI, such as ChatGPT. "Al model" refers to a collection of parameters whose values are learned during training, such as GPT-4. 4 New and historic models are continually added to the Epoch database, so the total year-by-year counts of models included in this year's Al Index might not exactly match those published in last year's report.
"AI 系统"指的是基于人工智能的计算机程序或产品,例如 ChatGPT。"AI 模型"指的是在训练过程中学习到的一组参数值,例如 GPT-4。新的和历史模型不断被添加到 Epoch 数据库中,因此本年度 AI 指数中包含的模型总数可能与去年报告中发布的数量不完全匹配。

Sector Analysis 领域分析

Until 2014, academia led in the release of machine learning models. Since then, industry has taken the lead. In 2023, there were 51 notable machine learning models produced by industry compared to just 15 from academia (Figure 1.3.1). Significantly, 21 notable models resulted from industry/academic collaborations in 2023, a new high.
直到 2014 年,学术界主导了机器学习模型的发布。从那时起,工业界开始领先。2023 年,工业界发布了 51 个显著的机器学习模型,而学术界仅发布了 15 个(图 1.3.1)。值得注意的是,2023 年有 21 个显著模型是由工业界和学术界合作产生的,创下新高。

Creating cutting-edge models now demands a substantial amount of data, computing power, and financial resources that are not available in academia. This shift toward increased industrial dominance in leading AI models was first highlighted in last year's Al Index report. Although this year the gap has slightly narrowed, the trend largely persists.
创建尖端 模型现在需要大量的数据、计算能力和财务资源,这些资源在学术界是不可用的。去年的 AI 指数报告首次突显了领先 AI 模型中增加工业主导地位的趋势。尽管今年差距略有缩小,但这一趋势基本上仍然存在。
Number of notable machine learning models by sector, 2003-23
各行业显著机器学习模型数量,2003-23

National Affiliation 国家隶属

To illustrate the evolving geopolitical landscape of , the Al Index research team analyzed the country of origin of notable models.
为了说明 不断演变的地缘政治格局,Al 指数研究团队分析了显著模型的原产国。
Figure 1.3.2 displays the total number of notable machine learning models attributed to the location of researchers' affiliated institutions.
图 1.3.2 显示了与研究人员所属机构位置相关的显著机器学习模型总数。
In 2023, the United States led with 61 notable machine learning models, followed by China with 15, and France with 8. For the first time since 2019, the European Union and the United Kingdom together have surpassed China in the number of notable Al models produced (Figure 1.3.3). Since 2003, the United States has produced more models than other major geographic regions such as the United Kingdom, China, and Canada (Figure 1.3.4).
2023 年,美国以 61 个显著机器学习模型领先,其次是中国 15 个,法国 8 个。自 2019 年以来,欧盟和英国联合超过中国在生产显著 Al 模型数量上首次超过(图 1.3.3)。自 2003 年以来,美国生产的模型数量超过其他主要地理区域,如英国、中国和加拿大(图 1.3.4)。
Figure 1.3.2 图 1.3.2

Number of notable machine learning models by select geographic area, 2003-23
2003 年至 2023 年各地区知名机器学习模型数量

Number of notable machine learning models by geographic area, 2003-23 (sum)
2003 年至 2023 年各地区知名机器学习模型数量总和
Source: Epoch, 2023 | Chart: 2024 Al Index report
来源:Epoch,2023 | 图表:2024 年 Al 指数报告
Figure 1.3.4 图 1.3.4
Parameters in machine learning models are numerical values learned during training that determine how a model interprets input data and makes predictions. Models trained on more data will usually have more parameters than those trained on less data. Likewise, models with more parameters typically outperform those with fewer parameters.
机器学习模型中的参数是在训练过程中学习到的数值,它们决定了模型如何解释输入数据并进行预测。使用更多数据训练的模型通常会比使用较少数据训练的模型具有更多的参数。同样,具有更多参数的模型通常会胜过具有较少参数的模型。
Figure 1.3.5 demonstrates the parameter count of machine learning models in the Epoch dataset, categorized by the sector from which the models originate. Parameter counts have risen sharply since the early 2010s, reflecting the growing complexity of tasks Al models are designed for, the greater availability of data, improvements in hardware, and proven efficacy of larger models. High-parameter models are particularly notable in the industry sector, underscoring the capacity of companies like OpenAl, Anthropic, and Google to bear the computational costs of training on vast volumes of data.
图 1.3.5 展示了 Epoch 数据集中机器学习模型的参数数量,按照模型来源的领域进行分类。自 2010 年代初以来,参数数量急剧上升,反映了 Al 模型设计任务的复杂性增加,数据的更大可用性,硬件的改进以及更大模型的有效性。高参数模型在工业领域尤为显著,突显了像 OpenAl、Anthropic 和 Google 这样的公司承担训练大量数据计算成本的能力。
Number of parameters of notable machine learning models by sector, 2003-23
2003-23 年间各领域知名机器学习模型的参数数量
Source: Epoch, 2023 | Chart: 2024 Al Index report
来源:Epoch,2023 | 图表:2024 年 Al 指数报告
The term "compute" in Al models denotes the computational resources required to train and operate a machine learning model. Generally, the complexity of the model and the size of the training dataset directly influence the amount of compute needed. The more complex a model is, and the larger the underlying training data, the greater the amount of compute required for training.
在 Al 模型中,“计算”一词表示训练和操作机器学习模型所需的计算资源。通常,模型的复杂性和训练数据集的大小直接影响所需的计算量。模型越复杂,底层训练数据越大,训练所需的计算量就越大。
Figure 1.3.6 visualizes the training compute required for notable machine learning models in the last 20 years. Recently, the compute usage of notable Al models has increased exponentially. This trend has been especially pronounced in the last five years. This rapid rise in compute demand has critical implications. For instance, models requiring more computation often have larger environmental footprints, and companies typically have more access to computational resources than academic institutions.
图 1.3.6 展示了过去 20 年中知名机器学习模型所需的训练计算量。最近,知名人工智能模型的计算使用量呈指数增长。这一趋势在过去五年中尤为明显。计算需求的快速增长具有重要的影响。例如,需要更多计算的模型通常具有更大的环境足迹,而公司通常比学术机构更容易获得计算资源。
Training compute of notable machine learning models by sector, 2003-23
2003-23 年间各领域知名机器学习模型的训练计算量
Source: Epoch, 2023 | Chart: 2024 Al Index report
来源:Epoch,2023 | 图表:2024 年 Al 指数报告
6 FLOP stands for "floating-point operation." A floating-point operation is a single arithmetic operation involving floating-point numbers, such as addition, subtraction, multiplication, or division. The number of FLOPs a processor or computer can perform per second is an indicator of its computational power. The higher the FLOP rate, the more powerful the computer is. An AI model with a higher FLOP rate reflects its requirement for more computational resources during training.
6 FLOP 代表“浮点运算”。浮点运算是涉及浮点数的单个算术运算,如加法、减法、乘法或除法。处理器或计算机每秒可以执行的 FLOP 数量是其计算能力的指标。 FLOP 速率越高,计算机的性能就越强大。具有更高 FLOP 速率的 AI 模型反映了其在训练过程中对更多计算资源的需求。
Figure 1.3.7 highlights the training compute of notable machine learning models since 2012. For example, AlexNet, one of the papers that popularized the now standard practice of using GPUs to improve Al models, required an estimated 470 petaFLOPs for training.
图 1.3.7 突出了自 2012 年以来值得注意的机器学习模型的训练计算。例如,AlexNet 是一篇推广现在标准做法的论文,即使用 GPU 来改进 Al 模型,其训练估计需要 470 petaFLOPs。

The original Transformer, released in 2017, required around 7,400 petaFLOPs. Google's Gemini Ultra, one of the current state-of-the-art foundation models, required 50 billion petaFLOPs.
最初的 Transformer 于 2017 年发布,大约需要 7,400 petaFLOPs。 Google 的 Gemini Ultra 是当前最先进的基础模型之一,需要 500 亿 petaFLOPs。
Training compute of notable machine learning models by domain, 2012-23
2012-23 年领域内知名机器学习模型的训练计算
Source: Epoch, 2023 | Chart: 2024 Al Index report
来源:Epoch,2023 | 图表:2024 年 Al 指数报告
Highlight: 突出显示:

Will Models Run Out of Data?
模型会用尽数据吗?

As illustrated above, a significant proportion of recent algorithmic progress, including progress behind powerful LLMs, has been achieved by training models on increasingly larger amounts of data. As noted recently by Anthropic cofounder and AI Index Steering Committee member Jack Clark, foundation models have been trained on meaningful percentages of all the data that has ever existed on the internet.
如上所述,最近算法进展的显著部分,包括强大的LLMs背后的进展,是通过在越来越大量的数据上训练模型实现的。正如 Anthropic 联合创始人兼 AI 指数指导委员会成员杰克·克拉克最近指出的,基础模型已经在互联网上存在的所有数据中训练了有意义的百分比。
The growing data dependency of Al models has led to concerns that future generations of computer scientists will run out of data to further scale and improve their systems. Research from Epoch suggests that these concerns are somewhat warranted. Epoch researchers have generated historical and compute-based projections for when Al researchers might expect to run out of data. The historical projections are based on observed growth rates in the sizes of data used to train foundation models. The compute projections adjust the historical growth rate based on projections of compute availability.
Al 模型日益依赖数据,这引发了人们对未来计算机科学家会用尽数据以进一步扩展和改进系统的担忧。Epoch 的研究表明,这些担忧在一定程度上是有根据的。Epoch 的研究人员已经为 Al 研究人员何时可能用尽数据生成了历史和基于计算的预测。历史预测基于用于训练基础模型的数据大小的增长率。计算预测根据计算可用性的预测调整历史增长率。
For instance, the researchers estimate that computer scientists could deplete the stock of high-quality language data by 2024, exhaust lowquality language data within two decades, and use up image data by the late 2030s to mid-2040s (Figure 1.3.8).
例如,研究人员估计计算机科学家可能在 2024 年之前耗尽高质量语言数据,两十年内用尽低质量语言数据,并在 2030 年代末至 2040 年代中期用尽图像数据(图 1.3.8)。
Theoretically, the challenge of limited data availability can be addressed by using synthetic
从理论上讲,有限数据可用性的挑战可以通过使用合成数据来解决

Projections of ML data exhaustion by stock type:
ML 数据枯竭的预测按库存类型:

median and dates
Source: Epoch, 2023 | Table: 2024 Al Index report
源:时代,2023 | 表:2024 Al 指数报告
Stock type Historical projection 历史预测 Compute projection 计算预测
Low-quality
language stock 语言库
High-quality
language stock 语言库
Image stock
Figure 1.3.8 图 1.3.8
data, which is data generated by models themselves. For example, it is possible to use text produced by one LLM to train another LLM. The use of synthetic data for training Al systems is particularly attractive, not only as a solution for potential data depletion but also because generative Al systems could, in principle, generate data in instances where naturally occurring data is sparse-for example, data for rare diseases or underrepresented populations. Until recently, the feasibility and effectiveness of using synthetic data for training generative systems were not well understood. However, research this year has suggested that there are limitations associated with training models on synthetic data.
数据,这是由 模型自动生成的数据。例如,可以使用一个LLM生成的文本来训练另一个LLM。对于训练 AI 系统来说,使用合成数据尤为吸引人,不仅可以解决潜在的数据枯竭问题,而且生成式 AI 系统原则上可以在自然数据稀缺的情况下生成数据,例如罕见疾病或代表性不足的人群的数据。直到最近,使用合成数据训练生成式 系统的可行性和有效性并不为人所了解。然而,今年的研究表明,使用合成数据训练模型存在一些限制。
For instance, a team of British and Canadian researchers discovered that models predominantly trained on synthetic data experience model collapse, a phenomenon where, over time, they lose the ability to remember true underlying data distributions and start producing a narrow range of
例如,一组英国和加拿大研究人员发现,主要在合成数据上训练的模型会经历模型崩溃,这是一种现象,随着时间的推移,它们失去了记住真实基础数据分布的能力,并开始产生一种狭窄范围的

Highlight: 突出显示:

Will Models Run Out of Data? (cont'd)
模型会耗尽数据吗?(续)

outputs. Figure 1.3.9 demonstrates the process of model collapse in a variational autoencoder (VAE) model, a widely used generative architecture. With each subsequent generation trained on additional synthetic data, the model produces an increasingly limited set of outputs. As illustrated in Figure 1.3.10, in statistical terms, as the number of synthetic generations increases, the tails of the distributions vanish, and the generation density shifts toward the mean. This pattern means that over time, the generations of models trained predominantly on synthetic data become less varied and are not as widely distributed.
输出。图 1.3.9 展示了变分自动编码器(VAE)模型中模型坍缩的过程,这是一种广泛使用的生成 架构。随着每一代在额外的合成数据上训练,模型产生的输出集变得越来越有限。正如图 1.3.10 所示,在统计学术语中,随着合成代数的增加,分布的尾部消失,生成密度向均值偏移。 这种模式意味着随着时间的推移,主要在合成数据上训练的模型的生成变得不那么多样化,分布也不那么广泛。
The authors demonstrate that this phenomenon occurs across various model types, including Gaussian Mixture Models and LLMs. This research underscores the continued importance of humangenerated data for training capable LLMs that can produce a diverse array of content.
作者证明了这种现象发生在各种模型类型中,包括高斯混合模型和 LLMs。这项研究强调了人工生成数据对训练能够产生多样内容的 LLMs 的持续重要性。

A demonstration of model collapse in a VAE
VAE 中模型坍缩的演示

Source: Shumailovet al., 2023
来源:Shumailov 等,2023 年
(a) Original model (a) 原始模型
(b) Generation 5 (b)第 5 代
(c) Generation 10 (c) 第 10 代
(d) Generation 20 (d) 第 20 代

Highlight: 突出显示:

Will Models Run Out of Data? (cont'd)
模型会耗尽数据吗?(续)

In a similar study published in 2023 on the use of synthetic data in generative imaging models, researchers found that generative image models trained solely on synthetic data cycles-or with insufficient real human data-experience a significant drop in output quality. The authors label this phenomenon Model Autophagy Disorder (MAD), in reference to mad cow disease.
在一项类似的研究中,研究人员发现,仅在合成数据上训练的生成图像模型,或者在真实人类数据不足的情况下训练的生成图像模型,会出现显著的输出质量下降。作者将这一现象称为模型自噬障碍(MAD),这是指疯牛病。
The study examines two types of training processes: fully synthetic, where models are trained exclusively on synthetic data, and synthetic augmentation, where models are trained on a mix of synthetic and real data. In both scenarios, as the number of training generations increases, the quality of the generated images declines. Figure 1.3.11 highlights the degraded image generations of models that are augmented with synthetic data; for example, the faces generated in steps 7 and 9 increasingly display strange-looking hash marks. From a statistical perspective, images generated with both synthetic data and synthetic augmentation loops have higher FID scores (indicating less similarity to real images), lower precision scores (signifying reduced realism or quality), and lower recall scores (suggesting decreased diversity) (Figure 1.3.12). While synthetic augmentation loops, which incorporate some real data, show less degradation than fully synthetic loops, both methods exhibit diminishing returns with further training.
这项研究探讨了两种训练过程:完全合成,其中模型仅在合成数据上进行训练,以及合成增强,其中模型在合成数据和真实数据的混合上进行训练。在这两种情景中,随着训练代数的增加,生成的图像质量下降。图 1.3.11 突出显示了使用合成数据增强的模型的降级图像生成;例如,在第 7 步和第 9 步生成的人脸逐渐显示出奇怪的哈希标记。从统计角度来看,使用合成数据和合成增强循环生成的图像具有更高的 FID 分数(表示与真实图像的相似性较低),更低的精度分数(表示现实性或质量降低),以及更低的召回分数(表明多样性减少)(图 1.3.12)。虽然合成增强循环,其中包含一些真实数据,显示出比完全合成循环更少的退化,但两种方法在进一步训练中都表现出递减的回报。
Highlight: 突出显示:

Will Models Run Out of Data? (cont'd)
模型会耗尽数据吗?(续)

An example of MAD in image-generation models
图像生成模型中 MAD 的示例
Source: Alemohammad et al., 2023
来源:Alemohammad 等人,2023 年
Figure 1.3.11 图 1.3.11
Assessing FFHQ syntheses: FID, precision, and recall in synthetic and mixed-data training loops Source: Alemohammad et al., 2023 | Chart: 2024 Al Index report
评估 FFHQ 合成:FID,精度和召回在合成和混合数据训练循环中的表现 Source: Alemohammad 等人,2023 | 图表:2024 Al 指数报告

Figure 1.3.12 图 1.3.12

Foundation Models 基础模型

Foundation models represent a rapidly evolving and popular category of Al models. Trained on vast datasets, they are versatile and suitable for numerous downstream applications. Foundation models such as GPT-4, Claude 3, and Llama 2 showcase remarkable abilities and are increasingly being deployed in realworld scenarios.
基础模型代表了一类快速发展且受欢迎的 AI 模型。经过大量数据集的训练,它们具有多功能性,适用于许多下游应用。诸如 GPT-4、Claude 3 和 Llama 2 等基础模型展示了卓越的能力,并越来越多地被部署在现实场景中。
Introduced in 2023, the Ecosystem Graphs is a new community resource from Stanford that tracks the foundation model ecosystem, including datasets, models, and applications. This section uses data from the Ecosystem Graphs to study trends in foundation models over time.
生态系统图是斯坦福的一个新社区资源,于 2023 年推出,用于跟踪基础模型生态系统,包括数据集、模型和应用程序。本节使用生态系统图的数据来研究基础模型随时间的发展趋势。

Model Release 模型发布

Foundation models can be accessed in different ways. No access models, like Google's PaLM-E, are only accessible to their developers. Limited access models, like OpenAl's GPT-4, offer limited access to the models, often through a public API. Open models, like Meta's Llama 2, fully release model weights, which means the models can be modified and freely used.
基础模型可以通过不同的方式访问。没有访问权限的模型,如谷歌的 PaLM-E,只能由其开发人员访问。有限访问模型,如 OpenAI 的 GPT-4,通过公共 API 通常提供对模型的有限访问。开放模型,如 Meta 的 Llama 2,完全释放模型权重,这意味着可以修改并自由使用模型。
Figure 1.3.13 visualizes the total number of foundation models by access type since 2019. In recent years, the number of foundation models has risen sharply, more than doubling since 2022 and growing by a factor of nearly 38 since 2019 . Of the 149 foundation models released in 2023, 98 were open, 23 limited and 28 no access.
图 1.3.13 展示了自 2019 年以来按访问类型分类的基础模型总数。近年来,基础模型的数量急剧增加,自 2022 年以来增长了一倍以上,自 2019 年以来增长了近 38 倍。2023 年发布的 149 个基础模型中,98 个是开放的,23 个是有限的,28 个没有访问权限。
Foundation models by access type, 2019-23
2019 年至 2023 年按访问类型分类的基础模型
Source: Bommasani et al., 2023 | Chart: 2024 Al Index report
来源:Bommasani 等人,2023 年 | 图表:2024 年 Al 指数报告
8 The Ecosystem Graphs make efforts to survey the global Al ecosystem, but it is possible that they underreport models from certain nations like South Korea and China.
8 生态系统图表努力调查全球 AI 生态系统,但可能会低估某些国家(如韩国和中国)的模型。
In 2023, the majority of foundation models were released as open access (65.8%), with having no access and limited access (Figure 1.3.14). Since 2021, there has been a significant increase in the proportion of models released with open access.
2023 年,大多数基础模型以开放获取的形式发布(65.8%), 没有访问权限, 有限访问权限(图 1.3.14)。自 2021 年以来,发布开放获取模型的比例显著增加。
Foundation models (% of total) by access type, 2019-23
基础模型(总数的百分比)按访问类型分类,2019-23

Organizational Affiliation
组织隶属

Figure 1.3.15 plots the sector from which foundation models have originated since 2019. In 2023, the majority of foundation models ( ) originated from industry. Only of foundation models in 2023 originated from academia. Since 2019, an ever larger number of foundation models are coming from industry.
图 1.3.15 显示自 2019 年以来基础模型的来源部门。到 2023 年,大多数基础模型( )来自工业界。2023 年,只有 的基础模型来自学术界。自 2019 年以来,越来越多的基础模型来自工业界。
Number of foundation models by sector, 2019-23
按部门划分的基础模型数量,2019-23
Figure 1.3.15 图 1.3.15
Figure 1.3.16 highlights the source of various foundation models that were released in 2023. Google introduced the most models (18), followed by Meta (11), and Microsoft (9). The academic institution that released the most foundation models in 2023 was UC Berkeley (3).
图 1.3.16 突出了 2023 年发布的各种基础模型的来源。谷歌推出了最多的模型(18 个),其次是 Meta(11 个)和微软(9 个)。2023 年发布最多基础模型的学术机构是加州大学伯克利分校(3 个)。
Number of foundation models by organization, 2023
按组织划分的基础模型数量,2023 年
Source: Bommasani et al., 2023 | Chart: 2024 Al Index report
来源:Bommasani 等人,2023 年 | 图表:2024 年 Al 指数报告
Since 2019, Google has led in releasing the most foundation models, with a total of 40, followed by OpenAl with 20 (Figure 1.3.17). Tsinghua University stands out as the top non-Western institution, with seven foundation model releases, while Stanford University is the leading American academic institution, with five releases.
自 2019 年以来,谷歌在发布最多基础模型方面处于领先地位,总共发布了 40 个,其次是 OpenAI 发布了 20 个(图 1.3.17)。清华大学是排名最高的非西方机构,发布了七个基础模型,而斯坦福大学是美国领先的学术机构,发布了五个模型。

Number of foundation models by organization, 2019-23 (sum)
组织发布的基础模型数量,2019-23 年(总和)

Source: Bommasani et al., 2023 | Chart: 2024 Al Index report
来源:Bommasani 等人,2023 年 | 图表:2024 年 Al 指数报告

National Affiliation 国家隶属

Given that foundation models are fairly representative of frontier research, from a geopolitical perspective, it is important to understand their national affiliations. Figures 1.3.18, 1.3.19, and 1.3.20 visualize the national affiliations of various foundation models. As with the notable model analysis presented earlier in the chapter, a model is deemed affiliated with a country if a researcher contributing to that model is affiliated with an institution headquartered in that country.
鉴于基础模型相当代表前沿 研究,从地缘政治的角度来看,了解它们的国家隶属关系至关重要。图 1.3.18、1.3.19 和 1.3.20 展示了各种基础模型的国家隶属关系。与本章前面提出的著名模型分析一样,如果参与该模型的研究人员隶属于总部设在该国的机构,则认为该模型与该国有关联。
In 2023, most of the world's foundation models originated from the United States (109), followed by China (20), and the United Kingdom (Figure 1.3.18). Since 2019, the United States has consistently led in originating the majority of foundation models (Figure 1.3.19).
2023 年,世界大部分基础模型源自美国(109),其次是中国(20)和英国(图 1.3.18)。自 2019 年以来,美国一直在发源大多数基础模型方面处于领先地位(图 1.3.19)。

Number of foundation models by geographic area, 2023
2023 年各地区基础模型数量
Source: Bommasani et al., 2023 | Chart: 2024 Al Index report
来源:Bommasani 等人,2023 年 | 图表:2024 年 Al 指数报告
Figure 1.3.18 图 1.3.18
Number of foundation models by select geographic area, 2019-23
2019-23 年各地区基础模型数量
Figure 1.3.20 depicts the cumulative count of foundation models released and attributed to respective countries since 2019. The country with the greatest number of foundation models released since 2019 is the United States (182), followed by China (30), and the United Kingdom (21).
图 1.3.20 显示自 2019 年以来发布并归因于各个国家的基础模型的累积数量。自 2019 年以来发布基础模型数量最多的国家是美国(182),其次是中国(30)和英国(21)。
Number of foundation models by geographic area, 2019-23 (sum)
地理区域基础模型数量,2019-23 年(总和)
Source: Bommasani et al., 2023 | Chart: 2024 AI Index report
来源:Bommasani 等人,2023 年 | 图表:2024 年 AI 指数报告
Figure 1.3.20 图 1.3.20

Training Cost 训练成本

A prominent topic in discussions about foundation models is their speculated costs. While AI companies seldom reveal the expenses involved in training their models, it is widely believed that these costs run into millions of dollars and are rising. For instance, OpenAl's CEO, Sam Altman, mentioned that the training cost for GPT-4 was over million. This escalation in training expenses has effectively excluded universities, traditionally centers of research, from developing their own leading-edge foundation models. In response, policy initiatives, such as President Biden's Executive Order on AI, have sought to level the playing field between industry and academia by creating a National AI Research Resource, which would grant nonindustry actors the compute and data needed to do higher level Al-research.
在关于基础模型的讨论中,一个突出的话题是它们被猜测的成本。虽然人工智能公司很少透露训练模型所涉及的费用,但普遍认为这些成本高达数百万美元且不断上升。例如,OpenAI 的 CEO Sam Altman 提到,GPT-4 的训练成本超过了 百万美元。训练费用的上升有效地排除了传统上是 研究中心的大学开发自己的领先基础模型。作为回应,政策倡议,如拜登总统关于人工智能的行政命令,试图通过创建一个国家人工智能研究资源来拉平行业和学术界的竞争格局,该资源将为非产业参与者提供进行更高级别人工智能研究所需的计算和数据。
Understanding the cost of training Al models is important, yet detailed information on these costs remains scarce. The Al Index was among the first to offer estimates on the training costs of foundation models in last year's publication. This year, the AI Index has collaborated with Epoch AI, an AI research institute, to substantially enhance and solidify the robustness of its training cost estimates. To estimate the cost of cutting-edge models, the Epoch team analyzed training duration, as well as the type, quantity, and utilization rate of the training hardware, using information from publications, press releases, or technical reports related to the models.