How generative AI works

Image generated by Adobe Firefly
图像由 Adobe Firefly 生成

What is generative AI? 什么是生成式人工智能?

Generative Artificial Intelligence, or Generative AI, is a class of computer algorithms able to create digital content – including text, images, video, music and computer code. They work by deriving patterns from large sets of training data that become encoded into predictive mathematical models, a process commonly referred to as ‘learning’. Generative AI models do not keep a copy of the data they were trained on, but rather generate novel content entirely from the patterns they encode. People can then use interfaces like ChatGPT or MidJourney to input prompts – typically instructions in plain language – to make generative AI models produce new content.
生成式人工智能(Generative Artificial Intelligence)是一类能够创建数字内容(包括文本、图像、视频、音乐和计算机代码)的计算机算法。它们的工作原理是从大量训练数据中总结出模式,并将其编码为预测性数学模型,这一过程通常被称为 "学习"。生成式人工智能模型不会保留训练数据的副本,而是完全根据编码模式生成新内容。然后,人们可以使用 ChatGPT 或 MidJourney 等界面输入提示(通常是普通语言指令),让生成式人工智能模型生成新内容。

As the development of practical and high-quality generative AI emerges, it can become a helpful tool for our everyday work and has the potential for diverse applications such as art, writing, and software development.

Flowchart showing generative AI as a brain, taking in a prompt and producing outputs.

The core of a generative AI is a trained deep-learning model that understands and generates text, image, or other media in a human-like fashion based on a given user input, i.e. prompt. This model is trained on massive amounts of data to learn from patterns in the data. For example, it would learn that certain words tend to follow others, or that certain phrases are more common in certain contexts. The model uses the prompt to produce a completion, which is then presented back to users.


Prompt and completion in a ChatGPT interface, linked by a model (GPT-3.5, GPT-4).

The video below provides a simple explanation of the mechanism of generative AI.



The quality of the generated output depends on several factors, including the amount and quality of the training data, the prompt's complexity, and the model's size. Larger models usually generate better output but require more computing power and resources. Notable examples of generative AI systems include ChatGPT Links to an external site. and Bard, Links to an external site. which focus on language generation, and Midjourney Links to an external site. and DALL-E Links to an external site., which focus on image generation.
生成输出的质量取决于多个因素,包括训练数据的数量和质量、提示的复杂性以及模型的大小。生成式人工智能系统的著名例子包括专注于语言生成的 ChatGPT 和 Bard,以及专注于图像生成的 Midjourney 和 DALL-E。

Some everyday applications of generative AI

Predictive text 预测文本

This technology facilitates typing on a device by suggesting words the user may wish to insert in a text field. The below example shows that predictive text suggests the word "you" to be inserted behind "Good morning, how are".
这项技术通过建议用户在文本字段中插入想要输入的单词,从而方便用户在设备上打字。下面的示例显示,预测文本建议在 "早上好,你好 "后面插入单词 "你"。


Screenshot of a screen keyboard showing autocomplete suggestions


Image style transfer 图像样式转移

This technology that generates a new image by combining the content of one image with the style of another image. The below example is a generated image (using Bing Image Creator) with the content of the painting "Mona Lisa" and the style of "Starry Night". 
这种技术通过将一张图片的内容与另一张图片的风格相结合来生成新的图片。下图是使用必应图像生成器生成的图像(内容为 "蒙娜丽莎",风格为 "星空")。

Mona Lisa in the style of Starry Night


Copyright © The University of Sydney. Unless otherwise indicated, 3rd party material has been reproduced and communicated to you by or on behalf of the University of Sydney in accordance with section 113P of the Copyright Act 1968 (Act). The material in this communication may be subject to copyright under the Act. Any further reproduction or communication of this material by you may be the subject of copyright protection under the Act. Do not remove this notice.
悉尼大学版权所有。除非另有说明,否则悉尼大学或代表悉尼大学根据《1968 年版权法》(以下简称《版权法》)第 113P 条复制了第三方材料并将其传达给您。根据该法案,本通信中的材料可能受版权保护。您对本材料的任何进一步复制或传播都可能受到该法规定的版权保护。请勿删除本通知。

Live streamed classes in this unit may be recorded to enable students to review the content. If you have concerns about this, please visit our student guide and contact the unit coordinator.

Privacy Statement 隐私声明