从 Transformer 到LLM ：架构、培训和使用 |王斌旭

Transformer Tutorial Series
变压器教程系列

Attention

In this session, we walked through the architecture, training and applications of transformers (slides), the lecture slides covered
在本次会议中，我们介绍了 Transformer 的架构、培训和应用（幻灯片），讲座幻灯片涵盖了

The basic principles of NLP
NLP的基本原理
Basics of attention mechanism and transformer
注意力机制和 Transformer 基础知识
Training language models (language modelling objective)
训练语言模型（语言建模目标）
Usage of pretrained models (finetuning vs prompting)
使用预训练模型（微调与提示）
Application of transformer beyond language (vision, audio, music, image generation, game&control)
Transformer超越语言的应用（视觉、音频、音乐、图像生成、游戏与控制）

Jupyter Notebook Tutorial Series
Jupyter Notebook 教程系列

We prepared this series of jupyter notebooks for you to gain hands-on experience about transformers from their architecture to the training and usage.
我们为您准备了这一系列的 Jupyter Notebook，让您获得 Transformer 从架构到培训和使用的实践经验。

Fundamentals of Transformer and Language modelling
Transformer 和语言建模基础知识
- Understanding Attention & Transformer from Scratch
  从头开始理解注意力和变压器
  In this tutorial, you will manually implement attention mechanism, and GPT model from scratch to gain a deeper understanding of their structure.
  在本教程中，您将从头开始手动实现注意力机制和GPT模型，以更深入地了解它们的结构。
- Language modelling and pretrained transformers
  语言建模和预训练变压器
  In this notebook you will look into the architectures of pretrained transformer (GPT / BERT), and then train a GPT2 model to "speak" the simplified English constructed with Context Free Generative Grammar, and observe the learning of syntactical rule and word meaning.
  在本笔记本中，您将研究预训练 Transformer (GPT / BERT) 的架构，然后训练 GPT2 模型来“说”用上下文无关生成语法构建的简化英语，并观察句法规则和词义的学习。
Beyond Language: 超越语言：
in the following notebooks, we will demonstrate the flexibility of the transformer model by
在接下来的笔记本中，我们将通过以下方式演示变压器模型的灵活性
- Learn to do arithmetics by sequence modelling.
  通过序列建模学习算术。
  In this notebook, you will train a GPT2 on arithmetic dataset, and let it learn to do arithmetics (partially) by next token prediction.
  在本笔记本中，您将在算术数据集上训练 GPT2，并让它通过下一个标记预测来学习（部分）进行算术运算。
- Image generation by sequence modelling.
  通过序列建模生成图像。
  In this notebook, you will train a GPT2-like transformer for generative modelling of MNIST images, by predicting the sequence of patches in an image.
  在本笔记本中，您将通过预测图像中的补丁序列来训练类似 GPT2 的转换器，用于 MNIST 图像的生成建模。
- Audio signal classification (~ 20 min)
  音频信号分类（约 20 分钟）
  In this notebook, you will train a transformer on Spoken MNIST dataset, and classify the audio sequences.
  在此笔记本中，您将在 Spoken MNIST 数据集上训练变压器，并对音频序列进行分类。
- Image classification (~ 30 min)
  图像分类（约 30 分钟）
  In this notebook, you will train a transformer on images -- formated as a sequence of patches, and predict the identity of the image.
  在本笔记本中，您将在图像上训练变压器（格式化为补丁序列），并预测图像的身份。
- Music generation by sequence modelling. (Difficult, training takes hrs)
  通过序列建模生成音乐。（困难，训练需要几个小时）
  In this notebook, you will train a transformer to predict next note in a music dataset consists of piano rolls. By doing so it could be used to generate classic piano music.
  在此笔记本中，您将训练一个变压器来预测由钢琴卷帘组成的音乐数据集中的下一个音符。通过这样做，它可以用来生成经典钢琴音乐。
Using Large Language Model
使用大型语言模型
Finally we will get a glimpse at the LLMs, by using OpenAI APIs to achieve some useful things
最后我们将了解一下LLMs ，通过使用 OpenAI API 来实现一些有用的事情
- OpenAI API and Chat with PDF :
  OpenAI API 和 PDF 聊天：
  In this notebook, you will use the OpenAI API and langchain to build a bot that can chat with a given document e.g. scientific paper . (replicating the functionality of Chat with PDF)
  在本笔记本中，您将使用 OpenAI API 和langchain构建一个可以与给定文档（例如科学论文）聊天的机器人。（复制Chat with PDF的功能）
Official Github repo
官方Github 存储库

Related material 相关资料

Class: 班级：

Machine Learning from Scratch
从头开始机器学习

Attach Files: 附加文件：

mlfs_tutorial_nlp_transformer_ssl_updated.pdf

Binxu Wang

Kempner Research Fellow, Ph.D. in Neuroscience

GA4 tracking code - linked

GA4 tracking code

From Transformer to LLM: Architecture, Training and Usage
从 Transformer 到LLM ：架构、培训和使用

Transformer Tutorial Series
变压器教程系列

Related material 相关资料

Related ML from Scratch tutorials
相关ML from Scratch教程

Class: 班级：

Attach Files: 附加文件：

Contact 接触

Visitor Map 访客地图

Binxu Wang

Kempner Research Fellow, Ph.D. in Neuroscience

GA4 tracking code - linked

GA4 tracking code

Transformer Tutorial Series 变压器教程系列

Related material 相关资料

Related ML from Scratch tutorials 相关ML from Scratch教程

Class: 班级：

Attach Files: 附加文件：

Contact 接触

Visitor Map 访客地图

Transformer Tutorial Series
变压器教程系列

Related ML from Scratch tutorials
相关ML from Scratch教程