Warning
The JupyterLab development team is excited to have a robust
third-party extension community. However, we do not review
third-party extensions, and some extensions may introduce security
risks or contain malicious code that runs on your machine. Moreover in order
to work, this panel needs to fetch data from web services. Do you agree to
activate this feature?
Please read the privacy policy.
Installed
Discover
Open Tabs
Kernels
Python 3 (ipykernel)
- ML4DS_week1_lab_with_solutions.ipynb
Python 3 (ipykernel)
- ML4DS_week1_lab.ipynb
Python 3 (ipykernel)
- ML4DS_week1_lab.ipynb
Language servers
Terminals
ML4DS_week1_lab_with_solutions.ipynb
- ML4DS Lab: week 1: Simple linear regression lab
- Aims
- Tasks
- Task 1: Download and import the olympic data
- Task 2: Plot the data
- Task 3. Fit a straightline using `LinearRegression` in `sklean`. Plot the model with the data and prediction at 2012.
- Task 5: Fit the model with the Least Square solution. Plot the model with the data and prediction at 2012.
- ML4DS_week1_lab_with_solutions.ipynb6 minutes ago85.9 KB
- ML4DS_week1_lab.ipynb12 days ago36.5 KB
- olympic100m.txt12 days ago265 B
- ML4DS_week1_lab_with_solutions.ipynb
ML4DS Lab: week 1: Simple linear regression lab¶
ML4DS 实验:第 1 周:简单线性回归实验¶
Aims¶ 目标¶
- Do a simple linear regression on the Olympic 100~m data in Python
在 Python 中对奥林匹克 100~m 数据进行简单的线性回归 - Practise numpy, matplotlib, and sklearn
练习 numpy、matplotlib 和 sklearn
Tasks¶ 任务¶
- Download the data (‘olympic100m.txt’) from the Moodle page
从 Moodle 页面下载数据 ('olympic100m.txt') - Plot Olympic year against winning time
绘制奥林匹克年与获胜时间 - Plot the loss function in 1D and 2D
在 1D 和 2D 中绘制损失函数 - Fit a model with sklearn functions
使用 sklearn 函数拟合模型 - Fit a model with using provided expressions to compute
and .
使用提供的表达式拟合模型以计算 和 。 - Create a new plot that includes the data and the function defined by
and
创建一个新绘图,其中包含由 和 定义的数据和函数 - Make a prediction at 2012
在 2012 年做出预测
Task 1: Download and import the olympic data¶
任务 1:下载并导入奥林匹克数据¶
array([[1896. , 12. ], [1900. , 11. ], [1904. , 11. ], [1906. , 11.2 ], [1908. , 10.8 ], [1912. , 10.8 ], [1920. , 10.8 ], [1924. , 10.6 ], [1928. , 10.8 ], [1932. , 10.3 ], [1936. , 10.3 ], [1948. , 10.3 ], [1952. , 10.4 ], [1956. , 10.5 ], [1960. , 10.2 ], [1964. , 10. ], [1968. , 9.95], [1972. , 10.14], [1976. , 10.06], [1980. , 10.25], [1984. , 9.99], [1988. , 9.92], [1992. , 9.96], [1996. , 9.84], [2000. , 9.87], [2004. , 9.85], [2008. , 9.69]])
Task 2: Plot the data¶
任务 2:绘制数据¶
(27,)
Text(0, 0.5, 'Time (seconds)')
Task 3. Fit a straightline using LinearRegression
in sklean
. Plot the model with the data and prediction at 2012.¶
任务 3.在 sklean
中使用 LinearRegression
拟合直线。使用 2012 年的数据和预测绘制模型。
(27, 1) (27, 1) [array([36.4164559]), array([[-0.01333089]])]
Text(0, 0.5, 'Time (seconds)')
Task 5: Fit the model with the Least Square solution. Plot the model with the data and prediction at 2012.¶
任务 5:使用最小二乘法拟合模型。使用 2012 年的数据和预测绘制模型。
Let's fit a model with an analytical solution to the problem of finding the parameters with the minimum average loss.
让我们用解析解拟合一个模型,以解决找到平均损失最小的参数的问题。
Recall that the average loss is
回想一下,平均损失为
𝐿
(
𝑤
0
,
𝑤
1
)
=
1
𝑁
∑
𝑛
=
1
𝑁
(
𝑡
𝑛
−
𝑤
0
−
𝑤
1
𝑥
𝑛
)
2
The procedure to find the analytical expression of the optimal parameters is the following:
求最佳参数的解析表达式的过程如下:
- Solving
解决 - the average loss is minimised:
and where .
平均损失最小化: 和 其中 .
You are encouraged to derive these yourself
我们鼓励您自己推导这些
1952.3703703703704 10.389629629629631 3812975.5555555555 20268.06814814815
36.41645590250286 -0.013330885710960602
Text(0, 0.5, 'Time (seconds)')