xxxxxxxxxx
This baseline notebook is designed to offer a starting point for the competiters. Please note that the approach we've taken is not *THE* solution — it's simply ONE possible approach. Our aim is to assist participants in exploring different ways to preprocess and model the data. Please feel free to fork the notebook and save the model/data for your own exploration.
This baseline notebook is designed to offer a starting point for the competiters. Please note that the approach we've taken is not THE solution — it's simply ONE possible approach. Our aim is to assist participants in exploring different ways to preprocess and model the data. Please feel free to fork the notebook and save the model/data for your own exploration.
这款基准笔记本旨在为参赛者提供一个起点。请注意,我们采取的方法并不是解决方案——它只是一种可能的方法。我们的目标是帮助参与者探索对数据进行预处理和建模的不同方法。请随意分叉笔记本并保存模型/数据以供您自己探索。
xxxxxxxxxx
This notebook was prepared by Virginie Batista and Angèle Syty from the Institut d'Astrophysique de Paris, and Orphée Faucoz from Centre National d’Etudes Spatiales (CNES), with support from Gordon Yip and Tara Tahseen from University College London.
This notebook was prepared by Virginie Batista and Angèle Syty from the Institut d'Astrophysique de Paris, and Orphée Faucoz from Centre National d’Etudes Spatiales (CNES), with support from Gordon Yip and Tara Tahseen from University College London.
本笔记本由巴黎天体物理学研究所的 Virginie Batista 和 Angèle Syty 以及国家空间研究中心 (CNES) 的 Orphée Faucoz 编写,并得到了伦敦大学学院的 Gordon Yip 和 Tara Tahseen 的支持。
xxxxxxxxxx
# READ THIS BEFORE YOU PROCEED
This training procedure uses the light dataset produced from this [notebook (Version 5)](https://www.kaggle.com/code/gordonyip/update-calibrating-and-binning-astronomical-data). We applied all the calibration steps EXCEPT Linearity Correction with Chunksize = 1. The binned dataset is available to download [here](https://www.kaggle.com/datasets/gordonyip/binned-dataset-v3/data). *If you want to carry out all the correction, you will have to do so yourself.*
**This notebook will only provide the model checkpoints, you are welcomed to use these checkpoints with your own script and submit to the leaderboard.**
This training procedure uses the light dataset produced from this notebook (Version 5). We applied all the calibration steps EXCEPT Linearity Correction with Chunksize = 1. The binned dataset is available to download here. If you want to carry out all the correction, you will have to do so yourself.
此训练过程使用从此笔记本(版本 5)生成的光数据集。我们应用了除块大小 = 1 时的线性校正之外的所有校准步骤。分箱数据集可在此处下载。如果您想执行所有更正,则必须亲自执行。
This notebook will only provide the model checkpoints, you are welcomed to use these checkpoints with your own script and submit to the leaderboard.
本笔记本仅提供模型检查点,欢迎您将这些检查点与您自己的脚本一起使用并提交到排行榜。
xxxxxxxxxx
The challenge's primary objective is to process these exposures to produce a single, clean spectrum for each exoplanet, summarizing the rp/rs values across all wavelengths.
The exposure are subject to noises and the images or spectrum are not perfect. The Jitter noise has a complex signature that the ML model should recognize to produce a better spectra.
Different techniques are possible and are up to the participant imagination to produce a novel (and hopefully better) solution to this task.
Here outline our baseline approach :
We first fit a 1D CNN to fit the mean value of the transmission spectra, taking as input the transit white curve (total flux of each image taken as a function of time).
For the second part of the baseline, to retrieve the atmopsheric features, we make the data lighter by summing up the fluxes along the y-axis, for each wavelength, resulting in 2D images of dimension (N_times, N_wavelengths). We also cut the signal to remove the out of transit in order to enhance transit depth variations between wavelengths. For the same reason, we substract the mean flux, corresponding to the average transit depth, to keep only wavelength variations around this mean. We use a 2D CNN to fit the atmospheric features.
The challenge's primary objective is to process these exposures to produce a single, clean spectrum for each exoplanet, summarizing the rp/rs values across all wavelengths.
该挑战的主要目标是处理这些曝光,为每个系外行星生成单一、干净的光谱,总结所有波长的 rp/rs 值。
The exposure are subject to noises and the images or spectrum are not perfect. The Jitter noise has a complex signature that the ML model should recognize to produce a better spectra.
曝光会受到噪声的影响,图像或光谱并不完美。抖动噪声具有复杂的特征,ML 模型应识别该特征以产生更好的频谱。
Different techniques are possible and are up to the participant imagination to produce a novel (and hopefully better) solution to this task.
不同的技术是可能的,并且取决于参与者的想象力来为该任务产生新颖的(并且希望是更好的)解决方案。
Here outline our baseline approach :
这里概述了我们的基线方法:
We first fit a 1D CNN to fit the mean value of the transmission spectra, taking as input the transit white curve (total flux of each image taken as a function of time).
我们首先拟合一维 CNN 来拟合透射光谱的平均值,将传输白曲线(每幅图像的总通量作为时间的函数)作为输入。
For the second part of the baseline, to retrieve the atmopsheric features, we make the data lighter by summing up the fluxes along the y-axis, for each wavelength, resulting in 2D images of dimension (N_times, N_wavelengths). We also cut the signal to remove the out of transit in order to enhance transit depth variations between wavelengths. For the same reason, we substract the mean flux, corresponding to the average transit depth, to keep only wavelength variations around this mean. We use a 2D CNN to fit the atmospheric features.
对于基线的第二部分,为了检索大气特征,我们通过对每个波长沿 y 轴的通量求和来使数据更轻,从而产生尺寸为(N_times,N_wavelengths)的二维图像。我们还切割信号以消除传输之外的信号,以增强波长之间的传输深度变化。出于同样的原因,我们减去与平均传输深度相对应的平均通量,以仅保留该平均值周围的波长变化。我们使用 2D CNN 来拟合大气特征。
xxxxxxxxxx
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
img = mpimg.imread('/kaggle/input/baseline-img/2nd_baseline.png')
plt.figure(figsize=(10, 15))
plt.imshow(img)
plt.axis('off')
plt.show()
xxxxxxxxxx
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.models import load_model
import tensorflow as tf
import random
import os
from tensorflow.keras.losses import MeanAbsoluteError
from matplotlib.ticker import ScalarFormatter
import pandas as pd
xxxxxxxxxx
data_folder = '/kaggle/input/binned-dataset-v3/' # path to the folder containing the data
auxiliary_folder = '/kaggle/input/ariel-data-challenge-2024/' # path to the folder containing the train targets and wavelengths informations
xxxxxxxxxx
data_train = np.load(f'{data_folder}/data_train.npy')
data_train_FGS = np.load(f'{data_folder}/data_train_FGS.npy')
xxxxxxxxxx
We create a directory to save the outputs of this notebook, and define the hyperparameters of the model
We create a directory to save the outputs of this notebook, and define the hyperparameters of the model
我们创建一个目录来保存此笔记本的输出,并定义模型的超参数
xxxxxxxxxx
output_dir = './output'
SEED = 42
do_the_mcdropout_wc = True
do_the_mcdropout = True
if not os.path.exists(output_dir):
os.makedirs(output_dir)
print(f"Directory {output_dir} created.")
else:
print(f"Directory {output_dir} already exists.")
xxxxxxxxxx
train_solution = np.loadtxt(f'{auxiliary_folder}/train_labels.csv', delimiter = ',', skiprows = 1)
targets = train_solution[:,1:]
targets_mean = targets[:,1:].mean(axis = 1) # used for the 1D-CNN to extract the mean value, only AIRS wavelengths as the FGS point is not used in the white curve
N = targets.shape[0]
xxxxxxxxxx
We create the dataset by adding the FGS frame, crushed in one column, at the end of the AIRS data cube.
The images are normalized using the star spectrum extracted from the images themselves.
We create the dataset by adding the FGS frame, crushed in one column, at the end of the AIRS data cube.
我们通过在 AIRS 数据立方体的末尾添加压缩为一列的 FGS 框架来创建数据集。
The images are normalized using the star spectrum extracted from the images themselves.
使用从图像本身提取的星光谱对图像进行标准化。
xxxxxxxxxx
signal_AIRS_diff_transposed_binned, signal_FGS_diff_transposed_binned = data_train, data_train_FGS
FGS_column = signal_FGS_diff_transposed_binned.sum(axis = 2)
dataset = np.concatenate([signal_AIRS_diff_transposed_binned, FGS_column[:,:, np.newaxis,:]], axis = 2)
xxxxxxxxxx
we sum up the pixels on the y-axis to transform the data into 2D images
we sum up the pixels on the y-axis to transform the data into 2D images
我们对 y 轴上的像素求和,将数据转换为 2D 图像
xxxxxxxxxx
dataset = dataset.sum(axis=3)
xxxxxxxxxx
We divide the images by the star flux assuming the first and last 50 instants belong to the out of transit.
We divide the images by the star flux assuming the first and last 50 instants belong to the out of transit.
我们将图像除以恒星通量,假设第一个和最后 50 个瞬间属于凌日之外。
xxxxxxxxxx
def create_dataset_norm(dataset1, dataset2) :
dataset_norm1 = np.zeros(dataset1.shape)
dataset_norm2 = np.zeros(dataset1.shape)
dataset_min = dataset1.min()
dataset_max = dataset1.max()
dataset_norm1 = (dataset1 - dataset_min) / (dataset_max - dataset_min)
dataset_norm2 = (dataset2 - dataset_min) / (dataset_max - dataset_min)
return dataset_norm1, dataset_norm2
def norm_star_spectrum (signal) :
img_star = signal[:,:50].mean(axis = 1) + signal[:,-50:].mean(axis = 1)
return signal/img_star[:,np.newaxis,:]
dataset_norm = norm_star_spectrum(dataset)
dataset_norm = np.transpose(dataset_norm,(0,2,1))
xxxxxxxxxx
## Split the targets and observations between valid and train
We start by computing a "white curve", that is actually the sum of the signal over the all image, as a function of time. We split the data and normalize the train/valid/test data.
We start by computing a "white curve", that is actually the sum of the signal over the all image, as a function of time. We split the data and normalize the train/valid/test data.
我们首先计算一条“白色曲线”,它实际上是整个图像上的信号之和,作为时间的函数。我们分割数据并对训练/有效/测试数据进行标准化。
xxxxxxxxxx
cut_inf, cut_sup = 39, 321 # we have previously cut the data along the wavelengths to remove the edges, this is to match with the targets range in the make data file
l = cut_sup - cut_inf + 1
wls = np.arange(l)
def split (data, N) :
list_planets = random.sample(range(0, data.shape[0]), N_train)
list_index_1 = np.zeros(data.shape[0], dtype = bool)
for planet in list_planets :
list_index_1[planet] = True
data_1 = data[list_index_1]
data_2 = data[~list_index_1]
return data_1, data_2, list_index_1
N_train = 8*N//10
# Validation and train data split
train_obs, valid_obs, list_index_train = split(dataset_norm, N_train)
train_targets, valid_targets = targets[list_index_train], targets[~list_index_train]
xxxxxxxxxx
signal_AIRS_diff_transposed_binned = signal_AIRS_diff_transposed_binned.sum(axis=3)
wc_mean = signal_AIRS_diff_transposed_binned.mean(axis=1).mean(axis=1)
white_curve = signal_AIRS_diff_transposed_binned.sum(axis=2)/ wc_mean[:, np.newaxis]
def normalise_wlc(train, valid) :
wlc_train_min = train.min()
wlc_train_max = train.max()
train_norm = (train - wlc_train_min) / (wlc_train_max - wlc_train_min)
valid_norm = (valid - wlc_train_min) / (wlc_train_max - wlc_train_min)
return train_norm, valid_norm
def normalize (train, valid) :
max_train = train.max()
min_train = train.min()
train_norm = (train - min_train) / (max_train - min_train)
valid_norm = (valid - min_train) / (max_train - min_train)
return train_norm, valid_norm, min_train, max_train
# Split the light curves and targets
train_wc, valid_wc = white_curve[list_index_train], white_curve[~list_index_train]
train_targets_wc, valid_targets_wc = targets_mean[list_index_train], targets_mean[~list_index_train]
# Normalize the wlc
train_wc, valid_wc = normalise_wlc(train_wc, valid_wc)
# Normalize the targets
train_targets_wc_norm, valid_targets_wc_norm, min_train_valid_wc, max_train_valid_wc = normalize(train_targets_wc, valid_targets_wc)
xxxxxxxxxx
plt.figure()
for i in range (200) :
plt.plot(train_wc[-i], '-', alpha = 0.5)
plt.title('Light-curves from the train set')
plt.xlabel('Time')
plt.ylabel('Normalized flux')
plt.show()
xxxxxxxxxx
## Train 1D CNN
The model to estimate the mean of the target spectrum using the white light-curve is a 1D-CNN with Dropout layers to make a MC Dropout prediction.
The model to estimate the mean of the target spectrum using the white light-curve is a 1D-CNN with Dropout layers to make a MC Dropout prediction.
使用白光曲线估计目标光谱平均值的模型是具有 Dropout 层的 1D-CNN,用于进行 MC Dropout 预测。
xxxxxxxxxx
from keras.layers import Input, Conv1D, MaxPooling1D, Flatten, Dense, Dropout, BatchNormalization, Concatenate,AveragePooling1D
from keras.models import Model, load_model
from tensorflow.keras.optimizers import Adam, SGD
from tensorflow.keras.callbacks import LearningRateScheduler, ModelCheckpoint
input_wc = Input((187,1))
x = Conv1D(32, 3, activation='relu')(input_wc)
x = MaxPooling1D()(x)
x = BatchNormalization() (x)
x = Conv1D(64, 3, activation='relu')(x)
x = MaxPooling1D()(x)
x = Conv1D(128, 3, activation='relu')(x)
x = MaxPooling1D()(x)
x = Conv1D(256, 3, activation='relu')(x)
x = MaxPooling1D()(x)
x = Flatten()(x)
x = Dense(500, activation='relu')(x)
x = Dropout(0.2)(x, training = True)
x = Dense(100, activation='relu')(x)
x = Dropout(0.1)(x, training = True)
output_wc = Dense(1, activation='linear')(x)
model_wc = Model(inputs=input_wc, outputs=output_wc)
model_wc.summary()
xxxxxxxxxx
def scheduler(epoch, lr):
decay_rate = 0.2
decay_step = 200
if epoch % decay_step == 0 and epoch:
return lr * decay_rate
return lr
optimizer = SGD(0.001)
model_wc.compile(optimizer=optimizer, loss='mse', metrics=[MeanAbsoluteError()])
callback = LearningRateScheduler(scheduler)
checkpoint_filepath = 'output/model_1dcnn.keras'
model_ckt = ModelCheckpoint(
checkpoint_filepath,
monitor="val_loss",
verbose=0,
save_best_only=True,
save_weights_only=False,
mode="min",
save_freq="epoch",
)
print('Running ...')
history = model_wc.fit(
x = train_wc,
y = train_targets_wc_norm,
validation_data = (valid_wc, valid_targets_wc_norm),
batch_size=16,
epochs= 1200,
shuffle=True,
verbose=0,
callbacks=[model_ckt]
)
print('Done.')
xxxxxxxxxx
model_wc = load_model(checkpoint_filepath)
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.legend()
plt.show()
xxxxxxxxxx
Then, we perform the MC Dropout to obtain the mean prediction and the uncertainty associated. We choose to compute 1000 instances.
Then, we perform the MC Dropout to obtain the mean prediction and the uncertainty associated. We choose to compute 1000 instances.
然后,我们执行 MC Dropout 以获得平均预测和相关的不确定性。我们选择计算 1000 个实例。
xxxxxxxxxx
nb_dropout_wc = 1000
def unstandardizing (data, min_train_valid, max_train_valid) :
return data * (max_train_valid - min_train_valid) + min_train_valid
def MC_dropout_WC (model, data, nb_dropout) :
predictions = np.zeros((nb_dropout, data.shape[0]))
for i in range(nb_dropout) :
predictions[i,:] = model.predict(data, verbose = 0).flatten()
return predictions
if do_the_mcdropout_wc :
print('Running ...')
prediction_valid_wc = MC_dropout_WC(model_wc, valid_wc, nb_dropout_wc)
spectre_valid_wc_all = unstandardizing(prediction_valid_wc, min_train_valid_wc, max_train_valid_wc)
spectre_valid_wc, spectre_valid_std_wc = spectre_valid_wc_all.mean(axis = 0), spectre_valid_wc_all.std(axis = 0)
print('Done.')
else :
spectre_valid_wc = model_wc.predict(valid_wc).flatten()
spectre_valid_wc = unstandardizing(spectre_valid_wc, min_train_valid_wc, max_train_valid_wc)
spectre_valid_std_wc = 0.1*np.abs(spectre_valid_wc)
xxxxxxxxxx
residuals = spectre_valid_wc - valid_targets_wc
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 6), sharex=True,
gridspec_kw={'height_ratios': [3, 1]})
ax1.errorbar(x = np.arange(len(spectre_valid_wc)), y = spectre_valid_wc, yerr =spectre_valid_std_wc, fmt = '.', color = 'k', ecolor = 'gray', label='Prediction', alpha=0.8)
ax1.fill_between(np.arange(len(spectre_valid_wc)), spectre_valid_wc - spectre_valid_std_wc, spectre_valid_wc + spectre_valid_std_wc, color = 'grey', alpha = 0.5)
ax1.vlines(np.arange(len(spectre_valid_wc)),ymin=0, ymax=spectre_valid_wc, colors='r', linestyle='dashed',alpha = 0.1)
ax1.plot(valid_targets_wc, 'r.', label='Target', alpha=0.8)
ax1.set_xlabel('Concatenated targets')
ax1.set_ylabel('$(R_p/R_s)^2$')
ax1.set_title('Prediction vs target, mean value of the spectrum, on validation dataset')
ax1.legend()
ax2.plot(residuals, 'b.', label='Residuals', alpha=0.8)
ax2.set_xlabel('Concatenated targets')
ax2.set_ylabel('Residuals')
ax2.axhline(0, color='black', linestyle='--', linewidth=1)
ax2.legend()
plt.tight_layout()
plt.show()
xxxxxxxxxx
residuals = valid_targets_wc - spectre_valid_wc
print('MSE : ', np.sqrt((residuals**2).mean())*1e6, 'ppm')
xxxxxxxxxx
# np.save(f'{output_dir}/pred_valid_wc.npy', spectre_valid_wc)
# np.save(f'{output_dir}/targ_valid_wc.npy', valid_targets_wc)
# np.save(f'{output_dir}/std_valid_wc.npy', spectre_valid_std_wc)
xxxxxxxxxx
# 2D CNN for atmospheric features
<a id="fluctu"></a>
We now remove the mean value (transit depth) of the spectra to keep the atmospheric features only
xxxxxxxxxx
def suppress_mean(targets, mean) :
res = targets - np.repeat(mean.reshape((mean.shape[0], 1)), repeats = targets.shape[1], axis = 1)
return res
train_targets, valid_targets = targets[list_index_train], targets[~list_index_train]
train_targets_shift = suppress_mean(train_targets, targets_mean[list_index_train])
valid_targets_shift = suppress_mean(valid_targets, targets_mean[~list_index_train])
xxxxxxxxxx
We normalize the targets so that they range between -1 and 1, centered on zero
We normalize the targets so that they range between -1 and 1, centered on zero
我们对目标进行标准化,使其范围在 -1 到 1 之间,以零为中心
xxxxxxxxxx
##### normalization of the targets ###
def targets_normalization (data1, data2) :
data_min = data1.min()
data_max = data1.max()
data_abs_max = np.max([data_min, data_max])
data1 = data1/data_abs_max
data2 = data2/data_abs_max
return data1, data2, data_abs_max
def targets_norm_back (data, data_abs_max) :
return data * data_abs_max
train_targets_norm, valid_targets_norm, targets_abs_max = targets_normalization(train_targets_shift, valid_targets_shift)
xxxxxxxxxx
plt.figure(figsize=(15,5))
for i in range (240) :
plt.plot(wls, train_targets_norm[i], 'g-', alpha = 0.5)
plt.plot([], [], 'g-', alpha=0.5, label='Train targets')
for i in range (60) :
plt.plot(wls, valid_targets_norm[i], 'r-', alpha = 0.7)
plt.plot([], [], 'r-', alpha=0.5, label='Valid targets (true mean)')
plt.legend()
plt.ylabel(f'$(R_p/R_s)^2$')
plt.title('All targets after substracting the mean value and normalization')
plt.show()
xxxxxxxxxx
###### Transpose #####
train_obs = train_obs.transpose(0, 2, 1)
valid_obs = valid_obs.transpose(0, 2, 1)
print(train_obs.shape)
xxxxxxxxxx
We cut the transit to keep the in-transit. We assume an arbitrary transit duration of 40 instants with a transit occuring between 75 and 115.
We cut the transit to keep the in-transit. We assume an arbitrary transit duration of 40 instants with a transit occuring between 75 and 115.
我们削减了过境以保持在途。我们假设任意的凌日持续时间为 40 个瞬间,其中凌日发生在 75 到 115 之间。
xxxxxxxxxx
##### Substracting the out transit signal #####
def suppress_out_transit (data, ingress, egress) :
data_in = data[:, ingress:egress,:]
return data_in
ingress, egress = 75,115
train_obs_in = suppress_out_transit(train_obs, ingress, egress)
valid_obs_in = suppress_out_transit(valid_obs, ingress, egress)
xxxxxxxxxx
We remove the mean value of the in-transit to get relative data like the targets
We remove the mean value of the in-transit to get relative data like the targets
我们删除传输中的平均值以获得像目标这样的相关数据
xxxxxxxxxx
###### Substract the mean #####
def substract_data_mean(data):
data_mean = np.zeros(data.shape)
for i in range(data.shape[0]):
data_mean[i] = data[i] - data[i].mean()
return data_mean
train_obs_2d_mean = substract_data_mean(train_obs_in)
valid_obs_2d_mean = substract_data_mean(valid_obs_in)
xxxxxxxxxx
We use the same normalization as for the targets, i.e. between -1 and 1 centered on zero
We use the same normalization as for the targets, i.e. between -1 and 1 centered on zero
我们对目标使用相同的归一化,即以零为中心的 -1 和 1 之间
xxxxxxxxxx
##### Normalization dataset #####
def data_norm(data1, data2):
data_min = data1.min()
data_max = data1.max()
data_abs_max = np.max([data_min, data_max])
data1 = data1/data_abs_max
data2 = data2/data_abs_max
return data1, data2, data_abs_max
def data_normback(data, data_abs_max) :
return data * data_abs_max
train_obs_norm, valid_obs_norm, data_abs_max = data_norm(train_obs_2d_mean, valid_obs_2d_mean)
xxxxxxxxxx
plt.figure(figsize=(15,5))
for i in range (train_obs.shape[0]) :
plt.plot(wls, train_obs_norm[i,10], 'g-', alpha = 0.5)
plt.plot([], [], 'g-', alpha=0.5, label='Train targets')
for i in range (valid_obs.shape[0]) :
plt.plot(wls, valid_obs_norm[i,10], 'r-', alpha = 0.7)
plt.plot([], [], 'r-', alpha=0.5, label='Valid targets (true mean)')
plt.legend()
plt.ylabel(f'$(R_p/R_s)^2$')
plt.title('Train and Valid data after substracting the mean value and normalization')
plt.show()
xxxxxxxxxx
from tensorflow import keras
from keras.layers import Input, Conv2D, MaxPooling2D, Flatten, Dense, Concatenate, Reshape, Dropout, BatchNormalization, AveragePooling2D
from keras.models import Model
import tensorflow as tf
import numpy as np
## CNN 2 global normalization data
input_obs = Input((40,283,1))
x = Conv2D(32, (3, 1), activation='relu', padding='same')(input_obs)
x = MaxPooling2D((2, 1))(x)
x = BatchNormalization() (x)
x = Conv2D(64, (3, 1), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 1))(x)
x = Conv2D(128, (3, 1), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 1))(x)
x = Conv2D(256, (3, 1), activation='relu', padding='same')(x)
x = Conv2D(32, (1, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((1, 2))(x)
x = BatchNormalization() (x)
x = Conv2D(64, (1, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((1, 2))(x)
x = Conv2D(128, (1, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((1, 2))(x)
x = Conv2D(256, (1, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((1, 2))(x)
x = Flatten()(x)
# DNN
x = Dense(700, activation='relu')(x)
x = Dropout(0.2)(x, training = True)
output = Dense(283, activation='linear')(x)
model = Model(inputs=[input_obs], outputs=output)
checkpoint_filepath = 'output/model_2dcnn.keras'
model_ckt2 = ModelCheckpoint(
checkpoint_filepath,
monitor="val_loss",
verbose=0,
save_best_only=True,
save_weights_only=False,
mode="min",
save_freq="epoch",
)
model.compile(optimizer=Adam(0.001), loss='mse', metrics=[MeanAbsoluteError()])
model.summary()
xxxxxxxxxx
history = model.fit(
x = train_obs_norm,
y = train_targets_norm,
validation_data = (valid_obs_norm, valid_targets_norm),
batch_size=32,
epochs= 200,
shuffle=True,
verbose=0,
callbacks=[model_ckt2]
)
xxxxxxxxxx
We obtain uncertainties on the predictions by computing a MCDropout.
We obtain uncertainties on the predictions by computing a MCDropout.
我们通过计算 MCDropout 获得预测的不确定性。
xxxxxxxxxx
nb_dropout = 5
def NN_uncertainity(model, x_test, targets_abs_max, T=5):
predictions = []
for _ in range(T):
pred_norm = model.predict([x_test],verbose=0)
pred = targets_norm_back(pred_norm, targets_abs_max)
predictions += [pred]
mean, std = np.mean(np.array(predictions), axis=0), np.std(np.array(predictions), axis=0)
return mean, std
if do_the_mcdropout :
spectre_valid_shift, spectre_valid_shift_std = NN_uncertainity(model, [valid_obs_norm], targets_abs_max, T = nb_dropout)
else :
pred_valid_norm = model.predict([valid_obs_norm])
pred_valid = targets_norm_back(pred_valid_norm, targets_abs_max)
spectre_valid_shift = pred_valid
spectre_valid_shift_std = spectre_valid_shift*0.1
xxxxxxxxxx
residuals = valid_targets_shift - spectre_valid_shift
print('MSE : ', np.sqrt((residuals**2).mean())*1e6, 'ppm')
xxxxxxxxxx
# np.save(f'{output_dir}/pred_valid_shift.npy', spectre_valid_shift)
# np.save(f'{output_dir}/targ_valid_shift.npy', valid_targets_shift)
# np.save(f'{output_dir}/std_valid_shift.npy', spectre_valid_shift_std)
xxxxxxxxxx
plt.figure()
for i in range (50) :
plt.plot(spectre_valid_shift[-i]+0.0001*i, '-', alpha = 0.5)
plt.title('Spectra predictions for the validation set')
plt.xlabel('Time')
plt.ylabel('Arbitrary flux')
plt.show()
xxxxxxxxxx
list_valid_planets = [0, 12, 35, 60, 70]
wavelength = np.loadtxt('/kaggle/input/ariel-data-challenge-2024/wavelengths.csv', skiprows=1, delimiter = ',')
uncertainty = spectre_valid_shift_std
for i in (list_valid_planets):
plt.figure()
plt.title('Result for the sample {} of the validation set'.format(i))
plt.plot(wavelength, spectre_valid_shift[i], '.k', label = 'Prediction')
plt.plot(wavelength, valid_targets_shift[i], color = 'tomato', label = 'Target')
plt.fill_between(wavelength, spectre_valid_shift[i] - spectre_valid_shift_std[i], spectre_valid_shift[i] + spectre_valid_shift_std[i], color='silver', alpha = 0.8, label = 'Uncertainty')
plt.legend()
plt.ylabel(f'$(R_p/R_s)^2$')
plt.xlabel(f'Wavelength ($\mu$m)')
plt.show()
xxxxxxxxxx
######## ADD THE FLUCTUATIONS TO THE MEAN ########
def add_the_mean (shift, mean) :
return shift + mean[:,np.newaxis]
predictions_valid = add_the_mean(spectre_valid_shift,spectre_valid_wc)
predictions_std_valid = np.sqrt(spectre_valid_std_wc[:,np.newaxis]**2 + spectre_valid_shift_std**2)
xxxxxxxxxx
uncertainty = predictions_std_valid
def plot_one_sample_valid(ax, p):
ax.set_title(f'Result for sample {p} ')
line1, = ax.plot(wavelength, predictions_valid[p], '.k', label='Prediction')
line2, = ax.plot(wavelength, valid_targets[p], color='tomato', label='Target')
ax.fill_between(wavelength, predictions_valid[p, :] - uncertainty[p], predictions_valid[p, :] + uncertainty[p], color='silver', alpha=0.8, label='Uncertainty')
ax.set_ylabel(f'$(R_p/R_s)^2$')
ax.set_xlabel(f'Wavelength ($\mu$m)')
return line1, line2
num_samples = 16
rows, cols = 4, 4
fig, axs = plt.subplots(rows, cols, figsize=(15, 10))
samples = [1, 2, 7, 15, 20, 25, 30, 35, 40, 45, 50, 55, 6, 5, 8, 9]
lines = []
for i, ax in enumerate(axs.flat):
lines.extend(plot_one_sample_valid(ax, samples[i]))
fig.legend(lines[:2], ['Prediction', 'Target'], loc='upper center', ncol=3, bbox_to_anchor=(0.5, -0.05))
fig.suptitle('Validation dataset')
plt.tight_layout()
plt.show()
xxxxxxxxxx
######## PLOTS THE RESULT ########
predictions = predictions_valid
targets_plot = valid_targets
std = predictions_std_valid
predictions_concatenated_plot = np.concatenate(predictions, axis=0)
wls_concatenated = np.arange(predictions_concatenated_plot.shape[0])
targets_concatenated_plot = np.concatenate(targets_plot, axis=0)
spectre_valid_std_concatenated = np.concatenate(std, axis=0)
residuals = targets_concatenated_plot - predictions_concatenated_plot
uncertainty = spectre_valid_std_concatenated
fig, axs = plt.subplots(2, 1, figsize=(9, 8), gridspec_kw={'height_ratios': [3, 1]})
axs[0].plot(wls_concatenated, predictions_concatenated_plot, '-', color='k', label="Prediction")
axs[0].plot(wls_concatenated, targets_concatenated_plot, '-', color='tomato', label="Target")
axs[0].fill_between(np.arange(len(wls_concatenated)),
predictions_concatenated_plot - uncertainty,
predictions_concatenated_plot + uncertainty,
color='silver', alpha=1, label='Uncertainty')
axs[0].set_xlabel('Concatenated wavelengths for all planets')
axs[0].set_ylabel(f'$(R_p/R_s)^2$')
axs[0].set_title('Prediction vs target, validation dataset')
axs[0].legend()
axs[1].plot(wls_concatenated, residuals, '-', color='cornflowerblue', label="Residual")
axs[1].fill_between(np.arange(len(wls_concatenated)),
residuals - uncertainty,
residuals + uncertainty,
color='lightblue', alpha=0.9, label='Uncertainty')
axs[1].set_xlabel('Concatenated wavelengths for all planets')
axs[1].set_ylabel('Residual')
axs[1].set_title('Residuals with Uncertainty')
axs[1].legend()
plt.tight_layout()
plt.show()
print('MSE : ',np.sqrt((residuals**2).mean())*1e6, 'ppm')
xxxxxxxxxx
# np.save(f'{output_dir}/pred_valid.npy', predictions_valid)
# np.save(f'{output_dir}/std_valid.npy', predictions_std_valid)