多導(dǎo)睡眠圖(PSG)數(shù)據(jù)的睡眠階段分類
睡眠階段識別,也稱為睡眠評分或睡眠階段分類,對于更好地了解睡眠腦電具有重要意義。事實(shí)上,睡眠圖的構(gòu)建,即睡眠階段序列,通常作為一種初步檢查被用于診斷睡眠障礙,例如失眠或睡眠呼吸暫停。該檢查一般是按如下方式進(jìn)行:首先,使用多導(dǎo)睡眠圖(PSG)記錄被試頭部不同位置的腦電圖(EEG)信號、眼電圖(EOG)信號、肌電圖(EMG)信號等。其次,由睡眠專家觀察夜間睡眠記錄的不同時(shí)間序列,并按照美國睡眠醫(yī)學(xué)會(AASM)規(guī)則或 Rechtschaffen和Kales(RK)規(guī)則等參考命名法,為每30s時(shí)間段分配一個(gè)睡眠階段。
根據(jù)AASM規(guī)則,確定了5個(gè)睡眠階段:喚醒(W)、快速眼動(REM)、非快速眼動(N1)、非快速眼動(N2)和非快速眼動(N3,也稱為慢波睡眠甚至深度睡眠)?!娟P(guān)于睡眠EEG波形更詳細(xì)地描述和分類比較可參見此文:干貨分享?| EEG波形判別上手指南】它們的特征是具有不同的時(shí)間和頻率模式,并且在整晚睡眠階段所占的比例也不同。例如,像N1這樣的過渡階段的頻率低于REM或N2。在AASM規(guī)則下,還記錄了兩個(gè)不同階段之間的過渡情況,并且可能會調(diào)節(jié)睡眠評分者的最終決定。然而實(shí)際上,某些過渡階段或轉(zhuǎn)換過程終止,以及轉(zhuǎn)換被加強(qiáng)等情況,這些取決于某些事件的發(fā)生,例如關(guān)于N1-N2過渡階段的喚醒、K-復(fù)合波或紡錘波【點(diǎn)此查看紡錘波的更多信息→紡錘波:EEG中紡錘波參數(shù)分析和檢測框架,并應(yīng)用于睡眠紡錘波】。盡管通過睡眠專家的檢查可以收集非常寶貴的信息,但睡眠評估是一項(xiàng)繁瑣且耗時(shí)的任務(wù),而且還受制于評分者的主觀性和可變性。
自動睡眠評分方法引起了許多研究人員的興趣。從統(tǒng)計(jì)機(jī)器學(xué)習(xí)的角度來看,該問題是一個(gè)不平衡的多類預(yù)測問題。最先進(jìn)的自動方法可以分為兩類,這取決于用于分類的特征是使用專家知識提取的,還是從原始信號中學(xué)習(xí)的。第一類方法依賴于有關(guān)信號和事件的先驗(yàn)知識。第二類方法包括從轉(zhuǎn)換后的數(shù)據(jù)或直接來自卷積神經(jīng)網(wǎng)絡(luò)的原始數(shù)據(jù)中學(xué)習(xí)適當(dāng)?shù)奶卣鞅碚?。最近,提出了一種使用對抗性深度神經(jīng)網(wǎng)絡(luò)對腦電信號進(jìn)行睡眠階段分類的方法。
統(tǒng)計(jì)機(jī)器學(xué)習(xí)主要的挑戰(zhàn)之一是分類任務(wù)的不平衡性質(zhì),但是這對于應(yīng)用過程具有重要的實(shí)際意義。一般來說,與N2階段相比,像N1這樣的睡眠階段通常是比較少見的。當(dāng)學(xué)習(xí)一個(gè)具有非常不平衡類的預(yù)測算法時(shí),通常會發(fā)生的情況是系統(tǒng)往往不會預(yù)測最稀少的類。解決此問題的一種方法是重加權(quán)模型的損失函數(shù)。與神經(jīng)網(wǎng)絡(luò)中使用的在線訓(xùn)練方法一樣,利用平衡采樣向網(wǎng)絡(luò)提供批量數(shù)據(jù),其中包含每個(gè)類的盡可能多的數(shù)據(jù)點(diǎn)。統(tǒng)計(jì)學(xué)習(xí)的另一個(gè)挑戰(zhàn)與處理過渡階段或轉(zhuǎn)換規(guī)則的方式有關(guān)。實(shí)際上,由于該過程可能會影響評分者的最終決定,因此預(yù)測模型可能會考慮這一點(diǎn)以提高其表現(xiàn)。這可以通過向最終分類器提供來自相鄰時(shí)間段的特征來實(shí)現(xiàn)。這就是所謂的睡眠階段分類。
本文使用端到端深度學(xué)習(xí)方法,使用多元時(shí)間序列來進(jìn)行時(shí)間睡眠階段分類。并使用來自給定Sleep Physionet數(shù)據(jù)集中的兩名被試,即Alice和Bob(因演示需求所取的名字),本文將闡述如何從Alice的數(shù)據(jù)中預(yù)測Bob的睡眠階段。該問題可被作為有監(jiān)督的多類分類任務(wù)來解決,目標(biāo)是從5個(gè)可能的階段預(yù)測每30s數(shù)據(jù)塊的睡眠階段?;赑ython工具包MNE進(jìn)行分析。
import numpy as np
import matplotlib.pyplot as plt
import mne
from mne.datasets.sleep_physionet.age import fetch_data
from mne.time_frequency import psd_welch
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import FunctionTransformer
加載數(shù)據(jù)
MNE-Python提供了mne.datasets.sleep_physionet.age.fetch_data(),能夠方便地從Sleep Physionet數(shù)據(jù)集中下載數(shù)據(jù)。給定被試和記錄列表,fetcher下載數(shù)據(jù)并為每個(gè)被試提供一對文件:
-PSG.edf包含多導(dǎo)睡眠圖。
-Hypnogram.edf包含由專家記錄的注釋。
將這兩者結(jié)合在一個(gè)mne.io.Raw對象中,然后根據(jù)注釋的描述提取事件(events)以獲得epochs。
讀取PSG數(shù)據(jù)和睡眠圖以創(chuàng)建一個(gè)原始對象
ALICE, BOB = 0, 1
[alice_files, bob_files] = fetch_data(subjects=[ALICE, BOB], recording=[1])
raw_train = mne.io.read_raw_edf(alice_files[0], stim_channel='Event marker',
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? misc=['Temp rectal'])
annot_train = mne.read_annotations(alice_files[1])
raw_train.set_annotations(annot_train, emit_warning=False)
# plot some data
# scalings were chosen manually to allow for simultaneous visualization of
# different channel types in this specific dataset
raw_train.plot(start=60, duration=60,
? ? ? ? ? ? ? scalings=dict(eeg=1e-4, resp=1e3, eog=1e-4, emg=1e-7,
? ? ? ? ? ? ? ? ? ? ? ? ? ? misc=1e-1))

Using default location ~/mne_data for PHYSIONET_SLEEP...
Extracting EDF parameters from /home/circleci/mne_data/physionet-sleep-data/SC4001E0-PSG.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
從注釋中提取30s的事件
Sleep Physionet數(shù)據(jù)集使用8個(gè)標(biāo)簽進(jìn)行注釋:Wake(W)、Stage1、Stage 2、Stage 3、Stage 4,對應(yīng)于從輕度睡眠到深度睡眠的范圍、REM睡眠(R),其中REM是快速眼動睡眠的縮寫,運(yùn)動(M)和沒有記分的階段(?)。本文將只使用5個(gè)階段:Wake(W)、Stage 1、Stage 2、Stage 3/4、REM睡眠(R)。為此,使用event_id參數(shù)in mne.events_from_annotations()來選擇我們感興趣的事件,并為每個(gè)事件關(guān)聯(lián)一個(gè)事件標(biāo)識符。
此外,這些記錄包含每晚前后的長時(shí)間Wake(W)區(qū)域。為了限制類不平衡帶來的影響,只保留第一個(gè)W時(shí)間段出現(xiàn)的前30min到最后一個(gè)睡眠階段出現(xiàn)的后30min來修剪每個(gè)記錄數(shù)據(jù)。
annotation_desc_2_event_id = {'Sleep stage W': 1,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 'Sleep stage 1': 2,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 'Sleep stage 2': 3,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 'Sleep stage 3': 4,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 'Sleep stage 4': 4,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 'Sleep stage R': 5}
# keep last 30-min wake events before sleep and first 30-min wake events after
# sleep and redefine annotations on raw data
annot_train.crop(annot_train[1]['onset'] - 30 * 60,
? ? ? ? ? ? ? ? annot_train[-2]['onset'] + 30 * 60)
raw_train.set_annotations(annot_train, emit_warning=False)
events_train, _ = mne.events_from_annotations(
? ? raw_train, event_id=annotation_desc_2_event_id, chunk_duration=30.)
# create a new event_id that unifies stages 3 and 4
event_id = {'Sleep stage W': 1,
? ? ? ? ? ? 'Sleep stage 1': 2,
? ? ? ? ? ? 'Sleep stage 2': 3,
? ? ? ? ? ? 'Sleep stage 3/4': 4,
? ? ? ? ? ? 'Sleep stage R': 5}
# plot events
fig = mne.viz.plot_events(events_train, event_id=event_id,
? ? ? ? ? ? ? ? ? ? ? ? ? sfreq=raw_train.info['sfreq'],
? ? ? ? ? ? ? ? ? ? ? ? ? first_samp=events_train[0, 0])
# keep the color-code for further plotting
stage_colors = plt.rcParams['axes.prop_cycle'].by_key()['color']

OUT:
Used Annotations descriptions: ['Sleep stage 1', 'Sleep stage 2', 'Sleep stage 3', 'Sleep stage 4', 'Sleep stage R', 'Sleep stage W']
根據(jù)事件創(chuàng)建Epochs
tmax = 30. - 1. / raw_train.info['sfreq']? # tmax in included
epochs_train = mne.Epochs(raw=raw_train, events=events_train,
? ? ? ? ? ? ? ? ? ? ? ? ? event_id=event_id, tmin=0., tmax=tmax, baseline=None)
print(epochs_train)
OUT:
Not setting metadata
841 matching events found
No baseline correction applied
0 projection items activated
<Epochs |? 841 events (good & bad), 0 - 29.99 sec, baseline off, ~12 kB, data not loaded,
'Sleep stage W': 188
'Sleep stage 1': 58
'Sleep stage 2': 250
'Sleep stage 3/4': 220
'Sleep stage R': 125>
應(yīng)用相同的步驟,加載Bob的數(shù)據(jù)(測試數(shù)據(jù))
raw_test = mne.io.read_raw_edf(bob_files[0], stim_channel='Event marker',
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? misc=['Temp rectal'])
annot_test = mne.read_annotations(bob_files[1])
annot_test.crop(annot_test[1]['onset'] - 30 * 60,
? ? ? ? ? ? ? ? annot_test[-2]['onset'] + 30 * 60)
raw_test.set_annotations(annot_test, emit_warning=False)
events_test, _ = mne.events_from_annotations(
? ? raw_test, event_id=annotation_desc_2_event_id, chunk_duration=30.)
epochs_test = mne.Epochs(raw=raw_test, events=events_test, event_id=event_id,
? ? ? ? ? ? ? ? ? ? ? ? tmin=0., tmax=tmax, baseline=None)
print(epochs_test)
OUT:
Extracting EDF parameters from /home/circleci/mne_data/physionet-sleep-data/SC4011E0-PSG.edf...
EDF file detected
Setting channel info structure...
Creating raw.info structure...
Used Annotations descriptions: ['Sleep stage 1', 'Sleep stage 2', 'Sleep stage 3', 'Sleep stage 4', 'Sleep stage R', 'Sleep stage W']
Not setting metadata
1103 matching events found
No baseline correction applied
0 projection items activated
<Epochs |? 1103 events (good & bad), 0 - 29.99 sec, baseline off, ~12 kB, data not loaded,
'Sleep stage W': 157
'Sleep stage 1': 109
'Sleep stage 2': 562
'Sleep stage 3/4': 105
'Sleep stage R': 170>
特征工程
觀察不同睡眠階段的各個(gè)epochs的功率譜密度圖(PSD),可以看到不同睡眠階段具有不同的特征。這些特征在Alice和Bob的數(shù)據(jù)之間是相似的。接下來,本節(jié)將根據(jù)特定頻段中的相對功率來創(chuàng)建EEG特征,以捕獲數(shù)據(jù)中睡眠階段之間的這種差異。
# visualize Alice vs. Bob PSD by sleep stage.
fig, (ax1, ax2) = plt.subplots(ncols=2)
# iterate over the subjects
stages = sorted(event_id.keys())
for ax, title, epochs in zip([ax1, ax2],
? ? ? ? ? ? ? ? ? ? ? ? ? ? ['Alice', 'Bob'],
? ? ? ? ? ? ? ? ? ? ? ? ? ? [epochs_train, epochs_test]):
? ? for stage, color in zip(stages, stage_colors):
? ? ? ? epochs[stage].plot_psd(area_mode=None, color=color, ax=ax,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? fmin=0.1, fmax=20., show=False,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? average=True, spatial_colors=False)
? ? ax.set(title=title, xlabel='Frequency (Hz)')
ax2.set(ylabel='μV^2/Hz (dB)')
ax2.legend(ax2.lines[2::3], stages)
plt.show()

OUT:
Loading data for 58 events and 3000 original time points ...
0 bad epochs dropped
? ? Using multitaper spectrum estimation with 7 DPSS windows
Loading data for 250 events and 3000 original time points ...
0 bad epochs dropped
? ? Using multitaper spectrum estimation with 7 DPSS windows
Loading data for 220 events and 3000 original time points ...
0 bad epochs dropped
? ? Using multitaper spectrum estimation with 7 DPSS windows
Loading data for 125 events and 3000 original time points ...
0 bad epochs dropped
? ? Using multitaper spectrum estimation with 7 DPSS windows
Loading data for 188 events and 3000 original time points ...
0 bad epochs dropped
? ? Using multitaper spectrum estimation with 7 DPSS windows
Loading data for 109 events and 3000 original time points ...
0 bad epochs dropped
? ? Using multitaper spectrum estimation with 7 DPSS windows
Loading data for 562 events and 3000 original time points ...
0 bad epochs dropped
? ? Using multitaper spectrum estimation with 7 DPSS windows
Loading data for 105 events and 3000 original time points ...
0 bad epochs dropped
? ? Using multitaper spectrum estimation with 7 DPSS windows
Loading data for 170 events and 3000 original time points ...
0 bad epochs dropped
? ? Using multitaper spectrum estimation with 7 DPSS windows
Loading data for 157 events and 3000 original time points ...
0 bad epochs dropped
? ? Using multitaper spectrum estimation with 7 DPSS windows
利用Python函數(shù)設(shè)計(jì)scikit-learn轉(zhuǎn)換器
根據(jù)特定頻段的相對功率來創(chuàng)建一個(gè)函數(shù),以提取EEG特征,從而能夠從EEG信號中預(yù)測睡眠階段。
def eeg_power_band(epochs):
? ? """EEG relative power band feature extraction.
? ? This function takes an ``mne.Epochs`` object and creates EEG features based
? ? on relative power in specific frequency bands that are compatible with
? ? scikit-learn.
? ? Parameters
? ? ----------
? ? epochs : Epochs
? ? ? ? The data.
? ? Returns
? ? -------
? ? X : numpy array of shape [n_samples, 5]
? ? ? ? Transformed data.
? ? """
? ? # specific frequency bands
? ? FREQ_BANDS = {"delta": [0.5, 4.5],
? ? ? ? ? ? ? ? ? "theta": [4.5, 8.5],
? ? ? ? ? ? ? ? ? "alpha": [8.5, 11.5],
? ? ? ? ? ? ? ? ? "sigma": [11.5, 15.5],
? ? ? ? ? ? ? ? ? "beta": [15.5, 30]}
? ? psds, freqs = psd_welch(epochs, picks='eeg', fmin=0.5, fmax=30.)
? ? # Normalize the PSDs
? ? psds /= np.sum(psds, axis=-1, keepdims=True)
? ? X = []
? ? for fmin, fmax in FREQ_BANDS.values():
? ? ? ? psds_band = psds[:, :, (freqs >= fmin) & (freqs < fmax)].mean(axis=-1)
? ? ? ? X.append(psds_band.reshape(len(psds), -1))
? ? return np.concatenate(X, axis=1)
使用scikit-learn的多分類工作流
為了解決如何從Alice的數(shù)據(jù)中預(yù)測Bob的睡眠階段,并盡可能避免使用樣板代碼的問題,接下來將利用到sckit-learn的兩個(gè)關(guān)鍵特性:Pipeline和FunctionTransformer。Scikit-learn的pipeline將估計(jì)器組合為一系列轉(zhuǎn)換和最終估計(jì)器,而FunctionTransformer將python函數(shù)轉(zhuǎn)換為估計(jì)器兼容對象。通過這種方式,可以創(chuàng)建以mne.Epochs為參數(shù)的scikit-learn估計(jì)器。
pipe = make_pipeline(FunctionTransformer(eeg_power_band, validate=False),
? ? ? ? ? ? ? ? ? ? RandomForestClassifier(n_estimators=100, random_state=42))
# Train
y_train = epochs_train.events[:, 2]
pipe.fit(epochs_train, y_train)
# Test
y_pred = pipe.predict(epochs_test)
# Assess the results
y_test = epochs_test.events[:, 2]
acc = accuracy_score(y_test, y_pred)
print("Accuracy score: {}".format(acc))
OUT:
Loading data for 841 events and 3000 original time points ...
0 bad epochs dropped
Effective window size : 2.560 (s)
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done? 1 out of? 1 | elapsed:? ? 1.2s remaining:? ? 0.0s
[Parallel(n_jobs=1)]: Done? 1 out of? 1 | elapsed:? ? 1.2s finished
Loading data for 1103 events and 3000 original time points ...
0 bad epochs dropped
Effective window size : 2.560 (s)
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done? 1 out of? 1 | elapsed:? ? 1.6s remaining:? ? 0.0s
[Parallel(n_jobs=1)]: Done? 1 out of? 1 | elapsed:? ? 1.6s finished
Accuracy score: 0.641885766092475
從輸出結(jié)果可知,可以根據(jù)Alice的數(shù)據(jù)預(yù)測Bob的睡眠階段,且預(yù)測精度為64.2%。
進(jìn)一步分析數(shù)據(jù)
查看混淆矩陣或分類報(bào)告。
print(confusion_matrix(y_test, y_pred))
OUT:
[[156? 0? 1? 0? 0]
[ 80? 4? 7? 2? 16]
[ 85? 17 369? 33? 58]
[? 0? 0? 5 100? 0]
[ 54? 3? 34? 0? 79]]
print(classification_report(y_test, y_pred, target_names=event_id.keys()))
OUT:
? ? ? ? ? ? ? ? precision? ? recall? f1-score? support
? Sleep stage W? ? ? 0.42? ? ? 0.99? ? ? 0.59? ? ? 157
? Sleep stage 1? ? ? 0.17? ? ? 0.04? ? ? 0.06? ? ? 109
? Sleep stage 2? ? ? 0.89? ? ? 0.66? ? ? 0.75? ? ? 562
Sleep stage 3/4? ? ? 0.74? ? ? 0.95? ? ? 0.83? ? ? 105
? Sleep stage R? ? ? 0.52? ? ? 0.46? ? ? 0.49? ? ? 170
? ? ? accuracy? ? ? ? ? ? ? ? ? ? ? ? ? 0.64? ? ? 1103
? ? ? macro avg? ? ? 0.55? ? ? 0.62? ? ? 0.54? ? ? 1103
? weighted avg? ? ? 0.68? ? ? 0.64? ? ? 0.63? ? ? 1103
從該分類報(bào)告中可以看到,每個(gè)睡眠階段的精度、召回率、F1值,以及用于訓(xùn)練的測試樣本數(shù)量。例如,快速眼動睡眠階段(R)的精度為52%,召回率為46%,F(xiàn)1值為0.49,用于訓(xùn)練的樣本為170,總樣本數(shù)為1103。
參考來源:
Stanislas Chambon, Mathieu N. Galtier, Pierrick J. Arnal, Gilles Wainrib, and Alexandre Gramfort. A deep learning architecture for temporal sleep stage classification using multivariate and multimodal time series. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 26(4):758–769, 2018. doi:10.1109/TNSRE.2018.2813138.
B. Kemp, A. H. Zwinderman, B. Tuk, H. A. C. Kamphuisen, and J. J. L. Oberyé. Analysis of a sleep-dependent neuronal feedback loop: the slow-wave microcontinuity of the EEG. IEEE Transactions on Biomedical Engineering, 47(9):1185–1194, 2000. doi:10.1109/10.867928.
Ary L. Goldberger, Luis A. N. Amaral, Leon Glass, Jeffrey M. Hausdorff, Plamen Ch. Ivanov, Roger G. Mark, Joseph E. Mietus, George B. Moody, Chung-Kang Peng, and H. Eugene Stanley. PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation, 2000. doi:10.1161/01.CIR.101.23.e215.
https://mne.tools/stable/auto_tutorials/clinical/60_sleep.html#id1
小伙伴們點(diǎn)個(gè)“在看”,加
(星標(biāo))關(guān)注茗創(chuàng)科技,將第一時(shí)間收到精彩內(nèi)容推送哦~
