手機(jī)站首頁(yè)散文詩(shī)歌雜文隨筆日記小小說(shuō)

散文網(wǎng) » 科技 »學(xué)習(xí) » 27 個(gè)Python數(shù)據(jù)科學(xué)庫(kù)實(shí)戰(zhàn)案例 (附代碼)

27 個(gè)Python數(shù)據(jù)科學(xué)庫(kù)實(shí)戰(zhàn)案例 (附代碼)

2023-08-30 18:14 作者:程序員老K 0人讀過(guò) | 我要投稿

27 個(gè)Python數(shù)據(jù)科學(xué)庫(kù)實(shí)戰(zhàn)案例 (附代碼)

Python數(shù)據(jù)科學(xué)?2023-07-21 18:40?發(fā)表于北京

收錄于合集#數(shù)據(jù)科學(xué)1個(gè)

Python數(shù)據(jù)科學(xué)

以Python為核心語(yǔ)言，專攻于「數(shù)據(jù)科學(xué)」領(lǐng)域，文章涵蓋數(shù)據(jù)分析，數(shù)據(jù)挖掘，機(jī)器學(xué)習(xí)等干貨內(nèi)容，分享大量數(shù)據(jù)挖掘?qū)崙?zhàn)項(xiàng)目分析和講解，以及海量的學(xué)習(xí)資源。

224篇原創(chuàng)內(nèi)容

公眾號(hào)

為了大家能夠?qū)θ斯ぶ悄艹Ｓ玫?Python 庫(kù)有一個(gè)初步的了解，以選擇能夠滿足自己需求的庫(kù)進(jìn)行學(xué)習(xí)，對(duì)目前較為常見(jiàn)的人工智能庫(kù)進(jìn)行簡(jiǎn)要全面的介紹。

1、Numpy

NumPy(Numerical Python)是?Python的一個(gè)擴(kuò)展程序庫(kù)，支持大量的維度數(shù)組與矩陣運(yùn)算，此外也針對(duì)數(shù)組運(yùn)算提供大量的數(shù)學(xué)函數(shù)庫(kù)，Numpy底層使用C語(yǔ)言編寫(xiě)，數(shù)組中直接存儲(chǔ)對(duì)象，而不是存儲(chǔ)對(duì)象指針，所以其運(yùn)算效率遠(yuǎn)高于純Python代碼。我們可以在示例中對(duì)比下純Python與使用Numpy庫(kù)在計(jì)算列表sin值的速度對(duì)比：import?numpy?as?np import?math import?random import?time start?=?time.time() for?i?in?range(10): ????list_1?=?list(range(1,10000)) ????for?j?in?range(len(list_1)): ????????list_1[j]?=?math.sin(list_1[j]) print("使用純Python用時(shí){}s".format(time.time()-start)) start?=?time.time() for?i?in?range(10): ????list_1?=?np.array(np.arange(1,10000)) ????list_1?=?np.sin(list_1) print("使用Numpy用時(shí){}s".format(time.time()-start))

從如下運(yùn)行結(jié)果，可以看到使用?Numpy?庫(kù)的速度快于純 Python 編寫(xiě)的代碼：

使用純Python用時(shí)0.017444372177124023s
使用Numpy用時(shí)0.001619577407836914s

2、OpenCV

OpenCV?是一個(gè)的跨平臺(tái)計(jì)算機(jī)視覺(jué)庫(kù)，可以運(yùn)行在 Linux、Windows 和 Mac OS 操作系統(tǒng)上。它輕量級(jí)而且高效——由一系列 C 函數(shù)和少量 C++ 類構(gòu)成，同時(shí)也提供了?Python?接口，實(shí)現(xiàn)了圖像處理和計(jì)算機(jī)視覺(jué)方面的很多通用算法。下面代碼嘗試使用一些簡(jiǎn)單的濾鏡，包括圖片的平滑處理、高斯模糊等：

import?numpy?as?np import?CV2?as?cv from?matplotlib?import?pyplot?as?plt img?=?cv.imread('h89817032p0.png') kernel?=?np.ones((5,5),np.float32)/25 dst?=?cv.filter2D(img,-1,kernel) blur_1?=?cv.GaussianBlur(img,(5,5),0) blur_2?=?cv.bilateralFilter(img,9,75,75) plt.figure(figsize=(10,10)) plt.subplot(221),plt.imshow(img[:,:,::-1]),plt.title('Original') plt.xticks([]),?plt.yticks([]) plt.subplot(222),plt.imshow(dst[:,:,::-1]),plt.title('Averaging') plt.xticks([]),?plt.yticks([]) plt.subplot(223),plt.imshow(blur_1[:,:,::-1]),plt.title('Gaussian') plt.xticks([]),?plt.yticks([]) plt.subplot(224),plt.imshow(blur_1[:,:,::-1]),plt.title('Bilateral') plt.xticks([]),?plt.yticks([]) plt.show()

3、Scikit-imagescikit-image是基于scipy的圖像處理庫(kù)，它將圖片作為numpy數(shù)組進(jìn)行處理。例如，可以利用scikit-image改變圖片比例，scikit-image提供了rescale、resize以及downscale_local_mean等函數(shù)。from?skimage?import?data,?color,?io from?skimage.transform?import?rescale,?resize,?downscale_local_mean image?=?color.rgb2gray(io.imread('h89817032p0.png')) image_rescaled?=?rescale(image,?0.25,?anti_aliasing=False) image_resized?=?resize(image,?(image.shape[0]?//?4,?image.shape[1]?//?4), ???????????????????????anti_aliasing=True) image_downscaled?=?downscale_local_mean(image,?(4,?3)) plt.figure(figsize=(20,20)) plt.subplot(221),plt.imshow(image,?cmap='gray'),plt.title('Original') plt.xticks([]),?plt.yticks([]) plt.subplot(222),plt.imshow(image_rescaled,?cmap='gray'),plt.title('Rescaled') plt.xticks([]),?plt.yticks([]) plt.subplot(223),plt.imshow(image_resized,?cmap='gray'),plt.title('Resized') plt.xticks([]),?plt.yticks([]) plt.subplot(224),plt.imshow(image_downscaled,?cmap='gray'),plt.title('Downscaled') plt.xticks([]),?plt.yticks([]) plt.show()

4、PIL

Python Imaging Library(PIL)?已經(jīng)成為?Python?事實(shí)上的圖像處理標(biāo)準(zhǔn)庫(kù)了，這是由于，PIL?功能非常強(qiáng)大，但API卻非常簡(jiǎn)單易用。但是由于PIL僅支持到?Python 2.7，再加上年久失修，于是一群志愿者在?PIL?的基礎(chǔ)上創(chuàng)建了兼容的版本，名字叫?Pillow，支持最新?Python 3.x，又加入了許多新特性，因此，我們可以跳過(guò)?PIL，直接安裝使用?Pillow。

5、Pillow

使用?Pillow?生成字母驗(yàn)證碼圖片：

from?PIL?import?Image,?ImageDraw,?ImageFont,?ImageFilter import?random #?隨機(jī)字母: def?rndChar(): ????return?chr(random.randint(65,?90)) #?隨機(jī)顏色1: def?rndColor(): ????return?(random.randint(64,?255),?random.randint(64,?255),?random.randint(64,?255)) #?隨機(jī)顏色2: def?rndColor2(): ????return?(random.randint(32,?127),?random.randint(32,?127),?random.randint(32,?127)) #?240?x?60: width?=?60?*?6 height?=?60?*?6 image?=?Image.new('RGB',?(width,?height),?(255,?255,?255)) #?創(chuàng)建Font對(duì)象: font?=?ImageFont.truetype('/usr/share/fonts/wps-office/simhei.ttf',?60) #?創(chuàng)建Draw對(duì)象: draw?=?ImageDraw.Draw(image) #?填充每個(gè)像素: for?x?in?range(width): ????for?y?in?range(height): ????????draw.point((x,?y),?fill=rndColor()) #?輸出文字: for?t?in?range(6): ????draw.text((60?*?t?+?10,?150),?rndChar(),?font=font,?fill=rndColor2()) #?模糊: image?=?image.filter(ImageFilter.BLUR) image.save('code.jpg',?'jpeg')

6、SimpleCV

SimpleCV?是一個(gè)用于構(gòu)建計(jì)算機(jī)視覺(jué)應(yīng)用程序的開(kāi)源框架。使用它，可以訪問(wèn)高性能的計(jì)算機(jī)視覺(jué)庫(kù)，如 OpenCV，而不必首先了解位深度、文件格式、顏色空間、緩沖區(qū)管理、特征值或矩陣等術(shù)語(yǔ)。但其對(duì)于 Python3 的支持很差很差，在 Python3.7 中使用如下代碼：

from?SimpleCV?import?Image,?Color,?Display #?load?an?image?from?imgur img?=?Image('http://i.imgur.com/lfAeZ4n.png') #?use?a?keypoint?detector?to?find?areas?of?interest feats?=?img.findKeypoints() #?draw?the?list?of?keypoints feats.draw(color=Color.RED) #?show?the??resulting?image.? img.show() #?apply?the?stuff?we?found?to?the?image. output?=?img.applyLayers() #?save?the?results. output.save('juniperfeats.png')

會(huì)報(bào)如下錯(cuò)誤，因此不建議在?Python3?中使用：

SyntaxError:?Missing?parentheses?in?call?to?'print'.?Did?you?mean?print('unit?test')?7、Mahotas

Mahotas?是一個(gè)快速計(jì)算機(jī)視覺(jué)算法庫(kù)，其構(gòu)建在?Numpy?之上，目前擁有超過(guò)100種圖像處理和計(jì)算機(jī)視覺(jué)功能，并在不斷增長(zhǎng)。使用?Mahotas?加載圖像，并對(duì)像素進(jìn)行操作：

import?numpy?as?np import?mahotas import?mahotas.demos from?mahotas.thresholding?import?soft_threshold from?matplotlib?import?pyplot?as?plt from?os?import?path f?=?mahotas.demos.load('lena',?as_grey=True) f?=?f[128:,128:] plt.gray() #?Show?the?data: print("Fraction?of?zeros?in?original?image:?{0}".format(np.mean(f==0))) plt.imshow(f) plt.show()

8、Ilastik

Ilastik?能夠給用戶提供良好的基于機(jī)器學(xué)習(xí)的生物信息圖像分析服務(wù)，利用機(jī)器學(xué)習(xí)算法，輕松地分割，分類，跟蹤和計(jì)數(shù)細(xì)胞或其他實(shí)驗(yàn)數(shù)據(jù)。大多數(shù)操作都是交互式的，并不需要機(jī)器學(xué)習(xí)專業(yè)知識(shí)。

9、Scikit-Learn

Scikit-learn?是針對(duì)?Python?編程語(yǔ)言的免費(fèi)軟件機(jī)器學(xué)習(xí)庫(kù)。它具有各種分類，回歸和聚類算法，包括支持向量機(jī)，隨機(jī)森林，梯度提升，k均值和 DBSCAN 等多種機(jī)器學(xué)習(xí)算法。使用Scikit-learn實(shí)現(xiàn)KMeans算法：

import?time import?numpy?as?np import?matplotlib.pyplot?as?plt from?sklearn.cluster?import?MiniBatchKMeans,?KMeans from?sklearn.metrics.pairwise?import?pairwise_distances_argmin from?sklearn.datasets?import?make_blobs #?Generate?sample?data np.random.seed(0) batch_size?=?45 centers?=?[[1,?1],?[-1,?-1],?[1,?-1]] n_clusters?=?len(centers) X,?labels_true?=?make_blobs(n_samples=3000,?centers=centers,?cluster_std=0.7) #?Compute?clustering?with?Means k_means?=?KMeans(init='k-means++',?n_clusters=3,?n_init=10) t0?=?time.time() k_means.fit(X) t_batch?=?time.time()?-?t0 #?Compute?clustering?with?MiniBatchKMeans mbk?=?MiniBatchKMeans(init='k-means++',?n_clusters=3,?batch_size=batch_size, ??????????????????????n_init=10,?max_no_improvement=10,?verbose=0) t0?=?time.time() mbk.fit(X) t_mini_batch?=?time.time()?-?t0 #?Plot?result fig?=?plt.figure(figsize=(8,?3)) fig.subplots_adjust(left=0.02,?right=0.98,?bottom=0.05,?top=0.9) colors?=?['#4EACC5',?'#FF9C34',?'#4E9A06'] #?We?want?to?have?the?same?colors?for?the?same?cluster?from?the #?MiniBatchKMeans?and?the?KMeans?algorithm.?Let's?pair?the?cluster?centers?per #?closest?one. k_means_cluster_centers?=?k_means.cluster_centers_ order?=?pairwise_distances_argmin(k_means.cluster_centers_, ??????????????????????????????????mbk.cluster_centers_) mbk_means_cluster_centers?=?mbk.cluster_centers_[order] k_means_labels?=?pairwise_distances_argmin(X,?k_means_cluster_centers) mbk_means_labels?=?pairwise_distances_argmin(X,?mbk_means_cluster_centers) #?KMeans for?k,?col?in?zip(range(n_clusters),?colors): ????my_members?=?k_means_labels?==?k ????cluster_center?=?k_means_cluster_centers[k] ????plt.plot(X[my_members,?0],?X[my_members,?1],?'w', ????????????markerfacecolor=col,?marker='.') ????plt.plot(cluster_center[0],?cluster_center[1],?'o',?markerfacecolor=col, ????????????markeredgecolor='k',?markersize=6) plt.title('KMeans') plt.xticks(()) plt.yticks(()) plt.show()

10、SciPy

SciPy?庫(kù)提供了許多用戶友好和高效的數(shù)值計(jì)算，如數(shù)值積分、插值、優(yōu)化、線性代數(shù)等。SciPy?庫(kù)定義了許多數(shù)學(xué)物理的特殊函數(shù)，包括橢圓函數(shù)、貝塞爾函數(shù)、伽馬函數(shù)、貝塔函數(shù)、超幾何函數(shù)、拋物線圓柱函數(shù)等等。

from?scipy?import?special import?matplotlib.pyplot?as?plt import?numpy?as?np def?drumhead_height(n,?k,?distance,?angle,?t): ????kth_zero?=?special.jn_zeros(n,?k)[-1] ????return?np.cos(t)?*?np.cos(n*angle)?*?special.jn(n,?distance*kth_zero) theta?=?np.r_[0:2*np.pi:50j] radius?=?np.r_[0:1:50j] x?=?np.array([r?*?np.cos(theta)?for?r?in?radius]) y?=?np.array([r?*?np.sin(theta)?for?r?in?radius]) z?=?np.array([drumhead_height(1,?1,?r,?theta,?0.5)?for?r?in?radius]) fig?=?plt.figure() ax?=?fig.add_axes(rect=(0,?0.05,?0.95,?0.95),?projection='3d') ax.plot_surface(x,?y,?z,?rstride=1,?cstride=1,?cmap='RdBu_r',?vmin=-0.5,?vmax=0.5) ax.set_xlabel('X') ax.set_ylabel('Y') ax.set_xticks(np.arange(-1,?1.1,?0.5)) ax.set_yticks(np.arange(-1,?1.1,?0.5)) ax.set_zlabel('Z') plt.show()

11、NLTK

NLTK?是構(gòu)建Python程序以處理自然語(yǔ)言的庫(kù)。它為50多個(gè)語(yǔ)料庫(kù)和詞匯資源(如?WordNet?)提供了易于使用的接口，以及一套用于分類、分詞、詞干、標(biāo)記、解析和語(yǔ)義推理的文本處理庫(kù)、工業(yè)級(jí)自然語(yǔ)言處理?(Natural Language Processing, NLP)?庫(kù)的包裝器。NLTK被稱為?“a wonderful tool for teaching, and working in, computational linguistics using Python”。

import?nltk from?nltk.corpus?import?treebank #?首次使用需要下載 nltk.download('punkt') nltk.download('averaged_perceptron_tagger') nltk.download('maxent_ne_chunker') nltk.download('words') nltk.download('treebank') sentence?=?"""At?eight?o'clock?on?Thursday?morning?Arthur?didn't?feel?very?good.""" #?Tokenize tokens?=?nltk.word_tokenize(sentence) tagged?=?nltk.pos_tag(tokens) #?Identify?named?entities entities?=?nltk.chunk.ne_chunk(tagged) #?Display?a?parse?tree t?=?treebank.parsed_sents('wsj_0001.mrg')[0] t.draw()

12、spaCy

spaCy?是一個(gè)免費(fèi)的開(kāi)源庫(kù)，用于?Python?中的高級(jí) NLP。它可以用于構(gòu)建處理大量文本的應(yīng)用程序；也可以用來(lái)構(gòu)建信息提取或自然語(yǔ)言理解系統(tǒng)，或者對(duì)文本進(jìn)行預(yù)處理以進(jìn)行深度學(xué)習(xí)。

??import?spacy ??texts?=?[ ??????"Net?income?was?$9.4?million?compared?to?the?prior?year?of?$2.7?million.", ??????"Revenue?exceeded?twelve?billion?dollars,?with?a?loss?of?$1b.", ??] ??nlp?=?spacy.load("en_core_web_sm") ??for?doc?in?nlp.pipe(texts,?disable=["tok2vec",?"tagger",?"parser",?"attribute_ruler",?"lemmatizer"]): ??????#?Do?something?with?the?doc?here ??????print([(ent.text,?ent.label_)?for?ent?in?doc.ents])

nlp.pipe?生成 Doc 對(duì)象，因此我們可以對(duì)它們進(jìn)行迭代并訪問(wèn)命名實(shí)體預(yù)測(cè)：

[('$9.4?million',?'MONEY'),?('the?prior?year',?'DATE'),?('$2.7?million',?'MONEY')] [('twelve?billion?dollars',?'MONEY'),?('1b',?'MONEY')]13、LibROSA

librosa?是一個(gè)用于音樂(lè)和音頻分析的 Python 庫(kù)，它提供了創(chuàng)建音樂(lè)信息檢索系統(tǒng)所必需的功能和函數(shù)。

#?Beat?tracking?example import?librosa #?1.?Get?the?file?path?to?an?included?audio?example filename?=?librosa.example('nutcracker') #?2.?Load?the?audio?as?a?waveform?`y` #????Store?the?sampling?rate?as?`sr` y,?sr?=?librosa.load(filename) #?3.?Run?the?default?beat?tracker tempo,?beat_frames?=?librosa.beat.beat_track(y=y,?sr=sr) print('Estimated?tempo:?{:.2f}?beats?per?minute'.format(tempo)) #?4.?Convert?the?frame?indices?of?beat?events?into?timestamps beat_times?=?librosa.frames_to_time(beat_frames,?sr=sr)14、PandasPandas?是一個(gè)快速、強(qiáng)大、靈活且易于使用的開(kāi)源數(shù)據(jù)分析和操作工具，?Pandas?可以從各種文件格式比如 CSV、JSON、SQL、Microsoft Excel 導(dǎo)入數(shù)據(jù)，可以對(duì)各種數(shù)據(jù)進(jìn)行運(yùn)算操作，比如歸并、再成形、選擇，還有數(shù)據(jù)清洗和數(shù)據(jù)加工特征。Pandas?廣泛應(yīng)用在學(xué)術(shù)、金融、統(tǒng)計(jì)學(xué)等各個(gè)數(shù)據(jù)分析領(lǐng)域。??pandas進(jìn)階寶典import?matplotlib.pyplot?as?plt import?pandas?as?pd import?numpy?as?np ts?=?pd.Series(np.random.randn(1000),?index=pd.date_range("1/1/2000",?periods=1000)) ts?=?ts.cumsum() df?=?pd.DataFrame(np.random.randn(1000,?4),?index=ts.index,?columns=list("ABCD")) df?=?df.cumsum() df.plot() plt.show()

15、MatplotlibMatplotlib?是Python的繪圖庫(kù)，它提供了一整套和?matlab?相似的命令 API，可以生成出版質(zhì)量級(jí)別的精美圖形，Matplotlib?使繪圖變得非常簡(jiǎn)單，在易用性和性能間取得了優(yōu)異的平衡。使用?Matplotlib?繪制多曲線圖：#?plot_multi_curve.py import?numpy?as?np import?matplotlib.pyplot?as?plt x?=?np.linspace(0.1,?2?*?np.pi,?100) y_1?=?x y_2?=?np.square(x) y_3?=?np.log(x) y_4?=?np.sin(x) plt.plot(x,y_1) plt.plot(x,y_2) plt.plot(x,y_3) plt.plot(x,y_4) plt.show()

16、Seaborn

Seaborn?是在?Matplotlib?的基礎(chǔ)上進(jìn)行了更高級(jí)的API封裝的Python數(shù)據(jù)可視化庫(kù)，從而使得作圖更加容易，應(yīng)該把?Seaborn?視為?Matplotlib?的補(bǔ)充，而不是替代物。

import?seaborn?as?sns import?matplotlib.pyplot?as?plt sns.set_theme(style="ticks") df?=?sns.load_dataset("penguins") sns.pairplot(df,?hue="species") plt.show()

17、OrangeOrange?是一個(gè)開(kāi)源的數(shù)據(jù)挖掘和機(jī)器學(xué)習(xí)軟件，提供了一系列的數(shù)據(jù)探索、可視化、預(yù)處理以及建模組件。Orange?擁有漂亮直觀的交互式用戶界面，非常適合新手進(jìn)行探索性數(shù)據(jù)分析和可視化展示；同時(shí)高級(jí)用戶也可以將其作為?Python?的一個(gè)編程模塊進(jìn)行數(shù)據(jù)操作和組件開(kāi)發(fā)。使用?pip?即可安裝?Orange，好評(píng)～$?pip?install?orange3

安裝完成后，在命令行輸入?orange-canvas?命令即可啟動(dòng)?Orange?圖形界面：

$?orange-canvas

啟動(dòng)完成后，即可看到?Orange?圖形界面，進(jìn)行各種操作。

18、PyBrainPyBrain?是?Python?的模塊化機(jī)器學(xué)習(xí)庫(kù)。它的目標(biāo)是為機(jī)器學(xué)習(xí)任務(wù)和各種預(yù)定義的環(huán)境提供靈活、易于使用且強(qiáng)大的算法來(lái)測(cè)試和比較算法。PyBrain?是?Python-Based Reinforcement Learning, Artificial Intelligence and Neural Network Library?的縮寫(xiě)。我們將利用一個(gè)簡(jiǎn)單的例子來(lái)展示?PyBrain?的用法，構(gòu)建一個(gè)多層感知器 (Multi Layer Perceptron, MLP)。首先，我們創(chuàng)建一個(gè)新的前饋網(wǎng)絡(luò)對(duì)象：from?pybrain.structure?import?FeedForwardNetwork n?=?FeedForwardNetwork()

接下來(lái)，構(gòu)建輸入、隱藏和輸出層：

from?pybrain.structure?import?LinearLayer,?SigmoidLayer inLayer?=?LinearLayer(2) hiddenLayer?=?SigmoidLayer(3) outLayer?=?LinearLayer(1)

為了使用所構(gòu)建的層，必須將它們添加到網(wǎng)絡(luò)中：

n.addInputModule(inLayer) n.addModule(hiddenLayer) n.addOutputModule(outLayer)

可以添加多個(gè)輸入和輸出模塊。為了向前計(jì)算和反向誤差傳播，網(wǎng)絡(luò)必須知道哪些層是輸入、哪些層是輸出。這就需要明確確定它們應(yīng)該如何連接。為此，我們使用最常見(jiàn)的連接類型，全連接層，由 FullConnection 類實(shí)現(xiàn)：

from?pybrain.structure?import?FullConnection in_to_hidden?=?FullConnection(inLayer,?hiddenLayer) hidden_to_out?=?FullConnection(hiddenLayer,?outLayer)

與層一樣，我們必須明確地將它們添加到網(wǎng)絡(luò)中：

n.addConnection(in_to_hidden) n.addConnection(hidden_to_out)

所有元素現(xiàn)在都已準(zhǔn)備就位，最后，我們需要調(diào)用.sortModules()方法使MLP可用：

n.sortModules()

這個(gè)調(diào)用會(huì)執(zhí)行一些內(nèi)部初始化，這在使用網(wǎng)絡(luò)之前是必要的。

19、MilkMILK(MACHINE LEARNING TOOLKIT)?是 Python 語(yǔ)言的機(jī)器學(xué)習(xí)工具包。它主要是包含許多分類器比如 SVMS、K-NN、隨機(jī)森林以及決策樹(shù)中使用監(jiān)督分類法，它還可執(zhí)行特征選擇，可以形成不同的例如無(wú)監(jiān)督學(xué)習(xí)、密切關(guān)系傳播和由 MILK 支持的 K-means 聚類等分類系統(tǒng)。使用?MILK?訓(xùn)練一個(gè)分類器：import?numpy?as?np import?milk features?=?np.random.rand(100,10) labels?=?np.zeros(100) features[50:]?+=?.5 labels[50:]?=?1 learner?=?milk.defaultclassifier() model?=?learner.train(features,?labels) #?Now?you?can?use?the?model?on?new?examples: example?=?np.random.rand(10) print(model.apply(example)) example2?=?np.random.rand(10) example2?+=?.5 print(model.apply(example2))20、TensorFlowTensorFlow?是一個(gè)端到端開(kāi)源機(jī)器學(xué)習(xí)平臺(tái)。它擁有一個(gè)全面而靈活的生態(tài)系統(tǒng)，一般可以將其分為 TensorFlow1.x 和 TensorFlow2.x，TensorFlow1.x 與 TensorFlow2.x 的主要區(qū)別在于 TF1.x 使用靜態(tài)圖而 TF2.x 使用Eager Mode動(dòng)態(tài)圖。這里主要使用TensorFlow2.x作為示例，展示在 TensorFlow2.x 中構(gòu)建卷積神經(jīng)網(wǎng)絡(luò) (Convolutional Neural Network, CNN)。import?tensorflow?as?tf from?tensorflow.keras?import?datasets,?layers,?models #?數(shù)據(jù)加載 (train_images,?train_labels),?(test_images,?test_labels)?=?datasets.cifar10.load_data() #?數(shù)據(jù)預(yù)處理 train_images,?test_images?=?train_images?/?255.0,?test_images?/?255.0 #?模型構(gòu)建 model?=?models.Sequential() model.add(layers.Conv2D(32,?(3,?3),?activation='relu',?input_shape=(32,?32,?3))) model.add(layers.MaxPooling2D((2,?2))) model.add(layers.Conv2D(64,?(3,?3),?activation='relu')) model.add(layers.MaxPooling2D((2,?2))) model.add(layers.Conv2D(64,?(3,?3),?activation='relu')) model.add(layers.Flatten()) model.add(layers.Dense(64,?activation='relu')) model.add(layers.Dense(10)) #?模型編譯與訓(xùn)練 model.compile(optimizer='adam', ??????????????loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), ??????????????metrics=['accuracy']) history?=?model.fit(train_images,?train_labels,?epochs=10,? ????????????????????validation_data=(test_images,?test_labels))21、PyTorch

PyTorch?的前身是 Torch，其底層和 Torch 框架一樣，但是使用 Python 重新寫(xiě)了很多內(nèi)容，不僅更加靈活，支持動(dòng)態(tài)圖，而且提供了 Python 接口。

#?導(dǎo)入庫(kù) import?torch from?torch?import?nn from?torch.utils.data?import?DataLoader from?torchvision?import?datasets from?torchvision.transforms?import?ToTensor,?Lambda,?Compose import?matplotlib.pyplot?as?plt #?模型構(gòu)建 device?=?"cuda"?if?torch.cuda.is_available()?else?"cpu" print("Using?{}?device".format(device)) #?Define?model class?NeuralNetwork(nn.Module): ????def?__init__(self): ????????super(NeuralNetwork,?self).__init__() ????????self.flatten?=?nn.Flatten() ????????self.linear_relu_stack?=?nn.Sequential( ????????????nn.Linear(28*28,?512), ????????????nn.ReLU(), ????????????nn.Linear(512,?512), ????????????nn.ReLU(), ????????????nn.Linear(512,?10), ????????????nn.ReLU() ????????) ????def?forward(self,?x): ????????x?=?self.flatten(x) ????????logits?=?self.linear_relu_stack(x) ????????return?logits model?=?NeuralNetwork().to(device) #?損失函數(shù)和優(yōu)化器 loss_fn?=?nn.CrossEntropyLoss() optimizer?=?torch.optim.SGD(model.parameters(),?lr=1e-3) #?模型訓(xùn)練 def?train(dataloader,?model,?loss_fn,?optimizer): ????size?=?len(dataloader.dataset) ????for?batch,?(X,?y)?in?enumerate(dataloader): ????????X,?y?=?X.to(device),?y.to(device) ????????#?Compute?prediction?error ????????pred?=?model(X) ????????loss?=?loss_fn(pred,?y) ????????#?Backpropagation ????????optimizer.zero_grad() ????????loss.backward() ????????optimizer.step() ????????if?batch?%?100?==?0: ????????????loss,?current?=?loss.item(),?batch?*?len(X) ????????????print(f"loss:?{loss:>7f}??[{current:>5d}/{size:>5d}]")22、Theano

Theano?是一個(gè) Python 庫(kù)，它允許定義、優(yōu)化和有效地計(jì)算涉及多維數(shù)組的數(shù)學(xué)表達(dá)式，建在 NumPy 之上。在?Theano?中實(shí)現(xiàn)計(jì)算雅可比矩陣：

import?theano import?theano.tensor?as?T x?=?T.dvector('x') y?=?x?**?2 J,?updates?=?theano.scan(lambda?i,?y,x?:?T.grad(y[i],?x),?sequences=T.arange(y.shape[0]),?non_sequences=[y,x]) f?=?theano.function([x],?J,?updates=updates) f([4,?4])23、KerasKeras?是一個(gè)用 Python 編寫(xiě)的高級(jí)神經(jīng)網(wǎng)絡(luò) API，它能夠以 TensorFlow, CNTK, 或者 Theano 作為后端運(yùn)行。Keras 的開(kāi)發(fā)重點(diǎn)是支持快速的實(shí)驗(yàn)，能夠以最小的時(shí)延把想法轉(zhuǎn)換為實(shí)驗(yàn)結(jié)果。from?keras.models?import?Sequential from?keras.layers?import?Dense #?模型構(gòu)建 model?=?Sequential() model.add(Dense(units=64,?activation='relu',?input_dim=100)) model.add(Dense(units=10,?activation='softmax')) #?模型編譯與訓(xùn)練 model.compile(loss='categorical_crossentropy', ??????????????optimizer='sgd', ??????????????metrics=['accuracy']) model.fit(x_train,?y_train,?epochs=5,?batch_size=32)?????????24、Caffe

在 Caffe2 官方網(wǎng)站上，這樣說(shuō)道：Caffe2?現(xiàn)在是?PyTorch?的一部分。雖然這些 api 將繼續(xù)工作，但鼓勵(lì)使用 PyTorch api。

25、MXNet

MXNet?是一款設(shè)計(jì)為效率和靈活性的深度學(xué)習(xí)框架。它允許混合符號(hào)編程和命令式編程，從而最大限度提高效率和生產(chǎn)力。使用?MXNet?構(gòu)建手寫(xiě)數(shù)字識(shí)別模型：

import?mxnet?as?mx from?mxnet?import?gluon from?mxnet.gluon?import?nn from?mxnet?import?autograd?as?ag import?mxnet.ndarray?as?F #?數(shù)據(jù)加載 mnist?=?mx.test_utils.get_mnist() batch_size?=?100 train_data?=?mx.io.NDArrayIter(mnist['train_data'],?mnist['train_label'],?batch_size,?shuffle=True) val_data?=?mx.io.NDArrayIter(mnist['test_data'],?mnist['test_label'],?batch_size) #?CNN模型 class?Net(gluon.Block): ????def?__init__(self,?**kwargs): ????????super(Net,?self).__init__(**kwargs) ????????self.conv1?=?nn.Conv2D(20,?kernel_size=(5,5)) ????????self.pool1?=?nn.MaxPool2D(pool_size=(2,2),?strides?=?(2,2)) ????????self.conv2?=?nn.Conv2D(50,?kernel_size=(5,5)) ????????self.pool2?=?nn.MaxPool2D(pool_size=(2,2),?strides?=?(2,2)) ????????self.fc1?=?nn.Dense(500) ????????self.fc2?=?nn.Dense(10) ????def?forward(self,?x): ????????x?=?self.pool1(F.tanh(self.conv1(x))) ????????x?=?self.pool2(F.tanh(self.conv2(x))) ????????#?0?means?copy?over?size?from?corresponding?dimension. ????????#?-1?means?infer?size?from?the?rest?of?dimensions. ????????x?=?x.reshape((0,?-1)) ????????x?=?F.tanh(self.fc1(x)) ????????x?=?F.tanh(self.fc2(x)) ????????return?x net?=?Net() #?初始化與優(yōu)化器定義 #?set?the?context?on?GPU?is?available?otherwise?CPU ctx?=?[mx.gpu()?if?mx.test_utils.list_gpus()?else?mx.cpu()] net.initialize(mx.init.Xavier(magnitude=2.24),?ctx=ctx) trainer?=?gluon.Trainer(net.collect_params(),?'sgd',?{'learning_rate':?0.03}) #?模型訓(xùn)練 #?Use?Accuracy?as?the?evaluation?metric. metric?=?mx.metric.Accuracy() softmax_cross_entropy_loss?=?gluon.loss.SoftmaxCrossEntropyLoss() for?i?in?range(epoch): ????#?Reset?the?train?data?iterator. ????train_data.reset() ????for?batch?in?train_data: ????????data?=?gluon.utils.split_and_load(batch.data[0],?ctx_list=ctx,?batch_axis=0) ????????label?=?gluon.utils.split_and_load(batch.label[0],?ctx_list=ctx,?batch_axis=0) ????????outputs?=?[] ????????#?Inside?training?scope ????????with?ag.record(): ????????????for?x,?y?in?zip(data,?label): ????????????????z?=?net(x) ????????????????#?Computes?softmax?cross?entropy?loss. ????????????????loss?=?softmax_cross_entropy_loss(z,?y) ????????????????#?Backpropogate?the?error?for?one?iteration. ????????????????loss.backward() ????????????????outputs.append(z) ????????metric.update(label,?outputs) ????????trainer.step(batch.data[0].shape[0]) ????#?Gets?the?evaluation?result. ????name,?acc?=?metric.get() ????#?Reset?evaluation?result?to?initial?state. ????metric.reset() ????print('training?acc?at?epoch?%d:?%s=%f'%(i,?name,?acc))26、PaddlePaddle飛槳?(PaddlePaddle)?以百度多年的深度學(xué)習(xí)技術(shù)研究和業(yè)務(wù)應(yīng)用為基礎(chǔ)，集深度學(xué)習(xí)核心訓(xùn)練和推理框架、基礎(chǔ)模型庫(kù)、端到端開(kāi)發(fā)套件、豐富的工具組件于一體。是中國(guó)首個(gè)自主研發(fā)、功能完備、開(kāi)源開(kāi)放的產(chǎn)業(yè)級(jí)深度學(xué)習(xí)平臺(tái)。使用?PaddlePaddle?實(shí)現(xiàn)?LeNtet5：#?導(dǎo)入需要的包 import?paddle import?numpy?as?np from?paddle.nn?import?Conv2D,?MaxPool2D,?Linear ##?組網(wǎng) import?paddle.nn.functional?as?F #?定義?LeNet?網(wǎng)絡(luò)結(jié)構(gòu) class?LeNet(paddle.nn.Layer): ????def?__init__(self,?num_classes=1): ????????super(LeNet,?self).__init__() ????????#?創(chuàng)建卷積和池化層 ????????#?創(chuàng)建第1個(gè)卷積層 ????????self.conv1?=?Conv2D(in_channels=1,?out_channels=6,?kernel_size=5) ????????self.max_pool1?=?MaxPool2D(kernel_size=2,?stride=2) ????????#?尺寸的邏輯：池化層未改變通道數(shù)；當(dāng)前通道數(shù)為6 ????????#?創(chuàng)建第2個(gè)卷積層 ????????self.conv2?=?Conv2D(in_channels=6,?out_channels=16,?kernel_size=5) ????????self.max_pool2?=?MaxPool2D(kernel_size=2,?stride=2) ????????#?創(chuàng)建第3個(gè)卷積層 ????????self.conv3?=?Conv2D(in_channels=16,?out_channels=120,?kernel_size=4) ????????#?尺寸的邏輯：輸入層將數(shù)據(jù)拉平[B,C,H,W]?->?[B,C*H*W] ????????#?輸入size是[28,28]，經(jīng)過(guò)三次卷積和兩次池化之后，C*H*W等于120 ????????self.fc1?=?Linear(in_features=120,?out_features=64) ????????#?創(chuàng)建全連接層，第一個(gè)全連接層的輸出神經(jīng)元個(gè)數(shù)為64，?第二個(gè)全連接層輸出神經(jīng)元個(gè)數(shù)為分類標(biāo)簽的類別數(shù) ????????self.fc2?=?Linear(in_features=64,?out_features=num_classes) ????#?網(wǎng)絡(luò)的前向計(jì)算過(guò)程 ????def?forward(self,?x): ????????x?=?self.conv1(x) ????????#?每個(gè)卷積層使用Sigmoid激活函數(shù)，后面跟著一個(gè)2x2的池化 ????????x?=?F.sigmoid(x) ????????x?=?self.max_pool1(x) ????????x?=?F.sigmoid(x) ????????x?=?self.conv2(x) ????????x?=?self.max_pool2(x) ????????x?=?self.conv3(x) ????????#?尺寸的邏輯：輸入層將數(shù)據(jù)拉平[B,C,H,W]?->?[B,C*H*W] ????????x?=?paddle.reshape(x,?[x.shape[0],?-1]) ????????x?=?self.fc1(x) ????????x?=?F.sigmoid(x) ????????x?=?self.fc2(x) ????????return?x27、CNTKCNTK(Cognitive Toolkit)?是一個(gè)深度學(xué)習(xí)工具包，通過(guò)有向圖將神經(jīng)網(wǎng)絡(luò)描述為一系列計(jì)算步驟。在這個(gè)有向圖中，葉節(jié)點(diǎn)表示輸入值或網(wǎng)絡(luò)參數(shù)，而其他節(jié)點(diǎn)表示對(duì)其輸入的矩陣運(yùn)算。CNTK?可以輕松地實(shí)現(xiàn)和組合流行的模型類型，如?CNN?等。CNTK?用網(wǎng)絡(luò)描述語(yǔ)言?(network description language, NDL)?描述一個(gè)神經(jīng)網(wǎng)絡(luò)。簡(jiǎn)單的說(shuō)，要描述輸入的 feature，輸入的 label，一些參數(shù)，參數(shù)和輸入之間的計(jì)算關(guān)系，以及目標(biāo)節(jié)點(diǎn)是什么。NDLNetworkBuilder=[ ???? ????run=ndlLR ???? ????ndlLR=[ ??????#?sample?and?label?dimensions ??????SDim=$dimension$ ??????LDim=1 ???? ??????features=Input(SDim,?1) ??????labels=Input(LDim,?1) ???? ??????#?parameters?to?learn ??????B0?=?Parameter(4)? ??????W0?=?Parameter(4,?SDim) ?????? ?????? ??????B?=?Parameter(LDim) ??????W?=?Parameter(LDim,?4) ???? ??????#?operations ??????t0?=?Times(W0,?features) ??????z0?=?Plus(t0,?B0) ??????s0?=?Sigmoid(z0)??? ?????? ??????t?=?Times(W,?s0) ??????z?=?Plus(t,?B) ??????s?=?Sigmoid(z)???? ???? ??????LR?=?Logistic(labels,?s) ??????EP?=?SquareError(labels,?s) ???? ??????#?root?nodes ??????FeatureNodes=(features) ??????LabelNodes=(labels) ??????CriteriaNodes=(LR) ??????EvalNodes=(EP) ??????OutputNodes=(s,t,z,s0,W0) ????]???

推薦閱讀

??pandas進(jìn)階寶典

??數(shù)據(jù)挖掘?qū)崙?zhàn)項(xiàng)目

??機(jī)器學(xué)習(xí)入門

Python數(shù)據(jù)科學(xué)

以Python為核心語(yǔ)言，專攻于「數(shù)據(jù)科學(xué)」領(lǐng)域，文章涵蓋數(shù)據(jù)分析，數(shù)據(jù)挖掘，機(jī)器學(xué)習(xí)等干貨內(nèi)容，分享大量數(shù)據(jù)挖掘?qū)崙?zhàn)項(xiàng)目分析和講解，以及海量的學(xué)習(xí)資源。

224篇原創(chuàng)內(nèi)容

公眾號(hào)

閱讀?2393

Python數(shù)據(jù)科學(xué)

關(guān)注

分享收藏14

20

喜歡此內(nèi)容的人還喜歡

yyds！Python爬蟲(chóng)的完整學(xué)習(xí)框架

?

Python數(shù)據(jù)科學(xué)

不喜歡

不看的原因

確定

內(nèi)容質(zhì)量低
不看此公眾號(hào)

一文徹底理解 ROC/AUC 概念（Python）

?

Python數(shù)據(jù)科學(xué)

不喜歡

不看的原因

確定

內(nèi)容質(zhì)量低
不看此公眾號(hào)

起飛！8個(gè) Python 加速運(yùn)行騷操作

?

Python數(shù)據(jù)科學(xué)

不喜歡

不看的原因

確定

內(nèi)容質(zhì)量低
不看此公眾號(hào)

一鍵三連+關(guān)注

想要領(lǐng)取資料的小伙伴們，請(qǐng)關(guān)注下方微信公眾號(hào)領(lǐng)取哦~?

然后就等著私信查收吧~

標(biāo)簽：野生技能協(xié)會(huì)學(xué)習(xí)程序員編程 python學(xué)習(xí)計(jì)算機(jī)計(jì)算機(jī)技術(shù)python python入門