關于【深度學習中的數(shù)學】的課件等答疑
看到有不少人問是否有課件和教材的問題,在這里給一個齊老師的統(tǒng)一回答:
我在講的過程中沒有完成follow任何一本具體的書,但是以下書籍是我比較喜歡也認真讀過的(一兩個只讀了其中的一部分,沒有全讀完,后面有時間我會讀完的),?對我講的東西很有幫助,大家有時間可以讀讀,
1.?Deisenroth,?Marc?Peter,?A.?Aldo?Faisal,?and?Cheng?Soon?Ong.?Mathematics?for?machine?learning.?Cambridge?University?Press,?2020.
2.?Nesterov,?Yurii.?"Introductory?lectures?on?convex?programming?volume?i:?Basic?course."?Lecture?notes?3,?no.?4?(1998):?5.
3.?Nocedal,?J.,?&?Wright,?S.?J.?(Eds.).?(1999).?Numerical?optimization.?New?York,?NY:?Springer?New?York.
4.?Wright,?J.,?&?Ma,?Y.?(2022).?High-dimensional?data?analysis?with?low-dimensional?models:?Principles,?computation,?and?applications.?Cambridge?University?Press.
5.?Beck,?Amir.?First-order?methods?in?optimization.?Society?for?Industrial?and?Applied?Mathematics,?2017.
還有很多的論文是我會經(jīng)常提到的,
1.?ResNet
readpaper.com/paper/2949650786
2.?Transformer
readpaper.com/paper/2963403868
3.?Xavier?Ininitalization
readpaper.com/paper/1533861849
4.?LayerNorm
readpaper.com/paper/3037932933
5.?BatchNorm
readpaper.com/paper/2949117887
6.?Lipsformer
readpaper.com/paper/717255664598069248
7.?understanding?optimization?of?deep?learning?via?jacobian?matrix?and?lispchitz?constant
readpaper.com/paper/4815122133717876737
8. Adam
readpaper.com/paper/1522301498
9. AdamW
readpaper.com/paper/2768282280
10.?llama,?opt,?palm等大模型論文
6,7是我們自己寫的兩個文章,?講的過程中我參考了很多,?但是也補充了很多上面沒有的基礎知識。
具體的很多概念,?我很多是參考Wikipedia,?像matrix?calculus,?SVD,?Lipschitz?continuity,?convex?function,?taylor?expansion等等。
那些論文如果你都讀一下會很有幫助,那些書都是打基礎的非常好的,wikipedia其實也很有幫助。