千鋒教育python數(shù)據(jù)分析教程200集,Python數(shù)據(jù)分析師入門必備視頻

學(xué)習(xí)python需要熟記一些語法,然后去靈活應(yīng)用。有些知識雖然前期沒學(xué)習(xí),但有時模塊可以分開去學(xué)習(xí)掌握。我按視頻學(xué)習(xí)的pandas 時間序列,有不太清楚地方,目前只是把聽后差不多理解的筆記整理了, 不清楚處未做整理。
pandas 時間序列
#導(dǎo)包
import numpy as np import pandas as pd pd.Timestamp('2030-2-3') #時刻數(shù)據(jù) pd.Period('2030-2-3',freq='D') #時期數(shù)據(jù) #批量生成時刻數(shù)據(jù) index=pd.date_range('2030.02.13',periods=4,freq='D') # index=pd.period_range('2030.02.13',periods=4,freq='D') index #freq:Y:年,M:月,D:日
DatetimeIndex(['2030-02-13', '2030-02-14', '2030-02-15', '2030-02-16'], dtype='datetime64[ns]', freq='D')
#時間戳索引 pd.Series(np.random.randint(0,10,size=4),index=index)
2030-02-13 7
2030-02-14 3
2030-02-15 2
2030-02-16 6
Freq: D, dtype: int32
#轉(zhuǎn)換方法 pd.to_datetime(['2030-3-14','2030.03.14','14/03/2030','2030/3/14'])
DatetimeIndex(['2030-03-14', '2030-03-14', '2030-03-14', '2030-03-14'], dtype='datetime64[ns]', freq=None)
dt=pd.to_datetime([20300314],unit='s') dt
DatetimeIndex(['1970-08-23 22:58:34'], dtype='datetime64[ns]', freq=None)
#時間差 DateOffset dt+pd.DateOffset(hours=8)
DatetimeIndex(['1970-08-24 06:58:34'], dtype='datetime64[ns]', freq=None)
dt+pd.DateOffset(days=-8)
DatetimeIndex(['1970-08-15 22:58:34'], dtype='datetime64[ns]', freq=None)
#時間戳的索引和切片 index=pd.bdate_range('2030-3-14',periods=100,freq='D') index
DatetimeIndex(['2030-03-14', '2030-03-15', '2030-03-16', '2030-03-17',
'2030-03-18', '2030-03-19', '2030-03-20', '2030-03-21',
'2030-03-22', '2030-03-23', '2030-03-24', '2030-03-25',
'2030-03-26', '2030-03-27', '2030-03-28', '2030-03-29',
'2030-03-30', '2030-03-31', '2030-04-01', '2030-04-02',
'2030-04-03', '2030-04-04', '2030-04-05', '2030-04-06',
'2030-04-07', '2030-04-08', '2030-04-09', '2030-04-10',
'2030-04-11', '2030-04-12', '2030-04-13', '2030-04-14',
'2030-04-15', '2030-04-16', '2030-04-17', '2030-04-18',
'2030-04-19', '2030-04-20', '2030-04-21', '2030-04-22',
'2030-04-23', '2030-04-24', '2030-04-25', '2030-04-26',
'2030-04-27', '2030-04-28', '2030-04-29', '2030-04-30',
'2030-05-01', '2030-05-02', '2030-05-03', '2030-05-04',
'2030-05-05', '2030-05-06', '2030-05-07', '2030-05-08',
'2030-05-09', '2030-05-10', '2030-05-11', '2030-05-12',
'2030-05-13', '2030-05-14', '2030-05-15', '2030-05-16',
'2030-05-17', '2030-05-18', '2030-05-19', '2030-05-20',
'2030-05-21', '2030-05-22', '2030-05-23', '2030-05-24',
'2030-05-25', '2030-05-26', '2030-05-27', '2030-05-28',
'2030-05-29', '2030-05-30', '2030-05-31', '2030-06-01',
'2030-06-02', '2030-06-03', '2030-06-04', '2030-06-05',
'2030-06-06', '2030-06-07', '2030-06-08', '2030-06-09',
'2030-06-10', '2030-06-11', '2030-06-12', '2030-06-13',
'2030-06-14', '2030-06-15', '2030-06-16', '2030-06-17',
'2030-06-18', '2030-06-19', '2030-06-20', '2030-06-21'],
dtype='datetime64[ns]', freq='D')
ts=pd.Series(range(len(index)),index=index) ts
2030-03-14 0
2030-03-15 1
2030-03-16 2
2030-03-17 3
2030-03-18 4
..
2030-06-17 95
2030-06-18 96
2030-06-19 97
2030-06-20 98
2030-06-21 99
Freq: D, Length: 100, dtype: int64
#切片 ts['2030-03-15':'2030-03-22']
2030-03-15 1
2030-03-16 2
2030-03-17 3
2030-03-18 4
2030-03-19 5
2030-03-20 6
2030-03-21 7
2030-03-22 8
Freq: D, dtype: int64
ts[pd.date_range('2030-3-24',periods=10,freq='D')]
2030-03-24 10
2030-03-25 11
2030-03-26 12
2030-03-27 13
2030-03-28 14
2030-03-29 15
2030-03-30 16
2030-03-31 17
2030-04-01 18
2030-04-02 19
Freq: D, dtype: int64
#屬性 ts.index ts.index.year #年 ts.index.month #月 ts.index.day #日 ts.index.dayofweek #星期幾
Int64Index([3, 4, 5, 6, 0, 1, 2, 3, 4, 5, 6, 0, 1, 2, 3, 4, 5, 6, 0, 1, 2, 3,
4, 5, 6, 0, 1, 2, 3, 4, 5, 6, 0, 1, 2, 3, 4, 5, 6, 0, 1, 2, 3, 4,
5, 6, 0, 1, 2, 3, 4, 5, 6, 0, 1, 2, 3, 4, 5, 6, 0, 1, 2, 3, 4, 5,
6, 0, 1, 2, 3, 4, 5, 6, 0, 1, 2, 3, 4, 5, 6, 0, 1, 2, 3, 4, 5, 6,
0, 1, 2, 3, 4, 5, 6, 0, 1, 2, 3, 4],
dtype='int64')
度進行數(shù)據(jù)聚合
ts.resample('2D').sunm() #以2天為單位進行匯總 求和
ts.resample('2W').sunm() #以2周為單位進行匯總 求和
ts.resample('3M').sunm() #以3個月為單位進行匯總 求和
ts.resample('H').sunm() #以1小時為單位進行匯總 求和
ts.resample('T').sunm() #以1分鐘為單位進行匯總 求和
#DataFrame重采樣
d={'price':[10,11,2,44,55,66],'score':[40,30,20,50,60,70,80,10],'week':[pd.data_range('2030-3-1',periods=8,freq='w')}
df=pd.DataFrame(d)
df