使用正則匹配百度的疫情數(shù)據(jù)地圖
? ? 51節(jié)UP在家閑著無(wú)聊又嘗試了下分析百度的疫情數(shù)據(jù),那么UP主發(fā)現(xiàn)這次的數(shù)據(jù)和上次嘗試分析的數(shù)據(jù)完全不同了,因?yàn)榘俣壤鲜菗Q數(shù)據(jù)結(jié)構(gòu)所以不是很建議大家使用百度的數(shù)據(jù)寫(xiě)獲取裝置,那么UP主通過(guò)對(duì)https://voice.baidu.com/act/newpneumonia/newpneumonia/?from=osari_pc_3這條主要的數(shù)據(jù)包分析,發(fā)現(xiàn)這次的結(jié)構(gòu)分為三層。
第一層:國(guó)內(nèi)數(shù)據(jù)
從 "caseList":[ 開(kāi)始之后就是國(guó)內(nèi)的數(shù)據(jù);數(shù)據(jù)結(jié)構(gòu)為:
//省區(qū)數(shù)據(jù)
{"confirmed":"1","died":"0","crued":"1","relativeTime":"1588176000","confirmedRelative":"0","diedRelative":"0","curedRelative":"0","curConfirm":"0","curConfirmRelative":"0","icuDisable":"1","area":"\u897f\u85cf","subList":[{"city":"\u62c9\u8428","confirmed":"1","died":"0","crued":"1","confirmedRelative":"0","curConfirm":"0","cityCode":"100"}]},
從這條數(shù)據(jù)可以分析出這個(gè)省的總數(shù)據(jù)和其城市的數(shù)據(jù)
城市數(shù)據(jù)從 "subList":[ 開(kāi)始之后就以這樣的數(shù)據(jù)結(jié)構(gòu)開(kāi)始
{"city":"\u62c9\u8428","confirmed":"1","died":"0","crued":"1","confirmedRelative":"0","curConfirm":"0","cityCode":"100"}
那么這樣我們可以先在源代碼中將省區(qū)分開(kāi)然后再解析城市,UP通過(guò)樹(shù)形框演示是這個(gè)樣子:

其中有些地方因?yàn)椴唤y(tǒng)一所以會(huì)出現(xiàn)錯(cuò)亂。
第二層:國(guó)家
這層存放了各個(gè)國(guó)家的數(shù)據(jù)和省份的數(shù)據(jù);
{"confirmed":"1","died":"","crued":"","relativeTime":"1588089600","confirmedRelative":"","curConfirm":"1","icuDisable":"1","area":"\u79d1\u6469\u7f57","subList":[]},
和上面的國(guó)內(nèi)結(jié)構(gòu)差不多,也是?"subList":[] 里面存放子區(qū)數(shù)據(jù);
這層數(shù)據(jù)的開(kāi)始是以 "caseOutsideList":[
結(jié)尾會(huì)以一堆雜亂的數(shù)據(jù)進(jìn)行填充除了:{"confirmed":"84387","died":"4643","cured":"78893","asymptomatic":"981","asymptomaticRelative":"25","unconfirmed":"9","relativeTime":"1588176000","confirmedRelative":"12","unconfirmedRelative":"3","curedRelative":"60","diedRelative":"0","icu":"38","icuRelative":"-3","overseasInput":"1670","unOverseasInputCumulative":"82715","overseasInputRelative":"6","unOverseasInputNewAdd":"6","curConfirm":"851","curConfirmRelative":"-48","icuDisable":"1"},"summaryDataOut":{"confirmed":"3242610","died":"230232","curConfirm":"2033124","cured":"979254","confirmedRelative":"64731","curedRelative":"42292","diedRelative":"5828","curConfirmRelative":"16611","relativeTime":"1588176000"},
之外還有很多填充的雜亂數(shù)據(jù)。
第三層:板塊
板塊的開(kāi)始位于第二層的填充數(shù)據(jù)結(jié)尾,以:"globalList":[? 為開(kāi)始。
其中{"area":"\u4e9a\u6d32","subList":[? ?為板塊開(kāi)始
之間的國(guó)家數(shù)據(jù)以:{"confirmed":"15","died":"","crued":"","relativeTime":1588089600,"confirmedRelative":"","curConfirm":"15","country":"\u5854\u5409\u514b\u65af\u5766"}? 為結(jié)構(gòu)。
在最后一個(gè) ], 結(jié)尾之后會(huì)直接來(lái)上一段板塊的總數(shù)據(jù):"died":"14128","crued":"190055","confirmed":"442739","curConfirm":"238556","confirmedRelative":"11282"},
之后又會(huì)以一個(gè)??{"area":"\u6b27\u6d32","subList":[? 開(kāi)始下一個(gè)板塊的數(shù)據(jù),
板塊數(shù)據(jù)包含:亞洲、歐洲、非洲、北美洲、南美洲、大洋洲、其他。
此外還有一個(gè)不知名的數(shù)據(jù),UP翻譯過(guò)來(lái)之后是 熱門(mén) 的意思,把UP都搞蒙了。
那么我們可以以這個(gè)熱門(mén)為結(jié)尾的數(shù)據(jù):{"area":"\u70ed\u95e8","subList":[
也可以以?"foreignTrendList": 為熱門(mén)的結(jié)尾。
從分析上面三個(gè)層的數(shù)據(jù)之后便可以用正則或其他方法匹配其中的數(shù)據(jù),UP主這里因?yàn)榧倨诓粔蛩灾慌藝?guó)內(nèi)的。


數(shù)據(jù)結(jié)構(gòu)txt下載:https://lanzous.com/ic6bg0f
國(guó)內(nèi)數(shù)據(jù)更新源碼:https://lanzous.com/ic6bj7a
好不容易放了兩天假UP主要休息去了~!