不用寫(xiě)代碼,瀏覽器擴(kuò)展神器webscraper 自動(dòng)抓取數(shù)據(jù)
webscraper 是個(gè)數(shù)據(jù)抓取神器瀏覽器擴(kuò)展,如果你不會(huì)代碼可以用它來(lái)抓取網(wǎng)站數(shù)據(jù),我之前寫(xiě)過(guò)文章不用寫(xiě)代碼,Chrome 擴(kuò)展神器 web scraper 抓取知乎熱榜/話(huà)題/回答/專(zhuān)欄,豆瓣電影,比如抓取b站上木魚(yú)水心的所有視頻 ,你可以直接導(dǎo)入我的代碼抓取。
{"_id":"bilibili_videos","startUrl":["https://space.bilibili.com/927587/video?tid=0&pn=[1-42:1]&keyword=&order=pubdate"],"selectors":[{"id":"row","parentSelectors":["_root"],"type":"SelectorElement","selector":"li.small-item","multiple":true},{"id":"視頻標(biāo)題","parentSelectors":["row"],"type":"SelectorText","selector":"a.title","multiple":false,"regex":""},{"id":"視頻鏈接","parentSelectors":["row"],"type":"SelectorElementAttribute","selector":"a.cover","multiple":false,"extractAttribute":"href"},{"id":"視頻封面","parentSelectors":["row"],"type":"SelectorElementAttribute","selector":"a.cover div.b-img picture img","multiple":false,"extractAttribute":"src"},{"id":"視頻播放量","parentSelectors":["row"],"type":"SelectorText","selector":".play span","multiple":false,"regex":""},{"id":"視頻長(zhǎng)度","parentSelectors":["row"],"type":"SelectorText","selector":" a.cover ?span.length","multiple":false,"regex":""},{"id":"發(fā)布時(shí)間","parentSelectors":["row"],"type":"SelectorText","selector":"span.time","multiple":false,"regex":""}]}

導(dǎo)出的excel數(shù)據(jù)包含視頻標(biāo)題,鏈接,封面,播放量,長(zhǎng)度,時(shí)間等,他從2013到2023年發(fā)布視頻1200多個(gè)。

知乎的回答和文章也一樣,理論上能在網(wǎng)頁(yè)上看到的數(shù)據(jù)都可以抓取。
{"_id":"zhihu_zhuanlan","startUrl":["https://www.zhihu.com/people/zhi-shi-ku-21-42/posts?page=[1-4]"],"selectors":[{"id":"row","type":"SelectorElement","parentSelectors":["_root"],"selector":"div.List-item","multiple":true,"delay":0},{"id":"知乎標(biāo)題","type":"SelectorText","parentSelectors":["row"],"selector":"h2.ContentItem-title","multiple":false,"regex":"","delay":0},{"id":"知乎鏈接","type":"SelectorElementAttribute","parentSelectors":["row"],"selector":"h2.ContentItem-title span a ","multiple":false,"extractAttribute":"href","delay":0}]}

2023 更新版:蘇生不惑開(kāi)發(fā)過(guò)的那些原創(chuàng)工具和腳本
再次更新:2023批量下載公眾號(hào)文章內(nèi)容/話(huà)題/圖片/封面/視頻/音頻,導(dǎo)出文章pdf,文章數(shù)據(jù)含閱讀數(shù)/點(diǎn)贊數(shù)/在看數(shù)/留言數(shù)
微博圖床又搞事情不能用了,盤(pán)它,我順便寫(xiě)了個(gè)微博圖片/視頻/內(nèi)容/文章批量下載工具
2023 年數(shù)字圖書(shū)館 zlibrary 復(fù)活,新推出客戶(hù)端人人可用
總有人問(wèn)我 Cookie 是什么?