Eurobeat歌詞搜索工具


本搜索基于https://www.eurobeat-prime.com/lyrics.php?網(wǎng)站,里面有絕大部分的eb歌詞。
通過(guò)python爬取網(wǎng)站可以找到你想找的歌詞的歌名和歌手。以csv文件儲(chǔ)存。第一列是id第二列是歌名。id用法是在“https://www.eurobeat-prime.com/lyrics.php?lyrics=”后面加上數(shù)字id。
搜索的內(nèi)容在a = re.search(r"\bShock\b|\bshock\b|\bSHOCK\b",?lrc)這一行里把\b之間的單詞做替換。
for?i?in?range(1,4967)這一行的4967要看網(wǎng)站現(xiàn)在最新的歌詞的id是多少。這個(gè)要看歌詞頁(yè)面https://www.eurobeat-prime.com/lyrics.php?的lastest?entries第一行。
requests,bs4這些東西要自己裝了,pip install一下就行,不會(huì)就隨便搜索就能找到
代碼如下:
import re,requests,bs4,os
from concurrent.futures import ThreadPoolExecutor
headers = {
? ?'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36'}
# url = "https://www.eurobeat-prime.com/lyrics.php?artist=z"
# url = "https://www.eurobeat-prime.com/lyrics.php?lyrics=2"
def findEurobeat(url,idx):
? ?resp = requests.get(url=url,headers=headers)
? ?# print(resp.text)
? ?soup = bs4.BeautifulSoup(resp.text,'html.parser')
? ?title = soup.select_one('tr:nth-child(3) > td > div > b').text#body > table > tbody > tr > td > table.mtable2 > tbody > tr > td.mtopm > table > tbody > tr:nth-child(3) > td
? ?lrcs = soup.select('tr:nth-child(3) > td > div')#body > table > tbody > tr > td > table.mtable2 > tbody > tr > td.mtopm > table > tbody > tr:nth-child(3) > td > div
? ?lrc = ""
? ?lrcl = lrcs[1].text.split('\n')
? ?for i,j in enumerate(lrcl):
? ? ? ?if i >1:
? ? ? ? ? ?# print(i, j)
? ? ? ? ? ?lrc += j +"\n"
? ?# a = re.search(r"\bEurobeat\b|\beurobeat\b|\bEUROBEAT\b",lrc)
? ?a = re.search(r"\bShock\b|\bshock\b|\bSHOCK\b", lrc)
? ?if a:
? ? ? ?# print(lrc)
? ? ? ?print(idx+","+title)
? ? ? ?with open("eblrc.csv","a+",encoding='utf-8') as f:
? ? ? ? ? ?f.write(idx+","+title+"\n")
? ? ? ?# print(title)
with ThreadPoolExecutor(max_workers=64) as pool:# 多線程
? ?for i in range(1,4967):#4967
? ? ? ?idx = str(i)
? ? ? ?url = "https://www.eurobeat-prime.com/lyrics.php?lyrics="+idx
? ? ? ?# findEurobeat(url,idx)
? ? ? ?pool.submit(findEurobeat,url,idx)