最美情侣中文字幕电影,在线麻豆精品传媒,在线网站高清黄,久久黄色视频

歡迎光臨散文網(wǎng) 會員登陸 & 注冊

下載文檔教程

2022-12-26 18:43 作者:SciTechSports  | 我要投稿

多種文檔下載器

https://github.com/rty813/doc_downloader

簡單的方法

  1. 下載docDownloader.zip(https://github.com/rty813/doc_downloader/releases/),解壓縮。

  2. 運(yùn)行docDownloader.exe。

  3. 輸入文檔的網(wǎng)址,即可開始下載。下載后的文檔在output子文件夾下。

復(fù)雜的方法

  1. 下載doc_downloader-master所有文件(GitZip for github Chrome插件),解壓縮

  2. 安裝好python或者Anaconda。以Anaconda為例,打開開始菜單,找到Anaconda3 (64-bit),以管理員身份運(yùn)行Anaconda Powershell Prompt (anaconda3),即可打開終端。輸入下列內(nèi)容,定位到解壓縮后的文件夾,這里是下載解壓縮到D:\Download\doc_downloader-master,終端內(nèi)輸入:

    D:(回車)

    cd D:\Download\doc_downloader-master\doc_downloader-master(回車)

  3. 終端內(nèi)輸入pip install -r requirements.txt(回車),安裝所需要的包。注意若使用報(bào)錯,應(yīng)先檢查chromedriver版本與chrome版本是否兼容。若不兼容,則只需將文件夾中的chromedriver.exe替換為兼容的版本即可。附[chromedriver下載地址](https://chromedriver.chromium.org/downloads)

  4. 終端內(nèi)輸入python docDownloader.py(回車),輸入文檔的網(wǎng)址,即可開始下載。下載后的文檔在output子文件夾下。

上述方法下載的PDF中存儲的是一張張圖片,為了可以復(fù)制文字,需要對PDF進(jìn)行OCR(光學(xué)字符識別)。

Windows下安裝OCRmyPDF

https://ocrmypdf.readthedocs.io/en/latest/installation.html#native-windows

You must install the following for Windows:

  • Python 3.8 (64-bit) or later

  • Tesseract 4.1.1 or later

  • Ghostscript 9.50 or later

Using the?Chocolatey (https://chocolatey.org/)?package manager, install the following when running in an Administrator command prompt:

  • choco?install?python3

  • choco?install?--pre?tesseract

  • choco?install?ghostscript

  • choco?install?pngquant?(optional)

The commands above will install Python 3.x (latest version), Tesseract, Ghostscript and pngquant. Chocolatey may also need to install the Windows Visual C++ Runtime DLLs or other Windows patches, and may require a reboot.

You may then use?pip?to install ocrmypdf. (This can performed by a user or Administrator.):

  • pip?install?ocrmypdf

Chocolatey automatically selects appropriate versions of these applications. If you are installing them manually, please install 64-bit versions of all applications for 64-bit Windows, or 32-bit versions of all applications for 32-bit Windows. Mixing the “bitness” of these programs will lead to errors.

OCRmyPDF will check the Windows Registry and standard locations in your Program Files for third party software it needs (specifically, Tesseract and Ghostscript). To override the versions OCRmyPDF selects, you can modify the?PATH?environment variable.?Follow these directions?to change the PATH.

打開Anaconda終端,輸入

cd D:\Download\docDownloader\docDownloader\output(回車)

待OCR文檔命名為pic.pdf,待輸出文件命名為 text.pdf,對于中文文檔,輸入

ocrmypdf --force-ocr -l chi_sim? pic.pdf text.pdf

即可開始OCR,輸出的text.pdf也在同一文件夾。






下載文檔教程的評論 (共 條)

分享到微博請遵守國家法律
常熟市| 武清区| 舞阳县| 循化| 筠连县| 灵山县| 长宁区| 常熟市| 玛曲县| 砚山县| 会同县| 普安县| 西平县| 珠海市| 浦北县| 西青区| 泾源县| 揭阳市| 涪陵区| 凌源市| 金塔县| 湘乡市| 兰考县| 九江市| 时尚| 景德镇市| 鲜城| 阳泉市| 沙坪坝区| 白银市| 寿宁县| 台南市| 东乡县| 思茅市| 富源县| 卓资县| 株洲市| 阜阳市| 天全县| 玉田县| 青浦区|