languagewire/html-dumper
最新稳定版本:1.0.1
Composer 安装命令:
composer require languagewire/html-dumper
包简介
A library which downloads pages as static HTML files and assets and dumps them on disk
README 文档
README
HtmlDumper is a PHP library which downloads a copy of an HTML page and its assets into a target directory.
- Downloads HTML source code and transforms all URIs into relative paths, creating an updated
index.htmlfile. - Parses HTML and fetches relevant resources
- Stylesheets, scripts, images, videos
- Also works with assets located within CSS files.
- Removes anchor links to external pages.
- Does not crawl pages beyond the initial URL.
$url = "https://example.com"; $targetDirectory = "/tmp/htmldump"; $downloader = new \LanguageWire\HtmlDumper\Service\PageDownloader(); if ($downloader->download($url, $targetDirectory)) { echo "Sucessfully downloaded $url in $targetDirectory"; }
Requirements
- PHP 7.2+
- PHP DOM Extension
- Composer
Installation
The recommended way to install HtmlDumper is through Composer.
composer require languagewire/html-dumper
Development
In the build/ folder there is a Dockerfile file which sets up all dependencies needed for local development, runs unit tests and other linters.
Customize build/.env like this:
cd build
cp .env.template .env
nano .env
And then run ./build.sh within the build/ folder:
cd build
./build.sh
License
HtmlDumper is made available under the MIT License (MIT). Please see the LICENSE file for more information.
统计信息
- 总下载量: 8.99k
- 月度下载量: 0
- 日度下载量: 0
- 收藏数: 1
- 点击次数: 1
- 依赖项目数: 0
- 推荐数: 0
其他信息
- 授权协议: MIT
- 更新时间: 2022-10-17