lobotomised/laravel-autocrawler
最新稳定版本:1.3.0
Composer 安装命令:
composer require lobotomised/laravel-autocrawler
包简介
A tool to crawl your own laravel installation checking your HTTP status codes
README 文档
README
Using this package you can check if your application have broken links.
php artisan crawl 200 OK - http://myapp.test/ 200 OK - http://myapp.test/login found on http://myapp.test/ 200 OK - http://myapp.test/register found on http://myapp.test/ 301 301 Moved Permanently - http://myapp.test/homepage found on http://myapp.test/register 404 Not Found - http://myapp.test/brokenlink found on http://myapp.test/register 200 OK - http://myapp.test/features found on http://myapp.test/ Crawl finished Results: Status 200: 4 founds Status 301: 1 found Status 404: 1 found
Installation
This package can be installed via Composer:
composer require --dev lobotomised/laravel-autocrawler
When crawling your site, it will automatically detect the url your application is using. If instead it scan http://localhost, check in your .env you properly configure the APP_URL variable
APP_URL="http://myapp.test"
Usage
Crawl a specific url
By default, the crawler will crawl the URL from your current laravel installation. You can force the url with the --url option:
php artisan crawl --url=http://myapp.test/my-page
Concurrent connection
The crawler run with 10 concurrent connections to speed up the crawling process. You can change that by passing the --concurrency option:
php artisan crawl --concurrency=5
Timeout
The request timeout is by default 30 seconds. Use the --timeout to change this value
php artisan crawl --timeout=10
Ignore robots.txt
By default, the crawler respect the robots.txt. These rules can be ignored with the --ignore-robots option:
php artisan crawl --ignore-robots
External link
When the crawler find an external link, it will check this link. It can be deactivated with the --ignore-external-links option:
php artisan crawl --ignore-external-links
Log non-2xx or non-3xx status code
By default, the crawler will only in your console. You can log all non-2xx or non 3xx status code to a file with the --output option. Result will be store in storage/autocrawler/output.txt
php artisan crawl --output
The output.txt will look like that:
403 Forbidden - http://myapp.test/dashboard found on http://myapp.test/home
404 Not Found - http://myapp.test/brokenlink found on http://myapp.test/register
Fail when non-2xx or non-3xx are found
By default, the command exit codes is 0. You can change it to 1 to indicate that the command has failed with the --fail-on-error
php artisan crawl --fail-on-error
Launch the robot interactively
Eventually, you may configure the crawler interactively by using the --interactive option:
php artisan crawl --interactive
Working with GitHub actions
To execute the crawler you first need to start a web server. You can choose to install apache or nginx. Here is an example using the php build-in webserver
If the crawl found some non-2xx or non-3xx response, the action will fail, and the result will be store as an artifacts of the Action.
steps:
- uses: actions/checkout@v3
- name: Prepare The Environment
run: cp .env.example .env
- name: Install Composer Dependencies
run: composer install
- name: Generate Application Key
run: php artisan key:generate
- name: Install npm Dependencies
run: npm ci
- name: Compile assets
run: npm run build
- name: Start php build-in webserver
run: (php artisan serve &) || /bin/true
- name: Crawl website
run: php artisan crawl --url=http://localhost:8000/ --fail-on-error --output
- name: Upload artifacts
if: failure()
uses: actions/upload-artifact@master
with:
name: Autocrawler
path: ./storage/autocrawler
Documentation
All commands and informations are available with the command:
php artisan crawl --help
Alternatives
This package is heavily inspire by spatie/http-status-check, but instead of being a project dependency, it is a global installation
Testing
First we need to start the included node http server in a separate terminal.
make start
Then to run the tests:
make test
Changelog
Please see CHANGELOG for more information on what has changed recently.
License
The MIT License (MIT). Please see License File for more information.
统计信息
- 总下载量: 39.89k
- 月度下载量: 0
- 日度下载量: 0
- 收藏数: 2
- 点击次数: 1
- 依赖项目数: 0
- 推荐数: 0
其他信息
- 授权协议: MIT
- 更新时间: 2022-08-16