chrisullyott/php-url-extractor
最新稳定版本:v0.2.3
Composer 安装命令:
composer require chrisullyott/php-url-extractor
包简介
Extract URLs from HTML content.
README 文档
README
Extract URLs from HTML content, applying optional filters.
Installation
With Composer:
$ composer require chrisullyott/php-url-extractor
Usage
$html = file_get_contents('about-us.html');
$extractor = new UrlExtractor($html);
$extractor->setHomeUrl('http://www.site.com');
$extractor->setFilesOnly(true);
$urls = $extractor->getUrls();
print_r($urls);
(
[0] => stdClass Object
(
[attribute] => href
[value] => /_assets/img/icons/favicon-96.png
[url] => https://www.site.com/_assets/img/icons/favicon-96.png
)
...
Options
setAttributeFilter (array)
The #getUrls method creates a DOMDocument and checks given element attributes, such as src and href, for URLs you might be interested in. Use #setAttributeFilter to override the default set of attributes with your own.
setHomeUrl (string)
Providing a home URL filters results to those local to the domain. Any relative URL beginning with one slash / and not two slashes is considered local as well. Setting this also builds the url property (an absolute URL) for the objects returned by the #getUrls method.
setAlternateDomains (array)
Used with #setHomeUrl. If set, the returned URLs will include those whose domain is found in the array. In this array, you may enter strings, like media.site.com and/or regular expressions, like /.*\.site\.com/.
setFilesOnly (boolean)
Whether we should only return URLs with file extensions.
setIgnoredExtensions (array)
Used with #setFilesOnly. Excludes URLs whose file extension is found in the array.
统计信息
- 总下载量: 2.48k
- 月度下载量: 0
- 日度下载量: 0
- 收藏数: 2
- 点击次数: 0
- 依赖项目数: 0
- 推荐数: 0
其他信息
- 授权协议: MIT
- 更新时间: 2018-01-18