jerodev/diggy
最新稳定版本:2.1
Composer 安装命令:
composer require jerodev/diggy
包简介
A fluent PHP web scraper
README 文档
README
Diggy is a simple wrapper around the PHP DOM extension that allow finding elements using simple query selectors and fail proof chaining.
Requirements
- PHP 8.1
Getting started
Diggy includes a simple webclient that uses Guzzle under the hood to download a page and return a NodeCollection object. However, you can use any webclient you prefer and pass a DOMNode or DOMNodeList object to the NodeCollection constructor.
$client = new \Jerodev\Diggy\WebClient(); $page = $client->get('https://www.deviaene.eu/'); $socials = $page->first('#social')->querySelector('a span')->texts(); var_dump($socials); // [ // 'GitHub', // 'Twitter', // 'Email', // 'LinkedIn', // ]
Available functions
These are the available functions on a NodeCollection object. All functions that do not return a native value can be chained without having to worry if there are nodes in the collection or not.
attribute(string $name)
Returns the value of the attribute of the first element in the collection if available.
$nodes->attribute('href');
count()
Returns the number of elements in the current node collection.
$nodes->count();
each(string $selector, closure $closure, ?int $max = null)
Loops over all dom elements in the current collection and executes a closure for each element. The return value of this function is an array of values returned from the closure.
$nodes->each('a', static function (NodeFilter $node) { return $a->attribute('href'); });
exists(?string $selector = null)
Indicates if an element exists in the collection. If a selector is given, the current nodes will first be filtered.
$nodes->exists('a.active');
filter(closure $closure)
Filters the current node collection based on a given closure.
$nodes->filter(static function (NodeFilter $node) { return $node->text() === 'foo'; });
first(?string $selector = null)
Returns the first element of the node collection. If a selector is given, the current nodes will first be filtered.
$nodes->first('a.active');
is(string $nodeName)
Indicates if the first element in the current collection has a specified tag name.
$nodes->is('div');
last(?string $selector = null)
Returns the last element of the node collection. If a selector is given, the current nodes will first be filtered.
$nodes->last('a.active');
nodeName()
Returns the tag name of the first element in the current node collection
$nodes->nodeName();
nth(int $index, ?string $selector = null)
Returns the nth element of the node collection, starting at 0. If a selector is given, the current nodes will first be filtered.
$nodes->nth(1, 'a.active');
querySelector(string $selector)
Finds all elements in the current node collection matching this css query selector.
$nodes->querySelector('a.active');
text(?string $selector = null)
Returns the inner text of the first element in the node collection. If a selector is given, the current nodes will first be filtered.
$nodes->text('p.description');
texts()
Returns an array containing the inner text of every root element in the collection.
$nodes->texts('nav > a');
whereHas(closure $closure)
Filter nodes that contain child nodes that fulfill the filter described by the closure
$nodes->whereHas(static function (NodeFilter $node) { return $node->first('a[href]'); });
whereHasAttribute(string $key, ?string $value = null)
Filters the current node collection by the existence of a specific attribute. If a value is given the collection is also filtered by the value of this attribute.
$nodes->whereHasAttribute('href');
whereHasText(?string $value = null, bool $trim = true, bool $exact = false)
Filters the current node collection by the existence of inner text. Setting a value will also filter the nodes by the actual inner text based on $trim and $exact.
| option | function |
|---|---|
$trim | Indicates the inner text value should be trimmed before matches with $value. |
$exact | Indicates the inner text value should match $value exactly. |
$nodes->whereHasText('foo');
xPath(string $selector)
Finds all elements in the current node collection matching this xpath query selector.
$nodes->xPath('//nav/a[@href]');
统计信息
- 总下载量: 3.21k
- 月度下载量: 0
- 日度下载量: 0
- 收藏数: 3
- 点击次数: 1
- 依赖项目数: 0
- 推荐数: 0
其他信息
- 授权协议: MIT
- 更新时间: 2026-01-04