承接 ava239/nokogiri 相关项目开发

从需求分析到上线部署,全程专人跟进,保证项目质量与交付效率

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

ava239/nokogiri

最新稳定版本:2.0.2

Composer 安装命令:

composer require ava239/nokogiri

包简介

fork of HTML Parser at http://github.com/olamedia/nokogiri

README 文档

README

PHP Composer

Attention: New version can break compatibility, in that case use previous version under the v1.0 branch or tag which supports even php 5.4+

\nokogiri class is left for compatibility

In English На русском

HTML parser

This library is a fast HTML parser, which can work with invalid code (errors are ignored).
Under the hood is used LibXML.
As the input you can use HTML string in UTF-8 encoding or DOMDocument.
For the querying elements CSS selectors are used, which are transformed to XPath expressions internally.

Usage

Loading HTML

HTML errors are ignored

  • From HTML string $saw = new \nokogiri($html); $saw = \nokogiri::fromHtml($html);
  • From DOM elements $saw = new \nokogiri($dom); $saw = \nokogiri::fromDom($dom);

get($cssSelector)

$cssSelector elements have the following format: tagName[attribute=value]#elementId.className:pseudoSelector(expression)

$saw->get('div > a[rel=bookmark]')->toArray();

toArray()

Returns underlying DOM structure as an array.
Values are attributes, text content under #text key and child elements under numeric keys

toXml()

Returns HTML string

getDom() toDom()

Returns DOMDocument. Given true as the first argument - can also return DOMNodeList or DOMElement

Iteration over found elements

foreach ($saw->get('#sidebar a.topic') as $link){
    var_dump($link['#text']);
}

Implemented selectors

  • tag
  • .class
  • #id
  • [attr]
  • [attr=value]
  • :root
  • :empty
  • :first-child
  • :last-child
  • :first-of-type
  • :last-of-type
  • :only-of-type
  • :nth-child(a)
  • :nth-child(an+b)
  • :nth-child(even/odd)

Requirements

  • DOM
  • libxml >=2.9.0
  • PHP >= 7.3

License

MIT

What's new

2.0.0

  • Minimal PHP version 7.3
  • Minimal LibXML version 2.9.0
  • Complete refactoring
  • Partially changed behaviour, can break compatibility
  • HTML loading behaviour changed
  • Test coverage
  • Fixed work of nth-child and other selectors
  • Incorrect selectors now throw exceptions
  • New selectors added

1.0.0

  • First version, 2011
  • Minimal PHP version 5.4

统计信息

  • 总下载量: 7
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 0
  • 点击次数: 0
  • 依赖项目数: 0
  • 推荐数: 0

GitHub 信息

  • Stars: 0
  • Watchers: 0
  • Forks: 63
  • 开发语言: PHP

其他信息

  • 授权协议: MIT
  • 更新时间: 2024-01-18