tomaj/meta-scraper 问题修复 & 功能扩展

解决BUG、新增功能、兼容多环境部署,快速响应你的开发需求

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

tomaj/meta-scraper

最新稳定版本:4.3.0

Composer 安装命令:

composer require tomaj/meta-scraper

包简介

Page meta scraper library

README 文档

README

Build Status Code Climate Test Coverage

SensioLabsInsight

Page meta scraper parse meta information from page.

Installation

via composer:

composer require tomaj/meta-scraper

How to use

Example:

use Tomaj\Scraper\Scraper;
use Tomaj\Scraper\Parser\OgParser;

$scraper = new Scraper();
$parsers = [new OgParser()];
$meta = $scraper->parse(file_get_contents('http://www.google.com/'), $parsers);

var_dump($meta);

or you can use parseUrl method (internally use Guzzle library)

use Tomaj\Scraper\Scraper;
use Tomaj\Scraper\Parser\OgParser;

$scraper = new Scraper();
$parsers = [new OgParser()];
$meta = $scraper->parseUrl('http://www.google.com/', $parsers);

var_dump($meta);

Parsers

There are 3 parsers included in package and you can create new implementing interface Tomaj\Scraper\Parser\ParserInterface.

3 parsers:

  • Tomaj\Scraper\Parser\OgParser - based on og (Open Graph) meta attributes in html (built on regular expressions)
  • Tomaj\Scraper\Parser\OgDomParser - also based on og (Open Graph) meta attributes in html (built on php DOM extension)
  • Tomaj\Scraper\Parser\SchemaParser - based on schema json structure

You can combine these parsers. Data that will not be found in first parser will be replaced with data from second parser.

use Tomaj\Scraper\Scraper;
use Tomaj\Scraper\Parser\SchemaParser;
use Tomaj\Scraper\Parser\OgParser;

$scraper = new Scraper();
$parsers = [new SchemaParser(), new OgParser()];
$meta = $scraper->parseUrl('http://www.google.com/', $parsers);

var_dump($meta);

统计信息

  • 总下载量: 81.76k
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 6
  • 点击次数: 1
  • 依赖项目数: 0
  • 推荐数: 0

GitHub 信息

  • Stars: 6
  • Watchers: 1
  • Forks: 4
  • 开发语言: PHP

其他信息

  • 授权协议: MIT
  • 更新时间: 2015-07-23