承接 fabpot/goutte 相关项目开发

从需求分析到上线部署,全程专人跟进,保证项目质量与交付效率

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

fabpot/goutte

最新稳定版本:v4.0.3

Composer 安装命令:

composer require fabpot/goutte

包简介

A simple PHP Web Scraper

关键字:

README 文档

README

Goutte is a screen scraping and web crawling library for PHP.

Goutte provides a nice API to crawl websites and extract data from the HTML/XML responses.

WARNING: This library is deprecated. As of v4, Goutte became a simple proxy to the HttpBrowser class from the Symfony BrowserKit component. To migrate, replace Goutte\Client by Symfony\Component\BrowserKit\HttpBrowser in your code.

Requirements

Goutte depends on PHP 7.1+.

Installation

Add fabpot/goutte as a require dependency in your composer.json file:

composer require fabpot/goutte

Usage

Create a Goutte Client instance (which extends Symfony\Component\BrowserKit\HttpBrowser):

use Goutte\Client;

$client = new Client();

Make requests with the request() method:

// Go to the symfony.com website
$crawler = $client->request('GET', 'https://www.symfony.com/blog/');

The method returns a Crawler object (Symfony\Component\DomCrawler\Crawler).

To use your own HTTP settings, you may create and pass an HttpClient instance to Goutte. For example, to add a 60 second request timeout:

use Goutte\Client;
use Symfony\Component\HttpClient\HttpClient;

$client = new Client(HttpClient::create(['timeout' => 60]));

Click on links:

// Click on the "Security Advisories" link
$link = $crawler->selectLink('Security Advisories')->link();
$crawler = $client->click($link);

Extract data:

// Get the latest post in this category and display the titles
$crawler->filter('h2 > a')->each(function ($node) {
    print $node->text()."\n";
});

Submit forms:

$crawler = $client->request('GET', 'https://github.com/');
$crawler = $client->click($crawler->selectLink('Sign in')->link());
$form = $crawler->selectButton('Sign in')->form();
$crawler = $client->submit($form, ['login' => 'fabpot', 'password' => 'xxxxxx']);
$crawler->filter('.flash-error')->each(function ($node) {
    print $node->text()."\n";
});

More Information

Read the documentation of the BrowserKit, DomCrawler, and HttpClient Symfony Components for more information about what you can do with Goutte.

Pronunciation

Goutte is pronounced goot i.e. it rhymes with boot and not out.

Technical Information

Goutte is a thin wrapper around the following Symfony Components: BrowserKit, CssSelector, DomCrawler, and HttpClient.

License

Goutte is licensed under the MIT license.

统计信息

  • 总下载量: 54.29M
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 9342
  • 点击次数: 1
  • 依赖项目数: 374
  • 推荐数: 4

GitHub 信息

  • Stars: 9212
  • Watchers: 349
  • Forks: 1028
  • 开发语言: PHP

其他信息

  • 授权协议: MIT
  • 更新时间: 2026-01-04