承接 bitandblack/sitemap 相关项目开发

从需求分析到上线部署,全程专人跟进,保证项目质量与交付效率

邮箱:yvsm@zunyunkeji.com | QQ:316430983 | 微信:yvsm316

bitandblack/sitemap

最新稳定版本:2.3.0

Composer 安装命令:

composer require bitandblack/sitemap

包简介

Creates a sitemap.xml by parsing the whole website.

README 文档

README

PHP from Packagist Latest Stable Version Total Downloads License

Bit&Black Logo

Bit&Black Sitemap

Creates a sitemap.xml by parsing the whole website including all language versions and all images.

If multiple language versions are found, multiple xml files will be written.

Installation

This library is made for the use with Composer. Add it to your project by running $ composer require bitandblack/sitemap.

Usage

Auto-generation of a sitemap for a whole website

Set up the sitemap generation like that:

<?php

use BitAndBlack\Sitemap\Config\YamlConfig;
use BitAndBlack\Sitemap\SitemapCrawler;
use BitAndBlack\Sitemap\Writer\FileWriter;

$config = new YamlConfig('/path/to/config.yaml');
$writer = new FileWriter('/path/to/xml/files');

$sitemapCrawler = new SitemapCrawler(
    $config,
    $writer
);

$sitemapCrawler->createSitemap('https://crawl.me');

The YamlConfig stores some information which are needed when the process needs to run in multiple steps. Therefore it needs a path where the config file may get stored.

FileWriter stores the xml files, so it needs to know a folder for those files.

createSitemap() starts the crawling. If the time limit has been reached, the process will stop and store its status in the config file. If you call createSitemap() again it will continue the process. This is helpful for large websites which may take a long time to crawl.

Options

Page limit

Set a page limit that stops the crawler when the defined page count has been reached:

<?php

$sitemapCrawler->setCrawlingLimit(500);

Crawling a single page

You can crawl a single page by using the PageCrawler class. It will result in an object containing the page's headers, the body and some information about the languages, links and media.

<?php

use BitAndBlack\Sitemap\PageCrawler;

$pageCrawler = new PageCrawler('https://www.bitandblack.com/de.html');
$page = $pageCrawler->getPage();

Manual generation of the sitemap.xml

You can also create the sitemap.xml by your own:

<?php

use BitAndBlack\Sitemap\Collection;
use BitAndBlack\Sitemap\Config\YamlConfig;
use BitAndBlack\Sitemap\Page;
use BitAndBlack\Sitemap\SitemapXML;

$collection = new Collection(
    new YamlConfig()
);

/**
 * The page doesn't need to exist.
 */
$page = new Page('https://example.org');

$sitemapXML = new SitemapXML($collection, [$page]);

file_put_contents(
    'sitemap.xml',
    $sitemapXML->getSitemap()->saveXML()
);

Available Crawlers

Per default, the Bit&Black Sitemap library uses the Symfony Http Client for requests.

You can use a different crawler, depending on your needs. Currently supported are:

The AutoPageCrawler will detect the available crawler by its own.

However, you can set up the PageCrawler with a specific crawler, for example:

<?php

use BitAndBlack\Sitemap\PageCrawler;
use BitAndBlack\Sitemap\PageCrawler\ReactCrawler;

$pageCrawler = new PageCrawler();
$pageCrawler->setPageCrawler(new ReactCrawler());

Help

If you have any questions, feel free to contact us under hello@bitandblack.com.

Further information about Bit&Black can be found under www.bitandblack.com.

统计信息

  • 总下载量: 3.97k
  • 月度下载量: 0
  • 日度下载量: 0
  • 收藏数: 0
  • 点击次数: 0
  • 依赖项目数: 2
  • 推荐数: 0

GitHub 信息

  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • 开发语言: PHP

其他信息

  • 授权协议: MIT
  • 更新时间: 2022-02-21