How to start with Crawler & ZyteCrawler
Utilizing Different Crawlers with EasyAI
When integrating web crawling functionality in your applications using EasyAI, you have the flexibility to choose between the basic Crawler
and the more advanced ZyteCrawler
that leverages the Zyte API for enhanced capabilities. Below are examples of how to work with both:
Using the Crawler
Crawler
use EasyAI\Tools\Crawler;
$crawler = new Crawler();
To fetch the full HTML content of a webpage:
// one URL
$html = $crawler->getHtml('https://hosono.ai/en');
// multiple URLs
$urls = [
'https://hosono.ai/en',
'https://hosono.ai/de'
];
$html = $crawler->getHtmls($urls);
Or, to retrieve only the textual content:
// one URL
$text = $crawler->getText('https://hosono.ai/en');
// multiple URLs
$urls = [
'https://hosono.ai/en',
'https://hosono.ai/de'
];
$text = $crawler->getTexts($urls);
Using the ZyteCrawler
ZyteCrawler
The ZyteCrawler
works similarly to the Crawler
but offers access to the Zyte API's advanced features. Depending on your specific requirements, you can choose the most suitable crawler for your project.
use EasyAI\Tools\ZyteCrawler;
$zyteCrawler = new ZyteCrawler();
To fetch the full HTML content of a webpage:
// one URL
$html = $zyteCrawler->getHtml('https://hosono.ai/en');
// multiple URLs
$urls = [
'https://hosono.ai/en',
'https://hosono.ai/de'
];
$html = $zyteCrawler->getHtmls($urls);
Or, to retrieve only the textual content:
// one URL
$text = $zyteCrawler->getText('https://hosono.ai/en');
// multiple URLs
$urls = [
'https://hosono.ai/en',
'https://hosono.ai/de'
];
$text = $zyteCrawler->getTexts($urls);
Last updated
Was this helpful?