Crawling with "npm crawler"
Problem
For example i what to crawl description of Node.js modules from npmjs.org.
but this code doesn't work. and how it made this with jQuery, but not with jsdom module.
var Crawler = require("crawler").Crawler;
var crawler = new Crawler({
"maxConnections":10,
});
crawler.queue([{
"uri":"https://npmjs.org/package/crawler",
"callback":function(error,result) {
console.log("description:", window.$("p.description").text());
}
}]);
Problem courtesy of: khex
Solution
your code exists too early. Add a setTimeout on the last line to give enough time for your code to complete.
then call process.exit() from your callback function.
the Crawler callback takes 3 parameters, the 3rd one being jQuery, so you probably use something like so:
"callback":function(error,result,$) {
console.log("description:",$("p.description").text());
}
Solution courtesy of: Pascal Belloncle
Discussion
View additional discussion.