Get Even More Visitors To Your Blog, Upgrade To A Business Listing >>

How can I scrape sites that require authentication using node.js?

How can I scrape sites that require authentication using node.js?

Problem

I've come across many tutorials explaining how to Scrape public websites that don't Require authentication/login, using node.js.

Can somebody explain how to Scrape Sites that require login using node.js?

Problem courtesy of: ekanna

Solution

Use Mikeal's Request library, you need to enable cookies support like this:

var request = request.defaults({jar: true})

So you first should create a username on that site (manually) and pass the username and the password as params when making the POST request to that site. After that the server will respond with a cookie which Request will remember, so you will be able to access the pages that require you to be logged into that site.

Note: this approach doesn't work if something like reCaptcha is used on the login page.

Solution courtesy of: alessioalex

Discussion

View additional discussion.



This post first appeared on Node.js Recipes, please read the originial post: here

Share the post

How can I scrape sites that require authentication using node.js?

×

Subscribe to Node.js Recipes

Get updates delivered right to your inbox!

Thank you for your subscription

×