Wednesday, February 8, 2012

Quick web crawling

wget -r -l 1 --http-user=$USER --http-passwd=$PASSWD $URL

will get you a level-1 depth crawl of all links starting at $URL. $USER and $PASSWD are optional, of course, in case the page requires HTML authentication.

Now that is nice!

No comments:

Post a Comment