erichynds

Me!

Welcome to my online development portfolio and blog. I'm Eric Hynds, a 23 year old website developer living outside of Boston, Massachusetts, and I'm passionate about developing functional, standard-compliant, and user-friendly websites.

Archive for October, 2009

Automatically crawl a website looking for errors

Thursday, October 8th, 2009

For all you Linux/Mac folks out there, one useful way to crawl your website looking for errors (web server or server-side; any kind of error that would appear in your log files) is to use the terminal. Start by ensuring your log levels are set correctly for whatever you want to catch; if you’re using Apache set the right LogLevel, if you want to catch PHP errors check your error_log and error_reporting variables, etc. You’ll also want to backup and clear your existing log files to keep this cleaner.

Next, open up your terminal and issue this command:

wget -r --level=10 --delete-after -nd http://www.mydomain.com/

wget will crawl through the site and retrieve just about everything 10 levels deep (change --level=10 to however many levels you want). -r searches recursively, --delete-after deletes the wget’d files right after downloading them, and -nd prevents wget from creating directories locally.

Once done, check your log files and see what it found. If you’re impatient like me, you can tail -f your log files on the server to stream new data.