kernelslacker ([info]kernelslacker) wrote,
@ 2008-08-09 14:08:00
Previous Entry  Add to memories!  Tell a Friend!  Next Entry
Entry tags:scripts

Another lazy script.
Part of my daily routine is to check various websites to see if they've updated. (These websites haven't joined the rest of us in this millennium with RSS yet). I got bored with the clicky clicky in firefox only to find out they hadn't changed. Also, some of the sites were shops with a page listing many items, so it was difficult to see if anything truly had changed.
So I hacked up this trivial script..


#!/bin/sh
check()
{
URL=$1
FILE=$2
F=$(echo $URL-$FILE | sed -e s/http:\\/\\/// | sed -e s/\\//-/)
SHA1=$(cat ~/.www/$F.SHA1)
wget -q $URL/$FILE
if [ -f $FILE ]; then
NEW=$(sha1sum $FILE | awk '{print $1}')
if [ "$NEW" != "$SHA1" ]; then
echo $URL/$FILE changed.
echo $NEW > ~/.www/$F.SHA1
diff -u ~/.www/$F $FILE
fi
mv $FILE ~/.www/$F
fi
}
check http://url index.html



(The url is split into dir/filename so that it handles multiple pages from the same server correctly).

There's probably some web2.0 widget that takes non-RSS webpages and creates RSS for me, but this seemed to do the trick just fine.
I run it from cron every so often, and get an email whenever something changes on the pages I'm watching, containing the html diff since the last time it was updated.

As an added bonus, this means you get to spot when websites start getting things like..
<script src="h ttp://1.verynx.cn/w.js"></script> embedded into them.
Then you get to send emails to the web stores in question telling them their webserver just got hit by a worm.

Hurrah, my script became a public service.

(update: I broke the http in the ref above, because lj saw fit to render it as a real link.)




Create an Account
Forgot your login?
Login w/ OpenID
English • Español • Deutsch • Русский…