blogical suspect

Friday, December 24, 2004

Suppose the bot

Suppose the bot ran much more frequently... Like once every ten minutes.
Suppose we also had a "speed" counter...
If we have clocked a site at updating once an hour, then the speed would be 6 (we expect it to be updated once every six runs)
We might start by checking it at 1:00.. Then we check back at 2:00 (which is to say we wait six runs and then try again)... If the blog has changed, there's no way of knowing exactly when it was changed, just that it took less than an hour... So to narrow things down, we could increase the speed to 5 (or "an update every 5 checks").
Since we're checking every five ten-minute-chunks, then we'd check back at 2:50... If we caught another change, we could decrement the wait time by one again, and so on, until it's checked every time.
If a change is NOT caught, then we're checking too often, and should increase the wait...
All this should probably happen according to a nonlinear curve.. (eg, no update in 3 days means check back in 4 days, not 3 days and ten minutes).. There should also be a cap on it, so that we always check at least once a week.
This would serve to distribute the load between runs, allow for smaller jobs, and calibrate fast enough so that a blog is checked more often when people are awake and active, and less often when they're asleep or taking some time away.
Generally, every update should increase the rate at which the blog is checked, and every check without an update should decrease the rate at which it's checked.
[Also some sites might not want 10-minute checks.. maybe each blog should have it's own minimum wait or something..]