Posted on

Selecting Random Items in Amazon SimpleDB

I am in the middle of working on a project that relies heavily on the ability to select random items from an Amazon Web Services SimpleDB domain. A little bit of google-fu turned up an answer to a post over at Stack Overflow which described Amazons recommended approach. A really rough psuedo-code implementation looks like this:
Item selectRandomItem() {
// Generate a random value
String randomValue = generateRandomString();

// Retrieve the data from SDB
Item item = randomLE(randomValue);
if(item == null) {
// Handle the edge case where there are no items
// with randomizers less than the random value
item = randomGE();
}
return item;
}

Item randomLE(String randomValue) {
return sdb.select(
"select * " +
"from MyStore " +
"where randomizer <= '" + randomValue + "'" +
"order by randomizer desc " +
"limit 1"
);
}
Item randomGE(String randomValue) {
return sdb.select(
"select * " +
"from MyStore " +
"where randomizer >= '" + randomValue + "'" +
"order by randomizer asc " +
"limit 1"
);
}
The algorithm stores a randomizer field with a random value on all of the items; when you need a row, generate another random value and select the first item whose randomizer attribute is less than this new random value. Unfortunately this approach is broken. Continue reading Selecting Random Items in Amazon SimpleDB