We had a website with >500,000 old revisions in their database. There is a nice module for knocking out small numbers of revisions called http://drupal.org/project/revision_deletion. But, to quickly remove a large set we wrote this script.
The general idea is to remove all revisions older than a certain date and then do the same for each CCK table. No matter what date limit you pick the current revision is not deleted.
WARNING: it makes assumptions about your table naming conventions -- use only on a backup
function remove_revisions() { // Set limit for revision age $days_ago = 30; // delete all revisions from node_revisions $limit = time() - (60*60*24*$days_ago); $sql = "DELETE FROM node_revisions WHERE timestamp < $limit AND vid NOT IN (SELECT vid FROM node)"; db_query($sql); $rows = db_affected_rows(); drupal_set_message("$rows revisions deleted -- node_revisions"); $table_types = array('content_field_%%', 'content_type_%%'); // remove orphaned data from CCK content tables foreach($table_types as $table_type) { $result= db_query("SHOW TABLES LIKE '$table_type'"); while ($row = db_fetch_array($result)) { $table = current($row); $sql = "DELETE FROM $table WHERE vid NOT IN (SELECT vid FROM node_revisions)"; db_query($sql); $rows = db_affected_rows(); drupal_set_message("$rows revisions deleted -- $table"); } } }
