I first developed this method in answer to a very specific problem posted on PHP Builder:keep last 5 logs and delete rest. It is an ideal solution to this knotty problem where there are an unknown number of records in each subset.
The problem is compounded because records will have been inserted in a more or less random order across the sets. To solve it we have to find the sequence within each sub-set.
Now some DBs such as MS SQL and Oracle have that functionality built in, but others do not. For these we need another approach, and the method is to use a self-join with grouping to generate the sequence number.
SELECT t1.userid, COUNT(t1.tableid) AS sequence, t1.tableid, t1.tableid >= t2.tableid AS flg
FROM table t1 INNER JOIN table t2 ON t1.userid = t2.userid
GROUP BY t1.userid, t1.tableid, flg
HAVING flg = TRUE
What this does is to join each row to itself and all those within the same sub-set, in this case ‘userid’, that were inserted before it and then count them. So the first entry for each user is only joined to itself = 1, the second is joined to itself and the first = 2, third to itself and the first two = 3, etc.
To reverse the sequence simply reverse the selection
t1.tableid <= t2.tableid AS flg
So to solve the original problem we just need the reverse sequence which we can then use to delete all records with a sequence greater than 5, neat huh.
DELETE FROM table INNER JOIN
(SELECT t1.userid, COUNT(t1.tableid) AS sequence, t1.tableid, t1.tableid <= t2.tableid AS flg
FROM table t1 INNER JOIN table t2 USING(userid)
GROUP BY t1.userid, t1.tableid, flg
HAVING flg = TRUE) AS q1 USING(tableid)
WHERE sequence > 5
PS this is of course MySQL syntax and many other RDBMS do not support the USING operator for joins
2 Trackbacks
[...] David Soussan My New Domain « How to Sequence each Sub-set of Records [...]
[...] fine use of my sequencing method is to calculate the running balance on a ledger: SELECT t1.CompanyID, t1.CompanyName, [...]