The is_deleted column is a pretty simple solution that we all use and there are a number of solutions to the problem of retrieving only the active columns, such as views.
Audit trails and performance are more interesting. A while back I worked for a web analytics company and we had the problem of the storage and performance costs of historical data. Only the 5% of the data was of any real interest, but the 95% historical data made writes slow because of the large number of indexes. They adopted the solution of historical tables with fewer indexes on cheaper drives.
I like the solution of serializing historical, deleted, and audit data and storing them in a NonSQL database of your choice. Then you can bring them back as individual undos, or into a data mining database for scenario playing.
I dont particularly like his suggestion of creating separate tables for each state of an element. I think that's needless complexity.
Why does updating an index take more time if there are many table rows?
i.e. if you have 5 indices on a table and 1 write/second, what's the difference in the work done by the db whether there are 1000 rows or 100k rows?
Even worse were the autogenerated queries that joined a dozen tables together. I saw some that were 1500 lines long.
Is he that ignorant of what a reasonable ORM (or at least, Ruby on Rails) will do for you? eg named_scope, etc...
If you are writing PHP, where you hard-code a query and then print out HTML as you iterate over the result set, then sure, you're fucked. But not because of soft deletes.
For example, if you have an :active named_scope in rails that respects an object's active state, you still need to remember to do MyObject.active.find everywhere you want to exclude inactive results (meaning it's still just as easy to forget it). This got a little bit better with default_scope, but now you have the problem of trying to jump through hoops in the 1 or 2 cases where you do want to bring back inactive objects -- which, in my opinion, still imposes some sort of tax (albeit a slightly more readable one).
Granted, I'm still fairly new to rails, so if there really is a way to just be able to do MyObject.find in all cases where I want to ignore inactive records and MyObject.include_inactive.find in the 1 or 2 edge cases where its needed, then I will readily concede the absence of said tax.
At the day job, when we do a soft delete as defined here, we tend to create a view of the active rows in the table. Accessing through the view rather than the table prevents many of the "Whoopsie, missed a WHERE clause, now I'm summing over deleted records" errors. I'm told it also improves performance but take anything I say about DBs with a grain of salt.
That ain't no man's soft delete, boy. You want to know what soft delete is? Well that's too bad, cuz there ain't no delete. Move from one table to another? What in tarnation?
Boy, you need yourself the sequence of actions that produce some kind of result. You don't store the result itself, ya'll just go an' fuck it up or something later. "Posted a job offer for Tractor Operator at 2009-11-21.12:44:26." See, that there's an action. "Accepted job offer for Tractor Operator." There's another action. Now what's the status of that job? You get your list of actions, then figure it out. Yes, it's taken, you would have just injured yourself on that tractor anyway.
All I do is setup two managers, one the default so in the admin I can browse my records without any filtering and then a second one which I use exclusively in my views that simply adds .filter(hidden=False) to the get_query_set method. So simple, and have never ran into any problems.
If it's something like a rapidly changing customers list then delete any customers where the soft delete date is greater than three years old.