Refactoring Databases - Reducing the Risk
I am reading the book "Refactoring Databases: Evolutionary Database Design" by Scott W Ambler and Pramod J Sadalage at the moment.
The biggest difficulty with refactoring databases is caused by having multiple client applications (that you may or may not know about) accessing your database in ways that are inconsistent with good programming practice, such as using "select * from table" and referring to the columns returned by numbered parameters in your code. I wrote an article a few months ago about the strengths and weaknesses of using stored procedures.
If you want to remove a column, you therefore need to be sure to change all your client applications.
The risk of doing this could be greatly reduced if there were a way in which you could trigger on select statements. This would mean that if you performed a "select *", you could replace it with a select statement that returned the columns in the correct order and inserted a null column in place of the one you have just removed. You could also record the fact a select had been made to an audit table, and use this to help identify if you have changed all your client applications.
Unfortunately, you can only trigger on insert, update and delete statements. Dejan Sarka describes a workaround for this by using the profiler and defining a FOR INSERT trigger on the table that it generates. However, this slows things down, and with the increased use of agile practices, a way to trigger on select statements should be introduced.
Until databases support this functionality, a first stage of any refactoring should be to reduce the numbers of ways in which you can connect to a database, and creating an interface that is as small as possible. I discussed the arguments for and against using stored procedures in January. I would now be more inclined to use SPs to connect to the database. If there are a small number of entry points to the database, there are less things to test when you refactor, and it will be more likely you will keep your database schema up to date.
The biggest difficulty with refactoring databases is caused by having multiple client applications (that you may or may not know about) accessing your database in ways that are inconsistent with good programming practice, such as using "select * from table" and referring to the columns returned by numbered parameters in your code. I wrote an article a few months ago about the strengths and weaknesses of using stored procedures.
If you want to remove a column, you therefore need to be sure to change all your client applications.
The risk of doing this could be greatly reduced if there were a way in which you could trigger on select statements. This would mean that if you performed a "select *", you could replace it with a select statement that returned the columns in the correct order and inserted a null column in place of the one you have just removed. You could also record the fact a select had been made to an audit table, and use this to help identify if you have changed all your client applications.
Unfortunately, you can only trigger on insert, update and delete statements. Dejan Sarka describes a workaround for this by using the profiler and defining a FOR INSERT trigger on the table that it generates. However, this slows things down, and with the increased use of agile practices, a way to trigger on select statements should be introduced.
Until databases support this functionality, a first stage of any refactoring should be to reduce the numbers of ways in which you can connect to a database, and creating an interface that is as small as possible. I discussed the arguments for and against using stored procedures in January. I would now be more inclined to use SPs to connect to the database. If there are a small number of entry points to the database, there are less things to test when you refactor, and it will be more likely you will keep your database schema up to date.
