Database refactoring

A database refactoring is a simple change to a database schema that improves its design while retaining both its behavioral and informational semantics. Database refactoring does not change the way data is interpreted or used and does not fix bugs or add new functionality. Every refactoring to a database leaves the system in a working state, thus not causing maintenance lags, provided the meaningful data exists in the production environment.

A database refactoring is conceptually more difficult than a code refactoring; code refactorings only need to maintain behavioral semantics while database refactorings also must maintain informational semantics.

A database schema is typically refactored for one of several reasons:

  1. To develop the schema in an evolutionary manner in parallel with the evolutionary design of the rest of the system.
  2. To fix design problems with an existing legacy database schema. Database refactorings are often motivated by the desire for database normalization of an existing production database, typically to "clean up" the design of the database.
  3. To implement what would be a large (and potentially risky) change as a series of small, low-risk changes.

Categories of database refactoring

In 2006 Scott Ambler, Pramod Sadalage[1] describe the following categories of database refactoring:[2]

  • Architecture Refactoring
A change which improves the overall manner in which external programs interact with a database.

Methods of Architecture Refactoring category: Add CRUD Methods; Add Mirror Table; Add Read Method; Encapsulate Table With View; Introduce Calculation Method; Introduce Index; Introduce Read Only Table; Migrate Method From Database; Migrate Method To Database; Replace Method(s) With View; Replace View With Methods(s); Use Official Data Source.

  • Structural Refactoring
A change to the table structure of your database schema.

Methods of Structural Refactoring category: Drop Column; Drop Table; Drop View; Introduce Calculated Column; Introduce Surrogate Key; Merge Columns; Merge Tables; Move Column; Rename Column; Rename Table; Rename View; Replace LOB With Table; Replace Column; Replace One-To-Many With Associative Tables; Replace Surrogate Key With Natural Key; Split Column; Split Table.

  • Data Quality Refactoring
A change which improves and/or ensures the consistency and usage of the values stored within the database.

Methods of Data Quality Refactoring category: Add Lookup Table; Apply Standard Codes; Apply Standard Type; Consolidate Key Strategy; Drop Column Constraint; Drop Default Value; Drop Non Nullable; Introduce Column Constraint; Introduce Common Format; Introduce Default Value; Make Column Non Nullable; Move Data; Replace Type Code With Property Flags.

  • Referential Integrity Refactoring
A change which ensures that a referenced row exists within another table and/or that ensures that a row which is no longer needed is removed appropriately.

Methods of Referential Integrity Refactoring category: Add Foreign Key Constraint; Add Trigger for Calculated Column; Drop Foreign Key Constraint; Introduce Cascading Delete; Introduce Hard Delete; Introduce Soft Delete; Introduce Trigger for History.

  • Transformation
A change which changes the semantics of your database schema by adding new elements to it or by modifying existing elements.

Methods of Transformation category: Insert Data; Introduce New Column; Introduce New Table; Introduce View; Update Data.

  • Method Refactoring
A change which improves the quality of a stored procedure, stored function, or trigger.

Methods of the Method Refactoring category: Parameterize Methods; Remove Parameter; Rename Method; Reorder Parameters; Replace Parameter with Explicit Methods; Consolidate Conditional Expression; Decompose Conditional; Extract Method; Introduce Variable; Remove Control Flag; Remove Middle Man; Replace Literal with Table Lookup; Replace Nested; Conditional with Guard Clauses; Split Temporary Variable; Substitute Algorithm.

In 2019 Vladislav Struzik supplemented the categories of database refactoring with a new one:[3]

  • Access Refactoring
A change which relates to data access.

Methods of the Access Refactoring category:[4][5] Change Authentication Attributes; Revoke Authorization Privileges; Grant Authorization Privileges; Extract Database Schema; Merge Database Schemas.

Process of database refactoring

The process of database refactoring is the act of applying database refactorings to evolve an existing database schema (database refactoring is a core practice of evolutionary database design). There are three considerations that need to be taken into account:

  1. How a single refactoring is implemented
  2. How database refactorings are tracked and shared within organizations
  3. How a series of database refactorings are applied

See also

References

  1. ^ Scott Ambler, Pramod Sadalage Refactoring Databases: Evolutionary Database Design - Addison-Wesley Professional; 1st edition (March 3, 2006) - 384 p. - ISBN 978-0321774514
  2. ^ Scott Ambler Catalog of Database Refactorings - Agile Data - URL: http://agiledata.org/essays/databaseRefactoringCatalog.html
  3. ^ Струзік, В. А. Категорія рефакторинг доступу / В. А. Струзік // Комп’ютерні науки, інформаційні технології та системи управління : Міжнародна науково-технічна конференція студентів, аспірантів та молодих вчених, 27–29 листопада 2019 р. – Івано-Франківськ : Прикарпатський національний університет ім. Василя Стефаника, 2019. – С. 20-21. URL: http://dspace.nuft.edu.ua/jspui/handle/123456789/31516
  4. ^ Струзік, В. А. Категорія рефакторинг доступу / В. А. Струзік, С. В. Грибков, В. В. Чобану // Наукові праці НУХТ. – Т. 26, № 2. – 2020. – С. 31–49. URL: http://dspace.nuft.edu.ua/jspui/handle/123456789/31515
  5. ^ Vladislav Struzik, PhD Refactoring: yesterday, today, tomorrow. URL: https://medium.com/@struzik/refactoring-yesterday-today-tomorrow-7fc8c845cfb1