Schema Design
Schema’s, the blueprint for every database, are the fundamental bases on which we dictate how to store data. Every schema is different and each is unique to fit the needs of each individual, company, or even user, but how we build our blueprints can affect a programmers ability to access the vital information held in it, and so to help in the making of a schema, these are some tips to build the best blueprint for you!
An important question to ask before making a design for a schema is to ask why you even need a design, and there are 4 essential answers. The first part is to reduce redundancies that can come from making duplicate data points or poor relationship structure from misplaced primary keys or lack thereof. Secondly is to provide access to information in a way that isn’t hindered due to relationships in the database. The third is to keep the accuracy and integrity of our data intact, essentially this boils down to maintaining the single source of truth in case the data gets changed at any point. Lastly, and most importantly is that the schema should meet YOUR needs.
After the why, comes the how, and this is the part where you lay out what you want. By this, I mean that it is important to think about what exactly you want from the schema be it a more effective way to store data or a faster way to access it. After thinking about what you want, then it's time to think about primary keys — where to put them and why you put them there. Primary keys form the majority of relationships built within the schema with foreign keys making up the rest of the majority. The placement of primary keys can heavily affect your ability to access certain information and how the database will cope with changes made to the information. This is why the single source of truth principle becomes important as one change can severely affect the database. Once you have the relationships built out, then it's time to refactor for data normalization. Data normalization is essentially the process of structuring a database with forms, which are rules for a specific outcome. The forms that will be mentioned in this article are normal form 1 (NF1) and normal form 2 (NF2). For NF1, the rules are that for every row and cell, only 1 value should be inputted. The rules for NF2 are just as simple except with a few changes. NF2 is just NF1, but every non-column key needs a primary key.
Lastly, to share an example, it is important to know that the complexity of a schema does not necessarily come from the length of it, but from the relationships formed in the schema. The first image represents an easy-looking schema as the relationship makes it easily accessible. However, the second image is much more complex due to the introduction of a third table with another relationship which can make queries difficult to execute depending on what you are accessing.
In conclusion, the way a schema develops depends solely on its creator and what rules they choose to follow.