Modular data and SQL Server
Posted 29 Oct 2024
Modular architectures like vertical slices and modular monoliths are nice ways to split up a large application. Smaller modules are easier to reason about and maintain than a single, large monolith. Modular software architecture also requires a strategy for persisting data along the same module boundaries.
There are a slew of data storage options available today, but most teams are comfortable with relational databases. They are proven technology, and that makes them a sensible choice for many companies. So, what is the most effective way to employ SQL databases with modular architectures?
Well, it depends. The solution I present here is based on the following constraints:
- A single, small team building and operating the system
- The team is proficient in SQL but prefers tooling support to write and test SQL
- Lower operational cost is better
- Modules don't need to scale all that much and are usually not independent of each other
- Multiple modules can be built around a shared set of data, a domain.
- The team should be able to build, test, and deploy each module independently.
- Each module owns a part of the domain and should be able to write that data.
- Related modules should be able to read some of the data from other modules.
- A fast and easy development inner loop is essential
- Robust deployment with data safety checks
Architecture
While it may be tempting to look at micro-services and follow the practice of strict data separation and independent scalability, that doesn't make sense in the majority of line-of-business applications. The cost and overhead of a database per module and syncing shared data around would make it nearly impossible for a small team to run a line-of-business application.
On the other hand, we do want the data storage to follow the same modular structure the rest of the system has, and not fall into the spaghetti trap.
SQL uses schemas to provide modular data. This allows multiple modules to reside in the same database, and even reference data from another module. It also provides a logical barrier between modules.
Developing with schemas
But... there is also a requirement to develop and deploy modules independently, right? How does that work then when all these modules use the same database?
This is where SQL Server Data-Tier Application package, or dacpac for short, can help. Dacpacs have been around for a long time and are a very robust deployment technique for SQL Server databases.
Deployment tooling
A dacpac represents the desired state of a database or a schema within the database. The tooling around it is able to determine the delta between the current schema and the desired schema. From that, it can produce an update script or directly update the schema to the desired state.
It can raise an error when data would be lost, so the CI/CD pipelines will be halted to prevent accidental data loss.
The downside of dacpacs is that there are limited options for data migrations so you'll need to plan breaking changes and data migrations to the data carefully.
Development tooling
Back in the .Net framework days, there was a Database project in Visual Studio that provided good development support for working with dacpacs. For modern dotnet, there's an open-source, cross-platform alternative: MSBuild.Sdk.SqlProj
The inner loop
For developers to work effectively, they need to be able to fire up an environment on their development box and iterate quickly on it. If we combine the dacpacs with deployment straight into a containerized SQL Server instance, it becomes really easy to spin up a database.
If a module needs to read data from another module, the dacpac can be shared through a Nuget package. This enables us to reference other tables from the database project, verify the schema during builds, and also deploy the required dependency when deploying a local development database.
The same setup for local deployments can be used to set up test containers and run integration tests.
This enables a rich an robust development cycle that would be difficult to achieve with other technology.
The code
I've posted a working sample repository on GitHub. The repository has lots more info, including a How-To for creating new modules.
Leave a comment