Avoid Structured Configuration Data

Previously, we discussed the benefits of getting configuration data out of local XML files and into a centralized configuration service, ideally using a context-aware key-value store. Today we will discuss how to shape the configuration data that goes into the central service.

Recall that we have a very narrow definition of “configuration” in mind. From an application’s perspective, configuration data is read-only – such data is maintained externally (probably by your operations staff). Any data that the application wishes to change (such as user preferences) do not fall under our “configuration” umbrella. Configuration data is typically used only at application start-up for bootstrapping purposes. Examples include database or service connection information, timeouts, feature switches, and performance tuning settings.

What should our properties and values look like? An appealing approach for object-oriented programmers is to group related settings together and store them as one object, serialized as XML or JSON. Under this approach, we might store connection settings for a service this way:

service.connection.info = 
{
    Address = 123.456.78.90
    Port = 1337
    TimeoutMs = 5000
    FailureStrategy = Retry
}

This is convenient because an application can request a single property, deserialize it to a strongly-typed object, and have everything it needs to connect to the service.

However, there are problems with this approach. They fall into two basic categories.

First, not every application or component needs all the data stored in the blob. Perhaps the failure strategy and timeout are handled by a different class than the one that creates the socket connection (or perhaps they will be in the future). The server needs to know the port, but it doesn’t need to know the address (since it can’t control its own IP address). Putting all the data into a single blob therefore violates the single responsibility principle, because the blob has more data than its clients really need.
Second, combining the data into a blob makes it harder to share the data in multiple contexts. It is very likely that the server address will be different in dev than it is in prod. However, it is much less likely that the other properties will vary across environments (if at all). If the data is a single blob, we must copy-paste the entire blob when we wish to override any part of it in a different context.

Instead of condensing the data into a blob stored as a single property value, we suggest storing each component as a separate property:

service.address = 123.456.78.90
service.port = 1337
service.timeout.ms = 5000
service.failure.strategy = Retry

This approach fixes the single responsibility and sharing problems we discussed above. However, this approach is not without its drawbacks. The biggest challenge is to ensure that the list of properties (which will be much larger than it would be under the blob approach) remains easy to navigate. This challenge can be addressed by adopting and adhering to a strict naming convention with the following properties:

In a sorted list of all properties, properties that are related should be near each other. This means that property names should become more specific as they read from left to right. For example, prefer “order.service.address” to “address.of.order.service”.
The delimiter between words in property names should be consistent across all your properties. Having a smattering of ‘-‘, ‘_’, and ‘.’ in your property names will slowly drive you insane. Pick one of the delimiters and stick with it. Consider enforcing this convention programmatically if you can.

Avoid the temptation to create a taxonomy for your configuration data by shoving related groups of data into structured objects. Keeping each property simple and separate will maximize your flexibility going forward.