Do you love your configuration files?
Let’s ask a hypothetical devoper:
I don’t mind them that much – they are a bit annoying, but I don’t have to deal with them that often and the users don’t seem to complain about them.
And now a hypothetical operator:
I HATE those $&#%*+@ config files! They make my life miserable on a daily basis!
What would a DevOpeler say?
What config files? We haven’t used those for years!
Let’s examine those attitudes a bit, shall we?
Why config files create problems
A local configuration file, like .NET’s “app.config”, is the simplest place to keep read-only configuration properties the app needs. Typically, such properties include database connection strings, external service host names, timer intervals, and error email addresses, to name a few. The solution works just fine when there are only a few application instances running in production. However, it quickly runs into problems as it scales up. Changing a config property across the organization is easy when it means manually tweaking a couple config files. It is daunting when it means changing hundreds or thousands of files. It becomes terrifying when it has to be done under pressure (an emergency cutover to a backup database or service). Those files may not even store the value in the same format, if there are multiple applications built by separate teams. Many of the values are different across environments, and hence deployments become another exercise in XML-munging. Auditing configuration firm-wide is also a huge challenge, as the needed information is scattered across all production machines (generating a report requires hitting them all). Automation is, as always, a way to dull the pain, but even the best XML-hacking scripts can’t make this flawed system completely reliable or give peace of mind to the operators responsible for it.
Fundamentally, the config-file-based approach is flawed because it breaks a bedrock principle of software development – DRY – don’t repeat yourself. All those applications that connect to the same database – why do they all need a personal invitation to connect to it? A better solution would allow us to specify that data in a single place, and ask the applications to query the single source of truth for the data they need.
Why developers don’t care
Even if the operations team complains loudly about this issue, odds are that the development team will not be bothered to fix it. Developers don’t feel the pain and stress associated with pushing out a config change via a perl script across hundreds of machines. Developers rarely have to interact with the config file – perhaps there is some pain associated with initial set up, but after that they only need to make minor modifications. Perhaps most importantly, there are always “more important” things to be working on than making the deployment and configuration process easier on the operations team. The users don’t need to deal with the configuration files, and operations has some killer scripts to deal with the files, so who cares?
Configuration the DevOps way
This is perhaps the archetypal problem that can be solved by DevOps – cross-team collaboration that produces more maintainable software. The right approach here is not more automation on the ops side – it is a new approach to configuration on the dev side, influenced by the needs of ops.
Before proceeding, it should be noted that we have a very narrow definition of configuration data in mind here. We are talking exclusively about read-only configuration that the application uses to bootstrap itself. This includes things like database connection strings and excludes things like the user’s color preferences. User settings and other types of application data demand different solutions which we won’t discuss here.
The goal is to eliminate the config files entirely and create the “single source of truth” for configuration data that we alluded to earlier. This goal requires us to create a central configuration service that will store the data, allow ops to edit it, and make it available to applications. Existing applications will need to change to query the service for configuration data instead of the local XML file.
The central configuration service we need is effectively a key-value store with one very important twist. In a traditional key-value store, there is one value for each key. However, we could need several values for the same key – one for production and one for development; one for John and one for Jane; one for the UI and one for the server. We need more than a key-value store, we need a context-aware key-value store that has the ability to serve up different values for a key based on the “context” of the request. With such a system, we could define one database connection string for production, one for dev, and one for the DBA who is working on schema modifications. We can default a “feature toggle” to false, override it to true for a beta tester, then switch it to true at the default level when we’re ready to roll it out. We can change a connection timeout across the entire organization by changing a single value in the central service.
From the point-of-view of application code, the configuration service looks just like a simple key-value store (you provide a key, you are returned a value). The client API determines the “context” and adds this to the query sent to the service. The “context” can include things like the environment, the application name, the “instance” of the application, the user name, the machine name, or perhaps something else that makes sense for your organization and architecture. It is also critical that the client API maintain a local file-based backup of configuration data so that it the application can fall back to that local data in the event that the central service becomes unavailable. Also, it is important that no configuration values live with or in the application code — the code should not provide or have knowledge of a “default” value for any configuration data — all configuration must be fetched from the service. Providing values in code violates DRY and undermines our single source of truth.
A mantra of mainstream DevOps is “configuration as code“. We think this is a bit misguided. Configuration should not be an operational concern that requires special code or tools to automate. Instead, configuration should be a first-class citizen in your architecture, and configuration management should be about managing your data, not managing a suite of automation scripts. Doing configuration right requires your application to change. This is central to our view of DevOps — operational problems should be solved by the software, not thrown over the wall for ops to automate.