Configuration Done Right

Do you love your configuration files?

Let’s ask a hypothetical devoper:

I don’t mind them that much – they are a bit annoying, but I don’t have to deal with them that often and the users don’t seem to complain about them.

And now a hypothetical operator:

I HATE those $&#%*+@ config files! They make my life miserable on a daily basis!

What would a DevOpeler say?

What config files? We haven’t used those for years!

Let’s examine those attitudes a bit, shall we?

Why config files create problems

425946_lunar_controls

A local configuration file, like .NET’s “app.config”, is the simplest place to keep read-only configuration properties the app needs. Typically, such properties include database connection strings, external service host names, timer intervals, and error email addresses, to name a few. The solution works just fine when there are only a few application instances running in production. However, it quickly runs into problems as it scales up. Changing a config property across the organization is easy when it means manually tweaking a couple config files. It is daunting when it means changing hundreds or thousands of files. It becomes terrifying when it has to be done under pressure (an emergency cutover to a backup database or service). Those files may not even store the value in the same format, if there are multiple applications built by separate teams. Many of the values are different across environments, and hence deployments become another exercise in XML-munging. Auditing configuration firm-wide is also a huge challenge, as the needed information is scattered across all production machines (generating a report requires hitting them all). Automation is, as always, a way to dull the pain, but even the best XML-hacking scripts can’t make this flawed system completely reliable or give peace of mind to the operators responsible for it.

Fundamentally, the config-file-based approach is flawed because it breaks a bedrock principle of software development – DRY – don’t repeat yourself. All those applications that connect to the same database – why do they all need a personal invitation to connect to it? A better solution would allow us to specify that data in a single place, and ask the applications to query the single source of truth for the data they need.

Why developers don’t care

Even if the operations team complains loudly about this issue, odds are that the development team will not be bothered to fix it. Developers don’t feel the pain and stress associated with pushing out a config change via a perl script across hundreds of machines. Developers rarely have to interact with the config file – perhaps there is some pain associated with initial set up, but after that they only need to make minor modifications. Perhaps most importantly, there are always “more important” things to be working on than making the deployment and configuration process easier on the operations team. The users don’t need to deal with the configuration files, and operations has some killer scripts to deal with the files, so who cares?

Configuration the DevOps way

This is perhaps the archetypal problem that can be solved by DevOps – cross-team collaboration that produces more maintainable software. The right approach here is not more automation on the ops side – it is a new approach to configuration on the dev side, influenced by the needs of ops.

Before proceeding, it should be noted that we have a very narrow definition of configuration data in mind here. We are talking exclusively about read-only configuration that the application uses to bootstrap itself. This includes things like database connection strings and excludes things like the user’s color preferences. User settings and other types of application data demand different solutions which we won’t discuss here.

The goal is to eliminate the config files entirely and create the “single source of truth” for configuration data that we alluded to earlier. This goal requires us to create a central configuration service that will store the data, allow ops to edit it, and make it available to applications. Existing applications will need to change to query the service for configuration data instead of the local XML file.

The central configuration service we need is effectively a key-value store with one very important twist. In a traditional key-value store, there is one value for each key. However, we could need several values for the same key – one for production and one for development; one for John and one for Jane; one for the UI and one for the server. We need more than a key-value store, we need a context-aware key-value store that has the ability to serve up different values for a key based on the “context” of the request. With such a system, we could define one database connection string for production, one for dev, and one for the DBA who is working on schema modifications. We can default a “feature toggle” to false, override it to true for a beta tester, then switch it to true at the default level when we’re ready to roll it out. We can change a connection timeout across the entire organization by changing a single value in the central service.

From the point-of-view of application code, the configuration service looks just like a simple key-value store (you provide a key, you are returned a value). The client API determines the “context” and adds this to the query sent to the service. The “context” can include things like the environment, the application name, the “instance” of the application, the user name, the machine name, or perhaps something else that makes sense for your organization and architecture. It is also critical that the client API maintain a local file-based backup of configuration data so that it the application can fall back to that local data in the event that the central service becomes unavailable. Also, it is important that no configuration values live with or in the application code — the code should not provide or have knowledge of a “default” value for any configuration data — all configuration must be fetched from the service. Providing values in code violates DRY and undermines our single source of truth.

A mantra of mainstream DevOps is “configuration as code“. We think this is a bit misguided. Configuration should not be an operational concern that requires special code or tools to automate. Instead, configuration should be a first-class citizen in your architecture, and configuration management should be about managing your data, not managing a suite of automation scripts. Doing configuration right requires your application to change. This is central to our view of DevOps — operational problems should be solved by the software, not thrown over the wall for ops to automate.

 
Comments

(late here, but just answered your question on Stack and followed-through to your post here)

Good to see someone else thinking hard about it from a more holistic, dev-ops perspective … but like yourself, I have yet to see the problem cracked well or completely enough, or without requiring cumbersome or opinionated frameworks (and entire projects that try and solve this) that can often bring as much complexity and fragility (or consequently require or silo the problem space to a specialized role) as they try and solve in the first place.

Having been bitten with this problem in every language and environment in the last 20 years, and being very averse to adding new pre-deploy tools, processes and dependencies on our workflow, we are accomplishing these goals by using JSON (which is already very compact, readable and structure-able) and adding property-name annotations/suffixes to distinguish values that are environment (or any other arbitrary factor) -specific from those that are static/global … and then collapsing it (destructively and recursively) with a single function (at build, deploy, run or even render-time) according to any number of strings we feed it that match the annotated property/key names.

In a build/deploy environment (if, as in our case, NodeJS is available to host the JS function) the build script can be fed or self-determine factors about its environment that are passed to this function, and either collapse to a single config, or generate specific configuration files (ick, yes) but that at least originated from a single source of truth.

In our case, we started developing this originally to have a single REST API configuration for our JS-MVC applications, and feed both the hostname (and sometimes language) to the function, to dynamically select API endpoints and language or location-specific strings, but have extended it to internationalization, and server-side configuration as well.

Detailed answer here (http://stackoverflow.com/a/41292206/5670894) but short example sufficient for this post:

Pre-processed, annotated JSON:

config: {
‘ver’: ‘1.0’,
‘help’: {
‘BLURB’: ‘This pre-production environment is not supported. Contact Development Team with questions.’,
‘PHONE’: ‘808-867-5309’,
‘EMAIL’: ‘coder.jen@lostnumber.com’
},
‘help@www.productionwebsite.com’: {
‘BLURB’: ‘Please contact Customer Service Center’,
‘BLURB@fr’: ‘S\’il vous plaît communiquer avec notre Centre de service à la clientèle’,
‘BLURB@de’: ‘Bitte kontaktieren Sie unseren Kundendienst!!1!’,
‘PHONE’: ‘1-800-CUS-TOMR’,
‘EMAIL’: ‘customer.service@productionwebsite.com’
},
}

using our function, if, say, the request is on the production website for a user with German as their preferred browser language, prefer(config,[‘www.productionwebsite’,’de’]) collapses down to:

{
‘ver’: ‘1.0’,
‘help’: {
‘BLURB’: ‘Bitte kontaktieren Sie unseren Kundendienst!!1!’,
‘PHONE’: ‘1-800-CUS-TOMR’,
‘EMAIL’: ‘customer.service@productionwebsite.com’
}
}

There are still lots of ways developers and dev-ops can mess this up, but it resolves the duplication, versioning (attributes added, renamed, etc.) and single-source-of-truth problem for us, while remaining flexible to use in a variety of scenarios.

The main caveat is that it’s JavaScript/JSON based, so if you don’t use or have NodeJS in your build environment, or are not using JavaScript as your front-end (depending on where you want/need to parse your config) … you’d need to figure out how to implement a similar function in your language of choice.

Leave a Reply