Hashicorp Vault + Ansible + CD: open source infra, option 2

“How can we publish our server configuration scripts as open source code without exposing our secrets to the world?”

In my first take on this problem, I fell down the rabbit hole of Ansible’s Vault technology – a single-password-driven encryption implementation that encrypts whole files and demands they be decrypted by interactive input or static filesystem input at runtime. Not a bad first try, but feels a little brittle (to changes in the devops team, to accidental inclusion in your git commits, or to division-of-labour concerns).

There’s another technology actively being developed for the devops world, by the Hashicorp project, also (confusingly/inevitably) called Vault. [I’ll call it HVault from here on, to distinguish from Ansible Vault >> AVault.]

HVault is a technology that (at least from a cursory review of the intro) promises to solve the brittle problems above. It’s an API-driven lockbox and runtime-proxy for all manner of secrets, making it possible to store and retrieve static secrets, provision secrets to some roles/users and not others, and create limited-time-use credentials for applications that have been integrated with HVault.

Implementation Options

So for our team’s purposes, we only need to worry about static secrets so far. There’s two possible ways I can see us trying to integrate this:

  1. retrieve the secrets (SSH passphrases, SSL private keys, passwords) directly and one-by-one from HVault, or
  2. retrieve just an AVault password that then unlocks all the other secrets embedded in our Ansible YAML files (using reinteractive’s pseudo-leaf indirection scheme).

(1) has the advantage of requiring one fewer technologies, which is a tempting decision factor – but it comes at the expense of creating a dependency/entanglement between HVault and our Ansible code (in naming and managing the key-value pairs for each secret) and of having to find/use a runtime solution to injecting each secret into the appropriate file(s).

(2) simplifies the problem of injecting secrets at runtime to a single secret (i.e. AVault can accept a script to insert the AVault password) and enables us to use a known quantity (AVault) for managing secrets in the Ansible YAMLs, but also means that (a) those editing the “secret-storing YAMLs” will still have to have access to a copy of the AVault password, (b) we face the future burden to plan for breaking changes introduced by both AVault and HVault, and (c) all secrets will be dumped to disk in plaintext on our continuous deployment (CD) server.

Thoughts on Choosing For Our Team

Personally, I favour (1) or even just using AVault alone. While the theoretical “separation of duties” potential for AVault + HVault is supposed to be more attractive to a security geek like me, this just seems like needless complexity for effectively very little gain. Teaching our volunteers (now and in the future) how to manage two secrets-protecting technologies would be more painful, and we double the risks of dealing with a breaking change (or loss of active development) for a necessary and non-trivially-integrated technology in our stack.

Further, if I had to stick with one, I’d stay “single vendor” and use AVault rather than spread us across two projects with different needs & design philosophies. Once we accept that there’s an occasional “out of band initialization” burden for setting up either vault, and that we’d likely have to share access to larger numbers of secrets with a wider set of the team than ideal, I think the day-to-day management overhead of AVault is no worse (and possibly lighter) than HVault.

Pseudo-Solution for an HVault-only Implementation

Assuming for the moment that we proceed with (1), this (I think) is the logical setup to make it work:

  • Setup an HVault instance
  • Design a naming scheme for secrets
  • Populate HVault with secrets
  • Install Consul Template as a service
  • Rewrite all secret-containing Ansible YAMLs with Consul Template templating variables (matching the HVault naming)
  • Rewrite CD scripts to pull HVault secrets and rewrite all secret-containing Ansible YAMLs
  • Populate the HVault environment variables to enable CD scripts to authenticate to HVault

Operational Concerns

If the HVault instance is running on a server in the production infrastructure, can HVault be configured to only allow connections from other servers that require access to the HVault secrets? This would reduce the risk that knowledge of the HVault (authentication token and address as used here) wouldn’t provide instant access to the secrets from anywhere on the Internet. This would be considered a defense-in-depth measure in case ip_tables and SSH protections could be circumvented to allow incoming traffic at the network level.

The HVault discussions about “flexibility” and “developer considerations” lead me to conclude that – for a volunteer team using part-time time slivers to manage an open source project’s infrastructure – HVault Cubbyhole just isn’t low-impact, fully-baked enough at this time to make it worth the extra development effort to create a full solution for our needs. While Cubbyhole addresses an interesting edge case in making on-the-wire HVault tokens less vulnerable, it doesn’t substantially mitigate (for us, at least) the bootstrapping problem, especially when it comes to a single-server HVault+deployment service setup.

Residual Security Issues

  • All this gyration with HVault is meant to help solve the problems of (a) storing all Ansible YAML-bound secrets in plaintext, (b) storing a static secret (the AVault password) in plaintext on our CD server, and (c) finding some way to keep any secrets from showing up in our github repo.
  • However, there’s still the problem of authenticating a CD process to HVault to retrieve secret(s) in the first place
  • We’re still looking to remove human intervention from standard deployments, which means persisting the authentication secret (token, directory-managed user/pass, etc) somewhere on disk (e.g. export VAULT_TOKEN=xxxx)
  • Whatever mechanism we use will ultimately be documented – either directly in our github repo, or in documentation we end up publishing for use by other infrastructure operators and those who wish to follow our advice


This is not the final word – these are merely my initial thoughts, and I’m looking forward to members of the team bringing their take to these technologies, comparisons and issues.  I’m bound to learn something and we’ll check back with the results.

Reading List

Intro to Hashicorp Vault: https://www.vaultproject.io/intro/

Blog example using HVault with Chef: https://www.hashicorp.com/blog/using-hashicorp-vault-with-chef.html

Example Chef Recipe for using HVault https://gist.github.com/sethvargo/6f1a315094fbd1a18c6d

Ansible lookup module to retrieve secrets from HVault https://github.com/jhaals/ansible-vault

Ansible modules for interacting with HVault https://github.com/TerryHowe/ansible-modules-hashivault

Ansible Vault for an open source project: adventures in simplified indirection

“How can we publish our server configuration scripts as open source code without exposing our secrets to the world?”

It seemed like a simple enough mission. There are untold numbers of open source projects publishing directly to github.com; most large projects have secrets of one form or another. Someone must have figured out a pattern for keeping the secrets *near* the code without actually publishing them (or a key leading to them) as plaintext *in* the code, yes?

However, a cursory examination of tutorials on Ansible Vault left me with an uneasy feeling. It appears that a typical pattern for this kind of setup is to partition your secrets as variables in an Ansible Role, encrypt the variables, and unlock them at runtime with reference to a password file (~/.vault_pass.txt) [or an interactive prompt at each Ansible run *shudder*]. The encrypted content is available as an AES256 blob, and the password file… well, here’s where I get the heebie-jeebies:

  1. While AES256 is a solid algorithm, it still feels…weird to publish such files to the WORLD. Distributed password cracking is quite a thing; how ridiculous of a password would we need to have to withstand an army of bots grinding away at a static password, used to unlock the encrypted secrets? Certainly not a password that anyone would feel comfortable typing by hand every time it’s prompted.
  2. Password files need to be managed, stored, backed up and distributed/distributable among project participants. Have you ever seen the docs for PGP re: handling the master passphrase? Last time I remember looking with a friend, he showed me four places where the docs said “DON’T FORGET THE PASSPHRASE”. [Worst case, what happens if the project lead gets hit by a bus?]

I guess I was expecting some kind of secured, daemon-based query-and-response RPC server, the way Jan-Piet Mens envisioned here.


  • We have a distributed, all-volunteer team – hit-by-a-bus scenarios must be part of the plan
  • (AFAIK) We have no permanent “off-the-grid” servers – no place to stash a secret that isn’t itself backed up on the Internet – so there will have to be at least periodic bootstrapping, and multiple locations where the vault password will live

Concerns re: Lifecycle of Ansible Vault secrets:

  1. Who should be in possession of the master secret? Can this be abstracted or does anyone using it have to know its value?
  2. What about editing encrypted files? Do you have to decrypt them each time and re-encrypt, or does “ansible-vault edit” hand-wave all that for you?
    • Answer: no, “ansible-vault edit” doesn’t persist the decrypted contents to disk, just sends them to your editor and transparently re-encrypts on save.
  3. Does Ansible Vault use per-file AES keys or a single AES key for all operations with the same password (that is, is the vault password a seed for the key or does it encrypt the key)?
    • Answer: not confirmed, but perusing the source code and the docs never mention per-file encryption, and the encrypted contents do not appear to store an encrypted AES key, so it looks like one AES key per vault password.
  4. Where to store the vault password if you want to integrate it into a CD pipeline?
    • Answer: –vault-password-file ~/.vault_pass.txt OR EVEN –vault-password-file ~/.vault_pass.py, where the script sends the password to stdout]
  5. Does anyone have a viable scheme that doesn’t require a privileged operator to be present during every deployment (–ask-vault-pass)?
    • i.e. doesn’t that mean you’re in danger of including ~/.vault_pass.txt in your git commit at some point? If not, where does that secret live?
  6. If you incorporate LastPass into your workflow to keep a protected copy of the vault password, can *that* be incorporated into the CD pipeline somehow?
  7. Are there any prominent OSS projects that have published their infrastructure and used Ansible Vault to publish encrypted versions of their secrets?

Based on my reading of the docs and blogs, it seems like this is the proferred solution for maximum automation and maintainability:

  • Divvy up all your secrets as variables and use pseudo-leaf indirection (var files referencing prefixed variables in a separate file) as documented here.
  • Encrypt the leaf-node file(s) using a super-complex vault password
  • Store the vault password in ~/.vault_pass.txt
  • Call all ansible and ansible-playbook commands using the –vault-password-file option
  • Smart: wire up a pre-commit step in git to make sure the right files are always encrypted as documented here.
  • Backup the vault password in a password manager like LastPass (so that only necessary participants get access to that section)
  • Manually deploy the ,vault_pass.txt file to your Jenkins server or other CI/CD master and give no one else access to that server/root/file.
  • Limit the number of individuals who need to edit the encrypted file(s), and make sure they list.vault_pass.txt in their .gitignore file.

P.S. Next up – look into the use of Hashicorp’s Vault project.

Reading List

Ansible Vault Docs:

This is an incredibly useful article of good practices for using Ansible (and Ansible Vault) in a reasonably productive way:

Can non-Microsoft ERM (electronic rights management) be integrated into MOSS 2007?

Fascinating question: can an organization that has deployed MOSS 2007 plug in another ERM/IRM (Electronic Rights Management) technology into the MOSS back-end, so that documents downloaded from MOSS would be automatically protected with that non-Microsoft ERM technology?

MOSS 2007 (aka SharePoint 2007) provides integration with the Microsoft Information Rights Management (IRM) technology – any documents that are uploaded to an “IRM-enabled” Document Library will automatically be (encrypted and) protected with a specific IRM policy whenever that document is downloaded again.  This depends both on the Microsoft implementation of IRM (RMS) policies (known as “Information Management Policy” in the MOSS SDK) as well as the inclusion of the Microsoft IRM “lockbox” (security processor) library on the MOSS server farm.  As I understand it, the procedure is basically:

  1. MOSS receives the download request from a remote client
  2. MOSS looks up the information management policy that is associated with the document’s List or Content Type (depending where the policy is applied)
  3. MOSS calls an instance of the IRM security processor (installed with the RMS Client on the front-end servers) to (a) encrypt the document, (b) generate the IRM license based on the associated policy, and (c) encrypt the content encryption key with appropriate RM Server’s public key. 
  4. MOSS delivers the protected document to the remote client – otherwise the same way that it would deliver an unprotected document.

Guessing How Third-Party ERM Could Integrate Into MOSS

So theoretically, for a third-party ERM solution to properly intercept the steps in this sequence:

  • the MOSS server would have to request a method/API that is “pluggable”
  • the MOSS server would have to support the ability to “plug” alternative ERM policy services in place of the native Microsoft IRM policy services
  • the MOSS server would have to support the ability to “plug” an alternative security processor in place of the native Microsoft RM security processor
  • the ERM solution would have to implement the pluggable responder for the “policy lookup” service, as well as a replacement UI and business logic framework for the server-side ERM policy “creation/assignment” capability that MOSS provides for IRM
  • the ERM solution would have to support a thread-safe, multi-threaded, rock-solid-stable security processor that could run in a potentially high-volume server environment

Given how much effort Microsoft has gone to in the past couple of years (not without external incentives, of course) to make available and document the means for ISV’s to interoperate with Microsoft client and server technologies, I’d figured there must be some “open protocol” documentation that documents how an ISV would create compatible ERM components to plug into the appropriate locations in a MOSS environment.

I scoured the SharePoint protocols specifications, but there were no specific protocols documents, nor any mention of “information management” in any of the overview documents.

There are some occasional references in the Microsoft Forums and elsewhere that hint at details that might be relevant to a third-party ERM plugin for MOSS, but I can’t tell if this is actually related or if I’m jus chasing spectres:

Aha!  It Appears the Answer is “Yes”

(I thought about erasing and rewriting the above, but there’s probably someone somewhere who thinks the same was I do about this, so I’ll leave it and just share my new insight below).

As always, I really should’ve started with the WSS 3.0 SDK and then branched out into the MOSS SDK and other far-off lands.

It turns out that the WSS SDK had the “secret” locked up in a page entitled “Custom IRM Protectors” (not to be confused with the forum post linked above).  My theory above didn’t nearly guess correctly, but it most closely resembled the “Autonomous Protector”:

Create an autonomous protector if you want the protector to have total control over how the protected files are rights-managed. The autonomous protector has full control over the rights-management process, and can employ any rights-management platform. Unlike the process with an integrated protector, when Windows SharePoint Services invokes an autonomous protector, it passes the specific rights that the user has to the document. Based upon these rights, an autonomous protector is responsible for generating keys for the document and creating rights-managed metadata in the correct format.

The autonomous protector and the client application must use the same rights-management platform.

So for a third-party ERM vendor to support an integrated experience in MOSS, while still using its non-Microsoft ERM client (i.e. not the Microsoft RMS Client), it would have to:

  • provide a COM component on each MOSS web server that implements the I_IrmProtector interface and an I_IrmPolicyInfo_Class object (analogous to my theorized “alternative ERM policy service”).
  • provide a rights management platform that protects (at the server) in a way that’s compatible with protections enforced by their rights management client (e.g. an alternative security processor available either locally or remotely from each MOSS web server)
  • override the default “integrated protectors” for Microsoft Office document types, and (presumably) support the ability to protect the Microsoft Office document types with the autonomous protector(s)

If I’m reading this right, then with a server-accessible rights management platform and one or more autonomous protectors, MOSS would be able to handle the rest of the required functionality: policy storage, UI, management interfaces (business logic), etc.

Now I wonder if anyone has actually implemented this support in their ERM solution…

EFS Certificate Configuration Updater tool is released!

After weeks of battling with Visual Studio over some pretty gnarly code issues, I’ve released the first version of a tool that will make IT admins happy the world over (well, okay, only those few sorry IT admins who’ve struggled to make EFS predictable and recoverable for the past seven years).

EFS Certificate Configuration Updater is a .NET 2.0 application that will examine the digital certificates a user has enrolled and will make sure that the user is using a certificate that was issued by a Certificate Authority (CA).

“Yippee,” I hear from the peanut gallery. “So what?”

While this sounds pretty freakin lame to most of the planet’s inhabitants, for those folks who’ve struggled to make EFS work in a large organization, this should come as a great relief.

Here’s the problem: EFS is supposed to make it easy to migrate from one certificate to the next, so that if you start using EFS today but decide later to take advantage of a Certificate Server, then the certs you issue later will replace the ones that were first enrolled. [CIPHER /K specifically tried to implement this.]

Unfortunately, there are some persistent but subtle bugs in EFS that prevent the automatic migration from self-signed EFS certificates to what are termed “version 2” certificates. Why are “version 2” certificates so special? Well, they’re the “holy grail” of easy recovery for encrypted files – they allow an administrator to automatically and centrally archive the private key that is paired with the “version 2” certificate.

So: the EFS Certificate Configuration Updater provides a solution to this problem, by finding a version 2 EFS certificate that the user has enrolled and forcing it to be the active certificate for use by EFS. [Sounds pretty simple eh? Well, there’s plenty of organizations out there that go to a lot of trouble to try to do it themselves.]

Even though this application fills a significant need, it doesn’t (at present, anyway) do everything that might be needed in all scenarios. The additional steps that you might need to cover include:

  • Enrolling a version 2 EFS certificate. [You can automate this with autoenrollment policy and the Windows Server 2003-based CA that is already in place for issuing v2 certificates and Key Archival.]
  • Updating EFS’d files to use the new certificate. [You can automate this by using CIPHER /U, but it’ll take a while if the user has a lot of encrypted files. The good news, however, is that the update only has to re-encrypt the FEK, not re-encrypt the entire file, so it’s much quicker than encrypting the same set of files from scratch.]
  • Ensuring that the user’s EFS certificate doesn’t expire before a new or renewed certificate is enrolled. [This is very easy to accomplish with Autoenrollment policy, but without the use of Autoenrollment, there is a significant risk that when the user’s preferred EFS certificate expires, the EFS component driver could enroll for a self-signed EFS certificate.]
  • Archiving unwanted EFS certificates. [This is different from deleting a digital certificate – which also invalidates the associated private key, which is NOT recommended. This would keep the certificates in the user’s certificate store, and preserve the private key — so that any files encrypted with that old certificate were still accessible. This is hard to do from UI or script, but is a feature I’m hoping to add to the EFS Certificate Configuration Updater in the near future. This is also optional – it just minimizes the chances of a pre-existing EFS certificate being used if the preferred certificate fails for some reason.]
  • Publishing the user’s current EFS certificate to Active Directory. [This is also optional. It is only necessary to make it possible — though still hardly scalable — to use EFS to encrypt files for access by multiple users (see MSDN for more information). This can be automated during Autoenrollment, but some organizations choose to disable publishing a 2nd or subsequent EFS certificate since the EFS component driver may get confused by multiple EFS certificates listed for a single user in Active Directory.]
  • Synchronizing the user’s EFS certificate and private key across all servers where encrypted files must be stored. [This is not needed if you’re merely ensuring that all sensitive data on the user’s notebook/laptop PC is encrypted, so that the loss or theft of that PC doesn’t lead to a data breach. However, if you must also enforce EFS encryption on one or more file servers, the EFS Certificate Configuration Updater will not help at all in this scenario.]

Try it out — Tell your friends (you have friends who’d actually *use* this beast? Man, your friends are almost as lame as mine – no offense) — Let me know what you think (but no flaming doo-doo on my front porch, please). And have a very crypto-friendly day. 😉

Encrypting %TEMP% with EFS: software installation concerns

One of the biggest bones of contention in the use of EFS is whether to encrypt the user’s %TEMP% folder or not.  It starts off pretty innocuously: many applications create temporary files in the user’s %TEMP% directory, and often these files can contain the same sensitive data that is contained in the original data files the users are opening.  That means the %TEMP% folder should be encrypted, right?

Microsoft originally recommended that %TEMP% be encrypted when using EFS.  Then reports of application compatibility issues came in, which created new “don’t encrypt %TEMP%” advice which has lingered long after those issues have been a real issue for most customers.  And yet there’s still varying opinions on this (e.g. here and here).

However, there’s one case that continues to dog those of us trying to enforce protection of sensitive data using EFS: software installation.  If I encrypt my %TEMP% folder and then try to install a bunch of applications myself (e.g. download and run the install files through the Windows UI), chances are I’ll find a few applications that either (a) won’t install (e.g. an older version of MSN Messenger had this problem) or (b) won’t work correctly after install (see this KB article for example).

While at Microsoft, I doggedly reported these app compat issues every time I ran into one, getting them fixed one by one (at least in MS apps).  Then I heard that the Windows Installer team had implemented a fix around the time that Vista shipped, and I figured we’d finally licked the problem.

However, there are recently KB articles (here and here) that indicate this is still a problem with Windows Vista and Office 2007.

So here’s one more attempt to clear up the confusion this issue creates, and provide definitive guidance on how to avoid problems with encrypted %TEMP%.  [John Morello got it right in a recent Technet article – but I suspect he may have cribbed this tip from some of the talks I’ve given over the years. ;)]

The only scenario in which installing software could fail due to encrypting the user’s %TEMP% folder is when:

  1. The software is being interactively installed by the user, not by a software distribution package (e.g. SMS, Tivoli, Altiris, etc.).
  2. The installer doesn’t understand EFS.  (e.g. The version of Windows Installer that shipped with Windows Vista knows to decrypt any encrypted folders it creates before handing off to the Windows Installer service running as SYSTEM)
  3. The installer moves (rather than copies) the files that it unpacks into the %TEMP% directory.  (Moving encrypted files to an unencrypted directory will leave the files encrypted
  4. The %TEMP% folder is left encrypted while the install takes place.  (You could distribute software installs with pre- and post-install actions that run simple command-line scripts to decrypt/encrypt the %TEMP% folder  e.g.
         cipher.exe /D %TEMP%
         cipher.exe /E %TEMP%


  • If all software installs are performed by a software distribution system such as SMS, Tivoli, Altiris, then you should be safe encrypting %TEMP%.
  • If your users are on Windows Vista, and
    • If the software being installed is packaged with MSI or other EFS-aware installers, then
    • You should be safe encrypting %TEMP%
  • If your users aren’t on Windows Vista, and
    • If your users install software themselves (e.g. download and run MSI install files), and
      • You can’t edit the install packages for the software that your users need to install, then
      • You should not encrypt %TEMP%.

Hey, in the long term I hope this issue gets buried once and for all – either EFS will become so ubiquitous that customers will report these issues in droves, and all the installer ISVs will finally fix their apps (including backports to earlier versions of Windows).  Or, EFS will be supplanted by some future implementation of ubiquitous encryption, making the need for file-based encryption a moot point.  [I don’t see that in the next few years, but never say never.]

Windows Vista’s Full Volume Encryption & TPM, part 6: more oddball TPM 1.2 links

Semi-random links to information I’ve used as reference for some of my rambling thoughts…

Whew! Now back to your regularly scheduled surfing.

Recent Articles on Data Security

Summaries and comments on some [not-so-] recent articles that caught my attention…

It’s Audit Time. Do You Know Where Your Private Data Is?

  • data encryption is becoming more commonplace, especially on mobile devices
  • “full disk encryption” is fashionable, but the security of that encrypted data depends heavily on key management and authentication
  • A little more user education on “physical security” can help avoid the risks for which encryption is layered on thick and gooey
  • “California’s Office of Privacy Protection issued a clarification [of CSB 1386] that defined encryption as AES, the government’s official encryption system.”

Commentary: I’m in full agreement that “full disk encryption” is the easy answer to multiple regulatory burdens, and that key management (i.e. being able to recover lost or damaged keys – to be able to recover the data) and authentication (i.e. strength of the authentication that stands between the keyboard and the decryption keys) are vital.

If you encrypt your whole disk but have no way of recovering if the disk sector [or TPM storage location] where the keys are stored is damaged/erased, then chances are you’ll lose legitimate access to the data more often (user frustration) than you’ll grant illegitimate access to the data (data exposure).

Sure, the AES clarification in California isn’t legally binding, but any organization that ignores this now (especially with wide availability of AES encryption technologies – e.g. RMS, EFS in Windows XP SP1, PGP, Pointsec) would be more than foolish – in my mind, they’d be deliberately negligent [obligatory “IANAL” hereby stated].

[Note: the article is incorrect about which versions of Windows support AES in EFS – EFS uses the AES algorithm only in Windows XP, and AES is the default only at SP1 and later.]

Study: ID Theft from Data Breaches Rare

  • Press release regurgitation: analysis and findings from a vendor of risk management technology

Commentary: in the “department of duh” category, not all security breaches involving identity data (credit cards, passwords, social security numbers, account numbers) resulted in massive identity theft.

US moves forward on data privacy

  • Proposed Federal law not only mandates data privacy and security – but also requires oversight of outside organizations you pay to handle/manage/process that data
  • Mandatory notification is required as well
  • Penalties for non-compliance include significant fines and possible jail time for willful disregard
  • Also mentions two additional pieces of legislation cooking: the “Identity Theft Protection Act” & the “Data Accountability and Trust Act”

Commentary: about freakin’ time.

Bonus article!!
Q&A: ETrade CIO calls token-based authentication a success

Commentary: “success” is measured in the interviewee’s first answer: customers who have adopted the SecurID token for access to their ETrade accounts “are therefore willing to move more assets to us.” Security is not useful if it doesn’t positively affect the core business.

Do you have more interest in strong authentication issues? Hit the site http://www.secureidnews.com/.