Mashing the marvelous wrapper until it responds, part 1: prereq/setup

I haven’t used a dynamic language for coding nearly as much as strongly-typed, compiled languages so approaching Python was a little nervous-making for me.  It’s not every day you look into the abyss of your own technical inadequacies and find a way to keep going.

Here’s how embarrassing it got for me: I knew enough to clone the code to my computer and to copy the example code into a .py file, but beyond that it felt like I was doing the same thing I always do when learning a new language: trying to guess at the basics of using the language that everyone who’s writing about it already knows and has long since forgotten, they’re so obvious.  Obvious to everyone but the neophyte.

Second, is that I don’t respond well to the canonical means of learning a language (at least according to all the “Learn [language_x] from scratch” books I’ve picked up over the years), which is

  • Chapter 1: History, Philosophy and Holy Wars of the Language
  • Chapter 2: Installing The Author’s Favourite IDE
  • Chapter 3: Everything You Don’t Have a Use For in Data Types
  • Chapter 4: Advanced Usage of Variables, Consts and Polymorphism
  • Chapter 5: Hello World
  • Chapter 6: Why Hello World Is a Terrible Lesson
  • Chapter 7: Author’s Favourite Language Tricks

… etc.

I tend to learn best by attacking a specific, relevant problem hands-on – having a real problem I felt motivated to attack is how these projects came to be (EFSCertUpdater, CacheMyWork).  So for now, despite a near-complete lack of context or mentors, I decided to dive into the code and start monkeying with it.

Riches of Embarrassment

I quickly found a number of “learning opportunities” – I didn’t know how to:

  1. Run the example script (hint: install the python package for your OS, make sure the python binary is in your current shell’s path, and don’t use the Windows Git Bash shell as there’s some weird bug currently at work)
  2. Install the dependencies (hint: run “pip install xxxx”, where “xxxx” is whatever shows up at the end of an error message like this:
    C:\Users\Mike\code\marvelous>python example.py 
    Traceback (most recent call last):     
        File "example.py", line 5, in <module>
            from config import public_key, private_key 
    ImportError: No module named config

    In this example, I ran “pip install config” to resolve this error.

  3. Set the public & private keys (hint: there was some mention of setting environment variables, but it turns out that for this example script I had to paste them into a file named “config” – no, for python the file needs to be named “config.py even though it’s text not a script you would run on its own – and make sure the config.py file is stored in the same folder as the script you’re running.  Its contents should look similar to these (no, these aren’t really my keys):
        public_key = 81c4290c6c8bcf234abd85970837c97 
        private_key = c11d3f61b57a60997234abdbaf65598e5b96

    Nope, don’t forget – when you declare a variable in most languages, and the variable is not a numeric value, you have to wrap the variable’s value in some type of quotation marks.  [Y’see, this is one of the things that bugs me about languages that don’t enforce strong typing – without it, it’s easy for casual users to forget how strings have to be handled]:

        public_key = '81c4290c6c8bcf234abd85970837c97' 
        private_key = 'c11d3f61b57a60997234abdbaf65598e5b96'
  4. Properly call into other Classes in your code – I started to notice in Robert’s Marvelous wrapper that his Python code would do things like this – the comic.py file defined
         class ComicSchema(Schema):

    …and the calling code would state

        import comic 
        … 
        schema = comic.ComicSchema()

    This was initially confusing to me, because I’m used to compiled languages like C# where you import the defined name of the Class, not the filename container in which the class is defined.  If this were C# code, the calling code would probably look more like this:

        using ComicSchema;
        … 
        _schema Schema = ComicSchema();

    (Yes, I’m sure I’ve borked the C# syntax somehow, but for sake of this sad explanation, I hope you get the idea where my brain started out.)

    I’m inferring that for a scripted/dynamic language like Python, the Python interpreter doesn’t have any preconceived notion of where to find the Classes – it has to be instructed to look at specific files first (import comic, which I’m guessing implies import comic.py), then further to inspect a specified file for the Class of interest (schema = comic.ComicSchema(), where comic. indicates the file to inspect for the ComicSchema() class).

Status: Learning

So far, I’m feeling (a) stupid that I have to admit these were not things with which I sprang from the womb, (b) grateful Python’s not *more* punishing, (c) smart-ish that fundamental debugging is something I’ve still retained and (d) good that I can pass along these lessons to other folks like me.

Advertisements

Coding Again? Experimenting with the Marvel API

I’ve been hanging around developers *entirely* too much lately.

These days I find myself telling myself the story that unless I get back into coding, I’m not going to be relevant in the tech industry any longer.

Hanging out (aka volunteering) at developer-focused conferences will do that to you:

Volunteering on open source projects will do that to you (jQuery Foundation‘s infrastructure team).

Interviewing for engineering-focused Product Owner and Technical Product Manager roles will do that to you. (Note: when did “technical” become equivalent to “I actively code in my day job/spare time”?)

One of the hang-ups I have that keeps me from investing the immense amount of grinding time it takes to make working code is that I haven’t found an itch to scratch that bugs me enough that I’m willing to commit myself to the effort. Plenty of ideas float their way past my brain, but very few (like CacheMyWork) get me emotionally engaged enough to surmount the activation energy necessary to fight alone past all the barriers: lonely nights, painful problem articulation, lack of buddy to work on it, and general frustration that I don’t know all the tricks and vocabulary that most good coders do.

Well, it finally happened. I found something that should keep me engaged: creating a stripped-down search interface into the Marvel comics catalogue.  Marvel.com provides a search on their site but I:

  1. keep forgetting where they buried it,
  2. find it cumbersome and slow to use, and
  3. never know if the missing references (e.g. appearances of Captain Marvel as a guest in others’ comics that aren’t returned in the search results) are because the search doesn’t work, or because the data is actually missing

Marvel launched an API a couple of years ago – I heard about it at the time and felt excited that my favourite comics publisher had embraced the Age of APIs.  But didn’t feel like doing anything with it.

Fast forward two years: I’m a diehard user of Marvel Unlimited, my comics reading is about half-Marvel these days, and I’m spending a lot of time trying to weave together a picture of how the characters relate, when they’ve bumped into each other, what issue certain happenings occurred in, etc

Possible questions I could answer if I write some code:

  • How socially-connected is Spidey compared with Wolverine?
  • When is the first appearance of any character?
  • What’s the chronological publication order of every comic crossover in any comics Event?

Possible language to use:

  • C# (know it)
  • F# (big hawtness at the .NET Fringe conf)
  • Python (feel like I should learn it)
  • Typescript (ES6 – like JavaScript with static types and other frustration-killers)
  • ScriptCS (a scriptable C#)

More important than choice of language though is availability of wrappers for the API – while I’m sure it would be very instructive to immediately start climbing the cliff of building “zero tech” code, I learn far faster when I have visible results, than when I’m still fiddling with getting the right types for my variables or trying to remember where and when to set the right kind of closing braces.

So for sake of argument I’m going to try out the second package I found – Robert Kuykendall’s “marvelous” python wrapper: https://github.com/rkuykendall/marvelous

See you when I’ve got something to report.

Goodbye Evernote, you doublespeak snake

I got the note recently from Evernote, much like all my friends, that starts:

At Evernote, we are committed not only to making you as productive as you can be, but also to running our business in as transparent a way as possible. We’re making a change to our Basic service, and it’s important that you know about it.

They go on to drop the bombshell that they’re raising prices on the paid plans, changing feature allocation, and most egregiously, putting new limitations on the Basic (free) accounts. Limiting free users to only having two devices is a serious attack on all those who’ve been working under the assumption that we should continue to invest our memories, data and daily workflow into their platform with the knowledge that we’d always be able to use it everywhere we were.

My PM Perspective

It’s obvious that Evernote’s VCs are putting the screws to them – either that “growth in user signups” is no longer what they’re after, but top-line growth in revenue – or that (more likely) Evernote’s growth in new users is fast waning (the market is now matured), and they need to start farming those millions of naïve suckers for increased revenue.

Either way, the revenue play couldn’t be more obvious. Make it nearly laughably inconvenient for most of their users to use, and at the same time boost the pricing on their subscription plans. Dangerous move kids.

In one way I have hand it to the Product Manager(s) who came up with this scheme – this is the boldest move they could make to convert free users over. I have to believe that they looked at their user base’s # of connected devices before deciding that “2” would be the magic tipping point (i.e. most users have at least three frequently-used devices to use Evernote).

It’s unfortunate that they had to rush the subscription price increase at the same time, however; had they waited another 6 months to let the hordes rush over to their more affordable plans, and *then* raised prices, they might not have caught our ire quite so widely.

Here’s a more winning alternative: convert me to one of the existing paid subscriptions, let me get used to the new normal, settle in to my old habits, let my addiction to the convenience of Evernote at these lower prices take hold again, and *then* ask for a little more money (see Netflix’s recent move to boost streaming prices by a couple of dollars, without changing features *or* the DVD subscription pricing).

I’m gladly giving Netflix the extra dollars, as I currently feel *happy* and agreeable to them, even though I don’t particularly feel like they should need more money for an increasingly-crappy catalogue of non-Netflix movies (which is precisely why I hang on to the DVD subscription).

Had they earned my trust either up to now or through these transitions, I suspect I’d be singing their praises, not migrating out wholesale.

Value Delivery Issues

The obvious value gap – for me – is in what I get back for the price at each subscription level.  At the Plus tier ($34.99/year), there isn’t one feature that I’ve ever actually needed – except the multi-device access (which just pisses me off).  At the Premium tier ($64.99/year), the only feature I would enjoy is the Business Card scanning feature – and as mentioned below, I have learned to mistrust it.

Perhaps I’m an outlier in feeling burned by Evernote, but I sincerely doubt it: there’s been an undercurrent of discontent with Evernote’s products for years:

  • Hello/Food starved out, features added then removed, unceremoniously dumped, and their rich content templates never migrated to Evernote.
  • Business Card scanning feature totally screwy – added to Hello with a “one-year free” promo then silently removed; added to Scannable with another “one-year free” promo then also silently removed from there; template for the scanned data changed at least once that I noticed, and made increasingly less useful *and* less searchable throughout my limited use of it. NOTE: I’ve never had any of these changes communicated to me through their apps, so I don’t know what noob Product/Community Manager thinks they are smoking when they say “running the business in as transparent a way as possible”. I’ve learned not to believe that for a second.
  • Many obvious features requested repeatedly with no acknowledgement let alone commitment (see the forums for a good laugh)
  • An “editor” feature that’s been abysmally poorly written, such that pasting content from Evernote notes is laughably painful for those using that content elsewhere like WordPress
  • A web-based interface that’s tolerable for read-only but horrific for writing – even in Chrome, it can’t seem to auto-save without creating multiple conflicting copies of the same note

And outlier or not, within 24 hours I went out to find the tools to migrate my eight years worth of notes over to OneNote. I’d looked into this a few times over the years, hearing good things Windows types said about OneNote, but never quite believed that it had outgrown its wart-on-the-ass-end-of-Office roots. Given the obvious migration incentive though, I’m willing to give it a try.

OneNote After A Week

So far OneNote has been performant and useable:

  • the migration tool seemed to bring over all the content, though (because of Evernote’s screwy CDATA embedded-HTML formatting) the layout of the content is loose and weirdly-fonted.
  • The difference between Notebooks/embedded notebooks and Notebooks/sections took me a bit (and two tries with the migration so that I got an acceptable arrangement of Tagged content), but otherwise so far so good.
  • I have no evidence that I’ve lost any data, or had any sync conflicts arise. (Unlike Evernote in app or on web)
  • The in-app editor is clean and hasn’t done anything I didn’t want it to do.

Bonus: the web editing experience is seamless.

Tip: if you want full editing power in Windows, use the free Desktop client, not the default “Trusted App” (the Win10/Metro/tablet-oriented app).

Hashicorp Vault + Ansible + CD: open source infra, option 2

“How can we publish our server configuration scripts as open source code without exposing our secrets to the world?”

In my first take on this problem, I fell down the rabbit hole of Ansible’s Vault technology – a single-password-driven encryption implementation that encrypts whole files and demands they be decrypted by interactive input or static filesystem input at runtime. Not a bad first try, but feels a little brittle (to changes in the devops team, to accidental inclusion in your git commits, or to division-of-labour concerns).

There’s another technology actively being developed for the devops world, by the Hashicorp project, also (confusingly/inevitably) called Vault. [I’ll call it HVault from here on, to distinguish from Ansible Vault >> AVault.]

HVault is a technology that (at least from a cursory review of the intro) promises to solve the brittle problems above. It’s an API-driven lockbox and runtime-proxy for all manner of secrets, making it possible to store and retrieve static secrets, provision secrets to some roles/users and not others, and create limited-time-use credentials for applications that have been integrated with HVault.

Implementation Options

So for our team’s purposes, we only need to worry about static secrets so far. There’s two possible ways I can see us trying to integrate this:

  1. retrieve the secrets (SSH passphrases, SSL private keys, passwords) directly and one-by-one from HVault, or
  2. retrieve just an AVault password that then unlocks all the other secrets embedded in our Ansible YAML files (using reinteractive’s pseudo-leaf indirection scheme).

(1) has the advantage of requiring one fewer technologies, which is a tempting decision factor – but it comes at the expense of creating a dependency/entanglement between HVault and our Ansible code (in naming and managing the key-value pairs for each secret) and of having to find/use a runtime solution to injecting each secret into the appropriate file(s).

(2) simplifies the problem of injecting secrets at runtime to a single secret (i.e. AVault can accept a script to insert the AVault password) and enables us to use a known quantity (AVault) for managing secrets in the Ansible YAMLs, but also means that (a) those editing the “secret-storing YAMLs” will still have to have access to a copy of the AVault password, (b) we face the future burden to plan for breaking changes introduced by both AVault and HVault, and (c) all secrets will be dumped to disk in plaintext on our continuous deployment (CD) server.

Thoughts on Choosing For Our Team

Personally, I favour (1) or even just using AVault alone. While the theoretical “separation of duties” potential for AVault + HVault is supposed to be more attractive to a security geek like me, this just seems like needless complexity for effectively very little gain. Teaching our volunteers (now and in the future) how to manage two secrets-protecting technologies would be more painful, and we double the risks of dealing with a breaking change (or loss of active development) for a necessary and non-trivially-integrated technology in our stack.

Further, if I had to stick with one, I’d stay “single vendor” and use AVault rather than spread us across two projects with different needs & design philosophies. Once we accept that there’s an occasional “out of band initialization” burden for setting up either vault, and that we’d likely have to share access to larger numbers of secrets with a wider set of the team than ideal, I think the day-to-day management overhead of AVault is no worse (and possibly lighter) than HVault.

Pseudo-Solution for an HVault-only Implementation

Assuming for the moment that we proceed with (1), this (I think) is the logical setup to make it work:

  • Setup an HVault instance
  • Design a naming scheme for secrets
  • Populate HVault with secrets
  • Install Consul Template as a service
  • Rewrite all secret-containing Ansible YAMLs with Consul Template templating variables (matching the HVault naming)
  • Rewrite CD scripts to pull HVault secrets and rewrite all secret-containing Ansible YAMLs
  • Populate the HVault environment variables to enable CD scripts to authenticate to HVault

Operational Concerns

If the HVault instance is running on a server in the production infrastructure, can HVault be configured to only allow connections from other servers that require access to the HVault secrets? This would reduce the risk that knowledge of the HVault (authentication token and address as used here) wouldn’t provide instant access to the secrets from anywhere on the Internet. This would be considered a defense-in-depth measure in case ip_tables and SSH protections could be circumvented to allow incoming traffic at the network level.

The HVault discussions about “flexibility” and “developer considerations” lead me to conclude that – for a volunteer team using part-time time slivers to manage an open source project’s infrastructure – HVault Cubbyhole just isn’t low-impact, fully-baked enough at this time to make it worth the extra development effort to create a full solution for our needs. While Cubbyhole addresses an interesting edge case in making on-the-wire HVault tokens less vulnerable, it doesn’t substantially mitigate (for us, at least) the bootstrapping problem, especially when it comes to a single-server HVault+deployment service setup.

Residual Security Issues

  • All this gyration with HVault is meant to help solve the problems of (a) storing all Ansible YAML-bound secrets in plaintext, (b) storing a static secret (the AVault password) in plaintext on our CD server, and (c) finding some way to keep any secrets from showing up in our github repo.
  • However, there’s still the problem of authenticating a CD process to HVault to retrieve secret(s) in the first place
  • We’re still looking to remove human intervention from standard deployments, which means persisting the authentication secret (token, directory-managed user/pass, etc) somewhere on disk (e.g. export VAULT_TOKEN=xxxx)
  • Whatever mechanism we use will ultimately be documented – either directly in our github repo, or in documentation we end up publishing for use by other infrastructure operators and those who wish to follow our advice

 

This is not the final word – these are merely my initial thoughts, and I’m looking forward to members of the team bringing their take to these technologies, comparisons and issues.  I’m bound to learn something and we’ll check back with the results.

Reading List

Intro to Hashicorp Vault: https://www.vaultproject.io/intro/

Blog example using HVault with Chef: https://www.hashicorp.com/blog/using-hashicorp-vault-with-chef.html

Example Chef Recipe for using HVault https://gist.github.com/sethvargo/6f1a315094fbd1a18c6d

Ansible lookup module to retrieve secrets from HVault https://github.com/jhaals/ansible-vault

Ansible modules for interacting with HVault https://github.com/TerryHowe/ansible-modules-hashivault

Ansible Vault for an open source project: adventures in simplified indirection

“How can we publish our server configuration scripts as open source code without exposing our secrets to the world?”

It seemed like a simple enough mission. There are untold numbers of open source projects publishing directly to github.com; most large projects have secrets of one form or another. Someone must have figured out a pattern for keeping the secrets *near* the code without actually publishing them (or a key leading to them) as plaintext *in* the code, yes?

However, a cursory examination of tutorials on Ansible Vault left me with an uneasy feeling. It appears that a typical pattern for this kind of setup is to partition your secrets as variables in an Ansible Role, encrypt the variables, and unlock them at runtime with reference to a password file (~/.vault_pass.txt) [or an interactive prompt at each Ansible run *shudder*]. The encrypted content is available as an AES256 blob, and the password file… well, here’s where I get the heebie-jeebies:

  1. While AES256 is a solid algorithm, it still feels…weird to publish such files to the WORLD. Distributed password cracking is quite a thing; how ridiculous of a password would we need to have to withstand an army of bots grinding away at a static password, used to unlock the encrypted secrets? Certainly not a password that anyone would feel comfortable typing by hand every time it’s prompted.
  2. Password files need to be managed, stored, backed up and distributed/distributable among project participants. Have you ever seen the docs for PGP re: handling the master passphrase? Last time I remember looking with a friend, he showed me four places where the docs said “DON’T FORGET THE PASSPHRASE”. [Worst case, what happens if the project lead gets hit by a bus?]

I guess I was expecting some kind of secured, daemon-based query-and-response RPC server, the way Jan-Piet Mens envisioned here.

Challenges

  • We have a distributed, all-volunteer team – hit-by-a-bus scenarios must be part of the plan
  • (AFAIK) We have no permanent “off-the-grid” servers – no place to stash a secret that isn’t itself backed up on the Internet – so there will have to be at least periodic bootstrapping, and multiple locations where the vault password will live

Concerns re: Lifecycle of Ansible Vault secrets:

  1. Who should be in possession of the master secret? Can this be abstracted or does anyone using it have to know its value?
  2. What about editing encrypted files? Do you have to decrypt them each time and re-encrypt, or does “ansible-vault edit” hand-wave all that for you?
    • Answer: no, “ansible-vault edit” doesn’t persist the decrypted contents to disk, just sends them to your editor and transparently re-encrypts on save.
  3. Does Ansible Vault use per-file AES keys or a single AES key for all operations with the same password (that is, is the vault password a seed for the key or does it encrypt the key)?
    • Answer: not confirmed, but perusing the source code and the docs never mention per-file encryption, and the encrypted contents do not appear to store an encrypted AES key, so it looks like one AES key per vault password.
  4. Where to store the vault password if you want to integrate it into a CD pipeline?
    • Answer: –vault-password-file ~/.vault_pass.txt OR EVEN –vault-password-file ~/.vault_pass.py, where the script sends the password to stdout]
  5. Does anyone have a viable scheme that doesn’t require a privileged operator to be present during every deployment (–ask-vault-pass)?
    • i.e. doesn’t that mean you’re in danger of including ~/.vault_pass.txt in your git commit at some point? If not, where does that secret live?
  6. If you incorporate LastPass into your workflow to keep a protected copy of the vault password, can *that* be incorporated into the CD pipeline somehow?
  7. Are there any prominent OSS projects that have published their infrastructure and used Ansible Vault to publish encrypted versions of their secrets?

Based on my reading of the docs and blogs, it seems like this is the proferred solution for maximum automation and maintainability:

  • Divvy up all your secrets as variables and use pseudo-leaf indirection (var files referencing prefixed variables in a separate file) as documented here.
  • Encrypt the leaf-node file(s) using a super-complex vault password
  • Store the vault password in ~/.vault_pass.txt
  • Call all ansible and ansible-playbook commands using the –vault-password-file option
  • Smart: wire up a pre-commit step in git to make sure the right files are always encrypted as documented here.
  • Backup the vault password in a password manager like LastPass (so that only necessary participants get access to that section)
  • Manually deploy the ,vault_pass.txt file to your Jenkins server or other CI/CD master and give no one else access to that server/root/file.
  • Limit the number of individuals who need to edit the encrypted file(s), and make sure they list.vault_pass.txt in their .gitignore file.

P.S. Next up – look into the use of Hashicorp’s Vault project.

Reading List

Ansible Vault Docs:
http://docs.ansible.com/ansible/playbooks_vault.html

This is an incredibly useful article of good practices for using Ansible (and Ansible Vault) in a reasonably productive way:
https://www.reinteractive.net/posts/167-ansible-real-life-good-practices

Occupied Neurons, early July 2016: security edition

Who are you, really: Safer and more convenient sign-in on the web – Google I/O 2016

Google shared some helpful tips for web developers to make it as easy as possible for users to securely sign in to your web site, from the Google Chrome team:

  • simple-if-annoying-that-we-still-have-to-use-these attributes to add to your forms to assist Password Manager apps
  • A Credential Management API that (though cryptically explained) smoothes out some of the steps in retrieving creds from the Chrome Credential Manager
  • This API also addresses some of the security threats (plaintext networks, Javascript-in-the-middle, XSS)
  • Then they discuss the FIDO UAF and U2F specs – where the U2F “security key” signs the server’s secondary challenge with a private key whose public key is already enrolled with the online identity the server is authenticating

The U2F “security key” USB dongle idea is cute and useful – it requires the user’s interaction with the button (can’t be automatically scraped by silent malware), uses RSA signatures to provide strong proof of possession and can’t be duplicated. But as with any physical “token”, it can be lost and it requires that physical interface (e.g. USB) that not all devices have. Smart cards and RSA tokens (the one-time key generators) never entirely caught on either, despite their laudable security laurels.

The Credential Manager API discussion reminds me of the Internet Explorer echo chamber from 10-15 years ago – Microsoft browser developers adding in all these proprietary hooks because they couldn’t imagine anyone *not* fully embracing IE as the one and only browser they would use everywhere. Disturbing to see Google slip into that same lazy arrogance – assuming that web developers will assume that their users will (a) always use Chrome and (b) be using Chrome’s Credential Manager (not an external password manager app) to store passwords.

Disappointing navel-gazing for the most part.

Google’s password-free logins may arrive on Android apps by year-end

Project Abacus creates a “Trust Score API” – an interesting concept which intends supplant the need for passwords or other explicit authentication demands, by taking ambient readings from sensors and user interaction patterns with their device to determine how likely it is that the current holder/user is equivalent to the identity being asserted/authenticated.

This is certainly more interesting technology, if only because it allows for the possibility that any organization/entity that wishes to set their own tolerance/threshold per-usage can do so, using different “Trust Scores” depending on how valuable the data/API/interaction is that the user is attempting. A simple lookup of a bank balance could require a lower score than making a transfer of money out of an account, for example.

The only trick to this is the user must allow Google to continuously measure All The Thingz from the device – listen on the microphone, watch all typing, observe all location data, see what’s in front of the camera lens. Etc. Etc. Etc.

If launched today, I suspect this would trip over most users’ “freak-out” instinct and would fail, so kudos to Google for taking it slow. They’re going to need to shore up the reputation of Android phones and their inscrutably cryptic if comprehensive permissions model and how well that’s sandboxed if they’ll ever get widespread trust for Google to watch everything you’re doing.

MICROSOFT WANTS TO PROTECT US FROM OUR OWN STUPID PASSWORDS

Looks like Microsoft is incorporating “widely-used hacked passwords” into the set of password rules that Active Directory can enforce against users trying to establish a weak password. Hopefully this’ll be less frustrating than the “complex passwords” rules that AD and some of Microsoft’s more zealous customers like to enforce, making it nigh-impossible to know what the rules are let alone give a sentient human a chance of getting a password you might want to type 20-50 times/day. [Not that I have any PTSD from that…]

Unfortunately, they do a piss-poor job of explaining how “Smart Password Lockout” works. I’m going to take a guess how this works, and hopefully someday it’ll be spelled out. It appears they’ve got some extra smarts in the AD password authentication routine that runs at the server-side – it can effectively determine whether the bad password authentication attempt came from an already-known device or not. This means that AD is keeping a rolling cache of the “familiar environments” – likely one that ages out the older records (e.g. flushing anything older than 30 days). What’s unclear is whether they’re recording remote IP addresses, remote computer names/identities, remote IP address subnets, or some new “cookie”-like data that wasn’t traditionally sent with the authentication stream.

If this is based on Kerberos/SAML exchanges, then it’s quite possible to capture the remote identity of the computer from which the exchange occurred (at least for machines that are part of the Active Directory domain). However, if this is meant as a more general-purpose mitigation for accounts used in more Internet (not Active Directory domain) setting, then unless Active Directory has added cookie-tracking capabilities it didn’t have a decade ago, I’d imagine they’re operating strictly on the remote IP address enveloped around any authentication request (Kerberos, NTLM, Basic, Digest).

Still seems a worthwhile effort – if it allows AD to lockout attackers trying to brute-force my account from locations where no successful authentication has taken place – AND continues to allow me to proceed past the “account lockout” at the same time – this is a big win for end users, especially where AD is used in Internet-facing settings like Azure.