Getting to know AWS with Python (the free way)

I’ve run through all the easy AWS tutorials and I was looking to roll sleeves up and take it one level deeper, so (given I’ve recently completed a Python class, and given I’m trying to stay on a budget) I hunted around and found some ideas for putting a sample Flask app up on AWS Elastic Beanstalk, that cost as little as possible to use.  Here’s how I wove them together (into an almost-functional cloud solution first time around).

Set aside Anaconda, install Python3

First step in the tutorial I found was building the Python app locally (then eventually “freezing” it and uploading to AWS EB).  [Note there’s a similar tutorial from AWS here, which is good for comparison.]  [Also note: after I mostly finished my work, I confirmed that the tutorial I’m using was written for Python 2.7, and yet I’d blundered into it with Python 3.4.  Not for the faint of heart, nor for those who like non-Degraded health status.]

Which start with using virtualenv to build up the app profile.

But for me, virtualenv threw an error after I installed it (with pip install virtualenv):

ERROR: virtualenv is not compatible with this system or executable 

Okay…so Ryan Wilcox gave us a clue that you might be using the wrong python.  Running which python told me I’m directed to the anaconda version, so I commented that PATH modification out of .bash_profile and installed python3 using homebrew (which I’d previously installed on my MacBook).

Setup the Flask environment Debug virtualenv

Surprising no one but myself, by removing anaconda and installing python3, I lost access not only to virtualenv but also to pip, so I guess all that was wrapped in the anaconda environment.  Running brew install pip reports the following:

Updating Homebrew...
Error: No available formula with the name "pip" 
Homebrew provides pip via: `brew install python`. However you will then
have two Pythons installed on your Mac, so alternatively you can install
pip via the instructions at:
  https://pip.readthedocs.io/en/stable/installing/

The only option from that article seems to be running get-pip.py, so I ran python3 get-pip.py (in case the script installed a different pip based on which version of python was running the script).  Running pip –version returned this, so I felt safe enough to proceed:

pip 9.0.1 from /usr/local/lib/python3.5/site-packages (python 3.5)

Ran pip install virtualenv, now we’re back where I started (but without the anaconda crutch that I don’t entirely understand, and didn’t entirely trust to be compatible with these AWS EB tutorials).

Running virtualenv flask-aws from within the local clone of the tutorial repo throws this:

Using base prefix '/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5'
Overwriting /Users/mike/code/flask-aws-tutorial/flask-aws/lib/python3.5/orig-prefix.txt with new content
New python executable in /Users/mike/code/flask-aws-tutorial/flask-aws/bin/python3.5
Not overwriting existing python script /Users/mike/code/flask-aws-tutorial/flask-aws/bin/python (you must use /Users/mike/code/flask-aws-tutorial/flask-aws/bin/python3.5)
Traceback (most recent call last):
  File "/usr/local/bin/virtualenv", line 11, in 
    sys.exit(main())
  File "/usr/local/lib/python3.5/site-packages/virtualenv.py", line 713, in main
    symlink=options.symlink)
  File "/usr/local/lib/python3.5/site-packages/virtualenv.py", line 925, in create_environment
    site_packages=site_packages, clear=clear, symlink=symlink))
  File "/usr/local/lib/python3.5/site-packages/virtualenv.py", line 1370, in install_python
    os.symlink(py_executable_base, full_pth)
FileExistsError: [Errno 17] File exists: 'python3.5' -> '/Users/mike/code/flask-aws-tutorial/flask-aws/bin/python3'

Hmm, is this what virtualenv does?  OK, if that’s true why is something intended to insulate from external dependencies seemingly getting borked by external dependencies?

Upon closer examination, it appears that what the virtualenv script tried to do the first time was generate a symlink to a python3.5 interpreter, and got tripped when it found a symlink already there.  However, the second time I ran the command (hoping that it is self-healing), I got yelled at about “too many symlinks”, and discover that the targets now have a circular loop:

Mac4Mike:bin mike$ ls -la
total 24
drwxr-xr-x  5 mike  staff  170 Dec 11 12:26 .
drwxr-xr-x  6 mike  staff  204 Dec 11 12:26 ..
lrwxr-xr-x  1 mike  staff    9 Dec 11 12:26 python -> python3.5
lrwxr-xr-x  1 mike  staff    6 Dec 11 11:52 python3 -> python
lrwxr-xr-x  1 mike  staff    6 Dec 11 11:52 python3.5 -> python

If I’m reading the above stack trace correctly, they were trying to create a symlink from python3.5 to some other target.  So all it should require is deleting the python3.5 symlink, yes?  Easy.

Aside: then running virtualenv flask-aws from the correct location (but an old, not-refreshed shell) spills this:

Using base prefix '/Users/mike/anaconda3'
Overwriting /Users/mike/code/flask-aws-tutorial/flask-aws/lib/python3.5/orig-prefix.txt with new content
New python executable in /Users/mike/code/flask-aws-tutorial/flask-aws/bin/python
Traceback (most recent call last):
  File "/Users/mike/anaconda3/bin/virtualenv", line 11, in 
    sys.exit(main())
  File "/Users/mike/anaconda3/lib/python3.5/site-packages/virtualenv.py", line 713, in main
    symlink=options.symlink)
  File "/Users/mike/anaconda3/lib/python3.5/site-packages/virtualenv.py", line 925, in create_environment
    site_packages=site_packages, clear=clear, symlink=symlink))
  File "/Users/mike/anaconda3/lib/python3.5/site-packages/virtualenv.py", line 1387, in install_python
    raise e
  File "/Users/mike/anaconda3/lib/python3.5/site-packages/virtualenv.py", line 1379, in install_python
    stdout=subprocess.PIPE)
  File "/Users/mike/anaconda3/lib/python3.5/subprocess.py", line 947, in __init__
    restore_signals, start_new_session)
  File "/Users/mike/anaconda3/lib/python3.5/subprocess.py", line 1551, in _execute_child
    raise child_exception_type(errno_num, err_msg)
OSError: [Errno 62] Too many levels of symbolic links

Try again from a shell where anaconda’s been removed from $PATH?  Works fine:

Installing setuptools, pip, wheel...done.

Step 2: Setting up the Flask environment

Installing the requirements (using pip install -r requirements.txt) went *mostly* smooth, but it broke down on either the distribute or itsdangerous dependency:

Collecting distribute==0.6.24 (from -r requirements.txt (line 11))
  Downloading distribute-0.6.24.tar.gz (620kB)
    100% |████████████████████████████████| 624kB 1.1MB/s 
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "", line 1, in 
      File "/private/var/folders/dw/yycg4bz1347cx5v_crcjl5580000gn/T/pip-build-6gettd5v/distribute/setuptools/__init__.py", line 2, in 
        from setuptools.extension import Extension, Library
      File "/private/var/folders/dw/yycg4bz1347cx5v_crcjl5580000gn/T/pip-build-6gettd5v/distribute/setuptools/extension.py", line 2, in 
        from setuptools.dist import _get_unpatched
      File "/private/var/folders/dw/yycg4bz1347cx5v_crcjl5580000gn/T/pip-build-6gettd5v/distribute/setuptools/dist.py", line 103
        except ValueError, e:
                         ^
    SyntaxError: invalid syntax
    
    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/dw/yycg4bz1347cx5v_crcjl5580000gn/T/pip-build-6gettd5v/distribute/

Unfortunately the “/pip-build-6gettd5v/” folder was already deleted by the time I got there to look at its contents.  So I re-ran the script and noticed that all dependencies through to distribute reported as “Using cached distribute-0.6.24.tar.gz”.

Figured I’d just pip install itsdangerous by hand, and if I got into too much trouble I could always uninstall it right?  Well, that didn’t seem to help arrest this error – presumably pip caches all the requirements files first and then installs them – so I figured I’d divide the requirements.txt file into two (creating requirements2.txt and stuffing the last three files there instead) and see if that helped bypass the problem in installing the first 11 requirements.

Nope, that didn’t work, so I also moved the line “distribute==0.6.24” out of requirements.txt into requirements2.txt:

Successfully installed Flask-0.10.1 Flask-SQLAlchemy-2.0 Flask-WTF-0.10.3 Jinja2-2.7.3 MarkupSafe-0.23 PyMySQL-0.6.3 SQLAlchemy-0.9.8 WTForms-2.0.1 Werkzeug-0.9.6 argparse-1.2.1

Yup, that did it, so off to pip install requirements2.txt:

Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/dw/yycg4bz1347cx5v_crcjl5580000gn/T/pip-build-5y_r3gg0/distribute/

OK, so then I removed the “distribute==0.6.24” line entirely from requirements2.txt:

Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/dw/yycg4bz1347cx5v_crcjl5580000gn/T/pip-build-5g9i5fpl/wsgiref/

Crap, wsgiref too?  OK, pip install wsigiref==0.1.2 by hand:

Collecting wsgiref==0.1.2
  Using cached wsgiref-0.1.2.zip
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "", line 1, in 
      File "/private/var/folders/dw/yycg4bz1347cx5v_crcjl5580000gn/T/pip-build-suqskeut/wsgiref/setup.py", line 5, in 
        import ez_setup
      File "/private/var/folders/dw/yycg4bz1347cx5v_crcjl5580000gn/T/pip-build-suqskeut/wsgiref/ez_setup/__init__.py", line 170
        print "Setuptools version",version,"or greater has been installed."
                                 ^
    SyntaxError: Missing parentheses in call to 'print'

Shit, even worse – the package itself craps out.  Good news is, I know exactly what’s wrong here, because the print method (function?) requires () in Python3.  (See, I *did* learn something from that anti-Zed rant.)

Double crap – wsgiref‘s latest version IS 0.1.2.

Triple-crap – that code’s docs haven’t been touched in forever, and don’t exist in a git repo against which issues can be filed.

Damn, this *really* sucks.  Seems I’ve been trapped in the hell that is “tutorials written for Python2”.  The only way I could get around this error is to edit the source of the cached wsgiref-0.1.2.zip file – do such files get deleted from $TMPDIR when pip exits?  Or are they stuffed in ~/Library/Caches/pip in some obfuscated filename?

Ultimately it didn’t matter – I found that pip install –download ~/ wsgiref==0.1.2 worked to dump the file exactly where I wanted.

And yes, wrapping lines 170 and 171 with () after the print verb (and re-zip’ing the folder contents) allowed me to fully install wsgiref-0.1.2 using pip install –no-index –find-links=~/ wsgiref-0.1.2.zip.

OK, finally ran pip install boto==2.28.0 and that’s all the dependencies installed.

Next!

Step 3: Create the AWS RDS

While the AWS UI has changed in subtle but UX-improving ways, this part wasn’t hard to follow.  I wasn’t happy about seeing the “all traffic, all sources” security group applied here, but not knowing what my ultimate target configuration might look like, I made the classic blunder of saying to self, “I’ll lock it down later.”

Step 4: Add tables

It wasn’t entirely clear, so here’s the construction of the SQLALCHEMY_DATABASE_URI parameter in config.py:

  • = Master Username you chose on the Specify DB Details page
    e.g. “awsflasktst”
  • = Master Password you chose on the Specify DB Details page
    e.g. “awsflasktstpwd”
  • = Endpoint string from DB Instance page
    e.g. “flask_tst.cxcmcajseabc.us-west-2.rds.amazonaws.com:3306”
  • = Database Name you chose on the Configure Advanced Settings page
    e.g. “awstflasktstdb”

So in my case that line read (well, it didn’t because I’m not telling you my *actual* configuration, but this lets you know how I parsed the tutorial guidance):

SQLALCHEMY_DATABASE_URI = 'mysql+pymysql://awsflasktst:awsflasktstpwd@flask_tst.cxcmcajseabc.us-west-2.rds.amazonaws.com:3306/awstflasktstdb'

Step 4.6 Install NewRelic

Because I’m a former Relic and I’m still enthused about APM technology, I decided to install the newrelic package via pip install newrelic.  A few steps into the Quick Start guide and it was reporting all the…tiny amounts of traffic I was giving myself.

God knows if this’ll survive the pip freeze, but it’s sure worth a shot.  [Spoiler: cloud deployment was ultimately unsuccessful, so it’s moot.]

Step 5: Elastic Beanstalk setup

Installing the CLI was easy enough:

Successfully installed awsebcli-3.8.7 blessed-1.9.5 botocore-1.4.85 cement-2.8.2 colorama-0.3.7 docker-py-1.7.2 dockerpty-0.4.1 docopt-0.6.2 docutils-0.13.1 jmespath-0.9.0 pathspec-0.5.0 python-dateutil-2.6.0 pyyaml-3.12 requests-2.9.1 semantic-version-2.5.0 six-1.10.0 tabulate-0.7.5 wcwidth-0.1.7 websocket-client-0.40.0

Setup of the user account was fairly easy, although again it’s clear that AWS has done a lot of refactoring of their UI flows.  Just read ahead a little in the tutorial and you’ll be able to fill in the blanks.  (Although again, it’s a little disturbing to see a tutorial be this cavalier about the access permissions being granted to a simple web app.)

When we got to the eb init command I ran into a snag:

ERROR: ConfigParseError :: Unable to parse config file: /Users/mike/.aws/config

I checked and I *have* a config file there, and it’s already been populated with Packer credentials info (for a separate experiment I was working on previously).

So what did I do?  I renamed the config and credentials files to set them aside for now, of course!

Like me, you might also see an EB application listed, so I chose “2” not “1”.  No big deal.

There was also an unexpected option to use AWS CodeCommit, but since I’m not even sure I’m keeping this app around, I did not take that choice for now.

(And once again, this tutorial was super-cavalier about ignoring the SSH option that *everyone* always uses, so just closed my eyes and prayed I never inherit these bad habits…)

“Time to deploy this bad boy.”  Bad boy indeed – Python 2.7, *no* security measures whatsoever.  This boy is VERY bad.

Step 6: Deployment

So there’s this apparent dependency between the EBCLI and your local git repo.  If your changes aren’t committed, then what gets deployed won’t reflect those changes.

However, the part about running git push on the repo seemed unnecessary.  If the local CLI tool checks my repo, they’re only checking the local one, not the remote one, so once we run git commit, it shouldn’t matter whether those changes were pushed upstream.

However, knowing that I monkeyed with the requirements.txt file, I thought I’d also git add that one to my commit history:

git add requirements.txt
git commit -m "added newrelic requirement"

After initiating eb create, I was presented with a question never mentioned in the tutorial (another new AWS feature!), and not documented specifically in any of the ELB documentation I reviewed:

Select a load balancer type
1) classic
2) application
(default is 1): 

While my experience in this space tells me “application” sounds like the right choice, I chose the default for sake of convenience/sanity.

Proceeded with eb deploy, and it was looking good until we ran into the problem I was expecting to see:

ERROR: Your requirements.txt is invalid. Snapshot your logs for details.
ERROR: [Instance: i-01b45c4d01c070555] Command failed on instance. Return code: 1 Output: (TRUNCATED)...)
  File "/usr/lib64/python2.7/subprocess.py", line 541, in check_call
    raise CalledProcessError(retcode, cmd)
CalledProcessError: Command '/opt/python/run/venv/bin/pip install -r /opt/python/ondeck/app/requirements.txt' returned non-zero exit status 1. 
Hook /opt/elasticbeanstalk/hooks/appdeploy/pre/03deploy.py failed. For more detail, check /var/log/eb-activity.log using console or EB CLI.
INFO: Command execution completed on all instances. Summary: [Successful: 0, Failed: 1].
WARN: Environment health has transitioned from Pending to Degraded. Command failed on all instances. Initialization completed 7 seconds ago and took 3 minutes.
ERROR: Create environment operation is complete, but with errors. For more information, see troubleshooting documentation.

Lesson

Ultimately unsuccessful, but was it satisfying?  Heck yes, grappling with all the tools and troubleshooting various aspects of the procedures was a wonderful learning experience.  [And as it turned out, was a great primer to lead into a less hand-holding tutorial I found later that I’ll try next.]

Coda

Well, heck.  That was easier than I thought.  Succeeded after another run-through by forcing Python 2.7 on both the local app configuration and in a new Elastic Beanstalk app.

Advertisements

Hiring in the Kafka universe of BigCorp: a play in infinite acts

Ever wondered why it takes so long for a company to get around to deciding whether to hire you?

The typical hiring “process” as I’ve observed it from inside the belly of two beasts (Microsoft and Intel – though I gather this is typical of most large, and many small, companies):

  • “yeah, we’ve got two heads requested, has to get through Mid-Year Budget Adjustment Review Fuckup”
  • “update? Yeah, MYBARF is taking a little longer than usual, but I’m hearing we’re likely to get the heads, so I’ve started drafting the job req”
  • “new emergency project announced – I’ll be heads-down for a few weeks with my key engineer – BTW we lost one of he heads to another project, last one isn’t approved yet”
  • “yeah, MYBARF got approved last month but the open head is still under negotiation”
  • “OK the head is approved – I lost the draft req, could someone volunteer to write one up for me?”
  • “HR had some feedback on the req language”
  • “we posted the req”
  • “I’ll have time to review resumes from HR in a week”
  • “HR has no idea how to screen for this job so I had to reject the initial batch of resumes”
  • “OK, I’ll have time to phone screen starting next week”
  • “I haven’t seen any mind-blowing candidates yet so I’m talking to HR *again* about my expectations”
  • “Can you do a tech screen tomorrow morning between 7:30 and 8:15? That’s the only time one candidate has for us to talk…”

Hashicorp Vault + Ansible + CD: open source infra, option 2

“How can we publish our server configuration scripts as open source code without exposing our secrets to the world?”

In my first take on this problem, I fell down the rabbit hole of Ansible’s Vault technology – a single-password-driven encryption implementation that encrypts whole files and demands they be decrypted by interactive input or static filesystem input at runtime. Not a bad first try, but feels a little brittle (to changes in the devops team, to accidental inclusion in your git commits, or to division-of-labour concerns).

There’s another technology actively being developed for the devops world, by the Hashicorp project, also (confusingly/inevitably) called Vault. [I’ll call it HVault from here on, to distinguish from Ansible Vault >> AVault.]

HVault is a technology that (at least from a cursory review of the intro) promises to solve the brittle problems above. It’s an API-driven lockbox and runtime-proxy for all manner of secrets, making it possible to store and retrieve static secrets, provision secrets to some roles/users and not others, and create limited-time-use credentials for applications that have been integrated with HVault.

Implementation Options

So for our team’s purposes, we only need to worry about static secrets so far. There’s two possible ways I can see us trying to integrate this:

  1. retrieve the secrets (SSH passphrases, SSL private keys, passwords) directly and one-by-one from HVault, or
  2. retrieve just an AVault password that then unlocks all the other secrets embedded in our Ansible YAML files (using reinteractive’s pseudo-leaf indirection scheme).

(1) has the advantage of requiring one fewer technologies, which is a tempting decision factor – but it comes at the expense of creating a dependency/entanglement between HVault and our Ansible code (in naming and managing the key-value pairs for each secret) and of having to find/use a runtime solution to injecting each secret into the appropriate file(s).

(2) simplifies the problem of injecting secrets at runtime to a single secret (i.e. AVault can accept a script to insert the AVault password) and enables us to use a known quantity (AVault) for managing secrets in the Ansible YAMLs, but also means that (a) those editing the “secret-storing YAMLs” will still have to have access to a copy of the AVault password, (b) we face the future burden to plan for breaking changes introduced by both AVault and HVault, and (c) all secrets will be dumped to disk in plaintext on our continuous deployment (CD) server.

Thoughts on Choosing For Our Team

Personally, I favour (1) or even just using AVault alone. While the theoretical “separation of duties” potential for AVault + HVault is supposed to be more attractive to a security geek like me, this just seems like needless complexity for effectively very little gain. Teaching our volunteers (now and in the future) how to manage two secrets-protecting technologies would be more painful, and we double the risks of dealing with a breaking change (or loss of active development) for a necessary and non-trivially-integrated technology in our stack.

Further, if I had to stick with one, I’d stay “single vendor” and use AVault rather than spread us across two projects with different needs & design philosophies. Once we accept that there’s an occasional “out of band initialization” burden for setting up either vault, and that we’d likely have to share access to larger numbers of secrets with a wider set of the team than ideal, I think the day-to-day management overhead of AVault is no worse (and possibly lighter) than HVault.

Pseudo-Solution for an HVault-only Implementation

Assuming for the moment that we proceed with (1), this (I think) is the logical setup to make it work:

  • Setup an HVault instance
  • Design a naming scheme for secrets
  • Populate HVault with secrets
  • Install Consul Template as a service
  • Rewrite all secret-containing Ansible YAMLs with Consul Template templating variables (matching the HVault naming)
  • Rewrite CD scripts to pull HVault secrets and rewrite all secret-containing Ansible YAMLs
  • Populate the HVault environment variables to enable CD scripts to authenticate to HVault

Operational Concerns

If the HVault instance is running on a server in the production infrastructure, can HVault be configured to only allow connections from other servers that require access to the HVault secrets? This would reduce the risk that knowledge of the HVault (authentication token and address as used here) wouldn’t provide instant access to the secrets from anywhere on the Internet. This would be considered a defense-in-depth measure in case ip_tables and SSH protections could be circumvented to allow incoming traffic at the network level.

The HVault discussions about “flexibility” and “developer considerations” lead me to conclude that – for a volunteer team using part-time time slivers to manage an open source project’s infrastructure – HVault Cubbyhole just isn’t low-impact, fully-baked enough at this time to make it worth the extra development effort to create a full solution for our needs. While Cubbyhole addresses an interesting edge case in making on-the-wire HVault tokens less vulnerable, it doesn’t substantially mitigate (for us, at least) the bootstrapping problem, especially when it comes to a single-server HVault+deployment service setup.

Residual Security Issues

  • All this gyration with HVault is meant to help solve the problems of (a) storing all Ansible YAML-bound secrets in plaintext, (b) storing a static secret (the AVault password) in plaintext on our CD server, and (c) finding some way to keep any secrets from showing up in our github repo.
  • However, there’s still the problem of authenticating a CD process to HVault to retrieve secret(s) in the first place
  • We’re still looking to remove human intervention from standard deployments, which means persisting the authentication secret (token, directory-managed user/pass, etc) somewhere on disk (e.g. export VAULT_TOKEN=xxxx)
  • Whatever mechanism we use will ultimately be documented – either directly in our github repo, or in documentation we end up publishing for use by other infrastructure operators and those who wish to follow our advice

 

This is not the final word – these are merely my initial thoughts, and I’m looking forward to members of the team bringing their take to these technologies, comparisons and issues.  I’m bound to learn something and we’ll check back with the results.

Reading List

Intro to Hashicorp Vault: https://www.vaultproject.io/intro/

Blog example using HVault with Chef: https://www.hashicorp.com/blog/using-hashicorp-vault-with-chef.html

Example Chef Recipe for using HVault https://gist.github.com/sethvargo/6f1a315094fbd1a18c6d

Ansible lookup module to retrieve secrets from HVault https://github.com/jhaals/ansible-vault

Ansible modules for interacting with HVault https://github.com/TerryHowe/ansible-modules-hashivault

Adventures in Data Modeling: Entity Framework, Model First

sun breaks through cloudsWorking up a data model for a new systems architecture – been spending the last few months working in MS Word, whiteboards and sticky notes.  Feel like we’ve hit diminishing returns by looking at this in the abstract, so I figured I’d hack up an instance in MS SQL to see how much of this thinking stands up to reality.

I tracked down a few online resources to get me back into SQL (it’s been nearly a decade since I dug in deep) – two of note were:

After a day or so messing around with this, my valued conspirator Dale Cox mentioned Entity Framework to me:
http://msdn.microsoft.com/en-us/data/ee712907

And thus was the Bright Light of Truth shone upon my works.  For a guy like me who’s working out the business needs, transforming into Stories, and usually has a lot more to say about UX than about MVVM constructs or SQL queries, this is the perfect level for me to dig into next.  Entity Framework, Model First, using Visual Studio 2013?  I still wouldn’t pay thousands of $$$ for this tool, but here’s one way that it stays relevant for me.