Scaling the Cliffs of Insanity (aka using Ansible from a Windows controller)

cliffs-of-insanity
Dread Pirate Roberts and Princess Buttercup, out for a morning stroll

This post is just record-keeping for me to remember why I abandoned Ansible on Windows a few months back.

here-there-be-dragons
Seriously, here be dragons

Here’s what I did to enable me to use Ansible for automation

  • Since I’m using a Windows box as my primary desktop for now, I wanted to see if I could avoid running a Linux VM just to manage other *NIX boxes
  • I thought I should be able to get away with using the Git for Windows environment rather than the full Cygwin environment that Jeff Geerling documents here (spoiler alert: I couldn’t, but it’s instructive to document how far I got and where it broke down)

Install and troubleshoot Ansible on Git for Windows

  • I followed installation instructions from step (3) (including using those legacy versions of PyYAML and Jinja2)
  • I skipped step (6) since I already had SSH keys
  • I ran ansible –version, and saw errors like:
     $ ansible --version
     Traceback (most recent call last):
       File "C:/Users/Mike/code/ansible/bin/ansible", line 46, in <module>
        from ansible.utils.display import Display
       File "C:\Users\Mike\code\ansible\lib\ansible\utils\display.py", line 21, in <module>
        import fcntl
     ImportError: No module named 'fcntl'
  • Tried resolving that dependency via http://stackoverflow.com/questions/1422368/fcntl-substitute-on-windows
  • This just led to the following (which seemed a hint that I might encounter many such issues, not worth it):
     $ ansible --version
     Traceback (most recent call last):
       File "C:/Users/Mike/code/ansible/bin/ansible", line 46, in <module>
        from ansible.utils.display import Display
       File "C:\Users\Mike\code\ansible\lib\ansible\utils\display.py", line 33, in <module>
        from termios import TIOCGWINSZ
     ImportError: No module named 'termios'
  • Gave in, uninstalled Python 3.5.1 and jumped back to latest Python 2.x (2.7.1)
  • Still couldn’t get past that error, so I gave up and went to Cygwin

Install and troubleshoot Ansible on Cygwin64

Then I found this article, followed a combo of the top answer:

  • apt-cyg remove python-cryptography
  • git clone –depth 1 git://github.com/ansible/ansible
  • apt-cyg install git python-{jinja2,six,yaml}
  • wget rawgit.com/transcode-open/apt-cyg/master/apt-cyg && install apt-cyg /bin
  • That results in this output:
     $ ansible --version
     ansible 2.2.0 (devel 37737ca6c1) last updated 2016/05/11 14:27:18 (GMT -700)
       lib/ansible/modules/core:  not found - use git submodule update --init lib/ansible/modules/core
       lib/ansible/modules/extras:  not found - use git submodule update --init lib/ansible/modules/extras
       config file =
       configured module search path = ['/opt/ansible/library']
  • Then running this command from within the local ansible repo (e.g. from /opt/ansible) gets the basic modules you’ll need to get started (e.g. the “ping” module for testing)
    git submodule update –init –recursive
  • Then I decided to test ansible’s ability to connect to the target host based on the command recommended in Ansible’s installation docs:
     $ ansible all -m ping --ask-pass
     SSH password:
     192.168.1.13 | FAILED! => {
        "failed": true,
        "msg": "to use the 'ssh' connection type with passwords, you must install the sshpass program"
     }
  • OK, let’s get sshpass installed on my control device [Windows box] so that we can quickly bootstrap the SSH keys to the target…
  • Since sshpass isn’t an available package in Cygwin package directory, the only way to get sshpass is to install it from source
    •  NOTE: I discovered this the hard way, after trying every trick I could think of to make all this work without having to deal with a C compiler
  • Found this article for OS X users and followed it in my Windows environment (NOTE: I had to install the “make” and “gcccore” packages from Cygwin setup – and then having to re-run apt-cyg remove python-cryptography from the Cygwin terminal again because it gets automatically reinstalled by Cygwin setup)
  • This time when running the command I get a different error – which means I must’ve got sshpass stuffed in the right location:
     $ ansible all -m ping --ask-pass
     SSH password:
     192.168.1.13 | FAILED! => {
        "failed": true,
        "msg": "Using a SSH password instead of a key is not possible because Host Key checking is enabled and sshpass does not support this.  Please add this host's fingerprint to your known_hosts file to manage this host."
     }
  • I’m guessing this means I need to obtain the target host’s SSH public key (and not that the target host refused to connect because it didn’t trust the control host).  So then I had to harvest the target host’s fingerprint
     $ ssh -o StrictHostKeyChecking=ask -l mike 192.168.1.13
     The authenticity of host '192.168.1.13 (192.168.1.13)' can't be established.
     ECDSA key fingerprint is SHA256:RqBcq8aCAgohFeiTlPeYd8hLDfTz1A25ZPlzyxlDrqI.
     Are you sure you want to continue connecting (yes/no)? yes
     Warning: Permanently added '192.168.1.13' (ECDSA) to the list of known hosts.
     mike@192.168.1.13's password:
  • This time around when running the ping command I saw this error:
     $ ansible all -m ping --ask-pass
     SSH password:
     192.168.1.13 | UNREACHABLE! => {
        "changed": false,
        "msg": "Authentication failure.",
        "unreachable": true
     }
  • I finally got the bright idea to poke around the target’s log files and see if there were any clues – sure enough, /var/log/auth.log made it clear enough to me:
     Invalid user Mike from 192.168.1.10
     input_userauth_request: invalid user Mike [preauth]
      pam_unix(sshd:auth): check pass; user unknown
      pam_unix(sshd:auth): authentication failure...
      Failed password for invalid user Mike from 192.168.1.10 port 58665 ssh2
      Connection closed by 192.168.1.10 [preauth]
  • (Boy do I hate *NIX case-sensitivity at times like this – my Windows username on the control device is “Mike” and the Linux username on the target is “mike”)
  • Tried this same command but adding the root user (ansible all -m ping –user root –ask-pass) but kept getting the same error back, and saw /var/log/auth.log reported “Failed password for root from 192.168.1.10”)
  • Re-ran su root in bash on the target just to make sure I wasn’t forgetting the password (I wasn’t)
  • Tried a basic ssh connection with that same password (to eliminate other variables):
     $ ssh root@192.168.1.13
     root@192.168.1.13's password:
     Permission denied, please try again.
     root@192.168.1.13's password:
     Permission denied, please try again.
     root@192.168.1.13's password:
     Permission denied (publickey,password).
  • Tried the same thing with my “mike” user, success:
     $ ssh mike@192.168.1.13
     mike@192.168.1.13's password:
     The programs included with the Debian GNU/Linux system are free software;
     the exact distribution terms for each program are described in the
     individual files in /usr/share/doc/*/copyright.
     Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
     permitted by applicable law.
     Last login: Sat May 14 16:10:05 2016 from 192.168.1.10
     mike@debianbox:~$
  • One more try with the ansible/SSH/password approach:
     $ ansible all -m ping --user mike --ask-pass
     SSH password:
     192.168.1.13 | UNREACHABLE! => {
        "changed": false,
        "msg": "Failed to connect to the host via ssh.",
        "unreachable": true
     }
  • So close!!  Search led to this possibility, so I re-ran with -vvvv param to get this:
     $ ansible all -m ping --user mike --ask-pass -vvvv
     No config file found; using defaults
     SSH password:
     Loaded callback minimal of type stdout, v2.0
     <192.168.1.13> ESTABLISH SSH CONNECTION FOR USER: mike
     <192.168.1.13> SSH: EXEC sshpass -d44 ssh -C -vvv -o ControlMaster=auto -o ControlPersist=60s -o User=mike -o ConnectTimeout=10 -o ControlPath=/home/Mike/.ansible/cp/ansible-ssh-%h-%p-%r 192.168.1.13 '/bin/sh -c '"'"'( umask 77 && mkdir -p "` echo $HOME/.ansible/tmp/ansible-tmp-1463273031.8-128175022355609 `" && echo "` echo $HOME/.ansible/tmp/ansible-tmp-1463273031.8-128175022355609 `" )'"'"''
     <192.168.1.13> PUT /tmp/tmp948_ec TO /home/mike/.ansible/tmp/ansible-tmp-1463273031.8-128175022355609/ping
     <192.168.1.13> SSH: EXEC sshpass -d44 sftp -b - -C -vvv -o ControlMaster=auto -o ControlPersist=60s -o User=mike -o ConnectTimeout=10 -o ControlPath=/home/Mike/.ansible/cp/ansible-ssh-%h-%p-%r '[192.168.1.13]'
     192.168.1.13 | UNREACHABLE! => {
        "changed": false,
        "msg": "SSH Error: data could not be sent to the remote host. Make sure this host can be reached over ssh",
        "unreachable": true
     }
  • I can definitely see the ansible-tmp-1463273031.8-128175022355609 file under ~/.ansible/tmp on the target system, so Ansible is getting authenticated and can run the initial shell commands
  • But I’m not seeing the /ping file under that directory, and I’m wondering if there’s something preventing sftp from connecting to the target host (since that’s the final command being run just before I get back the error). The “sftp” program is available on the controller though.
  • Digging around in Wireshark, I can see the SSHv2 traffic between controller and target, but after the initial key exchange I see exactly 3 encrypted packets sent from the controller (and encrypted responses from the target), and then no further communication between the two thereafter.  The only other activity I see on either system that’s unexplained is the target device ARP’ing for the local router three times, once every second, after the SSH traffic dies off
  • After the first two initial successes in getting the tmp files pushed via ssh, I’ve since only had failures “Failed to connect to the host via ssh.”  Using ProcExp.exe to verify that there is actual network traffic being sent to the target’s IP, and using Wireshark to get some idea what’s getting through and what’s not (but Wireshark is acting up and no longer showing me traffic from controller to target, only target to controller responses, so it’s getting a little nuts at this point)
  • I’ve added “PTR” records to the hosts files on both the controller and the target to resolve the IP address for each other to a defined name, but I’m still getting “failed to connect…” (even though I can confirm that the tools are using the newly-registered name, since I even tried substituting the falsified name in the ansible_hosts file for the previously-used IP address)
  • I tried the advice from here to switch to SCP if SFTP might not be working, but that didn’t help (so I emptied out the .ansible.cfg file again)
  • I don’t know where to look for log files to see exactly what errors are occurring locally, so I’m pretty much stumped at this point
  • !!!!!! I pushed my ssh key to the target host, and the very next run of ansible all -m ping succeeded!!!! 😦
  • CONCLUSION: ssh-pass doesn’t seem compatible with the Windows setup I’ve been using all weekend
  • EPILOGUE: “Failed to connect…” error is back again when running ansible from Windows – I can see successful auth in the target’s /var/log/auth.log, but even -m ping fails (e.g. ansible all -m ping -u root)

Other References

http://everythingshouldbevirtual.com/ansible-using-ansible-on-windows-via-cygwin

https://help.ubuntu.com/community/SSH/OpenSSH/Keys

Mashing the marvelous wrapper until it responds, part 1: prereq/setup

I haven’t used a dynamic language for coding nearly as much as strongly-typed, compiled languages so approaching Python was a little nervous-making for me.  It’s not every day you look into the abyss of your own technical inadequacies and find a way to keep going.

Here’s how embarrassing it got for me: I knew enough to clone the code to my computer and to copy the example code into a .py file, but beyond that it felt like I was doing the same thing I always do when learning a new language: trying to guess at the basics of using the language that everyone who’s writing about it already knows and has long since forgotten, they’re so obvious.  Obvious to everyone but the neophyte.

Second, is that I don’t respond well to the canonical means of learning a language (at least according to all the “Learn [language_x] from scratch” books I’ve picked up over the years), which is

  • Chapter 1: History, Philosophy and Holy Wars of the Language
  • Chapter 2: Installing The Author’s Favourite IDE
  • Chapter 3: Everything You Don’t Have a Use For in Data Types
  • Chapter 4: Advanced Usage of Variables, Consts and Polymorphism
  • Chapter 5: Hello World
  • Chapter 6: Why Hello World Is a Terrible Lesson
  • Chapter 7: Author’s Favourite Language Tricks

… etc.

I tend to learn best by attacking a specific, relevant problem hands-on – having a real problem I felt motivated to attack is how these projects came to be (EFSCertUpdater, CacheMyWork).  So for now, despite a near-complete lack of context or mentors, I decided to dive into the code and start monkeying with it.

Riches of Embarrassment

I quickly found a number of “learning opportunities” – I didn’t know how to:

  1. Run the example script (hint: install the python package for your OS, make sure the python binary is in your current shell’s path, and don’t use the Windows Git Bash shell as there’s some weird bug currently at work)
  2. Install the dependencies (hint: run “pip install xxxx”, where “xxxx” is whatever shows up at the end of an error message like this:
    C:\Users\Mike\code\marvelous>python example.py 
    Traceback (most recent call last):     
        File "example.py", line 5, in <module>
            from config import public_key, private_key 
    ImportError: No module named config

    In this example, I ran “pip install config” to resolve this error.

  3. Set the public & private keys (hint: there was some mention of setting environment variables, but it turns out that for this example script I had to paste them into a file named “config” – no, for python the file needs to be named “config.py even though it’s text not a script you would run on its own – and make sure the config.py file is stored in the same folder as the script you’re running.  Its contents should look similar to these (no, these aren’t really my keys):
        public_key = 81c4290c6c8bcf234abd85970837c97 
        private_key = c11d3f61b57a60997234abdbaf65598e5b96

    Nope, don’t forget – when you declare a variable in most languages, and the variable is not a numeric value, you have to wrap the variable’s value in some type of quotation marks.  [Y’see, this is one of the things that bugs me about languages that don’t enforce strong typing – without it, it’s easy for casual users to forget how strings have to be handled]:

        public_key = '81c4290c6c8bcf234abd85970837c97' 
        private_key = 'c11d3f61b57a60997234abdbaf65598e5b96'
  4. Properly call into other Classes in your code – I started to notice in Robert’s Marvelous wrapper that his Python code would do things like this – the comic.py file defined
         class ComicSchema(Schema):

    …and the calling code would state

        import comic 
        … 
        schema = comic.ComicSchema()

    This was initially confusing to me, because I’m used to compiled languages like C# where you import the defined name of the Class, not the filename container in which the class is defined.  If this were C# code, the calling code would probably look more like this:

        using ComicSchema;
        … 
        _schema Schema = ComicSchema();

    (Yes, I’m sure I’ve borked the C# syntax somehow, but for sake of this sad explanation, I hope you get the idea where my brain started out.)

    I’m inferring that for a scripted/dynamic language like Python, the Python interpreter doesn’t have any preconceived notion of where to find the Classes – it has to be instructed to look at specific files first (import comic, which I’m guessing implies import comic.py), then further to inspect a specified file for the Class of interest (schema = comic.ComicSchema(), where comic. indicates the file to inspect for the ComicSchema() class).

Status: Learning

So far, I’m feeling (a) stupid that I have to admit these were not things with which I sprang from the womb, (b) grateful Python’s not *more* punishing, (c) smart-ish that fundamental debugging is something I’ve still retained and (d) good that I can pass along these lessons to other folks like me.

Considering learning Python – idle thought (until something catalyses it)

A friend-of-a-friend asked me this week:

Hi Mike, [mutual friend] referred me to you as a good person to ask: what’s the best way to learn Python for someone like me, whose programming skills are essentially 1990-era (I know Perl and C, but haven’t made the leep to object oriented stuff)? I’d like to leapfrog into the present era, and make web 2.0-ish-looking sites and experiments. Is there a particular web hosting service I should use? Thanks for any advice you might have.

Funny you should ask – I was just wondering this week whether I shouldn’t dive into Python as a quick-and-dirty prototyping language.  I’d fancied myself for years as someone who might be able to reinvent myself as a programmer, and I’ve muddled around with C# and VB for a few years now – but only in short spurts.  Every time I come back to it, I feel like I’m climbing a steep hill all over again.
For some reason, I get the impression that  for folks that are just whipping something together quickly, interpreted scripting languages like Python, Perl or Javascript are easier to deal with – less overhead, less setup, just diving in and getting to the business of making something happen.  I’ve always felt like if I wanted to call myself a coder, that I’d be “cheating” by taking this route, so I never allowed myself the freedom to try this out.  But at the same time, I was never brave/patient enough to mess around with low-level code like C or C++ (who wants to write hundreds of lines of memory-handling routines that managed code gives you for ‘free’?), so I guess I’ve left myself between a rock and hard place – not quite as easy as “just do it” but not really forcing myself to learn the really “worthy” stuff either.
How to learn Python?  An almost-colleague of mine took the leap and blogged his process for going deep – start at the bottom:
I’ve never done it myself, but I trust that Mark is a smart guy who doesn’t muck around for the sake of making himself “look smart” (feel miserable).
For me, forcing myself to learn to code was an exercise in frustrating false starts – until I found a problem I couldn’t solve any way but coding it myself, and a problem that pissed me off enough to keep slogging through failures and dead ends until I got something working.
Web hosting?  No idea.  I know a few big names (AWS, Rackspace, Google Apps) but I have no clue where to get the pre-built infrastructure to just upload .py and let fly.
Is this helpful?  Do you have something specific in mind?  If you’re working on something specific and looking to work loosely with one or a few others, I’d be interested in hearing what it is and whether it fires my “that sucks!” instinct enough to want to contribute/walk alongside.

Security scrubbing of Python code – PyChecker or nothing?

I’m hardly versed in the history or design of the Python programming language (I just started reading up on it this week), but I know this much already: Python is intended to be a very easy-to-use scripting language, minimizing the burden of silly things like strongly typing your data (not to mention skipping the arguable burden of compiling your code).

Most developers don’t have two spare seconds to rub together, and are hardly excited at the prospect of taking code that they finally stabilized and having to review/revisit it to find and fix potential security bugs.  Manually droning through code has to be about the most mind-numbing work that most of us can think of eh?

On the other hand, static analysis tools are hardly an adequate substitute for good security design, threat modelling and code reviews.

Still, static analysis tools seem to me a great way to reduce the workload of secure code reviews and let the developer/tester/reviewer focus on more interesting and challenging work.

Is it really practical to expect to be able to perform complex, comprehensive static analysis of code developed in a scripting language?  I mean, theoretically speaking anyone can build a rules engine and write rules that are meant to test how code could instruct a CPU to manipulate bits.  It’s not that this is impossible – I’m just wondering how practical it is at our current level of sophistication in terms of developing software languages, scripting runtimes and modelling environments.  Can we realistically expect to be able to get away with both easy development, ease of maintenance (since the code isn’t compiled) and robustness of software quality/security/reliability?

I’m certainly not trying to disparage the incredible work that’s gone into PyChecker already – anything but.  However, when a colleague asks me if there are any other static analysis tools in addition to PyChecker, I have to imagine that (a) he has some basis for comparison among static analysis tool and (b) that PyChecker doesn’t quite meet the needs he’s come to expect for checkers targeted at other languages.