Jekyll2023-11-09T13:14:49+00:00http://kohi.uk/feed.xmlcoffee induced adventures in…Write an awesome description for your new site here. You can edit this line in _config.yml. It will appear in your document head meta (for Google search results) and in your feed.xml site description.
Rob BealMisuse of Presentations2022-08-12T10:59:00+00:002022-08-12T10:59:00+00:00http://kohi.uk/2022-08-12/presentations<p>
It's very commonplace to work somewhere where you're constantly bombarded with
high level, generic presentations. To the point it's somewhat normal these
days... but should it be? Do presentations offer the value we think they do?
Are we using them correctly?
</p>
<h2>Thinking Differently</h2>
<p>
If we first put aside the concept of a presentation (which can have its
merits) and instead focus on how the time could be used differently. Let's
assume a presentation is 30 minutes, or even worse an hour (with questions at
the end). If you were reading a book how many pages do you think you could
read in 30 minutes, or even an hour?
</p>
<p>
Quite a lot it turns out! The average reading speed for a human is 200-250
words per minute, so in 30 minutes a person can read around 7,000 words! In
fact the average human in 30 minutes could read 3,000 words, have a 5 minute
tea break, then 5 minutes to absorb and reflect and a further 5 minutes to
write some comments.
</p>
<h2>The Problems With Presentations</h2>
<p>
The problem with presentations are that they're used wrongly as a way of
communicating information. It often manifests as presenters simply reading
text from the slides, slides with too much text (squinting your eyes to read
the text!), slides with too little detail (vague, uninformative) - and as a
consequece people aren't able to absorb, reflect and engage with the
information in the same way they can a written document. This is often evident
during a Q&A session at the end as there's usually either no questions, or a
few sporadic questions that lack depth. Presenters may blame people for not
being engaged when it's the delivery mechanism that is at fault.
</p>
<p>
A side effect of remote working is that people now record presentations either
for others to watch back or as a historical document. This has spurned an
industry set on shortening videos or extracting the vague and undetailed text
from them and does not solve the problem at its core.
</p>
<h2>It takes a long time to write text...</h2>
<p>
It does... but this gets easier (like everything) with practice. And time can
be greatly reduced if your text is a living evolving piece, much like a
strategy document, that constantly reflects, analyses and adjusts as you move
through time. In such cases you need only make ammendments and notify people
of those ammendments than write an entire piece of text each time.
</p>
<p>
Stepping back from being focussed on the time it takes to write text. The
benefit of people reading detailed, engaging text, likely in their own time
and having time to absorb, have a tea break, reflect and comment - This far
outweighs 30 minutes of <em>everyone</em> joining a presentation (or watching
a recording) to listen to vague information lacking detail and context. If you
want an aware, informed team - the effort is worth the reward and will be
reflected in their outcomes, decisions, designs etc....
</p>
<h2>So, no presentations?</h2>
<p>
Of course not! Presentations still have a place and can add real value. While
not great for communicating detailed information, they are good for things
where text can't be used or is of less use. A live demo, a thought provoking
topic (where the slides assist the presenter, rather than drive the
presentation - think TED talks) or when you actually want to communicate
something thought provoking, vague, high level (like lightning talks). Whether
these uses of presentations are done synchronously or asynchronously is a
topic for another time!
</p>RobIt's very commonplace to work somewhere where you're constantly bombarded with high level, generic presentations. To the point it's somewhat normal these days... but should it be? Do presentations offer the value we think they do? Are we using them correctly?Simple Multi-Arch Docker Builds2020-08-16T10:59:00+00:002020-08-16T10:59:00+00:00http://kohi.uk/2020-08-16/simple-multi-arch-builds<p>
I've been doing multi-arch builds for a while as many images I use run on an
array of architectures at home, least not my small farm of raspberry pi's.
I've recently decided to try and simplify my build set up and came across
`buildx` as well as deciding to migrate to GitHub Actions.
</p>
<h2>Using 'docker buildx build'</h2>
<p>
As of writing this post, 'docker buildx' is an experimental feature but allows
you to build an image in an array of architectures with a single command that
effectively wraps what `docker manifest` does for you. Here's a simple
example:
</p>
<figure class="highlight"><pre><code class="language-docker" data-lang="docker"><span class="k">FROM</span><span class="s"> alpine:3.12</span>
<span class="k">ENTRYPOINT</span><span class="s"> ["echo", "hello!"]</span></code></pre></figure>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">docker buildx build <span class="se">\</span>
<span class="nt">--tag</span> hello:latest <span class="se">\</span>
<span class="nt">--tag</span> hello:1.0 <span class="se">\</span>
<span class="nt">--platform</span> linux/386,linux/amd64,linux/arm/v6,linux/arm/v7,linux/arm64/v8 .</code></pre></figure>
<p>You can even have `buildx` push your built images (given you're logged into a registry) with an additional "push" flag in the command:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">docker buildx build <span class="se">\</span>
<span class="nt">--push</span> <span class="se">\</span>
<span class="nt">--tag</span> hello:latest <span class="se">\</span>
<span class="nt">--tag</span> hello:1.0 <span class="se">\</span>
<span class="nt">--platform</span> linux/386,linux/amd64,linux/arm/v6,linux/arm/v7,linux/arm64/v8 .</code></pre></figure>
<h2>Building From Source</h2>
<p>
Previously, using my Syncthing image as an example. I'd do multiple docker
builds, one for each architecture, by passing in the architecture ("$ARCH") as
a <i>--build-arg</i>, and within the Dockerfile I'd download a version of
Syncthing for the specified architecture I was building.
</p>
<p>
With <i>buidx</i>, the build runs as a single command so I'm not able to
control which architecture version of the app to download and install within
the Dockerfile. Building from source simplifies this and is arguably a better,
more agnostic thing to do.
</p>
<h2>Tying It Together With GitHub Actions</h2>
<p>
I've recently been starting a migration to GitHub Actions from a mixture of
CircleCI and Travis. Neither are bad options, but GitHub Actions does have a
few features that really interest me. Path matching is the first, effectively
allowing easier "mono" repo support (or at least repo's with multiple
concepts), or simply ignoring certain files from triggering a build (such as
README.md's).
</p>
<p>
The second feature is the easy of using community actions, of which both
python and docker buildx were easy for me to hook up. Granted GitHub Actions
isn't as fast when comparing equivalent builds, but I can live with it. Here's
a snippet taken from
<a
href="https://github.com/robertbeal/docker-syncthing/blob/main/.github/workflows/build.yml"
>my build file</a
>:
</p>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"> <span class="na">build</span><span class="pi">:</span>
<span class="na">runs-on</span><span class="pi">:</span> <span class="s">ubuntu-latest</span>
<span class="na">needs</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">lint</span><span class="pi">,</span> <span class="nv">test</span><span class="pi">]</span>
<span class="na">if</span><span class="pi">:</span> <span class="s">github.ref == 'refs/heads/main'</span>
<span class="na">steps</span><span class="pi">:</span>
<span class="pi">-</span> <span class="na">uses</span><span class="pi">:</span> <span class="s">actions/checkout@v2</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">variables</span>
<span class="na">run</span><span class="pi">:</span> <span class="s">curl --silent https://api.github.com/repos/syncthing/syncthing/releases/latest | jq -r '.tag_name' > version</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">build dependencies</span>
<span class="na">uses</span><span class="pi">:</span> <span class="s">crazy-max/ghaction-docker-buildx@v3</span>
<span class="na">with</span><span class="pi">:</span>
<span class="na">version</span><span class="pi">:</span> <span class="s">latest</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">docker hub login</span>
<span class="na">run</span><span class="pi">:</span> <span class="s">echo "$" | docker login -u "$" --password-stdin</span>
<span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">build</span>
<span class="na">run</span><span class="pi">:</span> <span class="pi">|</span>
<span class="s">docker buildx build \</span>
<span class="s">--push \</span>
<span class="s">--tag robertbeal/syncthing:latest \</span>
<span class="s">--tag robertbeal/syncthing:$(cat version) \</span>
<span class="s">--build-arg=VERSION="$(cat version)" \</span>
<span class="s">--build-arg=COMMIT_ID="$GITHUB_SHA" \</span>
<span class="s">--platform linux/386,linux/amd64,linux/arm/v6,linux/arm/v7,linux/arm64/v8 .</span></code></pre></figure>RobI've been doing multi-arch builds for a while as many images I use run on an array of architectures at home, least not my small farm of raspberry pi's. I've recently decided to try and simplify my build set up and came across `buildx` as well as deciding to migrate to GitHub Actions.Dependabot… a critical DevOps tools2020-01-11T10:59:00+00:002020-01-11T10:59:00+00:00http://kohi.uk/2020-01-11/dependabot<p>
Acquired by GitHub in 2019, Dependabot is a free bot application that checks
for dependency updates, creates pull requests and can even merge them for you.
It's one of the more exciting tools to use in a DevOps culture (and something
we heavily use at LandTech). I know, I know... what's so exciting about a
glorified dependency bot? For me, beyond the tool itself, it's largely what it
implies to be able to use it to its full potential. Let me explain...
</p>
<h2>Something I don't like doing as an engineer</h2>
<p>
At LandTech we manage thousands of dependencies across all of our codebase.
Regularly upgrading such code dependencies is not something that really gives
me joy, nor would I imagine does it most engineers. And as part of that it is
not something I do nearly often enough. I've seen endless codebases where
dependencies are months or even years behind, long out of support, many major
versions behind. More often than not it's not seen as a problem until
something needs to be changed... taking far, far longer than it should. Or
arguably even worse, a breach or security incident occurs due to unpatched
dependencies.
</p>
<h2>A different way...</h2>
<p>
By having Dependabot track and create pull requests for dependency updates it
takes away something engineers have to be mindful of, that eats up part of
their mental capacity. It allows codebases to stay up-to-date, securely
patched, within support and reduces the feedback time it takes to discover a
breaking change (you don't have to fix the breaking change, but you are aware
and have the option to). It means codebase change remains easy and fast rather
than slowing over time, increasing in difficulty.
</p>
<h2>Fully automating the process</h2>
<p>
However, merely tracking dependencies and creating pull requests still
requires input from engineers to approve them, which can quickly stack up and
becomes yet another thing to remember to do.
</p>
<p>
Enter auto-merging, an advanced feature in Dependabot to automatically merge
pull requests given any necessary CI steps have passed. Auto-merging can be
something people struggle to agree with... What if my app breaks? What if
there's an outage? I usually ask the same question... What do you need to put
in place to give you the confidence that it won't? Need more tests? Then add
more tests. Need dark/canary deploys? Then add better deploy tooling. Need
automated rollbacks based off metrics? Then add that too!
</p>
<p>
It does mean investing time in such improvements, but it is time well spent.
You end up with a more robust CI/CD pipeline enabling you to auto-merge code
dependency updates. Applications stay up-to-date, in support, secure and
easy/fast to change. Engineers can focus more time on things that give them
joy, ie learning, solving problems and delivering value, rather than
repeating, joyless operational tasks.
</p>RobAcquired by GitHub in 2019, Dependabot is a free bot application that checks for dependency updates, creates pull requests and can even merge them for you. It's one of the more exciting tools to use in a DevOps culture (and something we heavily use at LandTech). I know, I know... what's so exciting about a glorified dependency bot? For me, beyond the tool itself, it's largely what it implies to be able to use it to its full potential. Let me explain...Favouring Servant Leadership2020-01-10T10:59:00+00:002020-01-10T10:59:00+00:00http://kohi.uk/2020-01-10/servant-leadership<p>
A leadership philosophy by Robert K. Greenleaf first introduced in the essay
"The Servant as Leader" published in 1970 that I like to apply in work and
everyday life. In growing, fast paced engineering cultures I find it to be a
critical philosophy to apply and do well, but often find it's misunderstood...
</p>
<h2>A Simple Scenario</h2>
<p>
There are two competitors. At the first, problems and decisions are solved and
made only by certain people in leadership. At the second, everyone in the
organisation is empowered, solving problems, making decisions.
</p>
<p>Who would win? Who would you bet on?</p>
<h2>What Is Servant Leadership?</h2>
<p>
For me, it's simply about improving
<a href="https://hbr.org/2017/01/the-neuroscience-of-trust">performance</a>
allowing people to deliver the organisation's goals and strategies.
Understanding that as a single person/leader I can only do so much... I can
only scale so far! But build and grow a team of leaders and we'll do more by
orders of magnitude.
</p>
<p>
I measure my own sucess by reminding myself of the following... if I were to
suddenly, magically disappear and the team (after the shock, if any, of me
disappearing) could not only maintain but continue to improve their
performance and way of working, then I've been a successful leader. Simple.
</p>
<h2>What Servant Leadership Isn't...</h2>
<p>
Just to cover off a few misconceptions, servant leadership isn't about being
an easy or soft leader unable to make decisions, or literally "serving"
people... doing their work. Nor does it mean you won't have to make difficult
decisions or any decisions at all. It's far more complex and challenging than
simply being "nice" (not that I'm adverse to being nice! Everyone should be
whether at work or not).
</p>
<h2>Why Isn't Everyone Doing It?</h2>
<p>
It may not be suited to all contexts (usually my first answer to many
topics!). But it also requires the right cultural environment and it's
difficult to do well.
</p>
<p>
The aim is to get the best out of those around you by putting them before
yourself, building trust... and inspiring, empowering, supporting, and guiding
in order to build a highly skilled team able to make effective decisions. It's
by no means an easy philosophy to learn/apply as it heavily relies on empathy,
observation, awareness, influence, foresight and putting trust in others. I do
find it a more challenging, but at the same time a more enjoyable and
rewarding form of leadership.
</p>
<h2>Self Guidance</h2>
<p>
These are some of my own realisations that I like to regularly reflect on as
guidance.
</p>
<ul>
<li>
A leader's availability/capacity (or lack of) should not block, slow or
stall progress
</li>
<li>
The further decisions are made from where they need to be, the poorer they
will be
</li>
<li>
The further decisions are made from where they need to be, the slower they
will be
</li>
<li>
A single point of failure (not just technical, but human too) can cause an
entire system to stop
</li>
<li>"Two heads are better than one"</li>
</ul>
<p>
I'm always looking to reduce delay and single points of failure in any system
as they can have a huge impact on a business. So in situations where a
decision has been delayed, or I've been a dependency in the decision making
process I always ask "why were the individuals/team unable to make a
decision?"
</p>
<ul>
<li>
Safety - did they feel "safe" and trusted to make a decision? If not, then
why? Do we truly have trust? How do we culturally treat mistakes? Do we have
a culture of blame or a culture of learning?
</li>
<li>
Empowered - did the they feel they were able or allowed to make such a
decision? Why not? Are they best placed to make the decision? What's the
cost of delaying a decision or throwing responsibility to others?
</li>
<li>
Competent - did they feel confident, experienced and skilled to make a
decision? If not then what do we need to do so that they do?
</li>
</ul>
<p>
None of the above means that individuals/teams go rogue, or make decisions
that suit themselves. If done well servant leadership should encourage quite
the oposite, so that people understand how to make effective decisions
together.
</p>
<h2>Servant Leadership benefits in examples...</h2>
<ul>
<li>
A team comes to me to make a decision around frontend architecture options.
Am I best placed to? No. Do I have the necessary knowledge or involvment
relative to those doing the work? No. I make decision, possibly a poor one,
impacting our ability to deliver frontend features and lowering team morale.
Alternatively, rather than make a decision myself, could I guide and
facilitate the team to make a better decision. Yes! Is the outcome likely to
be better? Most likely.
</li>
<li>
The team need to make a decision fast but I'm on holiday/in meetings/at
lunch/sick etc... Do they heavily rely on me so wait until I'm back to make
a decision? If yes, then I'm single handedly blocking progress. Or do they
feel confident, empowered and prepared enough to make a decision? If not,
then I'm failing as a leader, am a dependency, single point of failure and
impacting effective progress.
</li>
<li>
We discuss a problem. I make a decision (a terrible decision). No one speaks
up through either not feeling safe/empowered to, or by lacking
experience/exposure to such decision making. Am I human? Yes. Will I make
mistakes? Yes. Alternatively, the other team members are experienced,
empowered and skilled in such situations through continuous learning, and
feeling safe and trusted, they point out the failings in my decision. We
collectively make a better decision... disaster averted!
</li>
</ul>RobA leadership philosophy by Robert K. Greenleaf first introduced in the essay "The Servant as Leader" published in 1970 that I like to apply in work and everyday life. In growing, fast paced engineering cultures I find it to be a critical philosophy to apply and do well, but often find it's misunderstood...SSH Jumpbox in Docker2019-09-01T10:59:00+00:002019-09-01T10:59:00+00:00http://kohi.uk/2019-09-01/ssh-jumpbox<p>
<strong>Deprecated</strong> - this method, while still interesting in make-up, isn't our favoured
method (as it has added complexity and moving parts). There is a simpler solution using
an AWS AMI base image + the awscli (SSM) to
open a tunnel via a short-lived SSH key and random port. Or even simpler, directly
using SSM and removing the need for SSH.
</p>
<p>
As part of our never-ending pursuit of staying secure, we have recently built
an SSH jumpbox as a central, secure way to access our production instances on
AWS. A fairly standard affair, although in this instance we solved the problem
using Docker, numerous services (such as rsyslog and fail2ban) and related the
jumpbox users to our AWS users for seamless management... so we thought we'd
share how we did it!
</p>
<h2>What is an SSH Jumpbox?</h2>
<p>
A jumpbox is a host you connect/tunnel through to access a target (hidden)
host. It performs no additional function beyond helping create a secure tunnel
between the user and the target (hidden) host. SSH is the underlying
technology used between the user and jumpbox to form the tunnel. Using SSH any
port can be mapped back to the user, in order to do so we need to allow the
jumpbox to talk to that host via the port in question using an EC2 Security
Group.
</p>
<h2>Why an SSH Jumpbox?</h2>
<ul>
<li>
<strong>Simplicity</strong> - we don't yet need the overhead of a VPN, a
Jumpbox is enough for our current needs and far cheaper
</li>
<li>
<strong>Reduced Attack Surface</strong> - fewer hosts are publicly exposed.
Only the jumpbox is publicly accessible
</li>
<li>
<strong>Auditing</strong> - logging access is simpler as all users access
internal hosts via the jumpbox
</li>
<li>
<strong>Management</strong> - we have a single public host to
secure/maintain/update instead of numerous hosts
</li>
<li>
<strong>Single Responsibility</strong> - the jumpbox performs a single
function and performs it well
</li>
</ul>
<h2>Building an SSH Jumpbox</h2>
<p>
We settled on using Docker for creating our jumpbox, hosted on an EC2 instance
via ECS. Docker was largely chosen because of the fast feedback loop from
being able to test-drive the container (via `testinfra`) to running/debugging
it locally (and consistently) and tearing it up/down on AWS easily.
</p>
<p>
The container is based on an alpine image for a smaller size and attack
surface, running a handful of processes:
</p>
<ul>
<li>
`openssh` for our SSH server. We have locked down `/etc/ssh/sshd_config`:
<ul>
<li>
not allowing interactive mode (there's no reason to be on the jumpbox)
</li>
<li>not allowing password auth (as it is less secure than key-based)</li>
<li>
not allowing root login (as there should be no need to login as root)
</li>
<li>forcing SSH protocol 2</li>
</ul>
</li>
<li>
`rsyslog` for managing our logs (and sending them to our logging platform)
</li>
<li>`fail2ban` for banning malicious activity per ip address</li>
<li>
`s6` overlay as our process supervisor, managing all the above processes
</li>
</ul>
<p>Here is our `Dockerfile`:</p>
<figure class="highlight"><pre><code class="language-docker" data-lang="docker"><span class="k">FROM</span><span class="s"> alpine:latest</span>
<span class="k">ARG</span><span class="s"> AWS_ACCESS_KEY_ID</span>
<span class="k">ARG</span><span class="s"> AWS_SECRET_ACCESS_KEY</span>
<span class="k">ARG</span><span class="s"> IGNORED_IPS</span>
<span class="k">ARG</span><span class="s"> LOGGING_DEST</span>
<span class="k">ARG</span><span class="s"> OVERLAY_VERSION=1.21.7.0</span>
<span class="k">COPY</span><span class="s"> create-users.sh .</span>
<span class="k">COPY</span><span class="s"> keys /etc/ssh</span>
<span class="k">COPY</span><span class="s"> s6 /etc/services.d</span>
<span class="k">RUN </span>apk add <span class="nt">--no-cache</span> fail2ban openssh openssh-server-pam <span class="nb">grep </span>rsyslog rsyslog-tls <span class="se">\
</span> <span class="o">&&</span> apk add <span class="nt">--no-cache</span> <span class="nt">--repository</span> http://uk.alpinelinux.org/alpine/edge/testing aws-cli bash curl jq <span class="nt">--virtual</span><span class="o">=</span>dependencies <span class="se">\
</span> <span class="c"># s6 overlay</span>
&& curl -L "https://github.com/just-containers/s6-overlay/releases/download/v${OVERLAY_VERSION}/s6-overlay-amd64.tar.gz" | tar zx -C / \
# fail2ban
&& mv /etc/services.d/fail2ban/*.local /etc/fail2ban/ \
&& sed -i -e "s/{IGNORED_IPS}/$IGNORED_IPS/" /etc/fail2ban/jail.local \
# logging
&& sed -i -e "s/{LOGGIN_DEST}/$LOGGING_DEST/" /etc/services.d/rsyslog/rsyslog.conf \
# create users via aws
&& export AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID} \
&& export AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY} \
&& ./create-users.sh && rm create-users.sh \
&& chmod -R 600 /etc/ssh \
&& apk del --purge dependencies
<span class="k">EXPOSE</span><span class="s"> 22</span>
<span class="k">ENTRYPOINT</span><span class="s"> ["/init"]</span></code></pre></figure>
<p>
As part of building our jumpbox, we create a (password disabled) user on the
jumpbox for each user on AWS. This is easily done using a bash script and
`boto`. For each user created on the jumpbox, we get the public SSH key
associated with respective AWS user and add it as an `~/.ssh/authorized_keys`
(so the user is allowed to connect via SSH). This relates our AWS users to the
jumpbox users and means there is no user/key sharing happening and we have
cleaner/clearer auditing as a consequence! If someone new starts or leaves, we
simply need to update our AWS users and kick off a CI build and deploy (which
takes no more than 2 minutes) to refresh the container.
</p>
<figure class="highlight"><pre><code class="language-docker" data-lang="docker"><span class="c">#!/bin/bash</span>
users=$(aws iam get-group --group-name users | jq '.Users[].UserName' -r) || exit 1
if [ -z "$users" ]; then
echo "No users retrieved, please check your AWS credentials and access"
exit 1
fi
while read -r username; do
echo "Creating user '$username'"
adduser -D "$username" || exit 1
echo "Fetching ssh key ids..."
ssh_key_ids=$(aws iam list-ssh-public-keys --user-name "$username" | jq '.SSHPublicKeys[].SSHPublicKeyId' -r) || exit 1
if [ -z "$ssh_key_ids" ]; then
echo "No key ids found"
continue
fi
echo "$ssh_key_ids"
mkdir -p /home/"$username"/.ssh
echo "Fetching ssh keys..."
while read -r ssh_key_id; do
ssh_key=$(aws iam get-ssh-public-key --user-name "$username" --ssh-public-key-id "$ssh_key_id" --encoding SSH | jq '.SSHPublicKey.SSHPublicKeyBody' -r) || exit 1
echo "$ssh_key" >> /home/"$username"/.ssh/authorized_keys
echo "$ssh_key"
done <<< "$ssh_key_ids"
chown -R "$username:$username" /home/"$username"/.ssh
chmod 700 /home/"$username"/.ssh
chmod 600 /home/"$username"/.ssh/authorized_keys
done <<< "$users"</code></pre></figure>
<h2>Testing our container</h2>
<p>
We use the rather awesome `testinfra` python package to help us test-drive our
container. Using it we can test numerous things from packages installed to
more complex tests such as checking a logging platform connection is
'ESTABLISHED' via `netstat`. We have over 20 tests, here's a snippet of some:
</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="kn">import</span> <span class="nn">boto</span>
<span class="kn">import</span> <span class="nn">os</span>
<span class="kn">import</span> <span class="nn">pytest</span>
<span class="kn">import</span> <span class="nn">subprocess</span>
<span class="kn">import</span> <span class="nn">testinfra</span>
<span class="o">@</span><span class="n">pytest</span><span class="p">.</span><span class="n">fixture</span><span class="p">(</span><span class="n">scope</span><span class="o">=</span><span class="s">'session'</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">host</span><span class="p">(</span><span class="n">request</span><span class="p">):</span>
<span class="n">subprocess</span><span class="p">.</span><span class="n">check_call</span><span class="p">([</span><span class="s">'make'</span><span class="p">,</span> <span class="s">'build'</span><span class="p">])</span>
<span class="n">docker_id</span> <span class="o">=</span> <span class="n">subprocess</span><span class="p">.</span><span class="n">check_output</span><span class="p">([</span><span class="s">'make'</span><span class="p">,</span> <span class="s">'--silent'</span><span class="p">,</span> <span class="s">'daemonise'</span><span class="p">]).</span><span class="n">decode</span><span class="p">().</span><span class="n">strip</span><span class="p">()</span>
<span class="k">yield</span> <span class="n">testinfra</span><span class="p">.</span><span class="n">get_host</span><span class="p">(</span><span class="s">"docker://"</span> <span class="o">+</span> <span class="n">docker_id</span><span class="p">)</span>
<span class="n">subprocess</span><span class="p">.</span><span class="n">check_call</span><span class="p">([</span><span class="s">'docker'</span><span class="p">,</span> <span class="s">'rm'</span><span class="p">,</span> <span class="s">'-f'</span><span class="p">,</span> <span class="n">docker_id</span><span class="p">])</span>
<span class="k">def</span> <span class="nf">test_sshd_process_is_running</span><span class="p">(</span><span class="n">host</span><span class="p">):</span>
<span class="n">process</span> <span class="o">=</span> <span class="n">host</span><span class="p">.</span><span class="n">process</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">comm</span><span class="o">=</span><span class="s">'sshd'</span><span class="p">)</span>
<span class="k">assert</span> <span class="n">process</span><span class="p">.</span><span class="n">user</span> <span class="o">==</span> <span class="s">'root'</span>
<span class="k">assert</span> <span class="n">process</span><span class="p">.</span><span class="n">group</span> <span class="o">==</span> <span class="s">'root'</span>
<span class="k">def</span> <span class="nf">test_rsyslog_is_connected_to_logging_platform</span><span class="p">(</span><span class="n">host</span><span class="p">):</span>
<span class="n">port</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">[</span><span class="s">'LOGGING_PLATFORM_PORT'</span><span class="p">]</span>
<span class="k">assert</span> <span class="n">host</span><span class="p">.</span><span class="n">run</span><span class="p">(</span><span class="sa">f</span><span class="s">'netstat -atn | grep -P ":</span><span class="si">{</span><span class="n">port</span><span class="si">}</span><span class="s">\s+ESTABLISHED"'</span><span class="p">).</span><span class="n">rc</span> <span class="o">==</span> <span class="mi">0</span>
<span class="k">def</span> <span class="nf">test_users</span><span class="p">(</span><span class="n">host</span><span class="p">):</span>
<span class="n">iam</span> <span class="o">=</span> <span class="n">boto3</span><span class="p">.</span><span class="n">resource</span><span class="p">(</span><span class="s">'iam'</span><span class="p">,</span> <span class="n">aws_access_key_id</span><span class="o">=</span><span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">[</span><span class="s">'AWS_ACCESS_KEY_ID'</span><span class="p">],</span> <span class="n">aws_secret_access_key</span><span class="o">=</span><span class="n">os</span><span class="p">.</span><span class="n">environ</span><span class="p">[</span><span class="s">'AWS_SECRET_ACCESS_KEY'</span><span class="p">])</span>
<span class="k">for</span> <span class="n">user</span> <span class="ow">in</span> <span class="n">iam</span><span class="p">.</span><span class="n">Group</span><span class="p">(</span><span class="s">'users'</span><span class="p">).</span><span class="n">users</span><span class="p">.</span><span class="nb">all</span><span class="p">():</span>
<span class="k">assert</span> <span class="n">host</span><span class="p">.</span><span class="n">user</span><span class="p">(</span><span class="n">user</span><span class="p">.</span><span class="n">name</span><span class="p">).</span><span class="n">exists</span>
<span class="n">home</span> <span class="o">=</span> <span class="s">'/home/%s'</span> <span class="o">%</span><span class="n">user</span><span class="p">.</span><span class="n">name</span>
<span class="k">if</span> <span class="n">host</span><span class="p">.</span><span class="nb">file</span><span class="p">(</span><span class="n">home</span> <span class="o">+</span> <span class="s">'.ssh'</span><span class="p">).</span><span class="n">exists</span><span class="p">:</span>
<span class="k">assert</span> <span class="n">host</span><span class="p">.</span><span class="nb">file</span><span class="p">(</span><span class="n">home</span> <span class="o">+</span> <span class="s">'.ssh'</span><span class="p">).</span><span class="n">mode</span> <span class="o">==</span> <span class="mo">0o700</span>
<span class="n">authorized_keys</span> <span class="o">=</span> <span class="n">home</span> <span class="o">+</span> <span class="s">'/authorized_keys'</span>
<span class="k">assert</span> <span class="n">host</span><span class="p">.</span><span class="nb">file</span><span class="p">(</span><span class="n">authorized_keys</span><span class="p">).</span><span class="n">exists</span>
<span class="k">assert</span> <span class="n">host</span><span class="p">.</span><span class="nb">file</span><span class="p">(</span><span class="n">authorized_keys</span><span class="p">).</span><span class="n">user</span> <span class="o">==</span> <span class="n">user</span><span class="p">.</span><span class="n">name</span>
<span class="k">assert</span> <span class="n">host</span><span class="p">.</span><span class="nb">file</span><span class="p">(</span><span class="n">authorized_keys</span><span class="p">).</span><span class="n">mode</span> <span class="o">==</span> <span class="mo">0o600</span></code></pre></figure>
<h2>Running our container</h2>
<p>
As `fail2ban` uses iptables under the hood for banning ip addresses, the
container needs to run with slightly elevated privileges. This translates into
giving the container `NET_ADMIN` privileges (this is an inbuilt Linux
privilege), the minimum privileges needed in order for the container to be
able to modify `iptables` of the host. In a `docker run` command this
translates as `--cap-add=NET_ADMIN`.
</p>
<p>
The container can also run in `--readonly` mode meaning it can't be modified,
just for that added bit of security. We mount our log directory into the
container meaning we also retain our log files after a deploy.
</p>RobDeprecated - this method, while still interesting in make-up, isn't our favoured method (as it has added complexity and moving parts). There is a simpler solution using an AWS AMI base image + the awscli (SSM) to open a tunnel via a short-lived SSH key and random port. Or even simpler, directly using SSM and removing the need for SSH.Getting rid of Google2017-12-21T10:59:00+00:002017-12-21T10:59:00+00:00http://kohi.uk/2017-12-21/degoogling<p>
As part of my venture into improving my personal privacy I've finally taken
the plunge and started to de-Google (ie become less reliant on Google's
services such as gmail, calendar, contacts, play store etc...)
</p>
<h1>Why am I doing this?</h1>
<p>
Simply put, I don't want to be a product or money maker for large corporations
at the expensive of my privacy... and to some extent my security given how
much companies like Google know about me (should there be a leak or social
hack etc...). As well as all the social implications, advertising and having
services targeted or pushed at me.
</p>
<p>
There's also an aspect of curiosity and experimentation behind my motivation.
How deep into my life has Google rooted itself? How easily can I root it out?
</p>
<p>Either way, it's time to take back control.</p>
<h1>Where do I start?</h1>
<p>
It can be quite overwhelming getting started especially with Google as they're
integrated into so many day-to-day things...devices (pc's, laptops, phones),
operating systems, browsers, apps, websites, services (email, events,
messaging, music, photos) to name a few!
</p>
<p>Luckily, I'm already part way along my journey:</p>
<ul>
<li>On PC I solely use Linux (Arch to be specific)</li>
<li>
On phone I use LineageOS (an open source version of Android, formerly known
as Cyangogen) albeit with OpenGApps pico (a stripped down, minimal version
of the Google suite of services)
</li>
<li>Firefox is my browser of choice across all devices</li>
</ul>
<h1>Reducing Google Dependencies</h1>
<ul>
<li>
Calendar/Contacts - self-hosted using Linode as a VPS host with my own
domain hosted on NameCheap. <a href="http://radicale.org/">Radicale</a> via
docker for all my calendars and contacts. Versioning is done via git (I self
host Gogs) of which backups go to AWS S3. I use gnome-calendar and
gnome-contacts on PC, and DAVDroid on Android for managing everything.
</li>
<li>
<a href="https://microg.org/">MicroG framework</a> - a replacement on
Android for the OpenGApps. While it still uses parts of Google (Google Cloud
Messaging for example), it does reduce some dependencies such as Location
services and it's all open source (no hidden snooping etc...).
</li>
<li>F-droid, an open source Android app repository for all my apps</li>
<li>
<a href="https://f-droid.org/en/packages/com.github.yeriomin.yalpstore/"
>Yalpstore</a
>, an open source app for grabbing apps from the Google Play Store without
needing the Google Play Store to be installed.
</li>
<li>
Email - switching to <a href="https://tutanota.com/">Tutanota</a> (an
encrypted, open source, DE hsoted mail service) using my own domain for all
emails. While not as feature-full as Gmail, I does all I need (I actually
realised I simply didn't need most of the features Gmail offers).
</li>
</ul>
<h1>Easy switches</h1>
<ul>
<li>
Calendar/contacts. Exporting calendars and contacts from Google and
importing them into <a href="http://radicale.org/">Radicale</a> was very
easy. Hosting Radicale took more time, as did setting up a VPS but since it
has been running I've had zero issues and 100% uptime/
</li>
<li>
F-droid. Using it for many of my Android apps has been easy, and is highly
recommended.
</li>
</ul>
<h1>Workable But Not Perfect Switches</h1>
<ul>
<li>
<a href="https://microg.org/">MicroG framework</a> - while it generally
works very well, it's not perfect. Using other location providers work well
enough but don't provide the same level of accuracy that Google offers.
Cloud Messaging works well most of the time, but does have the odd blip
which is annoying as it may mean I miss a notification.
</li>
<li>
<a href="https://f-droid.org/en/packages/com.github.yeriomin.yalpstore/"
>Yalpstore</a
>
- generally a great app but it lacks polish, and updating apps isn't fully
automated.
</li>
<li>
Email - switching to <a href="https://tutanota.com/">Tutanota</a> was very
easy, but the pain is in migrating all my accounts and contacts on to it
which will take time. Also their spam filters aren't as good as Gmail's.
</li>
</ul>RobAs part of my venture into improving my personal privacy I've finally taken the plunge and started to de-Google (ie become less reliant on Google's services such as gmail, calendar, contacts, play store etc...)A different kind of agile board2017-11-08T10:59:00+00:002017-11-08T10:59:00+00:00http://kohi.uk/2017-11-08/agile-board<p>
A few months back we took on a new product and in doing so thought we'd
refresh the way we work, largely focussing on our board to help speed up our
delivery. Just to be clear before going into detail, there is no right or
wrong in terms of boards, just what fits you as a team. The details below are
just our journey and findings that hopefully others may find interesting or at
least take ideas from.
</p>
<h1>The Previous Board</h1>
<p>
We used a very common and perscribed board, in Jira as pictured below (this is
a quick mock up so not exactly how it was):
</p>
<img
src="https://user-images.githubusercontent.com/296433/32545885-f4dcc834-c474-11e7-8147-2b2ddb05a332.png"
alt="Jira"
/>
<p>
It's a proven way of working and very effective in certain contexts, but in
our case we found it a bit heavy/clunky and additionally we'd let certain
problems manifest:
</p>
<ul>
<li>
No WIP limits. We actually used multiple "swim lanes", with no limits per
lane or overall. The work per lane was commonly contextually different. All
of this meant we found ourselves spread across multiple, differing pieces of
work (leading to context switching or knowledge silos) and starting many
things but finishing few. One consequence was needing to have regular
inter-team knowledge share sessions (again not a bad thing, but manifested
through a lack of working together).
</li>
<li>
Lack of vision/understanding. When working on complex systems or problems
having cards sat in columns wasn't helping us visualise and see what we were
trying to build. We'd churn through cards with little thought to the problem
at hand, or overall solution.
</li>
<li>
JIRA overhead. We'd let the "features" (subjectively seen as bloat/waste) of
JIRA complicate our board. Tags, epics, goals, stories, swim lanes,
estimates, statuses, tasks, sub-tasks etc... The board wasn't fun any more.
I often find when something becomes too complicated or a pain to use, people
stop using it properly, and that was beginning to happen. There was a
distinct lack of love for the board.
</li>
</ul>
<p>
One option was to spend time fixing those issues with our JIRA board, but
sometimes it's good to try something new, so that's what we did!
</p>
<h1>Diagramatic Board</h1>
<p>
This is a technique I used at my previous company. Simply put, our board, for
what we're working on now, is a diagram. And that's the beauty of it, it can
be whatever we want it to be (that best helps us) that best represents what
we're trying to solve. Stories/tasks are stuck where work needs to be done.
And avatars are stuck where people are working (strictly only one avatar
instance per person!). When work is done, we change the colour (to green) or
simply take it off the board. We try to work in weekly blocks, although that
is by no means strict. If we finish a piece of work, then the board gets wiped
and we draw the next problem and get to work. A week is just a nice unit of
time we can all generally relate to, and as a team we try to deliver at least
weekly.
</p>
<p>
While I prefer a physical board, we currently use RealTimeBoard as some of the
team work in different locations (from home, or when visiting family abroad
etc...):
</p>
<img
src="https://user-images.githubusercontent.com/296433/32545882-f28e11c8-c474-11e7-92ec-46119afa8b41.png"
alt="Diagramtic board"
/>
<p>There are a few caveats:</p>
<ul>
<li>
Discipline - it's an open, free format. The idea is that what is drawn can
be whatever is needed in order to help visualise and solve the problem.
There are no rules or boundaries.
</li>
<li>
Regular Reviews - what you draw on Monday may not be appropriate come
Wednesday due to findings or decisions. Always question/analyse etc... the
board daily and redraw it if needed. Don't let it become stale or not
representative of the problem.
</li>
<li>
No Silver Bullet - it won't solve all your problems, it's a tool that *may*
solve *some* of your problems. For example it doesn't directly solve our WIP
limit issue mentioned above. It tries to help by us only solving a single
problem at a time (so that everyone is working on the same context), but the
bounds of that context are still defined by the team.
</li>
<li>
Keep It Simple - just like our JIRA board, if you over complicate your board
it'll end up hindering rather than helping you. We've gone through phases of
numerous multi-colour post-its, icons, graphics, even having a legend (we
have one, but a minimal one for accessibility).
</li>
<li>
Be Flexible - be like the board. The board should be flexible and
changeable. So should you in terms of adapting to the work you're doing. In
some cases this kind of board won't be effective but that's ok, don't use
it!
</li>
</ul>
<h1>Common Problems</h1>
<ul>
<li>
Something that commonly happens, is a discovery results in multiple
additional cards spawning onto the board (although this is a problem that
can happen on any board). We like to "jail" new items in a discussion area
before adding them to the diagram. For a while we didn't do this, and it
became difficult to track progress and scope as cards just kept on
appearing.
</li>
<li>
RealTimeBoard doesn't have any built in card types etc... We found when
working individually, everyone would put things on the board slightly
differently. Task or story, how things are worded, colours used, sizing
etc... Again a problem that can happen on any board, but looked particularly
messy on a diagramatic board. We solved this by working together, at least
when breaking down work and updating the board, as this helped enforce a
team-accepted way of doing things.
</li>
</ul>
<h1>Backlog</h1>
<p>
Just for clarity, we still have a backlog which is an ordered list of work
(currently sat in JIRA). We still use JIRA for certain things (RealTimeBoard
even has JIRA integration although the cards look a little ugly). The great
thing about a diagramatic board is that you can compliment it with other
tools, in fact I'd encourage it.
</p>
<h1>Little Touches (Accessibility)</h1>
<p>
Having read about simulating colour blindness I decided to make our board a
little more accessible as we have a colourblind team member (hence the
slightly random colours in the picture above). RealTimeBoard isn't
particularly helpful with the available colours (something I've raised with
them and they've responded to very quickly) but using
<a href="http://colororacle.org/">Oracle Color</a> it is possible to pick more
accessible colours for the board. We typically stick with the 3 left most
colours (grey, orange and green) on the RealTimeBoard palette so that everyone
knows which ones to pick (and doesn't end up picking similar shades/variants
etc...). Here's how they might look to a colour-blind person according to
<a href="http://colororacle.org/">Oracle Color</a>:
</p>
<img
src="https://user-images.githubusercontent.com/296433/32545878-efe195da-c474-11e7-8b93-ee3c06f58741.png"
alt="Colour accessible"
/>
<h1>Go Try!</h1>
<p>
This article only covers so much, and is sure to raise quite a few questions
and doubts. It's far better to see in action by giving it a go. Remember it
doesn't need to be permanent and it may not work for you... but at least give
it a try!
</p>RobA few months back we took on a new product and in doing so thought we'd refresh the way we work, largely focussing on our board to help speed up our delivery. Just to be clear before going into detail, there is no right or wrong in terms of boards, just what fits you as a team. The details below are just our journey and findings that hopefully others may find interesting or at least take ideas from.Building production grade docker images2017-10-22T10:59:00+00:002017-10-22T10:59:00+00:00http://kohi.uk/2017-10-22/building-production-grade-docker-images<p>
Looking through Docker Hub at numerous Dockerfiles and seeing how Docker is
generally used, it's easy to see it's commonly being mis-used or giving a
false sense of security. What we think is a contained, secure environment
might be anything but that. Here are some common mistakes to avoid and good
practises to follow.
</p>
<h1>Bad</h1>
<h3>Beware the "docker" group</h3>
<p>
Be very careful when adding host users to the `docker` group. The docker
daemon has root privileges so adding a user to the `docker` group is akin to
giving a user sudo access without requiring a password. For example you can
run: <figure class="highlight"><pre><code class="language-bash" data-lang="bash"> docker run <span class="nt">-ti</span> <span class="nt">--privileged</span> <span class="nt">-v</span> /:/host alpine <span class="nb">chroot</span>
/host </code></pre></figure>
</p>
<p>
It's a huge security risk, even for trusted users (should you leave your
keyboard unlocked, or get exploited). Don't be lazy, stick to requiring `sudo`
for using docker, or customise your `/etc/sudoers` file allowing `NOPASSWD`
for safe (non-destructive) docker commands but still require it for
destructive commands.
</p>
<h3>Root USER</h3>
<p>
In similar regards to the previous point, don't run your containers using the
root user. While you may think your container is isolated, it's a huge risk
allowing your container to run with root privileges in case it gets exploited.
Would you run your own apps (or other people's code) on the host using root? I
hope not...you shouldn't treat your containers any differently.
</p>
<h3>Restarting containers</h3>
If you're restarting containers, you're doing it wrong. Always `run` a
container, stop it, remove it, then run a fresh one if needing to start it
again. It ensures your containers can be pulled down and built back up again
easily and quickly and aren't storing state.
<h3>Avoid `latest`</h3>
While latest is great for getting the most recent changes, in production it's a
bit of a risk as at the point of pulling it will simply pull whatever is the
latest. It's far better to specify the version you want in the run command that
you execute as part of the deploy.
<h1>Good</h1>
<h3>--read-only</h3>
<p>
For production-standard containers (I consider this a must) you should be
running them in read-only mode. If your app being contained is compromised, it
stops the hacker putting any exploits on the filesystem. If you do need some
writable access, there are two options:
</p>
<ul>
<li>`--tmpfs /tmp` - use a temporary file system</li>
<li>
`-v /foo:/foo - mount a volume. I use this when I need to write to a data
directory, but want everything else locked down as read-only
</li>
</ul>
<h3>Read only mounts</h3>
<p>
Further to the above, if you do need to mount a volume, you can make it read
only too using `:ro`, for example `-v /foo:/foo:ro`. A good example is my use
of MiniDLNA. I mount my music and videos as volumes in my container, but
MiniDLNA doesn't need write access to those volumes so I mark them as `:ro`.
</p>
<h3>Managing init (PID1)</h3>
<p>
While not needed for all situations, if you find your entrypoint (for example
a Java app) is doing one of the following...
</p>
<ul>
<li>Is not reaping spawned zombie processes</li>
<li>Is not respecting or doesn't have signal handlers set up</li>
</ul>
Then you should be using an init system that will handle the above for you like
`tini` or `s6` (overlay). It'll ensure your containers shut down and fail
properly (which is important so you know when to recreate them).
<h3>Managing mount ownership</h3>
Handling file ownership between the host and container can be tricky. Linux
under the hood, uses User Id's for file ownership (usernames are mapped to
UID's). You ideally want the UID's to match between container and host rather
than having the container chown-ing files/folders during start up. There are a
few simply solutions:
<h4>Matching UID's</h4>
This is my favourite and most secure of all the options. When building an image,
and adding a non root user to run as, specify a random UID (rather than
defaulting to 1000), for example 4999. Push the image to your registry as per
normal. Now hosts can create a user with a matching UID...
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="c"># Host user.... </span>
<span class="nb">sudo </span>useradd <span class="nt">--no-create-home</span> <span class="nt">--system</span> <span class="nt">--shell</span> /bin/false <span class="nt">--uid</span> 4999 git
<span class="c"># Run the container... </span>
docker run <span class="se">\ </span>
<span class="nt">--rm</span> <span class="se">\ </span>
<span class="nt">--user</span> <span class="si">$(</span><span class="nb">id </span>git <span class="nt">-u</span><span class="si">)</span>:<span class="si">$(</span><span class="nb">id </span>git <span class="nt">-g</span><span class="si">)</span> <span class="se">\ </span>
<span class="nt">--read-only</span> <span class="se">\ </span>
<span class="nt">-v</span> /var/data:/data <span class="se">\ </span>
<span class="nt">-t</span> foo </code></pre></figure>
<h4>Matching username</h4>
Alternatively if you don't want to play around with UID's, you simply need a
user with a matching username. For example building a container that runs as the
user `git`. You can mount `/etc/passwd` into the container, so the host and
container user will lookup to the same UID. However, this does expose your hosts
`/etc/passwd`.
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">
<span class="c"># Host user.... </span>
<span class="nb">sudo </span>useradd <span class="nt">--no-create-home</span> <span class="nt">--system</span> <span class="nt">--shell</span> /bin/false git
<span class="c"># Run the container... </span>
docker run <span class="se">\</span>
<span class="nt">--rm</span> <span class="se">\</span>
<span class="nt">--user</span> <span class="si">$(</span><span class="nb">id </span>foo <span class="nt">-u</span><span class="si">)</span>:<span class="si">$(</span><span class="nb">id </span>foo <span class="nt">-g</span><span class="si">)</span> <span class="se">\ </span>
<span class="nt">--read-only</span> <span class="se">\ </span>
<span class="nt">-v</span> /etc/passwd:/etc/passwd:ro <span class="se">\ </span>
<span class="nt">-v</span> /var/data:/data <span class="se">\ </span>
<span class="nt">-t</span> foo</code></pre></figure>
<h4>Specifying Id's</h4>
A final option is being able to specify the Id's when running the container.
Using `usermod` in the `shadow` package in Alpine you can modify the user the
container is running at on start up. The biggest compromise with this, is that
you can't run in read-only mode as you are modifying the file system
(`/etc/passwd`) at runtime and is the reason I don't use this method very often.
<figure class="highlight"><pre><code class="language-bash" data-lang="bash">
docker run <span class="se">\ </span>
<span class="nt">--rm</span> <span class="se">\ </span>
<span class="nt">-e</span> <span class="nv">PUID</span><span class="o">=</span><span class="si">$(</span><span class="nb">id </span>git <span class="nt">-u</span><span class="si">)</span> <span class="se">\ </span>
<span class="nt">-e</span> <span class="nv">PGID</span><span class="o">=</span><span class="si">$(</span><span class="nb">id </span>git <span class="nt">-g</span><span class="si">)</span> <span class="se">\ </span>
<span class="nt">-v</span> /etc/passwd:/etc/passwd:ro <span class="se">\ </span>
<span class="nt">-v</span> /var/data:/data <span class="se">\ </span>
<span class="nt">-t</span> foo</code></pre></figure>
<h3>Keep it small</h3>
<p>
For many of us this will mean using alpine as a base image rather than larger
images like Ubuntu or Debian although the more hardcore may start from
`scratch`. Not only does using a smaller image speed up pulls and deploys, but
having a lighter image means fewer default processes running and a smaller
attack surface. Unless you absolutely can't (in some cases Alpine can be
tricky to get running, mainly in my experience from running `musl` as opposed
to `glibc`) you should be using Alpine.
</p>
<p>
Also be aware of the all the layers being created in your image. `docker
history myimage` should show you all the layers your image is made up of.
</p>RobLooking through Docker Hub at numerous Dockerfiles and seeing how Docker is generally used, it's easy to see it's commonly being mis-used or giving a false sense of security. What we think is a contained, secure environment might be anything but that. Here are some common mistakes to avoid and good practises to follow.Decoupling your tests to improve code quality2017-10-08T10:59:00+00:002017-10-08T10:59:00+00:00http://kohi.uk/2017-10-08/decoupling-your-test<p>
Connascence is a quality metric for describing how coupled two systems are, or
in the terms of this example, how coupled our implementation class and test
class are. Because it describes levels of coupling (in a structured order, see
below) we can use it to help prioritise what should be refactored first.
</p>
<img
src="https://user-images.githubusercontent.com/296433/32545866-e76da88a-c474-11e7-83e7-9c5b642a70b8.png"
alt="Connascence"
width="389"
height="342"
/>
<p>
As always there is a trade off in how far down the chart you work but from
personal experience I usually find beyond Connascence of Meaning the value
starts to drop off. I'll go over a simple example (so take it with a pinch of
salt) that should show firstly, how to clean up your code, but also how you
could end up with a better solution. Oh, and excuse my python (if it's not
overly pythonic), I'm not a native python coder. In the below example I'll
tackle two of the types of Connascence.
</p>
<h3>Fizz Buzz</h3>
<p>
A very typical kata most of you may have come across at some point in time.
The requirements are very simple.
</p>
<ul>
<li>You count in numbers, starting at 0, incremeting one at a time</li>
<li>For numbers divisible by 3, you say "Fizz"</li>
<li>For numbers divisible by 5, you say "Buzz"</li>
<li>
For numbers divisible by 3 or 5, you say the rules in their above order, ie
"FizzBuzz"
</li>
<li>If the number does not match a rule, you say the number, ie "4"</li>
</ul>
<p>Easy! Right let's start test driving this...</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="k">class</span> <span class="nc">FizzBuzz</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">say</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="nb">input</span><span class="p">):</span>
<span class="k">if</span> <span class="nb">input</span> <span class="o">%</span> <span class="mi">3</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
<span class="k">return</span> <span class="s">"Fizz"</span>
<span class="k">return</span> <span class="nb">input</span>
<span class="k">class</span> <span class="nc">TestFizzBuzz</span><span class="p">(</span><span class="n">unittest</span><span class="p">.</span><span class="n">TestCase</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">test_it_says_Fizz_for_numbers_divisible_by_3</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">100</span><span class="p">):</span>
<span class="n">value</span> <span class="o">=</span> <span class="n">i</span><span class="o">*</span><span class="mi">3</span> <span class="bp">self</span><span class="p">.</span><span class="n">assertEqual</span><span class="p">(</span><span class="s">"Fizz"</span><span class="p">,</span> <span class="n">FizzBuzz</span><span class="p">().</span><span class="n">say</span><span class="p">(</span><span class="n">value</span><span class="p">))</span> </code></pre></figure>
<p>
I've not gone overly purist, and I've jumped in at getting "Fizz" to work. A
first test that passes as simply as I can make it. Many people may stop here
and move on to coding "Buzz", but I'd like to first tackle some coupling.
First up is <strong>Connascence of Value</strong> in that both my
implementation and test share knowledge of "Fizz" and "3". I'll fix this first
by injecting the values into my implementation so that my test can control the
scenario:
</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="k">class</span> <span class="nc">FizzBuzz</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">number</span><span class="p">,</span> <span class="n">value</span><span class="p">):</span>
<span class="bp">self</span><span class="p">.</span><span class="n">number</span> <span class="o">=</span> <span class="n">number</span>
<span class="bp">self</span><span class="p">.</span><span class="n">value</span> <span class="o">=</span> <span class="n">value</span>
<span class="k">def</span> <span class="nf">say</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="nb">input</span><span class="p">):</span>
<span class="k">if</span> <span class="nb">input</span> <span class="o">%</span> <span class="bp">self</span><span class="p">.</span><span class="n">number</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
<span class="k">return</span> <span class="bp">self</span><span class="p">.</span><span class="n">value</span>
<span class="k">return</span> <span class="nb">input</span>
<span class="k">class</span> <span class="nc">TestFizzBuzz</span><span class="p">(</span><span class="n">unittest</span><span class="p">.</span><span class="n">TestCase</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">test_it_says_if_divisible_by_a_specified_number</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">number</span> <span class="o">=</span> <span class="n">randint</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">9</span><span class="p">)</span>
<span class="n">word</span> <span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="n">uuid</span><span class="p">.</span><span class="n">uuid4</span><span class="p">())</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">100</span><span class="p">):</span>
<span class="n">value</span> <span class="o">=</span> <span class="n">i</span> <span class="o">*</span> <span class="n">number</span>
<span class="bp">self</span><span class="p">.</span><span class="n">assertEqual</span><span class="p">(</span><span class="n">word</span><span class="p">,</span> <span class="n">FizzBuzz</span><span class="p">(</span><span class="n">number</span><span class="p">,</span> <span class="n">word</span><span class="p">).</span><span class="n">say</span><span class="p">(</span><span class="n">value</span><span class="p">))</span></code></pre></figure>
<p>
In the above I've fixed <strong>Connascence of Value</strong>, and in doing so
I'm able to randomise the number and word said (as it doesn't matter what they
are). Contextually, if it can divide by the number, the word gets said.
</p>
<p>
Luckily I've no <strong>Connascence of Timing</strong> (the timing of the
execution of code doesn't impact me) or
<strong>Connascence of Execution Order</strong> (nor does execution order of
the implementation code affect me) and arguably no Connascence of Position.
</p>
<p>
I am however breaking <strong>Connascence of Algorithm</strong> as both my
implementation and test know that it's a multiple of a number. They are
coupled by both having to know the same "algorithm", in this case a simple mod
0. So let's fix this, again by injecting in the "algorithm" so that the test
controls the scenario...
</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="k">class</span> <span class="nc">FizzBuzz</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">rule</span><span class="p">,</span> <span class="n">value</span><span class="p">):</span>
<span class="bp">self</span><span class="p">.</span><span class="n">rule</span> <span class="o">=</span> <span class="n">rule</span>
<span class="bp">self</span><span class="p">.</span><span class="n">value</span> <span class="o">=</span> <span class="n">value</span>
<span class="k">def</span> <span class="nf">say</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="nb">input</span><span class="p">):</span>
<span class="k">if</span> <span class="bp">self</span><span class="p">.</span><span class="n">rule</span><span class="p">(</span><span class="nb">input</span><span class="p">):</span>
<span class="k">return</span> <span class="bp">self</span><span class="p">.</span><span class="n">value</span>
<span class="k">return</span> <span class="nb">input</span>
<span class="k">class</span> <span class="nc">TestFizzBuzz</span><span class="p">(</span><span class="n">unittest</span><span class="p">.</span><span class="n">TestCase</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">test_it_says_if_the_rule_applies</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">word</span> <span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="n">uuid</span><span class="p">.</span><span class="n">uuid4</span><span class="p">())</span>
<span class="bp">self</span><span class="p">.</span><span class="n">assertEqual</span><span class="p">(</span><span class="n">word</span><span class="p">,</span> <span class="n">FizzBuzz</span><span class="p">((</span><span class="k">lambda</span> <span class="n">i</span><span class="p">:</span> <span class="bp">True</span><span class="p">),</span> <span class="n">word</span><span class="p">).</span><span class="n">say</span><span class="p">(</span><span class="mi">0</span><span class="p">))</span></code></pre></figure>
<p>
In removing the <strong>Connascence of Algorithm</strong> I'm able to
competely change the context of my test. Instead of passing in a number I just
need to pass in a lambda expression (which would be "lambda i: i % 3 == 0").
In the case of my test though, I don't care what the expression is, only that
if it returns true, then FizzBuzz will say the word. I can also easily add a
negative test case:
</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="k">def</span> <span class="nf">test_it_says_the_number_if_the_rule_does_not_apply</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="n">word</span> <span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="n">uuid</span><span class="p">.</span><span class="n">uuid4</span><span class="p">())</span>
<span class="bp">self</span><span class="p">.</span><span class="n">assertEqual</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">FizzBuzz</span><span class="p">((</span><span class="k">lambda</span> <span class="n">i</span><span class="p">:</span> <span class="bp">False</span><span class="p">),</span> <span class="n">word</span><span class="p">).</span><span class="n">say</span><span class="p">(</span><span class="mi">0</span><span class="p">))</span></code></pre></figure>
<p>
Next up I hit <strong>Connascence of Meaning</strong> (an example would be,
returning an int to represent a monetary value. Is it pence? Pounds? Dollars?
Cents? etc...). This basic kata isn't really affected by it, so I'll halt my
refactoring there. In terms of testing "Buzz" I'm already covered by the above
tests. The next step would be to test saying two rules, ie "FizzBuzz" at which
point I'd fix only injecting a single rule/word, for an array of rules/words.
</p>
<p>
So where are the actual rules now that I've extracted them out? And how are
they tested? Separately of course, and easily testable. Here a Rules class I
eventually ended up making (that gets injected into my FizzBuzz class) that
represents my configuration:
</p>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="k">class</span> <span class="nc">Rules</span><span class="p">(</span><span class="nb">list</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="nb">list</span><span class="p">.</span><span class="n">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="bp">self</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">Rule</span><span class="p">((</span><span class="k">lambda</span> <span class="n">i</span><span class="p">:</span> <span class="n">i</span> <span class="o">%</span> <span class="mi">3</span> <span class="o">==</span> <span class="mi">0</span><span class="p">),</span> <span class="s">'Fizz'</span><span class="p">))</span>
<span class="bp">self</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">Rule</span><span class="p">((</span><span class="k">lambda</span> <span class="n">i</span><span class="p">:</span> <span class="n">i</span> <span class="o">%</span> <span class="mi">5</span> <span class="o">==</span> <span class="mi">0</span><span class="p">),</span> <span class="s">'Buzz'</span><span class="p">))</span></code></pre></figure>
<p>
In terms of testing the above class, all I need to test is that a rule exists
in the array for "Fizz" and for "Buzz". I'm effectively just testing my
configuration is correct, not the logic arround how rules/words are applied.
Using Connascence I've been able to decouple my tests, and in doing so it's
help drive out a solution where I've decoupled configuration from function. My
FizzBuzz class doesn't need to know the details of the rules, it just applies
them if necessary.
</p>
<h3>Context</h3>
<p>
To take a step back and look at the problem again...FizzBuzz is (as I know it)
a drinking game. New rules are constantly added as the game progresses (making
it harder and harder so more drink is consumed). If a rule applies, you say a
specified word. If not you say a number. The above solution I've ended up with
makes it very easy to add new rules without needing to change a lot of code.
It's only when rules overide other rules etc... (new features!) that things
get more complicated, but the code is in a good/flexible position to adapt now
that I've refactored it and reduced coupling.
</p>
<h3>Summary</h3>
<p>
I've a nearly finished solution (lacking a couple of configuration tests)
here: https://github.com/robertbeal/kata-fizzbuzz-py although keep in mind it
only shows the destination, not the journey (so to speak). And while the above
does test configuration and function, you would still compliment it with
higher level tests as well.
</p>
<p>
This post is just a very simplistic example of using Connascence to decouple
your tests. I've not covered all of the levels of Connascence (although please
click any links above for code examples) but I hope it shows a rough idea of
how effective it can be at improving your code quality. It's obviously much
easier to apply to a kata like FizzBuzz than "real" code, but that's something
that comes with practise. All I can say is that once you try it (and see the
light) you'll get hooked as it's quite an eye-opening, measurable tool for
improving code quality.
</p>RobConnascence is a quality metric for describing how coupled two systems are, or in the terms of this example, how coupled our implementation class and test class are. Because it describes levels of coupling (in a structured order, see below) we can use it to help prioritise what should be refactored first.Maximising your day with stand-ups2017-09-01T10:59:00+00:002017-09-01T10:59:00+00:00http://kohi.uk/2017-09-01/micro-standups<p>
Stand-ups are a great way to synchronise as a team and create a plan for what
you'll do until the next stand-up but like any tool they're open to misuse,
over dependency or misunderstanding. Many teams will most likely perscribe to
the Scrum model of a time-boxed, ~10-15 minute meeting that happens in the
morning as a way to plan out the work for the day. While this is a good first
step it's easy to forget why you have a stand-up and fall into the trap of
waiting (waste) for the stand-up to happen in order to truly start the day or
next piece of work.
</p>
<p>
One problem of a stand-up is when do you have it? Too early and people may
miss it or you lose out on being flexible to different people's needs (doctors
appointments for example). Too late and you're not maximising your day as
people may be waiting for it to happen (a consequence of over dependence)
before they're able to really get going. So how can you be both flexible and
efficent?
</p>
<h1>(Micro) Stand-ups</h1>
<p>
A technique I've successfully used, largely due to people arriving at
different times in the morning, was to use micro stand-ups. Many people may
already do this, but if you don't it's worth formalising initially until it
becomes natural:
</p>
<ol>
<li>
The first team member arrives at work and starts working on the most
important thing.
</li>
<li>
Later the second team member arrives who then has a quick stand up with the
first person. A very short quick discussion; what are you working on? can I
help? etc...
</li>
<li>
Later still the third team member arrives, and again a quick stand-up takes
place and the three team members set to work.
</li>
<li>This continues as each person arrives.</li>
</ol>
<p>
Also if you finish a feature, or hit a problem part way through the day.
Again... have a micro stand-up; synchronise; plan; get to work. Any time you
need a quick discussion or plan just shout out to the team for a micro
stand-up rather than waiting to raise it the next day.
</p>
<p>
The aim isn't to push people to 100% efficiency (I'm a big believer in having
slack), it's simply to have discussions and planning sooner to enable people
to be able to get on with work sooner. That doesn't mean they have to, but it
means they at least have the option.
</p>
<h1>Questions/doubts</h1>
<p>
<b>Isn't it disruptive?</b> It may seem so but it encourages collaboration and
discussion which aren't bad things and the aim is to be brief and quick each
time. It also helps reduce waste as you're synchronising/planning right away
rather than waiting for a set timed meeting.
</p>
<p>
<b
>What about updating people outside the team, or people who need a set time
to attend?</b
>
There's no reason why you can't still do your main stand-up. Given all the
discussion that will have already happened in the micro stand-ups, then the
main stand-up should consequently be very quick and to the point.
</p>
<p>
<b
>What if we can't continue something because the person working on it isn't
in yet?</b
>
Firstly, try not to leave work in progress before going home. Collaborating
and working together also helps stop this happening as knowledge is shared
between the team.
</p>
<p>
<b>What if we all arrive at a similar time anyway?</b> Then maybe you don't
need micro stand-ups! But certainly have a stand-up the second everyone is
there, why wait? (you can still have your main stand-up later).
</p>
<p>
<b>How does the first person know what to work on?</b> I'll cover this
below...
</p>
<h1>Stand-downs</h1>
<p>
At the end of the day we would have a quick stand-up (stand-down), more like a
mini retrospective. Again, it should be very quick. We would often answer 3
questions (as an example)...
</p>
<ul>
<li>Were there any problems/blockers today?</li>
<li>
If yes, what can we do tomorrow to stop them repeating? (card/action goes on
the board etc...)
</li>
<li>Do we all know what we'd get on with if we were first in tomorrow?</li>
</ul>
<p>
The quick stand-down / retrospective meeting is a way to review the day and
ensure that people know what they are doing tomorrow. So if someone does get
in early they are able to get on with work straight away.
</p>
<h1>Give it a try</h1>
<p>
While this technique may not work for all teams, it's worth trying to at least
find out if that's the case. Stand-ups don't need to be a one off, set-time,
daily occurrence but instead can be used throughout the day to great effect,
especially if your team takes a while to get going in the mornings.
</p>RobStand-ups are a great way to synchronise as a team and create a plan for what you'll do until the next stand-up but like any tool they're open to misuse, over dependency or misunderstanding. Many teams will most likely perscribe to the Scrum model of a time-boxed, ~10-15 minute meeting that happens in the morning as a way to plan out the work for the day. While this is a good first step it's easy to forget why you have a stand-up and fall into the trap of waiting (waste) for the stand-up to happen in order to truly start the day or next piece of work.