management - Brian Reich

What I Got Out of Agile + DevOps East 2023

An image generated by DALL-E to methophirically represent the blending of the concepts of agile software development and artificial intelligence. Shows a humanoid figure made of code, surrounded by circuitry, and symbols evoking agile and devops processes

I’ve never gotten the chance to attend a tech conference in person. Maybe that changes next year. This year, I got the chance to virtually attend Agile + DevOps East.

What I wanted out of the conference isn’t exactly what I got. As a baby engineering manager and aspiring agilist I hoped to gain a better understanding of agile software development and DevOps, and take some lessons home to implement. What I got was a little confidence that I am already on the right path, and a whole lot of prognostication about AI.

But it was useful.

I’ll start with key take-aways that I pulled from the conference, and then move into a summary of each of the talks I attended.

My Key Take-Away’s from Agile + DevOps East 2023

At a high level, this is what I took away from Agile + DevOps East 2023

The software industry is losing it’s religion when it comes to agile.
We still care about the foundations of agile. Those things the manifesto sought to correct. But people are tired of the systems, and the rigidity. Agile is in it’s reformation.
The industry has long-since moved on to DevOps and Continuous integration. Which to me looks a lot like “agility in practice.”
Software teams should be self-contained: all the expertise required to deliver working software should be on the team.
The AI is coming! But coming to make us more efficient, not replace us.

AI-Powered Agile and DevOps

This talk by Tariq King used, of all things, the evolution of Super Mario Bros. to demonstrate how and why software development processed have evolved to improve quality, efficiency, and flexibility as the the industry mature. Though sometimes the metaphor was a little stretched, it made the point and definitely spoke to my inner 80’s kid who’s still playing Mario games with his kids!

We’re currently in the Agile + DevOps era. While we don’t know what the shape is going to be exactly, we can be pretty certain AI is going to shape what’s next.

But the attributes of current AI make it are tuned to make us efficient, not accurate. We need to use the tools ethically and intelligently to shape positive outcomes.

Key Take-Aways

Software models change. We’re at the Agile + DevOps age right now. AI will influence what is next.
AI will speed up all the processes in our development lifecycle including product management, requirements analysis, development, testing, and release.
But productivity cannot be measured in quantity alone. Productivity combines quantity, efficiency, and quality.
We need trustworthy AI tools. Otherwise, we risk using AI to deliver garbage faster.
To deliver quality, AI needs to be testable, controllable, observable, and explainable. These are attributes the current iteration of AI lacks.
Don’t build on assumptions. Form a hypothesis, test, and verify.

A Minor Point of Disagreement

One of the slides in Tariq’s presentation offered an example of how AI can help make development processes more efficient. A business analyst used AI to convert chat transcripts to user stories.

In my experience that example misunderstands where the valuable work is being done. The real work was facilitating a conversation and asking the right questions to introspect the problem. All the AI did was crunch words into a different format. Which is, in fairness, valuable. It eliminated grunt work. It did not eliminate or even speed up the real work of the analyst.

In my experience, AI has not been great at processing raw transcripts of conversations. Conversations have a pace and cadence that can be hard to parse. People speak in fragments that are clear in the moment but sound fragmented when converted to a raw transcript. There is unspoken information and context in the negative space of the conversation that tools cannot help to capture.

How AI is Shaping High Performance DevOps Teams

Vitaly Gordon’s talk was mostly about measurement. He makes that point that engineering is often the least managed function in an organization (based on context I think he really meant to say least measured). In DevOps, we should be measuring the health and productivity of our team and product (DORA metrics are one example).

In the future, AI can help us measure and improve these metrics.

Key Take-Aways

Engineering is often one of the least managed and measured functions of a business.
To reduce lead time we should reduce wait time. In other words, measure and identify blockers like slow PR approval, and figure out how to eliminate the blockers.
Use automated testing to reduce Change Failure Rate.
Use AI to generate more test coverage.

DevSecOps in a Bottle: The Care and Feeding of Pocket Pipelines

Jennifer Hwe’s talk focused on how her team improved security, maintainability, and delivery by bringing DevSecOps practices into an organization with a lot of complexity. Her team was charged with implementing DevSecOps, CI/CD, and containerization on a legacy product that required a focus on heightened security practices, and had to serve multiple teams that were previously working in silos, on separate networks with their own ops and security processes.

Key Take-Aways

Innovation was being held back by lack of DevSecOps automation. They couldn’t deliver new features quickly because manual processes held them back.
When you plan to implement DevOps or any Dev*Ops variant, you’ll likely cut across various parts of the organization with different cultures, and different opinions on how things should be done. Be prepared to identify and address both technical and cultural challenges.
Taka a phased approach.
Change is slow. In an organization as large as Northrop Grunman, their transition was measured in years.

Lead Without Blame

This talk by Tricia Broderick felt like the philosophical sibling of Sarah Drasner’s Engineering Management for the Rest of Us. It’s all about the fact that organizations often hire up from the developer pool into management, but does not prepare former individual contributors for their new role. This talk felt like the missing manual.

Key Take-Aways

As a technical manager don’t write code because you’re good at it, or because it’s your happy place. That’s not your job anymore.
“Sitting together” doesn’t make you a team. That just makes you a group. Health collaboration makes you a team.
Individuals can win while whole teams and projects fail. That’s still a failuree.
Transition yourself out of the “hub” of operations. You’re not that important. You’ll bottleneck productivity and team growth if you stay there too long.
Don’t focus so much on individual accountability.
Focus on building team members who are responsible, motivated learners.
Conflict good. Drama bad.
Further Reading: Lead Without Blame

The Potential of AI and Automated Testing, Conquer Test Script Challenges with AI

This talk by Jason Manning, Nyran Moodie, and Orane Findley was more of a high-level, open discussion about how AI has and will continue to change software testing. They discussed some of the pitfalls we need to be aware of as we build more reliance on AI to build tests and perform automated testing.

Key Take-Aways

AI can help you get data-driven metrics about your product (but didn’t really dive into “how”)
It may be possible for AI to scan web pages and generate tests for you (again, “how”)
We need to consider risks to privacy and security as we plug AI into our products, our tests, and our intellectual property
Consider how to use AI without sharing sensitive data or IP
At this point, a human needs to be involved in order to ensure the results of AI-driven processes are accurate and secure.

We Got Our Monolith to Move at Light Speed

This talk by Corry Stanley and Marianna Chasnik hit a bit close to home for me. It was all about how they moved a legacy monolith at Discover Financial from a “few releases a year” to a two-week release cycle. Sounds a lot like the journey I’ve been on. Discover succeeded by bringing Ops skills into the product team, using modern tools, infrastructure, and techniques to drive productivity, release faster, and reduce defects.

Key Take-Aways

The product team needs DevOps skills built-in
Train your whole team in DevOps
DORA metrics are lagging indicators of health
Treat Pre-production (staging and test environment) failures as production failures. Act accordingly, and act fast.
Avoid broken baselines. Use tools and processes line standardized branching models, automated deploys, automated quality tools, automated testing, and branch protection rules to shift quality and validation as early in the process as possible.

The Art of Getting Less to be Faster, Smoother, and Better – Embracing the Agile Principle of Simplicity

Robert Clawson’s talk was near and dear to me as the head of a project that suffers from a legacy of organic, unnecessary complexity. Robert advocated for achieving simplicity and productivity by maximizing the work not done.

Key Take-Aways

People and time are finite.
Our incentive structures rarely reward subtraction, even though subtraction can be an incredibly intellectual, creative, and valuable endeavor.
Features “not worked on” are valuable. It means you saved your resources, or chose to use them to do something with more value.
Sometimes removing something is the most valuable thing you can do. Clawon’s example was the K-brick which optimized cost and materials without sacrificing structural integrity.
Look for opportunities for reuse. What do you have? How can you reuse or further capitalize on it without adding complexity?
Further reading: Subtract, The Untapped Science of Less

AI and the Future of Coding

Christopher Harrison from GitHub gave a refreshingly down-to-Earth talk about the current and future state of AI.

Generative AI is an enhancement to software development that can make us faster, but AI cannnot write full applications, write perfect code, or replace developers.

In experienced developers risk shipping bad code by over-relying on AI and not understanding the results it generates.

Experienced developers driving AI can use it to work faster, reduce the pain and time engaged in unpleasant tasks.

Key Take-Aways

Automated code review is coming. But don’t forget about other automated tools like GitHub Actions to automatically check security, code quality, etc.
AI can help with unpleasant tasks like writing unit tests.
AI can help with uncommon syntax, like figuring out regular expressions.
AI can help you rapid prototype and experiment.
But AI can’t help you write good code if you don’t already know how to write good code.

Technical Debt for the Nontechnical

A photograph of several Jeeps that can't go anywhere because one is stuck in the mud

If you hang out with programmers long enough, you’re bound to hear one of them vent about technical debt. What is technical debt? Why is it so bad? And more importantly, why should you care?

Let’s begin with that classic cliché we all know and love, the dictionary definition.

Debt: a state of being under obligation to pay or repay someone or something in return for something received.
Merriam-Webster Dictionary

Technical Debt is created when we accept technical trade-offs for a short-term advantage with long-term consequences. Technical debt is a bargain we strike with our future selves. If we don’t want to suffer the consequences, it must be paid back.

I’m Not a Programmer. Why Should I Care?

“I’m not a programmer. I’m in sales, marketing, customer service, the C-suite, or somewhere else. Why should I care about your technical debt techno-babble?”

Why do we write software, and who do we write it for?

For most of us the cold, capitalist answer is we write software to make money for our organization. But that’s an outcome of doing the job well, not the reason we do it. We write software because we’ve identified a problem we can solve for customers. We code to solve problems, delight our users, and keep them coming back. If we do that, the money happens.

You have a stake in all of that too. Other roles in the organization have customer and market insights critical to the software team and the product’s success. If sales, marketing, and the rest of the organization are rowing in different directions, the right customers won’t know we solved their problem, and software won’t succeed. We’re all in this together.

So if we’re not careful, good intentions on the part of the rest of the organization to boost revenue, get a product or feature to market faster, or close a sale can create perverse incentives to take on technical debt.

Even worse: if a nontechnical person asks the technical team to sacrifice “doing it right” in order to “do it cheap” or “do it quick,” if they get what they want and don’t see a consequence, they’re going to keep asking for it.

This works, until it doesn’t.

And so, anyone that can influence software’s direction can create the conditions where technical debt can flourish and lead to failure. That’s why you should care about technical debt.

The consequences of technical debt. "goto" by XKCD

Examples of Technical Debt

Technical debt can be created all sorts of ways. Below are a few examples of what this looks like in practice:

Putting sloppy code in production. The code “works” but is poorly written. The customer may get the feature faster but the choice slows down future development because the work is hard to understand, buggy, hard to change, brittle, or inflexible.
Progress by Copy-and-Paste. You deliver new features by copy-pasting old ones. In the short-term this may deliver immediate value. But you’ve multiplied the complexity and time required for future changes, enhancements, and bug fixes.
Putting inefficient code in production. You know your code hogs resources. The customer may get your changes faster, but their experience suffer from poor performance, and the software may be significantly more expensive to host.
Putting code with known security vulnerabilities in production. You know your code has potential security risks and choose to release it anyway. The customer may get the feature faster, but you’ve introduced code that puts all customers, and your organization’s credibility, at risk.
Skipping Documentation. You don’t document your code in order to release it faster. In the future, developers have to stop and build an understanding of the “old code” before they can make any changes. Progress slows going forward. If you skip public-facing documentation, you may fail to build institutional and user-level knowledge of changes, missing the chance to educate and advocate for your own product.
Skipping Automated Testing. The code “works”… so you think. But you skipped automated testing to release faster. You miss bugs. You introduce regressions in features that used to work before. You eventually find yourself buried in toil: work that is pure overhead, devoid of long-term value, because you chose to skip QA.
Building in Toil. The software works but processes that could be automated are built on human intervention. This may help the product or feature release faster. But it introduces friction into the user’s experience of the product and results in a product that can only scale by adding more humans. (And those humans usually require salaries.)

These are examples of technical debt. And to reiterate: some debt is okay, so long as you pay it back.

What Are the Consequences of Technical Debt?

In the financial world, failing to pay your debts has consequences. The bank starts taking stuff, and eventually life starts sounding like a country tune.

A chart illustrating that technical debt causes the cost of change to increase over time.

In software, failing to pay your debts has consequences too. It results in a high Cost of Change. In other words, a product with high technical debt will be harder, slower, and more expensive to build on than the same product with less technical debt. Think of it like inflation: the same dollar buys less new feature development today than it did yesterday.

Here are some examples of what it looks like when software is over-leveraged on technical debt.

Too Much Toil. Developers are spending the majority of time engaged in toil: work that is pure overhead and has no long-term value to the organization. But without it, the system eventually grinds to a halt.
Stagnation. If you’ve spent enough time expecting “fast” solutions over “good” solutions, this eventually catches up with you. Your developers can’t get to new feature development because they are buried in fixing bugs.
Inefficiency. Adding a programmer to the team doesn’t result in “1 programmer worth of value to the organization.” You’ve just added an additional rower to a rowboat stuck in peanut butter, instead of water.
Turnover. You can’t keep talented developers because they want to solve interesting problems, not make a career of cleaning someone else’s code.
Declaring Technical Bankruptcy. Your software may become so cumbersome to maintain that the only sane path forward feels like starting from scratch (which has it’s own set of problems).

Summary

So sum it up all up: technical debt is created when trades-offs are made to accept worse code in exchange for short-term gains. This can be strategically useful, but only if you honor the promise to pay it back.

Anyone that can influence software decisions can create the conditions for technical debt. We’re all in this together, and should default to promoting mature, sustainable engineering practices over shortcuts taken for short term gains.

Over-leveraging on technical debt has very real consequences that may not surface right away. If you’re progress stagnates because your engineering resources are stuck fixing problems caused by a history of ignoring mature, sustainable engineering practices, you’re probably over-leveraged on technical debt.

But now you know what technical debt is, and what it looks like in practice. You also know how to spot evidence that your organization has taken on too much in the past. Armed with this information, you’ve got the opportunity to help your organization make smarter, more sustainable decisions to reduce technical debt, and avoid creating more in the future.

How to Code as an Engineering Manager (Maybe Don’t?)

Your scientists were so preoccupied with whether or not they could, they didn’t stop to think if they should.
Fake Chaos Theorist and Dinosaur Assault Victim, Ian Malcolm

So, you’re an engineering manager. Your backlog seems overwhelming. You think “what better way to support my team than to pick a ticket and reduce their workload?

You could. But stop for a moment and consider if you should.

Maker’s Schedule vs. Manager Schedule.

You assign yourself a ticket from the critical path.

Then what happens? You start with good intentions. But then you get distracted. In the negative space between meetings, you just barely have time to remember what you did last time. Three weeks later you haven’t completed the ticket.

Not only have you not helped your team, you’ve actually let them down by making an agreement you couldn’t keep, and preventing on-time delivery.

Maker’s Schedule, Manager’s Schedule is as true today as it was on the day it was written. To sum it up: programming takes time and focus. Programmers need the freedom to ignore distractions. In contrast, a manager’s schedule is all about distractions: a project meeting here, a presentation to leadership there, one-on-ones, agile ceremonies, “unsticking” individual contributors.

Every six seconds, a manager somewhere on the planet says, “when am I supposed to get the real work done?“
Source: September, 2023 Journal of Fabricated Statistics

Turns out the meetings were the real work all along, sucker!

But I Really Want to Code!

I know, right?

Coding is my happy place. Marking something Done can mean the difference between an emotionally draining day with nothing to show for itself, versus logging out with a sense of accomplishment. Let’s face it: even the worst requirements document still defines Done better than most management responsibilities.

But that’s not a good reason to pick a ticket and risk breaking your team’s agreement to deliver something.

So to scratch your itch, here are some suggestions:

Don’t assign yourself work on the critical path. I’m just repeating this again, in case it didn’t sink in. You mean well, but this is a the road to Hell is paved with good intentions kind of a situation. If you want it released on time, assign it to someone else.
Select low-effort technical debt. Tech debt tasks such as cleaning up a smelly slice of code, refactors, test enhancement, and documentation, are often small tasks you can fit into the margins in your schedule, aren’t blocking others, and will help the team in the future.
Experiment. Is there an internal tool that would improve your team’s experience? Is there a process that could be automated? Go get started! A recent example from my team is a tool I built that monitors Jira, GitHub, and other tools and sends each team member a morning email to remind them of their obligations for the day, like PR assignments. It fit into the margins of my week just fine.

Just be aware that your team may get jealous if they see you plucking the “interesting” projects and leave them picking up scraps.
Defragment your calendar. If you really want to code, spend some time defragmenting your calendar. Basically, try to rearrange your planned engagements to be patched together with less empty space between them. A successfully defragmented calendar looks like large blocks of meetings (boo!) with large blocks of free space as a result. (yay!) Working for an organization that’s willing to embrace this concept collectively definitely helps.

Now the obligatory throat clearing: this is just a recommendation I’ve found works in my experience. I don’t always get it right. But when the urge strikes to raise my hand and say “I’ll look into that” I reflect on whether or not I actually have the time.