Two Color Journaling

I journal in black and red. Almost everything is black because I reserve red for anything on theme or particularly important to me. By “on theme”, I mean consistent with my yearly theme, which is to Make Art with Friends.

This has advantages in review and while journalling the day.

I am using a Recurring Journal this year, which means that when I journal I am seeing a past day on a different part of the page. If that past day was particularly red, I can try to match it. I can also easily see the red ink as a flip around. At the end of the year, when I review the journal, it will be easy to pick out the important parts.

During the day, I can see how much red ink I have or have not used and try to see if I can get something in to make the day more red.

Not every day needs to have some red on it, but sometimes I have a free moment, and since I have a low bar for making art, I can easily do it.

Morning Pages Make Me Feel Like ChatGPT

In the first episode of my podcast I said that I do morning pages to train myself to write on demand, and then I followed that up in Episode 3 where I explained that I use the momentum from morning pages to write a first draft of something.

While doing my morning pages last week I thought about how doing them is kind of like how ChatGPT generates text. It’s just statistically picking the next word based on all the words so far in the prompt and what it has already generated.

I am also doing something like that In my morning pages. I am writing writing writing and I use the words so far to guide my next ones.

My mind strays as I write and a phrase might trigger a new thread, which I follow for a bit and then follow another and another. ChatGPT’s results are a lot more coherent than my morning pages. It has an uncanny ability to stay on topic because it is considering all of the text, and I don’t.

First drafts are different. When I switch to writing a first draft, I do consider the entire text. I’m not as fast, because I am constantly looking at where I am so far. I also start with a a prompt in the form of a simple message that I hope to convey, which I use as the working title.

I know I could get a first draft faster from ChatGPT, but it would not be as good (I think), or at least not specific to me. More importantly, I would not have improved as a writer.

[NOTE: While writing a draft of this post, I thought of a way to make my morning pages more directed and made a podcast about it]

Write While True Episode 19: Prompt Your Morning Pages

If you are just coming to this podcast on this episode, I have to tell you that I talk about Morning Pages a lot. It was the subject of Episode 1. Listen to that for the full description of what they are or read the book The Artist’s Way by Julia Cameron, which is where I was introduced to the idea.

The main thing to know is that I start each morning by writing three pages of long hand writing in an automatic stream of consciousness style. I never show them to anyone and they aren’t meant to be published. The point is to train my brain to generate text on demand.

If you read the entire thing, I would have a couple of sentences in a row here and there that make some sense, but overall, it’s not well-structured in any way. My pages tend to stray from topic to topic because I’m not considering the entire text.

This is the breakthrough I had today.


The System Boundary is Defined by the External Pieces

In a C4 System Boundary diagram, you start by drawing a blue box in the center. That’s your system. And you draw some blue stick figures with arrows pointing at that box. Those are your users.

An empty blue box next to blue human shape. An arrow is pointing from the human to the box.

Every system in the world pretty much looks the same if you stop there. Put some words on the parts to make it more specific.

A blue box next to blue human shape. An arrow is pointing from the human to the box. The blue box says "Sprint-o-Mat: a watchOS app to guide programmed runs" and under the human is a caption "Runner". An arrow points from the human to the box and says "sets up and runs with". This is the system in a context diagram.

But this diagram is called a Context diagram for a reason. The most important part is not the system box (the three other types of C4 diagrams will elaborate on it), but the all of the gray boxes and stick figures you put around it.

A blue box next to blue human shape. An arrow is pointing from the human to the box. The blue box says "Sprint-o-Mat: a watchOS app to guide programmed runs" and under the human is a caption "Runner". An arrow points from the human to the box and says "sets up and runs with". This is the system in a context diagram.

Under that are gray boxes that say HealthKit, RunGap and Running Social Networks. These are external context systems. There is a gray set of humans that say "other runners" The diagram show the relationship between them.

These are the external pieces that are not the system and who are not your users. They are out-of-scope, but do a lot of work in the diagram to help describe the system.

Metrics that Resist Gaming

The other day I described CRAP, which is a metric that flags code that is risky to change. I suggested that if you need to change that code, you start with tests and refactoring.

Tests and refactoring are positive to the codebase and improve the readability of the PR they are in, which is why I like CRAP as a metric—it’s hard to game.

In contrast, Story points (and velocity based on it) are the exact opposite.

Story points are completely made up numbers that have low accountability. Velocity is just points-over-time, so it’s also made up. If a team must improve velocity (for their manager), the easiest thing to do is to over-estimate the points per task and like magic, velocity can meet any target. I don’t think engineers would do this consciously, but this is just a known phenomenon of metrics (see Goodhart’s Law).

This is one of the reasons I don’t use Story Points. But to be honest, almost any estimation technique is ripe for gaming (ala Scotty from Star Trek).

When I wrote about DevEx, a new developer productivity methodology, I wrote that I thought that they “do help engineers deliver software better and faster”, but that they are most useful to the team itself (not stakeholders).

Looking over that article I realized that the thing I like about these metrics is that they are hard to game. If I get several 3-hour blocks of uninterrupted coding time per week, then I am sure I can write more and better code than if I didn’t. Counting lines of code (and judging its quality) is fraught, but hours of uninterrupted time is easy to count and more is better.

If you are worried that coders will shirk other duties (like code reviews or attending meetings), there is another metric to measure feedback loops, which is in tension to the flow metric.

My main critique of DevEx still stands—it’s not something to report outside of the team. But the more I think of it, the more I like it and will try implementing it on my (1-person) team.

My New Podcast Generating Workflow

In Accessibility First in Podcasts, I wrote that since my podcasts are scripted, I don’t have to work hard to get a transcript.

But I found that writing a script was both hard and resulted in a podcast that sounded written. Even if I memorize and perform it well, it didn’t sound like my spoken “voice”.

So, I decided to

  1. Start with a rough outline
  2. Make a recording of a lot of extemporaneous speaking on the subject
  3. Make a transcription of the recording using Whisper from OpenAI
  4. Edit the transcription into a coherent story, but try to preserve the phrasing
  5. Practice it
  6. Make a recording that basically follows the script. It’s ok to make mistakes, rephrase, or veer off.
  7. Edit the recording to remove mistakes and reduce overly long pauses
  8. Listen to the recording and fix the script so it’s now a transcript.

The key thing is step #4 which helps me make a script that sounds like me talking (not writing).

Use Your First Commit to Fix CRAP

The CRAP metric combines cyclic complexity and test code coverage to flag functions that are both complex and under tested so that you can see which functions are risky to change.

There are extensions for many IDEs to get you the metric directly or that will show the parts (test coverage and complexity). But you don’t really need them, because you know CRAP-y code when you see it—run unit tests to see if the function is under test and eyeball the complexity by counting up the branches and logical sub-expressions—you can stop counting at about four, because more than that is probably CRAP.

So, if you have to change a CRAP-y function, you could start the PR by trying to lower the score.

The first step to reduce CRAP scores is to add tests. Complex functions are often hard to test, but I would add any tests you can to start, because they help with the next step.

Next, lower complexity by refactoring the function down into simpler parts. The tests you just added will make sure you do it right, but these should be simple mechanical refactors that might even be automatable by your IDE. If they are not trivial, you need to add more tests. Do not restructure or rewrite code unless that’s the goal of the PR—all of your changes should preserve the observable behavior of the code.

I start a lot of PRs this way. It’s a good way to get warmed up, and you know that you are improving the code base in a place that benefits the most from it. You are paying technical debt down right before an interest payment was due.

First Rule of Refactoring Club

Don’t talk about refactoring club.

A long time ago, I linked to this post by Martin Fowler (author of Refactoring), where he lamented at the misuse of the word “refactoring”:

However the term “refactoring” is often used when it’s not appropriate. If somebody talks about a system being broken for a couple of days while they are refactoring, you can be pretty sure they are not refactoring. [This is] restructuring.

For me, refactoring might be part of every PR. My first commit is often a refactoring that makes the rest of the commits easier to do and understand. I might also refactor at the end, but those commits will be squashed before I PR since you don’t need to see how I got there.

In TDD, there’s a specific practice to Red, Green, Refactor your way to working code (or as I do it Green, Refactor, Red) that explicitly thinks of refactoring as a small thing you do often.

The tell to knowing that you are doing refactoring wrong is that you feel like it’s something to talk about. Refactoring, when done well, is about as interesting as variable naming.

It’s not not interesting, but you don’t need to talk about it in a stand-up.

Making Sausage and Delivering Sausage

There’s an article about DevEx, a new developer productivity methodology, in ACM Queue. If you subscribe to the Pragmatic Engineer newsletter, there was an interview with the article’s authors last week. This is the latest methodology from the people behind DORA and SPACE.

DORA’s measurements were grounded in externally visible outcomes.

  • Deployment Frequency
  • Mean Time to Recovery
  • Change Failure Rate
  • Lead Time

The idea was to pick things that engineers could actually control.. Even though the elements of DORA are not directly translatable to business outcomes, they are still understandable to external stakeholders.

In SPACE, these metrics are still one kind that we collect, but SPACE also recognizes that there are other things besides Performance and Activity metrics (the P and A of SPACE). It also considers Satisfaction, Communication, and Efficiency, which are more internal to the team.

In DevEx, the emphasis is on internal metrics: Flow, Cognitive Load, and Feedback Loops.

I want to say upfront that I completely agree that these things do help engineers deliver software better and faster. But they are hard to share outside of the team. It’s how the sausage is made. The business ultimately needs to deliver sausage.

Aside from the rest of the business not understanding or caring about these metrics, I also worry that they will try to get too involved in them. Engineering leadership should care a great deal about the cognitive load of the members of their teams, and should work to lower it, but they need to find a better way to express that outside of engineering if they do.

I know the DevEx authors know this, and emphasis on these “making sausage” metrics doesn’t mean that they don’t also think externally visible performance isn’t important (they did after all design DORA and SPACE). But if you deliver on, for example, long flow states, but there isn’t more useful software on servers, you have failed in the business objective. This is the same thing I said about Story Points—they are too far removed from things people outside of engineering care about:

[…] regular people just translate [story points] to some notion of time anyway, and in that regard, I am very regular. If you are going to take some random number I provide and assign a date to it, I’d much rather I do the translation for you.

To the extent that you report directly on DevEx, try to emphasize the parts outsiders can help with. Frequency of meetings and speed of external feedback loops (especially from product management) are good examples of that.

Write While True Episode 18: Taking My Own Advice

I thought it would be a good idea to re-listen to all of my podcasts from season one. Each of them is only about 10 minutes and there are only 15 episodes, so it doesn’t take too long.

This had two effects.

First I realized they weren’t as bad as I thought, which made me feel better about restarting.

The second thing is that I started to hear the advice almost as if it was coming from a third party because I had recorded these so long ago. I had dropped many of these practices during my break, so it was almost like hearing from a different person. But that person was making podcasts and I wasn’t, so I decided to listen to him.