Author Archives: Lou Franco

Use S3 to Serve Podcast Episodes

I started a podcast about a month ago, and for various reasons I decided to self-host it rather than use a podcast service. I am doing this mainly because I want the episodes to be available indefinitely, even if I stop making new ones, and I don’t want to pay for just hosting. I also don’t care about analytics, and I have the skills and desire to learn how to self-host.

I think this is the wrong choice for almost everyone who podcasts.

But, if you got this far, I will say that it’s probably right not to just put your mp3 files on your web-host. I haven’t really done the math, but these are large files, and if you get any kind of traffic, it will probably be expensive and possibly send you over your caps.

I’ve decided that the minimum I need to do is to use S3. I think it’s probably technically correct to also use a CDN, but I’ll cross that bridge if I get more traffic.

(If you have no idea what S3 or a CDN is, I really recommend you do not go down this route)

There are a lot of good guides out there for the specifics. I used these two:

In addition to setting up a bucket for your .mp3 files and artwork, I suggest you set up a separate bucket for logs and then send web access logs to that bucket. The AWS official docs are good to see how to do this.

By having logs stored you have enough to get some simple analytics. There are services that can read and graph the data in them.

I will post soon about how I scripted a simple way to get episode download counts.

Tech Debt Happens to You

In the original ANSI C, there are a bunch of library functions that use internal static variables to keep track of state. For example, strtok is a function that tokenizes strings. You call it once with the string, and then you call it with NULL over and over until you finish reading all of the tokens. To do this, strtok uses static variables to keep track of the string and an iterator into it. In early C usage, this was fine. You had to hope that any 3rd party library calls you made while iterating tokens weren’t also using strtok because there could only be one iterator at a time.

But when threads were introduced to UNIX and C, this broke down fast. Now, your algorithms couldn’t live in background threads if they used strtok. This specific problem was solved with thread-local variables, but the pervasive use of global state inside of C-functions was a constant source of issues when multi-threading became mainstream.

The world was switching from desktop apps to web apps, so now a lot of your code lived in a multi-threaded back-end that serviced simultaneous requests. This was a problem because we took C-libraries out of our desktop apps and made them work in CGI executables or NSAPI/ISAPI web-server extensions (similar to Apache mod_ extensions)

To make this work, we had to use third-party memory allocation libraries because the standard malloc/free/new/delete implementations slowed down as you added more processors (from constant lock contention). Standard reference-counting implementations used normal ++ and -- which aren’t thread-safe, and so we needed to buy a source code implementation of stl that we could alter to use InterlockedIncrement/InterlockedDecrement (which are atomic, lock-free, and thread-safe).

As the world changed around us, we could keep moving forward with these tech-debt payments.

Also, this was slow-paced problem—strtok/malloc/etc were written in the 70s and limped through the 90s. That’s actually not that bad.

But, the world doesn’t stop. Pretty soon, it was just too weird to implement back-ends as ISAPI extensions. So, you pick Java/SOAP because CORBA is just nuts, and well, that’s wrong because REST deprecates that, and then GraphQL deprecates that, and you picked Java, but were you supposed to wait for node/npm? Never mind what’s going on on the front-end as JS and CSS frameworks replace each other every 6 months. Even if you are happy with your choice, are you keeping your dependencies up to date, even through the major revisions that don’t follow Substitutable Versioning?

And I think that this is the main source of tech debt, not intentional debt that you take on or the debt you accumulate from cutting corners due to time constraints. The debt that comes with dependency and environment changes.

Being able to bring code into your project or build on a framework is probably the only thing that makes modern programming possible, but like mortgages, they come with constant interest payments and a looming balloon payment at some point.

There are some you can’t avoid, like the OS, language, and probably database, but as you go down the dependency list, remember to factor in the debt they inevitably bring with them.

Timing Your Tech Debt Payments

It’s impossible to ignore that developers have a visceral reaction against tech debt. Even if they agree that it’s worth it. That’s because they are the ones that need to service the debt.

Tech debt is a cost similar to real-life debt like a mortgage. If you can use tech debt to bring forward revenue and growth, you can pay off the debt later.

But, until then, the interest must be paid.

So, when you are calculating the cost of taking on some debt, a factor in that calculation is how much future work is going to happen on that code. The more work you do, the more interest you pay. If you fix bugs or add features to debt-laden code, you are servicing the debt by making an interest payment. If you refactor, you are paying off principal, and future interest payments are lowered, but that only matters if there are going to be future interest payments.

If you have a system that works and doesn’t need any changes, the fact that it has tech debt doesn’t matter.

To carry the analogy forward, some mortgages have penalties for early payment. Paying off tech debt also has a penalty, usually in QA and system stability.

This is why my favorite time to pay off tech debt is just before a major feature is being added to indebted code. You are trading off the looming interest payments (which will balloon) and your penalty is already being incurred, because you need to QA that whole area again anyway.

AirTags Could Be Used for Precise Indoor Location

I don’t think there’s going to be an SDK for AirTags, and they seem to be designed to be found by a single person, but the same technology could be used to precisely locate myself indoors (if AirTags were installed to create a mesh).

This is supposedly what iBeacon’s do, but I’ve heard from people trying to deploy them that the technology doesn’t work very well. I don’t really know anything about this at all, but here’s a contrary view from someone who knows beacon tech better:

We don’t believe that these tags will replace the current generation of BLE beacons for a few reasons:

・ These UWB Tags will require a new (circa 2020) Apple or Samsung
They will not be compatible with most of the existing gateways
These tags will most likely initially only work with the proprietary applications on Apple or Samsung Phones
Apple and Samsung UWB seem to be geared towards finding lost items, not providing all of the other sensor data that current BLE beacons do
BLE Beacons will be much, much cheaper than these UWB Tags will be

And, this is probably true today with AirTags as they are. But, this article also says that Google is dropping support for BLE beacons, so there is some problem here.

What is Bicycle?

Bicycle is an open-source framework that I’ve been working on with a couple of friends. One way to think about it is to compare it to a spreadsheet.

In a spreadsheet, we are building a directed, acyclical graph of cells, where cells are nodes and each formula in the cell defines the edges and direction.

If A1=B1+C1, then both B1 and C1 point to A1. The graph cannot contain cycles, so, in a spreadsheet, you can’t then say that B1=A1-C1 even though that is true, because it would cause a cycle in the graph.

Bicycle defines a data-structure and algorithm that gives meaning to a graph of formulas that does contain cycles where dependencies between nodes can be bidirectional (hence, Bicycle).

In Bicycle, you can define both of the formulas above and also complete the network with A1=C1-B1. In more complex networks, an individual field may have several different formulas, using different dependencies that can set its value.

Once you define a network, you can seed it with values. These are kept outside of the formula data-structure in a kind of priority queue. The highest priority values are seeded first, and each value is only accepted into the network if the network can remain consistent.

Meaning, I could define A1 as 2, B1 as 3, but if I then say C1 is 7, I have created an inconsistency. When you attach this to a UI, you would want the oldest value to be discarded.

We are also providing some help building SwiftUI based UIs with it. The network is meant to be hosted in an @ObservableObject and we provide a TextField Initializer that will bind to fields in it. Here’s a demo of a network that can convert between yards, feet, and inches.

Try to imagine replicating that in Excel. You’d have to pick one of the fields to be user-provided and put formulas in the other two. In Bicycle, you can provide as many formulas as you want (as long as they are consistent with each other) and seed-values are used as long as they are consistent.

It’s in very early stages and the API will probably change a lot, but if you want to take a look, see SwiftBicycle in GitHub.

Write While True Episode 6: Editing First Drafts

I was first exposed to this idea at The Business of Software conference in 2017. Joanna Wiebe gave a talk about copywriting for SaaS businesses. She’s an advertising copy writer, and the talk is mostly about that. It’s worth watching the whole thing, but near the end, she said something that astonished me.

Transcript

In Defense of Tech Debt

I’m a fan of tech debt if used properly. Just like real debt, if you can pull some value forward and then invest that value so that it outgrows the debt considerably, it’s good.

Mortgages and tuition-debt can possibly do this. Credit card consumption debt does not. If your tech debt looks like the former, do it.

For example, if can can close a big enterprise deal with some tech debt, and the alternative is another round of VC to “do it right”, I think it’s obvious to hack away. When you close that deal, your valuation goes up. Maybe you don’t even need to raise.

The decision depends on the specifics. Tech debt isn’t “bad”, it’s a cost. Calculate the cost.

It can be worth it.

Audio Variables

We can apply principles of visual design to designing audio. In visual design, we can manipulate “visual variables” to get different communication effects. The variables are are:

  • Size
  • Value/Brightness
  • Hue/Color
  • Orientation/Rotation
  • Texture
  • Shape
  • Position

Continuing along with sonifications, here are some analogous audio variables:

  • Frequency / pitch
  • Beat
  • Amplitude / volume
  • Envelope / Waveform: e.g. a beep vs. a buzz
  • Source Location

We have different ways of perceiving visual variables. For example, for some of them, we have an order—size goes from small to big, but shapes are not ordered.

We also have different ideas about how many variations we can tell apart on the same canvas. We can distinguish between a huge number of shapes, but probably only a few levels of brightness.

These kinds of perception apply to audio variables as well. Waveform seems to be related to shape in that there can be many types (instruments), but they are not really ordered.

Amplitude is similarly related to size, pitch is like brightness, and source location is like position.

Using what we know about combining and contrasting visual variables is probably a good start for doing the same with audio.

Self-hosting a Podcast in WordPress

I started a podcast about a month ago that helps programmers develop a writing habit. I looked at all the podcast hosts and ultimately decided to self-host. This is probably not for everyone, but here was my rationale:

  1. I already use WordPress for this blog, and I didn’t want a site specifically for the podcast as it is related to the other content here.
  2. I am unsure if I’ll make new episodes indefinitely, but I know that I want the episodes available indefinitely.
  3. I don’t have plans to add sponsors. If it ever got popular enough where that was an option, I think I’d rather point it towards my own products.
  4. I have enough technical skill to understand how podcast publishing works and can deal with rolling my own pieces if I need to.
  5. I am unwilling to compromise on privacy and the published URLs for files.

Given those attributes, most podcast hosts weren’t worth it for me. I just don’t care about analytics that much. I have no problem parsing web-access logs to get download counts.

So, I looked around and for WordPress, there is a great option, PowerPress, a free podcast plugin from Blubrry.

The plugin will handle generating the RSS feed and will walk you through submitting it to Apple, Google, and other directories. It has embeddable players that you can use on your episode pages.

If you don’t want to self-host, they provide a hosting service that you can access via the plugin with reasonable options, even for small shows.

But, they also support you hosting the mp3 files yourself and don’t require that you use their service at all. They even have a free, minimal analytics service for self-hosters. I don’t use it, because they require that you use their URLs and they redirect.

I’ll follow up this article about how I use Amazon S3 for the mp3 files and how I get some idea what the download counts are.