PSA: Postmortem(s) rollup

A small graphical overview of the previous month, now that May 2026 is complete in the Log…

See the earlier March grid for a description of the annotations.

Making the grid itself be clickable started to get a little unwieldy, so, instead, the <details> elements below can be expanded, to get each row’s links.

  • Week of May 03: Grafana, secrets, SIN, FRA, BOM, SJC, certs

    incid.
    date
    sourcessymbolbox description
    05/04s.f.n,
    forum
    Grafana logs
    05/05infra,
    s.f.n
    secrets
    05/06infra,
    s.f.n
    ½▪SIN listing Machines
    05/06forum,
    forum′
    FRA IPv6
    05/07s.f.n½▪BOM Machine creates/updates
    05/07s.f.nSJC networking
    05/08s.f.nLet's Encrypt outage
  • Week of May 10: Redis×2, Grafana, billing, SSH, FRA MPG

    incid.
    date
    sourcessymbolbox description
    05/11infra,
    s.f.n,
    forum
    Redis
    05/11s.f.nGrafana logs
    05/12infra,
    forum,
    forum′
    Redis (again)
    05/12infra½│billing lag
    05/13infra½│billing lag (continued)
    05/15infra,
    s.f.n,
    forum
    SSH & OIDC due to Consul
    05/16infra,
    s.f.n,
    forum
    FRA MPG
  • Week of May 17: SIN×4, IAD, dashboard, SYD, ORD, BOM, SJC

    incid.
    date
    sourcessymbolbox description
    05/19infra,
    s.f.n
    SIN Fly Proxy
    05/19infra,
    s.f.n
    IAD logs & metrics
    05/19infra,
    s.f.n
    dashboard
    05/20infra,
    s.f.n
    dashboard (continued)
    05/20infra,
    s.f.n
    SYD egress IPs
    05/20s.f.nSIN networking
    05/20s.f.n½│SIN high latency (segue)
    05/21s.f.nORD IPv6
    05/21s.f.nSIN networking (again)
    05/22s.f.nSIN IPv6
    05/23s.f.n½▪BOM networking
    05/23s.f.n½▪SJC networking
  • Week of May 24: EWR, app creates, SYD×2, DNS, SJC, LAX, ORD×3, deploys

    incid.
    date
    sourcessymbolbox description
    05/26s.f.n½▪EWR capacity
    05/27infra,
    s.f.n
    app creates
    05/27s.f.n½▪SYD 6PN
    05/28infraDNS cache
    05/28infra,
    s.f.n,
    forum,
    forum′
    SJC edge proxies
    05/28infra,
    s.f.n,
    forum,
    forum′
    LAX edge proxies
    05/29infra,
    forum,
    forum′
    LAX edge proxies (continued)
    05/29s.f.n½▪ORD networking
    05/29s.f.n½▪ORD networking (again)
    05/29s.f.n½▪ORD networking (again, again)
    05/30infra,
    s.f.n
    deploys
    05/30s.f.n½▪SYD 6PN
  • Week of May 31: ORD

    incid.
    date
    sourcessymbolbox description
    05/31s.f.nORD IPv6

(In the expanded tables’ second columns, “s.f.n” is status.flyio.net, the real-time status page, which serves as a second-tier source in this context.)

May followed the (lately) typical pattern of a handful of heftier boxes within an expansive speckling of relatively minor ones. The SVG pipeline that constructed the above enforces a minimum width and height, otherwise some of these would actually barely even be visible. Since there are more pixels to work with overall now, the minimums are roughly half of what they were in earlier renderings.

Of the more memorable cases…

   

The widely used Redis extension went down May 11–12, due to a mismatch between Linux kernel and hypervisor versions. Pathological behavior was triggered by virtualization guest log traffic.


    

The essential secrets and/or certificates features glitched on May 5 and May 27, although each time for only half an hour. These were the PetSem servicecodebase, including its growing pains in expanded roles.

[Edit: see @lillian’s clarification below.]


    

The West Coast proxy overloads on May 28 and a bit of May 29 affected a lot of people, due to the prominence of that part of the world in things generally Internet, but fortunately the durations of those were mainly in the 2 hour range (albeit with after-shocks).