Wed, 04 Dec 2024 13:45:55 +0000 Fullscreen Open in Tab
Pluralistic: "That Makes Me Smart" (04 Dec 2024)


Today's links



Uncle Sam in a leg-hold trap. The trap is staked to a log, against which rests a sign bearing the MAKE AMERICA GREAT AGAIN wordmark. In the background is a halftoned image of a waving US flag amidst the clouds.

"That Makes Me Smart" (permalink)

The Biden administration disappointed, frustrated and enraged in so many ways, including abetting a genocide – but one consistent bright spot over the past four years was the unseen-for-generations frontal assault on corporate power and corporate corruption.

The three words that define this battle above all others are "unfair and deceptive" – words that appear in Section 5 of the Federal Trade Commission Act and other legislation modeled on it, like USC40 Section 41712(a), which gives the Department of Transportation the power to ban "unfair and deceptive" practices as well:

https://pluralistic.net/2023/01/10/the-courage-to-govern/#whos-in-charge

When Congress created an agency to punish "unfair and deceptive" conduct, they were saying to the American people, "You have a right not to be cheated." While this may sound obvious, it's hardly how the world works.

To get a sense of how many ripoffs are part of our daily lives, let's take a little tour of the ways that the FTC and other agencies have used the "unfair and deceptive" standard to defend you over the past four years. Take Amazon Prime: Amazon executives emailed one another, openly admitting that in their user tests, the public was consistently fooled by Amazon's "get free shipping with Prime" dialog boxes, thinking they were signing up for free shipping and not understanding that they were actually signing up to send the company $140/year. They had tested other versions of the signup workflow that users were able to correctly interpret, but they decided to go with the confusing version because it made them more money:

https://arstechnica.com/tech-policy/2024/05/amazon-execs-may-be-personally-liable-for-tricking-users-into-prime-sign-ups/

Getting you signed up for Prime isn't just a matter of taking $140 out of your pocket once – because while Amazon has produced a greased slide that whisks you into a recurring Prime subscription, the process for canceling that recurring payment is more like a greased pole you must climb to escape the Prime pit. This is typical of many services, where signing up happens in a couple clicks, but canceling is a Kafkaesque nightmare. The FTC decided that this was an "unfair and deceptive" business practice and used its authority to create a "Click to Cancel" rule that says businesses have to make it as easy to cancel a recurring payment as it was to sign up for it:

https://www.theregister.com/2023/07/12/ftc_cancel_subscriptions/

Once businesses have you locked in, they also spy on you, ingesting masses of commercial surveillance data that you "consented" to by buying a car, or clicking to a website, or installing an app, or just physically existing in space. They use this to implement "surveillance pricing," raising prices based on their estimation of your desperation. Uber got caught doing this a decade ago, raising the price of taxi rides for users whose batteries were about to die, but these days, everyone's in on the game. For example, McDonald's has invested in a company that spies on your finances to determine when your payday is, and then raises the price of your usual breakfast sandwich by a dollar the day you get paid:

https://pluralistic.net/2024/06/05/your-price-named/#privacy-first-again

Everything about this is "unfair and deceptive" – from switching prices the second you click into the store to the sham of consent that consists of, say, picking up your tickets to a show and being ordered to download an app that comes with 20,000 words of terms and conditions that allows the company that sends you a QR code to spy on you for the rest of your life in any way they can and sell the data to anyone who'll buy it.

As bad as it is to be trapped in an abusive relationship as a shopper, it's a million times worse to be trapped as a worker. One in 18 American workers is under a noncompete "agreement" that makes it illegal for you to change jobs and work for someone else in the same industry. The vast majority of these workers are in low-waged food-service jobs. The primary use of the American noncompete is to stop the cashier at Wendy's from getting an extra $0.25/hour by taking a job at McDonald's.

Noncompetes are shrouded in a fog of easily dispelled bossly bullshit: claims that noncompetes raise wages (empirically, this is untrue), or that they enable "IP"-intensive industries to grow by protecting their trade secrets. This claim is such bullshit: you can tell by the fact that noncompetes are banned under California's state constitution and yet the most IP-intensive industries have attracted hundreds of billions – if not trillions – in investment capital even though none of their workforce can be bound under a noncompete. The FTC's order banning noncompetes for every worker in America simply brings the labor regime that created Silicon Valley and Hollywood to the rest of the country:

https://pluralistic.net/2023/10/26/hit-with-a-brick/#graceful-failure

Noncompetes aren't the only "unfair and deceptive" practice used against American workers. The past decade has seen the rise of private equity consolidation in several low-waged industries, like pet grooming. The new owners of every pet grooming salon within 20 miles of your house haven't just slashed workers' wages, they've also cooked up a scheme that lets them charge workers thousands of dollars if they quit these shitty jobs. This scheme is called a "training repayment agreement provision" (TRAP!): workers who are TRAPped at Petsmart are made to work doing menial jobs like sweeping up the floor for three to four weeks. Petsmart calls this "training," and values it at $5,500. If you quit your pet grooming job in the next two years, you legally owe PetSmart $5,500 to "repay" them for the training:

https://pluralistic.net/2022/08/04/its-a-trap/#a-little-on-the-nose

Workers are also subjected to "unfair and deceptive" bossware: "AI" tools sold to bosses that claim they can sort good workers from bad, but actually serve as random-number generators that penalize workers in arbitrary, life-destroying ways:

https://pluralistic.net/2024/11/26/hawtch-hawtch/#you-treasure-what-you-measure

Some of the most "unfair and deceptive" conduct we endure happens in shadowy corners of industry, where obscure middlemen help consolidated industries raise prices and pick your pocket. All the meat you buy in the grocery store comes from a cartel of processing and packing companies that all subscribe to the same "price consulting" services that tells them how to coordinate across-the-board price rises (tell me again how greedflation isn't a thing?):

https://pluralistic.net/2023/10/04/dont-let-your-meat-loaf/#meaty-beaty-big-and-bouncy

It's not just food, it's all of Maslow's Hierarchy of Needs. Take shelter: the highly consolidated landlord industry uses apps like Realpage to coordinate rental price hikes, turning the housing crisis into a housing emergency:

https://pluralistic.net/2024/07/24/gouging-the-all-seeing-eye/#i-spy

And of course, health is the most "unfair and deceptive" industry of all. Useless middlemen like "Pharmacy Benefit Managers" ("a spreadsheet with political power" -Matt Stoller) coordinate massive price-hikes in the drugs you need to stay alive, which is why Americans pay substantially more for medicine than anyone else in the world, even as the US government spends more than any other to fund pharma research, using public money:

https://pluralistic.net/2024/09/23/shield-of-boringness/#some-men-rob-you-with-a-fountain-pen

It's not just drugs: every piece of equipment – think hospital beds and nuclear medicine machines – as well as all the consumables – from bandages to saline – at your local hospital runs through a cartel of "Group Purchasing Organizations" that do for hospital equipment what PBMs do for medicine:

https://pluralistic.net/2021/09/27/lethal-dysfunction/#luxury-bones

For the past four years, we've lived in an America where a substantial portion of the administrative state went to war every day to stamp out unfair and deceptive practices. It's still happening: yesterday, the CFPB (which Musk has vowed to shut down) proposed a new rule that would ban the entire data brokerage industry, who nonconsensually harvest information about every American, and package it up into categories like "teenagers from red states seeking abortions" and "military service personnel with gambling habits" and "seniors with dementia" and sell this to marketers, stalkers, foreign governments and anyone else with a credit-card:

https://www.consumerfinance.gov/about-us/newsroom/cfpb-proposes-rule-to-stop-data-brokers-from-selling-sensitive-personal-data-to-scammers-stalkers-and-spies/

And on the same day, the FTC banned the location brokers who spy on your every movement and sell your past and present location, again, to marketers, stalkers, foreign governments and anyone with a credit card:

https://www.404media.co/ftc-bans-location-data-company-that-powers-the-surveillance-ecosystem/

These are tantalizing previews of a better life for every American, one in which the rule is, "play fair." That's not the world that Trump and his allies want to build. Their motto isn't "cheaters never prosper" – it's "caveat emptor," let the buyer beware. Remember the 2016 debate where Clinton accused Trump of cheating on his taxes and he admitted to it, saying "That makes me smart?" Trumpism is the movement of "that makes me smart" life, where if you get scammed, that's your own damned fault. Sorry, loser, you lost.

Nowhere do you see this more than in cryptocurrencyland, so it's not a coincidence that tens – perhaps hundreds – in dark crypto money was flushed into the election, first to overpower Democratic primaries and kick out Dem legislators who'd used their power to fight the "unfair and deceptive" crowd:

https://www.politico.com/newsletters/california-playbook-pm/2024/02/13/crypto-comes-for-katie-porter-00141261

And then to fight Dems across the board (even the Dems whose primary victories were funded by dark crypto money) and elect the GOP as the party of "caveat emptor"/"that makes me smart":

https://www.coindesk.com/news-analysis/2024/12/02/crypto-cash-fueled-53-members-of-the-next-u-s-congress

Crypto epitomizes the caveat emptor economy. By design, fraudulent crypto transactions can't be reversed. If you get suckered, that's canonically a you problem. And boy oh boy, do crypto users get suckered (including and especially those who buy Trump's shitcoins):

https://www.web3isgoinggreat.com/

And for crypto users who get ripped off because they've parked their "money" in an online wallet, there's no sympathy, just "not your keys, not your coins":

https://www.ledger.com/academy/not-your-keys-not-your-coins-why-it-matters

A cornerstone of the "unfair and deceptive" world is that only suckers – that is, outsiders, marks and little people – have to endure consequences when they get rooked. When insiders get ripped off, all principle is jettisoned. So it's not surprising that when crypto insiders got taken for millions the first time they created a DAO, they tore up all the rules of the crypto world and gave themselves the mulligan that none of the rest of us are entitled to in cryptoland:

https://blog.ethereum.org/2016/07/20/hard-fork-completed

Where you find crypto, you find Elon Musk, the guy who epitomizes caveat emptor thinking. This is a guy who has lied to drivers to get them to buy Teslas by promising "full self driving in one year," every year, since 2015:

https://www.consumerreports.org/cars/autonomous-driving/timeline-of-tesla-self-driving-aspirations-a9686689375/

Musk told investors that he had a "prototype" autonomous robot that could replace their workers, then demoed a guy in a robot suit, pretending to be a robot:

https://gizmodo.com/elon-musk-unveils-his-funniest-vaporware-yet-1847523016

Then Musk did it again, two years later, demoing a remote-control robot while lying and claiming that it was autonomous:

https://techcrunch.com/2024/10/14/tesla-optimus-bots-were-controlled-by-humans-during-the-we-robot-event

This is entirely typical of the AI sector, in which "AIs" are revealed, over and over, to be low-waged workers pretending to be robots, so much so that Indian tech industry insiders joke that "AI" stands for "Absent Indians":

https://pluralistic.net/2024/01/29/pay-no-attention/#to-the-little-man-behind-the-curtain

Musk's view is that he's not a liar, merely a teller of premature truths. Autonomous cars and robots are just around the corner (just like the chatbots that can do your job, and not merely convince your boss to fire you while failing to do your job). He's not tricking you, he's just faking it until he makes it. It's not a scam, it's inspirational. Of course, if he's wrong and you are scammed, well, that's a you problem. Caveat emptor. That makes him smart.

Musk does this all the time. Take the Twitter blue tick, originally conceived of as a way to keep Twitter users from being scammed ("unfair and deceptive") by con artists pretending to be famous people. Musk's inaugural act at Twitter was to take away blue ticks from verified users and sell them to anyone who'd pay $8/month. Almost no one coughed up for this – the main exception being scammers, who used their purchased, unverified blue ticks to steal from Twitter users ("that makes me smart").

As Twitter hemorrhaged advertising revenue and Musk became increasingly desperate to materialize an army of $8/month paid subscribers, he pulled another scam: he nonconsensually applied blue ticks to prominent accounts, in a bid to trick normies into thinking that widely read people valued blue ticks so much they were paying for them out of their own pockets:

https://www.bbc.com/news/technology-65365366

If you were tricked into buying a blue tick on this pretense, well, caveat emptor. Besides, it's not a lie, it's a premature truth. Someday all those widely read users with nonconsensual blue ticks will surely value them so highly that they do start to pay for them. And if they don't? Well, Musk got your $8: "that makes me smart."

Scammers will always tell you that they're not lying to you, merely telling premature truths. Sam Bankman-Fried's defenders will tell you that he didn't actually steal all those billions. He gambled them on a bet that (sorta-kinda) paid off. Eventually, he was able to make all his victims (sorta-kinda) whole, so it's not even a theft:

https://www.cnn.com/2024/05/08/business/ftx-bankruptcy-plan-repay-creditors/index.html

Likewise, Tether, a "stablecoin" that was unable to pass an audit for many years as it issued unbacked, unregulated securities while lying and saying that for every dollar they minted, they had a dollar in reserves. Tether now (maybe) has reserves to equal its outstanding coins, so obviously all those years where they made false claims, they weren't lying, merely telling a premature truth:

https://creators.spotify.com/pod/show/cryptocriticscorner/episodes/Tether-wins–Skeptics-lose-the-end-of-an-era-e2rhf5e

If Tether had failed a margin call during those years and you'd lost everything, well, caveat emptor. The Tether insiders were always insulated from that risk, and that's all that matters: "that makes me smart."

When I think about the next four years, this is how I frame it: the victory of "that makes me smart" over "fairness and truth."

For years, progressives have pointed out the right's hypocrisy, despite that fact that Americans have been conditioned to be so cynical that even the rankest hypocrisy doesn't register. But "caveat emptor?" That isn't just someone else's bad belief or low ethics: it's the way that your life is materially, significantly worsened. The Biden administration – divided between corporate Dems and the Warren/Sanders wing that went to war on "unfair and deceptive" – was ashamed and nearly silent on its groundbreaking work fighting for fairness and honesty. That was a titanic mistake.

Americans may not care about hypocrisy, but they really care about being stolen from. No one wants to be a sucker.


Hey look at this (permalink)



A Wayback Machine banner.

This day in history (permalink)

#15yrsago Measuring the smell of old books to find candidates for preservation https://www.sciencedaily.com/releases/2009/12/091202122041.htm

#15yrsago Are terms-of-service enforceable? https://www.eff.org/wp/clicks-bind-ways-users-agree-online-terms-service

#15yrsago Just look at this awesome banana bunker. https://memex.craphound.com/2009/12/04/just-look-at-this-awesome-banana-bunker/

#15yrsago Woman jailed, charged with felony camcordering after recording 4 mins of sister’s birthday party in a movie theater https://web.archive.org/web/20091206024138/https://consumerist.com/2009/12/charged-with-felony-after-taping-4-minutes-of-new-moon.html

#15yrsago Jonathan Lethem’s CHRONIC CITY, surreal and beautiful sf explores the authentic and the unreal https://memex.craphound.com/2009/12/04/jonathan-lethems-chronic-city-surreal-and-beautiful-sf-explores-the-authentic-and-the-unreal/

#10yrsago NSA leak reveal plans to subvert mobile network security around the world https://theintercept.com/2014/12/04/nsa-auroragold-hack-cellphones/

#10yrsago Brian Krebs’s “Spam Nation” https://memex.craphound.com/2014/12/04/brian-krebss-spam-nation/

#10yrsago Why journalists should be free speech partisans https://medium.com/backchannel/when-journalists-must-not-be-objective-fad5aadd8cb3

#10yrsago Fantastically detailed Boeing 777 model made from manila folders https://www.flickr.com/photos/lucaiaconistewart/sets/72157632208677161

#5yrsago FCC Chairman Pai’s former employer, Verizon, lied about coverage, and then Pai tried to bury the news https://arstechnica.com/tech-policy/2019/12/fcc-tries-to-bury-finding-that-verizon-and-t-mobile-exaggerated-4g-coverage/

#5yrsago The south’s latest culinary trend: inadequate, rotting prison food, supplemented by cattle feed https://www.southernfoodways.org/gravy/are-prison-diets-punitive-a-report-from-behind-bars/

#5yrsago Browser plugins from Avast and AVG yanked for stealing user data https://www.zdnet.com/article/mozilla-removes-avast-and-avg-extensions-from-add-on-portal-over-snooping-claims/

#5yrsago Second wave Algorithmic Accountability: from “What should algorithms do?” to “Should we use an algorithm?” https://lpeproject.org/blog/the-second-wave-of-algorithmic-accountability/

#5yrsago How Ken Liu went from engineer to lawyer to SF writer to the foremost translator of Chinese sf into English https://www.nytimes.com/2019/12/03/magazine/ken-liu-three-body-problem-chinese-science-fiction.html

#5yrsag The bizarre story of China’s most prolific bank-robbers, who stole literal tons of cash and spent it on losing lotto tickets https://marker.medium.com/jackpot-694063c4d867

#5yrsago Opendemocracy: the Libdems tried to censor our article about their sale of voter data, then used a forged email to intimidate us https://www.opendemocracy.net/en/opendemocracyuk/what-are-jo-swinsons-liberal-democrats-so-desperate-to-hide/

#1yrago Francis Spufford's "Cahokia Jazz" https://pluralistic.net/2023/12/04/cahokia/#the-sun-and-the-moon


Upcoming appearances (permalink)

A photo of me onstage, giving a speech, holding a mic.



A screenshot of me at my desk, doing a livecast.

Recent appearances (permalink)



A grid of my books with Will Stahle covers..

Latest books (permalink)



A cardboard book box with the Macmillan logo.

Upcoming books (permalink)

  • Picks and Shovels: a sequel to "Red Team Blues," about the heroic era of the PC, Tor Books, February 2025

  • Unauthorized Bread: a middle-grades graphic novel adapted from my novella about refugees, toasters and DRM, FirstSecond, 2025



Colophon (permalink)

Today's top sources:

Currently writing:

  • Enshittification: a nonfiction book about platform decay for Farrar, Straus, Giroux. Status: first pass edit underway (TKs and FCKs)

  • A Little Brother short story about DIY insulin PLANNING

  • Picks and Shovels, a Martin Hench noir thriller about the heroic era of the PC. FORTHCOMING TOR BOOKS FEB 2025

Latest podcast: Spill, part five (a Little Brother story) https://craphound.com/littlebrother/2024/12/01/spill-part-five-a-little-brother-story/


This work – excluding any serialized fiction – is licensed under a Creative Commons Attribution 4.0 license. That means you can use it any way you like, including commercially, provided that you attribute it to me, Cory Doctorow, and include a link to pluralistic.net.

https://creativecommons.org/licenses/by/4.0/

Quotations and images are not included in this license; they are included either under a limitation or exception to copyright, or on the basis of a separate license. Please exercise caution.


How to get Pluralistic:

Blog (no ads, tracking, or data-collection):

Pluralistic.net

Newsletter (no ads, tracking, or data-collection):

https://pluralistic.net/plura-list

Mastodon (no ads, tracking, or data-collection):

https://mamot.fr/@pluralistic

Medium (no ads, paywalled):

https://doctorow.medium.com/

Twitter (mass-scale, unrestricted, third-party surveillance and advertising):

https://twitter.com/doctorow

Tumblr (mass-scale, unrestricted, third-party surveillance and advertising):

https://mostlysignssomeportents.tumblr.com/tagged/pluralistic

"When life gives you SARS, you make sarsaparilla" -Joey "Accordion Guy" DeVilla

Tue, 03 Dec 2024 14:23:42 +0000 Fullscreen Open in Tab
Pluralistic: Canada sues Google (03 Dec 2024)


Today's links



A gagged, bound figure plummeting off the CN Tower, which leans precipitously. On the left, a grimacing Statue of Liberty with Trump's haircut leans into the frame. The background is a hellscape from Bosch's 'Garden of Earthly Delights.'

Canada sues Google (permalink)

For a country obsessed with defining itself as "not America," Canada sure likes to copy US policies, especially the really, really terrible policies – especially the really, really, really terrible digital policies.

In Canada's defense: these terrible US policies are high priority for the US Trade Representative, who leans on Canadian lawmakers to ensure that any time America decides to collectively jump off the Empire State Building, Canadian politicians throw us all off the CN Tower. And to Canada's enduring shame, the USTR never has to look very hard to find a lickspittle who's happy to sell Canadians out.

Take anti-circumvention. In 1998, Bill Clinton signed the Digital Millennium Copyright Act, a gnarly hairball of copyright law whose Section 1201 bans reverse-engineering for any purpose. Under DMCA 1201, "access controls" for copyrighted works are elevated to sacred status, and it's a felony (punishable by a five-year prison sentence and a $500k fine) to help someone bypass these access controls.

That's pretty esoteric, even today, and in 1998, it was nearly incomprehensible, except to a small group of extremely alarmed experts who ran around trying to explain to lawmakers why they should not vote for this thing. But by the time Tony Clement and James Moore (Conservative ministers in the Harper regime) introduced a law to import America's stupidest tech idea and paste it into Canada's lawbooks in 2012, the evidence against anti-circumvention was plain for anyone to see.

Under America's anti-circumvention law, any company that added an "access control" to its products instantly felonised any modification to that product. For example, it's not illegal to refill an ink cartridge, but it is illegal to bypass the access control that gets the cartridge to recognise that it's full and start working again. It's not illegal for a Canadian software developer to sell a Canadian Iphone owner an app without cutting Apple in for a 30% of the sale, but it is illegal to mod that Iphone so that it can run apps without downloading them from the App Store first. It's not illegal for a Canadian mechanic to fix a Canadian's car, but it is illegal for that mechanic to bypass the access controls that prevent third-party mechanics from decrypting the error codes the car generates.

We told Clement and Moore about this, and they ignored us. Literally: when they consulted on their proposal in 2010, we filed 6,138 comments explaining why this was a bad idea, while only 53 parties wrote in to support it. Moore publicly announced that he was discarding the objections, on the grounds that they had come from "babyish" "radical extremists":

https://www.cbc.ca/news/science/copyright-debate-turns-ugly-1.898216

For more than a decade, we've had Clement and Moore's Made-in-America law tied to our ankles. Even when Canada copies some good ideas from the US (by passing a Right to Repair law), or even some very good ideas of its own (passing an interoperability law), Canadians can't use those new rights without risking prosecution under Clement and Moore's poisoned gift to the nation:

https://pluralistic.net/2024/11/15/radical-extremists/#sex-pest

"Not America" is a pretty thin basis for a political identity anyway. There's nothing wrong with copying America's good ideas (like Right to Repair). Indeed, when it comes to tech regulation, the US has had some bangers lately, like prosecuting US tech giants for violating competition law. Given that Canada overhauled its competition law this year, the country's well-poised to tackle America's tech giants.

Which is exactly what's happening! Canada's Competition Bureau just filed a lawsuit against Google over its ad-tech monopoly, which isn't merely a big old Privacy Chernobyl, but is also a massively fraudulent enterprise that rips off both advertisers and publishers:

https://www.reuters.com/technology/canadas-antitrust-watchdog-sues-google-alleging-anti-competitive-conduct-2024-11-28/

The ad-tech industry scoops up about 51 cents out of every dollar (in the pre-digital advertising world the net take by ad agencies was more like 15%). Fucking up Google's ad-tech rip off is a much better way to get Canada's press paid than the link tax the country instituted in 2023:

https://www.eff.org/deeplinks/2023/05/save-news-we-must-ban-surveillance-advertising

After all, what tech steals from the news isn't content (helping people find the news and giving them a forum to discuss it is good) – tech steals news's money. Ad-tech is a giant ripoff. So is the app tax – the 30% Canadian newspapers have to kick up to the Google and Apple crime families every time a subscriber renews their subscriptions in an app. Using Canadian law to force tech to stop stealing the press's money is a way better policy than forcing tech to profit-share with the news. For tech to profit-share with the news, it has to be profitable, meaning that a profit-sharing press benefits from tech's most rapacious and extractive conduct, and rather than serving as watchdogs, they're at risk of being cheerleaders.

Smashing tech power is a better policy than forcing tech to share its stolen loot with newspapers. For one thing, it gets government out of the business of deciding what is and isn't a legit news entity. Maybe you're OK with Trudeau making that call (though I'm not), but how will you feel when PM Polievre decides that Great Replacement-pushing, conspiracy-addled far right rags should receive a subsidy?

Taking on Google is a slam-dunk, not least because the US DoJ just got through prosecuting the exact same case, meaning that Canadian competition enforcers can do some good copying of their American counterparts – like, copying the exhibits, confidential memos, and successful arguments the DoJ brought before the court:

https://www.justice.gov/opa/pr/justice-department-sues-google-monopolizing-digital-advertising-technologies

Indeed, this is already a winning formula! Because Big Tech commits the same crimes in every jurisdiction, trustbusters are doing a brisk business by copying each others' cases. The UK Digital Markets Unit released a big, deep market study into Apple's app market monopoly, which the EU Commission used as a roadmap to bring a successful case. Then, competition enforcers in Japan and South Korea recycled the exhibits and arguments from the EU's case to bring their own successful prosecutions:

https://pluralistic.net/2024/04/10/an-injury-to-one/#is-an-injury-to-all

Canada copying the DoJ's ad-tech case is a genius move – it's the kind of south-of-the-border import that Canadians need. Though, of course, it's a long shot that the Trump regime will produce much more worth copying. Instead, Trump has vowed to slap a 25% tariff on Canadian goods as of January 20.

Which is bad news for Canada's export sector, but it definitely means that Canada no longer has to worry about keeping the US Trade Rep happy. Repealing Clement and Moore's Bill C-11 should be Parliament's first order of business. Tariff or no tariff, Canadian tech entrepreneurs could easily export software-based repair diagnostic tools, Iphone jailbreaking tools, alternative firmware for tractors and medical implants, and alternative app stores for games consoles, phones and tablets. So long as they can accept a US payment, they can sell to US customers. This is a much bigger opportunity than, say, selling cheap medicine to Americans trying to escape Big Pharma's predation.

What's more, there's no reason this couldn't be policy under Polievre and the Tories. After all, they're supposed to be the party of "respect for private property." What could be more respectful of private property than letting the owners of computers, phones, cars, tractors, printers, medical implants, smart speakers and anything else with a microchip decide for themselves how they want to it work? What could be more respectful of copyright than arranging things so that Canadian copyright holders – like a games studio or an app company – can sell their copyrighted works to Canadian buyers, without forcing the data and the payment to make a round trip through Silicon Valley and come back 30% lighter?

Canadian politicians have bound the Canadian public and Canadian industry to onerous and expensive obligations under treaties like the USMCA (AKA NAFTA2), on promise of tariff-free access to American markets. With that access gone, why on Earth would we continue to voluntarily hobble ourselves?


Hey look at this (permalink)



A Wayback Machine banner.

This day in history (permalink)

#10yrago UK police arrest man who built anti-immigrant nail-bomb, decline to press terrorism charges https://www.theguardian.com/uk-news/2014/nov/28/soldier-jailed-nailbomb-ryan-mcgee-manchester-bomb

#10yrsago Library’s seed sharing system threatened by Big Ag regulations https://www.mprnews.org/story/2014/11/30/duluth-librarys-seed-sharing-program-hits-a-hurdle

#10yrsago Haunted Mansion leg sleeve tattoo https://www.reddit.com/r/tattoos/comments/2o20s0/haunted_mansion_leg_sleeve_in_progress_by_darin/

#10yrsago Interview with fantasy writer Tim Powers about being a “secret historian” https://web.archive.org/web/20150103220737/http://likeiwassayingblog.com/2014/12/02/tim-powers-interview-with-a-secret-historian/

#5yrsago McKinsey designed ICE’s gulags, recommending minimal food, medical care and supervision https://www.propublica.org/article/how-mckinsey-helped-the-trump-administration-implement-its-immigration-policies#172207

#5yrsago Frustrated game devs automated the production of 1,500 terrible slot machine apps and actually made money https://www.gdcvault.com/play/1025766/1-500-Slot-Machines-Walk

#5yrsago The Supreme Court just heard the State of Georgia’s argument for copyrighting the law and charging for access to it https://arstechnica.com/tech-policy/2019/12/justices-debate-allowing-state-law-to-be-hidden-behind-a-pay-wall/

#5yrsago UK Apostrophe Protection Society surrender’s, saying “ignorance and lazines’s have won” https://www.standard.co.uk/news/uk/apostrophe-society-shuts-down-because-ignorance-has-won-a4301391.html

#5yrsago MMT: when does government deficit spending improve debt-to-GDP ratios? https://carnegieendowment.org/china-financial-markets/2019/10/mmt-heaven-and-mmt-hell-for-chinese-investment-and-us-fiscal-spending

#5yrsago Using the Challenger Disaster to illustrate the 8 symptoms of groupthink https://web.archive.org/web/20150326031934/https://courses.washington.edu/psii101/Powerpoints/Symptoms of Groupthink.htm

#5yrsago A sweeping new tech bill from Silicon Valley Democrats promises privacy, interoperability, and protection from algorithmic discrimination and manipulation https://web.archive.org/web/20191105215639/https://eshoo.house.gov/news-stories/press-releases/eshoo-lofgren-introduce-the-online-privacy-act/

#5yrsago Harry Shearer interviews Uber’s smartest critic: Hubert “Bezzle” Horan https://harryshearer.com/le-shows/december-01-2019/#t=10:10

#5yrsago Reading the “victory letter” a white nationalist sent to his followers after getting $2.5m from UNC, it’s obvious why he tried to censor it https://twitter.com/greg_doucette/status/1201547992748216322

#5yrsago “Harbinger households”: neighborhoods that consistently buy products that get discontinued, buy real-estate that underperforms, and donate to losing political candidates https://journals.sagepub.com/doi/abs/10.1177/0022243719867935

#5yrsago White nationalists who got a $2.5m payout from UNC abuse the DMCA to censor lawyer’s trove of documents about it https://twitter.com/greg_doucette/status/1201635924158881792


Upcoming appearances (permalink)

A photo of me onstage, giving a speech, holding a mic.



A screenshot of me at my desk, doing a livecast.

Recent appearances (permalink)



A grid of my books with Will Stahle covers..

Latest books (permalink)



A cardboard book box with the Macmillan logo.

Upcoming books (permalink)

  • Picks and Shovels: a sequel to "Red Team Blues," about the heroic era of the PC, Tor Books, February 2025

  • Unauthorized Bread: a middle-grades graphic novel adapted from my novella about refugees, toasters and DRM, FirstSecond, 2025



Colophon (permalink)

Today's top sources: Slashdot (https://slashdot.org).

Currently writing:

  • Enshittification: a nonfiction book about platform decay for Farrar, Straus, Giroux. Status: first pass edit underway (TKs and FCKs)

  • A Little Brother short story about DIY insulin PLANNING

  • Picks and Shovels, a Martin Hench noir thriller about the heroic era of the PC. FORTHCOMING TOR BOOKS FEB 2025

Latest podcast: Spill, part five (a Little Brother story) https://craphound.com/littlebrother/2024/12/01/spill-part-five-a-little-brother-story/


This work – excluding any serialized fiction – is licensed under a Creative Commons Attribution 4.0 license. That means you can use it any way you like, including commercially, provided that you attribute it to me, Cory Doctorow, and include a link to pluralistic.net.

https://creativecommons.org/licenses/by/4.0/

Quotations and images are not included in this license; they are included either under a limitation or exception to copyright, or on the basis of a separate license. Please exercise caution.


How to get Pluralistic:

Blog (no ads, tracking, or data-collection):

Pluralistic.net

Newsletter (no ads, tracking, or data-collection):

https://pluralistic.net/plura-list

Mastodon (no ads, tracking, or data-collection):

https://mamot.fr/@pluralistic

Medium (no ads, paywalled):

https://doctorow.medium.com/

Twitter (mass-scale, unrestricted, third-party surveillance and advertising):

https://twitter.com/doctorow

Tumblr (mass-scale, unrestricted, third-party surveillance and advertising):

https://mostlysignssomeportents.tumblr.com/tagged/pluralistic

"When life gives you SARS, you make sarsaparilla" -Joey "Accordion Guy" DeVilla

2024-12-02T22:04:35+00:00 Fullscreen Open in Tab
Read "Why So Few Matt Levines?"
Read:
Why are popularizing educational newsletter-frequency writers of important fields like Matt Levine for finance so rare? Because most fields are too slow or ambiguous, and writers of the right combination of expertise, obsession, and persistence are also rare.
Illustration of Molly White sitting and typing on a laptop, on a purple background with 'Molly White' in white serif.
Illustration of Molly White sitting and typing on a laptop, on a purple background with 'Molly White' in white serif.
2024-12-02T18:24:56+00:00 Fullscreen Open in Tab
Note published on December 2, 2024 at 6:24 PM UTC

for those of you who've seen the supposed Enron relaunch and crypto project: their website's terms of use say it's parody.

tl;dr apparently someone thinks it's a good and funny idea to make a big joke out of a company that ruined people, and this is about as much oxygen as i'm inclined to give it

Enron logo<br>The information on the website is first amendment protected parody, represents performance art, and is for entertainment purposes only
Mon, 02 Dec 2024 13:01:21 +0000 Fullscreen Open in Tab
Pluralistic: All the books I reviewed in 2024 (02 Dec 2024)


Today's links



Samuel Hollyer's 1875 engraving of Charles Dickens in his study, sitting at a desk, staring out a window, surrounded by bookcases.

All the books I reviewed in 2024 (permalink)

I reviewed 26 books this year: 15 novels, 5 nonfiction books, and 6 graphic novels. Even though I feel perennially behind on my reading (and objectively, I do have 10 linear feet of "to be read" books on the shelf), I think this is a pretty good haul.

Books are pretty much the ideal gift, if you ask me. Of course, I'm biased as a former bookseller and library worker, and as an author (of course) – I had three more books come out in 2024 (see the end of this post for details).

I started a lot more than 26 books this year. Long ago, I figured life was too short for books I wasn't enjoying, and I'm pretty ruthless about putting books down partway through if I think they're not going to reward finishing them. I probably start 10 books for every one I finish. However, I do review more than 90% of the books I get through. It's rare for me to keep reading a book all the way to the end if I'm not enjoying it enough to unconditionally recommend it. I rarely review books I don't like – there's not really any point in cataloging the list of books I think you won't enjoy reading, and most books I don't like very much are broken in ways that are too banal to comment upon.

The list below is pretty great, but if you're looking for more, here's the haul from 2023:

https://pluralistic.net/2023/12/01/bookmaker/#2023-in-review

NOVELS

I. Cahokia Jazz by Francis Spufford
The cover for Cahokia Jazz.

A fucking banger: it's a taut, unguessable whuddunit, painted in ultrablack noir, set in an alternate Jazz Age in a world where indigenous people never ceded most the west to the USA. It's got gorgeously described jazz music, a richly realized modern indigenous society, and a spectacular romance. It's amazing.

https://pluralistic.net/2023/12/04/cahokia/#the-sun-and-the-moon


II. After World by Debbie Urbanski
The cover for After World.

An unflinching and relentlessly bleak tale of humanity's mass extinction, shot through with pathos and veined with seams of tragic tenderness and care. Sen Anon – the story's semi-protagonist – is 18 years old when the world learns that every person alive has been sterilized and so the human race is living out its last years.

The news triggers a manic insistence that this is a good thing – long overdue, in fact – and the perfect opportunity to scan every person alive for eventual reincarnation as virtual humans in an Edenic cloud metaverse called Gaia. That way, people can continue to live their lives without the haunting knowledge that everything they do makes the planet worse for every other living thing, and each other. Here, finally, is the resolution to the paradox of humanity: our desire to do good, and our inevitable failure on that score.

https://pluralistic.net/2023/12/18/storyworker-ad39-393a-7fbc/#digital-human-archive-project


III. Jonathan Abernathy You Are Kind by Molly McGhee
The cover for Jonathan Abernathy You Are Kind.

A dreamlike tale of a public-private partnership that hires the terminally endebted to invade the dreams of white-collar professionals and harvest the anxieties that prevent them from being fully productive members of the American corporate workforce.

We meet Jonathan as he is applying for a job that he was recruited for in a dream. As instructed in his dream, he presents himself at a shabby strip-mall office where an acerbic functionary behind scratched plexiglass takes his application and informs him that he is up for a gig run jointly by the US State Department and a consortium of large corporate employers. If he is accepted, all of his student debt repayments will be paused and he will no longer face wage garnishment. What's more, he'll be doing the job in his sleep, which means he'll be able to get a day job and pull a double income – what's not to like?

https://pluralistic.net/2024/01/08/capitalist-surrealism/#productivity-hacks


IV. The Book of Love by Kelly Link
The cover for The Book of Love

If you've read Link's short stories (which honestly, you must read), you know her signature move: a bone-dry witty delivery, used to spin tales of deceptive whimsy and quirkiness, disarming you with daffiness while she sets the hook and yanks. That's the unmistakeable, inimitable texture of a Kelly Link story: deft literary brushstrokes, painting a picture so charming and silly that you don't even notice when she cuts you without mercy.

Turns out that she can quite handily do this for hundreds of pages, and the effect only gets better when it's given space to unfold.

It's a long and twisting mystery about friendship, love, queerness, rock-and-roll, stardom, parenthood, loyalty, lust and duty.

https://pluralistic.net/2024/02/13/the-kissing-song/#wrack-and-roll


V. Lyorn by Steven Brust
The cover for Lyorn

The seventeenth book in Steven Brust's long-running Vlad Taltos series. For complicated reasons, Vlad has to hide out in a theater. Why a theater? They are shielded from sorcery, as proof against magical spying by rival theater companies, and Vlad is on the run from the Left Hand of the Jhereg – the crime syndicate's all-woman sorceress squad – and so he has to hide in the theater.

The theater is mounting a production of a famous play that's about another famous play. The first famous play (the one the play is about – try and follow along, would you?) is about a famous massacre that took place thousands of years before. The play was mounted as a means of drumming up support for the whistleblower who reported on the massacre and was invited to a short-term berth in the Emperor's death row as a consequence.

The plot is a fantastic, fast-handed caper story that has a million moving parts, a beautiful prestige, and a coup de grace that'll have you cheering and punching the air.

https://pluralistic.net/2024/04/09/so-meta/#delightful-doggerel


VI. Till Human Voices Wake Us by Rebecca Roque
The cover for Till Human Voices Wake Us

A teen murder mystery told in the most technorealist way. Cia's best friend Alice has been trying to find her missing boyfriend for months, and in her investigation, she's discovered their small town's dark secret – a string of disappearances, deaths and fires that are the hidden backdrop to the town's out-of-control addiction problem.

Alice has something to tell Cia, something about the fire that orphaned her and cost her one leg when she was only five years old, but Cia refuses to hear it. Instead, they have a blazing fight, and part ways. It's the last time Cia and Alice ever see each other: that night, Alice kills herself.

Or does she? Cia is convinced that Alice has been murdered, and that her murder is connected to the drug- and death-epidemic that's ravaging their town. As Cia and her friends seek to discover the town's secret – and the identity of Alice's killer – we're dragged into an intense, gripping murder mystery/conspiracy story that is full of surprises and reversals, each more fiendishly clever than the last.

https://pluralistic.net/2024/04/16/dead-air/#technorealism


VII. The Steerswoman by Rosemary Kirstein
The cover for The Steerswoman

Randall "XKCD" Munroe pitched me on this over dinner: "All these different people kept recommending them to me, and they kept telling me that I would love them, but they wouldn't tell me what they were about because there's this huge riddle in them that's super fun to figure out for yourself. "The books were published in the eighties by Del Rey, and the cover of the first one had a huge spoiler on it. But the author got the rights back and she's self-published it."

How could I resist a pitch like that? So I ordered a copy. Holy moly is this a good novel! And yeah, there's a super interesting puzzle in it that I won't even hint at, except to say that even the book's genre is a riddle that you'll have enormous great fun solving.

https://pluralistic.net/2024/05/04/the-wulf/#underground-fave


VIII. Moonbound by Robin Sloan
The cover for Moonbound

Moonbound's protagonist is a "chronicler," a symbiotic fungus engineered to nestle in a human's nervous system, where it serves as a kind of recording angel, storing up the memories, experiences and personalities of its host. When we meet the chronicler, it has just made a successful leap from its old host – a 10,000-years-dead warrior who had been preserved in an anaerobic crashpod ever since her ship was shot out of the sky – into the body of Ariel, a 12-year-old boy who had just invaded the long-lost tomb.

This is doing fiction in hard mode, and Sloan nails it. The unraveling strangeness of Ariel's world is counterpointed with the amazing tale of the world the chronicler hails from, even as the chonicler consults with the preserved personalities of the heroes and warriors it had previous resided in and recorded.

https://pluralistic.net/2024/06/11/penumbraverse/#middle-anth


IX. Fight Me by Austin Grossman
The cover for Fight Me

Aging ex-teen superheroes weigh the legacy of Generation X, in a work that enrobes its savage critique with sweet melancholia, all under a coating of delicious snark. The Newcomers – an amped-up ninja warrior, a supergenius whose future self keeps sending him encouragement and technical schematics backwards through time, and an exiled magical princess turned preppie supermodel – have spent more than a decade scattered to the winds. While some have fared better than others, none of them have lived up to their potential or realized the dreams that seemed so inevitable when they were world famous supers with an entourage of fellow powered teens who worshipped them as the planet's greatest heroes.

As they set out to solve the mystery of the wizard who gave the protagonist his powers, they are reunited and must take stock of who they are and how they got there (cue Talking Heads' "Once In a Lifetime").

The publisher's strapline for this book is "The Avengers Meets the Breakfast Club," which is clever, but extremely wrong. The real comp for this book isn't "The Breakfast Club," it's "The Big Chill."

https://pluralistic.net/2024/07/01/the-big-genx-chill/#im-super-thanks-for-asking


X. Glass Houses by Madeline Ashby
The cover for Glass Houses

Kristen is the "Chief Emotional Manager" for Wuv, a hot startup that has defined the new field of "affective computing," which is when a computer tells you what everyone else around you is really feeling, based on the irrepressible tells emitted by their bodies, voices and gadgets.

Managing Sumter through Wuv's tumultuous launch is hard work for Kristen, but at last, it's paid off. The company has been acquired, making Kristen – and all her coworkers on the founding core team – into instant millionaires. They're flying to a lavish celebration in an autonomous plane that Sumter chartered when the action begins: the plane has a malfunction and crashes into a desert island, killing all but ten of the Wuvvies.

As the survivors explore the island, they discover only one sign of human habitation: a huge, brutalist, featureless black glass house, which initially rebuffs all their efforts to enter it. But once they gain entry, they discover that the house is even harder to leave.

https://pluralistic.net/2024/08/13/influencers/#affective-computing


XI. The Sapling Cage by Margaret Killjoy
The cover for The Sapling Cage

A queer coming-of-age tale in the mode of epic fantasy. Lorel wants to be a witch, but that's the very last of the adventurous trades to be strictly gender-segregated. Boys and girls alike run away to be knights, brigands and sailors, but only girls can become a witch. Indeed, Lorel's best friend, Lane, is promised to the witches, having been born to a witch herself.

Lorel has signed up for witching just as the land is turning against witches, thanks to a political plot by a scheming duchess who has scapegoated the witches as part of a plan to annex all the surrounding duchies, re-establishing the long-disintegrated kingdom with herself on the throne. To make things worse (for the witches, if not the duchess), there's a plague of monsters on the land, and the forests are blighted with a magical curse that turns trees to unmelting ice. This all softens up the peasantfolk for anti-witch pogroms.

So Lorel has to learn witching, even as her coven is fighting both monsters and the duchess's knights and the vigilante yokels who've been stirred up with anti-witch xenophobia.

https://pluralistic.net/2024/09/24/daughters-of-the-empty-throne/#witchy


XII. Blackheart Man by Nalo Hopkinson
The cover for Blackheart Man

A story that will make you drunk on language, on worldbuilding, and on its roaring, relentless plot. The action is set on Chynchin, a fantastic Caribbean island (or maybe Caribbeanesque – it's never clear whether this is some magical, imaginary world, or some distant future of our own). Chynchin is a multiracial, creole land with a richly realized gift economy that Hopkinson deftly rounds out with a cuisine, languages, and familial arrangements.

Chynchin was founded through a slave rebellion, in which the press-ganged soldiers of the iron-fisted Ymisen empire were defeated by three witches who caused them to be engulfed in tar that they magicked into a liquid state just long enough to entomb them, then magicked back into solidity. For generations, the Ymisen have tolerated Chynchin's self-rule, but as the story opens, a Ymisen armada sails into Chynchin's port and a "trade envoy" announces that it's time for the Chynchin to "voluntarily" re-establish trade with the Ymisen.

The story that unfolds is a staple of sf and fantasy: the scrappy resistance mounted against the evil empire, and this familiar backdrop is a sturdy scaffold to support Hopkinson's dizzying, phantasmagoric tale of psychedelic magic, possessed children, military intrigue, musicianship and sexual entanglements.

https://pluralistic.net/2024/08/20/piche/#cynchin


XIII. Julia by Sandra Newman
The cover for Julia

Julia is the kind of fanfic that I love, in the tradition of both The Wind Done Gone and Rosencrantz and Gildenstern Are Dead, in which a follow-on author takes on the original author's throwaway world-building with deadly seriousness, elucidating the weird implications and buried subtexts of all the stuff and people moving around in the wings and background of the original.

For Newman, the starting point here is Julia, an enigmatic lover who comes to Winston with all kinds of rebellious secrets – tradecraft for planning and executing dirty little assignations and acquiring black market goods. Julia embodies a common contradiction in the depiction of young women (she is some twenty years younger than Winston): on the one hand, she is a "native" of the world, while Winston is a late arrival, carrying around all his "oldthink" baggage that leaves him perennially baffled, terrified and angry; on the other hand, she's a naive "girl," who "doesn't much care for reading," and lacks the intellectual curiosity that propels Winston through the text.

This contradiction is the cleavage line that Newman drives her chisel into, fracturing Orwell's world in useful, fascinating, engrossing ways. Through Julia's eyes, we experience Oceania as a paranoid autocracy, corrupt and twitchy. We witness the obvious corollary of a culture of denunciation and arrest: the ruling Party of such an institution must be riddled with internecine struggle and backstabbing, to the point of paralyzed dysfunction. The Orwellian trick of switching from being at war with Eastasia to Eurasia and back again is actually driven by real military setbacks – not just faked battles designed to stir up patriotic fervor. The Party doesn't merely claim to be under assault from internal and external enemies – it actually is.

https://pluralistic.net/2024/09/28/novel-writing-machines/#fanfic


XIV. The Wilding by Ian McDonald
The cover for The Wilding

McDonald's first horror novel, and it's fucking terrifying. It's set in a rural Irish peat bog that has been acquired by a conservation authority that is rewilding it after a century of industrial peat mining that stripped it back nearly to the bedrock. This rewilding process has been greatly accelerated by the covid lockdowns, which reduced the human footprint in the conservation area to nearly zero.

Lisa's last duty before she leaves the bog and goes home to Dublin is leading a school group on a wild campout in one of the bog's deep clearings. It's a routine assignment, and while it's not her favorite duty, it's also not a serious hardship.

But as the group hikes out to the campsite, one of her fellow guides is killed, without warning, by a mysterious beast that moves so quickly they can barely make out its monstrous form. Thus begins a tense, mysterious, spooky as hell story of survival in a haunted woods, written in the kind of poesy that has defined McDonald's career, and which – when deployed in service of terror – has the power to raise literal goosebumps.

https://pluralistic.net/2024/10/25/bogman/#erin-go-aaaaaaargh


XV. Polostan by Neal Stephenson
the cover of Polostan

Not a spy novel, but a science fiction novel about spies in an historical setting. This isn't to say that Stephenson tramples on, or ignores spy tropes: this is absolutely a first-rate spy novel. Nor does Stephenson skimp on the lush, gorgeously realized and painstakingly researched detail you'd want from an historical novel.

Polostan raises the curtain on the story of Dawn Rae Bjornberg, AKA Aurora Maximovna Artemyeva, whose upbringing is split between the American West in the early 20th century and the Leningrad of revolutionary Russia (her parents are an American anarchist and a Ukrainian Communist who meet when her father travels to America as a Communist agitator). Aurora's parents' marriage does not survive their sojourn to the USSR, and eventually Aurora and her father end up back in the States, after her father is tasked with radicalizing the veterans of the Bonus Army that occupied DC, demanding the military benefits they'd been promised.

All of this culminates in her return sojourn to the Soviet Union, where she first falls under suspicion of being an American spy, and then her recruitment as a Soviet spy.

Also: she plays a lot of polo. Like, on a horse.

https://pluralistic.net/2024/11/04/bomb-light/#nukular


NONFICTION

I. A City on Mars by Kelly and Zach Weinersmith
The cover for A City on Mars

Biologist Kelly Weinersmith and cartoonist Zach Weinersmith set out to investigate the governance challenges of the impending space settlements they were told were just over the horizon. Instead, they discovered that humans aren't going to be settling space for a very long time, and so they wrote a book about that instead.

The Weinersmiths make the (convincing) case that every aspect of space settlement is vastly beyond our current or reasonably foreseeable technical capability. What's more, every argument in favor of pursuing space settlement is errant nonsense. And finally: all the energy we are putting into space settlement actually holds back real space science, which offers numerous benefits to our species and planet (and is just darned cool).

https://pluralistic.net/2024/01/09/astrobezzle/#send-robots-instead


II. Dark Wire by Joseph Cox
The cover for Dark Wire

Cox spent years on the crimephone beat, tracking vendors who sold modded phones (first Blackberries, then Android phones) to criminal syndicates with the promise that they couldn't be wiretapped by law-enforcement.

He tells the story of the FBI's plan to build an incredibly secure, best-of-breed crimephone, one with every feature that a criminal would want to truly insulate themselves from law enforcement while still offering everything a criminal could need to plan and execute crimes.

This is really two incredible tales. The first is the story of the FBI and its partners as they scaled up Anom, their best-of-breed crimephone business. This is a (nearly) classic startup tale, full of all-nighters, heroic battles against the odds, and the terror and exhilaration of "hockey-stick" growth.

The other one is the crime startup, the one that the hapless criminal syndicates that sign up to distribute Anom devices find themselves in the middle of. They, too, are experiencing hockey-stick growth. They, too, have a fantastically lucrative tiger by the tail. And they, too, have a unique set of challenges that make this startup different from any other.

Cox has been on this story for a decade, and it shows. He has impeccable sourcing and encyclopedic access to the court records and other public details that allow him to reproduce many of the most dramatic scenes in the Anom caper verbatim.

https://pluralistic.net/2024/06/04/anom-nom-nom/#the-call-is-coming-from-inside-the-ndrangheta


III. The Hidden History of Walt Disney World by Foxx Nolte
The cover for The Hidden History of Walt Disney World

No one writes about Disney theme parks like Foxx Nolte; no one rises above the trivia and goes beyond the mere sleuthing of historical facts, no one nails the essence of what makes these parks work – and fail.

The history of Walt Disney World is also a history of the American narrative from the 1960s to the turn of the millennium, especially once Epcot enters the picture and Disney sets out to market itself as a futuristic mirror to America and the world. There's a doomed plan to lead the nation in the provision of an airport for the largely hypothetical short runway aircraft that never materialized, the Disney company's love-hate affair with Florida's orange growers, and the geopolitics of installing a permanent World's Fair, just as World's Fairs were disappearing from the world stage.

In focusing on the conflicts between different corporate managers, outside suppliers, and the gloriously flamboyant weirdos of Florida, Nolte's history of Disney World transcends amusing anaecdotes and tittle-tattle – rather, it illustrates how the creative sparks thrown off by people smashing into each other sometimes created towering blazes of glory that burn to this day.

https://pluralistic.net/2024/07/15/disnefried/#dialectics


IV. Network Nation by Richard R John
The cover of Network Nation

An extremely important, brilliantly researched, deep history of America's love/hate affair with not just the telephone, but also the telegraph. It is unmistakably as history book, one that aims at a definitive takedown of various neat stories about the history of American telecommunications.

The monopolies that emerged in the telegraph and then the telephone weren't down to grand forces that made them inevitable, but rather, to the errors made by regulators and the successful gambits of the telecoms barons. At many junctures, things could have gone another way.

Most striking about this book were the parallels to contemporary fights over Big Tech trustbusting, in our new Gilded Age. Many of the apologies offered for Western Union or AT&T's monopoly could have been uttered by the Renfields who carry water for Facebook, Apple and Google. John's book is a powerful and engrossing reminder that variations on these fights have occurred in the not-so-distant past, and that there's much we can learn from them.

https://pluralistic.net/2024/07/18/the-bell-system/#were-the-phone-company-we-dont-have-to-care


V. A Natural History of Empty Lots by Christopher Brown
A Natural History of Empty Lots

A frustratingly hard to summarize book, because it requires a lot of backstory and explanation, and one of the things that makes this book so! fucking! great! is how skillfully Brown weaves disparate elements – the unique house he built in Austin, the wildlife he encounters in the city's sacrifice zones, the politics that created them – into his telling.

This series of loosely connected essays that explains how everything fits together: colonial conquest, Brown's failed marriage, his experience as a lawyer learning property law, what he learned by mobilizing that learning to help his neighbors defend the pockets of wildness that refuse to budge.

It's filled with pastoral writing that summons Kim Stanley Robinson by way of Thoreau, and it sometimes frames its philosophical points the way a cyberpunk writer would.

The kind of book that challenges how you feel about the crossroads we're at, the place you live, and the place you want to be.

https://pluralistic.net/2024/09/17/cyberpunk-pastoralism/#time-to-mow-the-roof


GRAPHIC NOVELS

I. Death Strikes by David Maass and Patrick Lay
The cover for Death Strikes

"The Emperor of Atlantis," is an opera written by two Nazi concentration camp inmates, the librettist Peter Kien and the composer Viktor Ullmann, while they were interned in Terezin, a show-camp in Czechoslovakia that housed numerous Jewish artists, who were encouraged to make and display their work as a sham to prove to the rest of the world that Nazi camps were humane places.

Death Strikes was adapted by my EFF colleague Dave Maass, an investigator and muckraker and brilliant writer, who teamed up with illustrator Patrick Lay and character designer Ezra Rose (who worked from Kien and Ullmann's original designs, which survived along with the score and libretto).

The Emperor's endless wars have already tried Death's patience. Death brings mercy, not vengeance, and the endless killing has dismayed him. The Emperor's co-option drives him past the brink, and Death declares a strike, breaking his sword and announcing that henceforth, no one will die.

Needless to say, this puts a crimp in the Emperor's all-out war plan. People get shot and stabbed and drowned and poisoned, but they don't die. They just hang around, embarrassingly alive (there's a great comic subplot of the inability of the Emperor's executioners to kill a captured assassin).

While this is clearly an adaptation, Kien and Ullmann's spirit of creativity, courage, and bittersweet creative ferment shines through. It's a beautiful book, snatched from death itself.

https://pluralistic.net/2024/01/23/peter-kien-viktor-ullmann/#terez


II. My Favorite Things Is Monsters Book Two by Emil Ferris
The cover for My Favorite Things Is Monsters Book Two

The long, long delayed sequel to the tale of Karen Reyes, a 10 year old, monster-obsessed queer girl in 1968 Chicago who lives with her working-class single mother and her older brother, Deeze, in an apartment house full of mysterious, haunted adults. There's the landlord – a gangster and his girlfriend – the one-eyed ventriloquist, and the beautiful Holocaust survivor and her jazz-drummer husband.

Ferris's storytelling style is dazzling, and it's matched and exceeded by her illustration style, which is grounded in the classic horror comics of the 1950s and 1960s. Characters in Karen's life – including Karen herself – are sometimes depicted in the EC horror style, and that same sinister darkness crowds around the edges of her depictions of real-world Chicago.

Book Two picks up from Book One's cliffhanger and then rockets forward. Everything brilliant about One is even better in Two – the illustrations more lush, the fine art analysis more pointed and brilliant, the storytelling more assured and propulsive, the shocks and violence more outrageous, the characters more lovable, complex and grotesque.

Everything about Two is more. The background radiation of the Vietnam War in One takes center stage with Deeze's machinations to beat the draft, and Deeze and Karen being ensnared in the Chicago Police Riots of '68. The allegories, analysis and reproductions of classical art get more pointed, grotesque and lavish. Annika's Nazi concentration camp horrors are more explicit and more explicitly connected to Karen's life. The queerness of the story takes center stage, both through Karen's first love and the introduction of a queer nightclub. The characters are more vivid, as is the racial injustice and the corruption of the adult world.

https://pluralistic.net/2024/06/01/the-druid/#


III. So Long Sad Love by Mirion Malle
The cover for So Long Sad Love

Cleo is a French comics creator who's moved to Montreal, in part to be with Charles, a Quebecois creator who helps her find a place in the city's tight-knit artistic scene. The relationship feels like a good one, with the normal ups and downs, but then Cleo travels to a festival, where she meets Farah, a vivacious and talented fellow artist. They're getting along great…until Farah discovers who Cleo's boyfriend is. Though Farah doesn't say anything, she is visibly flustered and makes her excuses before hurriedly departing.

This kicks off Cleo's hunt for the truth about her boyfriend, a hunt that is complicated by the fact that she's so far from home, that her friends are largely his friends, that he flies off the handle every time she raises the matter, and by her love for him.

Malle handles this all so deftly, showing how Cleo and her friends all play archetypal roles in the recurrent missing stair dynamic. It's a beautifully told story, full of charm and character, but it's also a kind of forensic re-enactment of a disaster, told from an intermediate distance that's close enough to the action that we can see the looming crisis, but also understand why the people in its midst are steering straight into it.

Packed with subtlety and depth, romance and heartbreak, subtext that carries through the dialog (in marvelous translation from the original French by Aleshia Jensen) and the body language in Malle's striking artwork.

https://pluralistic.net/2024/06/25/missing-step/#the-fog-of-love


IV. Bea Wolf by Zach Wienersmith and Boulet
The cover for Bea Wolf

A ferociously amazingly great illustrated kids' graphic novel adaptation of the Old English epic poem, which inspired Tolkien, who helped bring it to popularity after it had languished in obscurity for centuries.

Weinersmith and Boulet set themselves the task of bringing a Germanic heroic saga from more than a thousand years ago to modern children, while preserving the meter and the linguistic and literary tropes of the original. And they did it!

There are some changes, of course. Grendel – the boss monster that both Beowulf and Bea Wulf must defeat – is no longer obsessed with decapitating his foes and stealing their heads. In Bea Wulf, Grendel is a monstrously grown up and boring adult who watches cable news and flosses twice per day, and when he defeats the kids whose destruction he is bent upon, he does so by turning them into boring adults, too.

The utter brilliance of Bea Wulf is as much due to the things it preserves from the original epic as it is to the updates and changes. Weinersmith has kept the Old English tradition of alliteration, right from the earliest passages, with celebrations of heroes like "Tanya, treat-taker, terror of Halloween, her costume-cache vast, sieging kin and neighbor, draining full candy-bins, fearing not the fate of her teeth. Ten thousand treats she took. That was a fine Tuesday."

https://pluralistic.net/2024/06/24/awesome-alliteration/#hellion-hallelujah


V. Youth Group by Bowen McCurdy and Jordan Morris
The cover for Youth Group

A charming tale of 1990s ennui, cringe Sunday School – and demon hunting.

Kay is a bitter, cynical teenager who's doing her best to help her mother cope with an ugly divorce that has seen her dad check out on his former family. Mom is going back to church, and she talks Kay into coming along with her to attend the church youth group.

But this is no ordinary youth group. Kay's ultra-boring suburban hometown is actually infested with demons who routinely possess the townspeople, and that baseline of demonic activity has suddenly gone critical, with a new wave of possessions. Suddenly, the possessed are everywhere – even Kay's shitty dad ends up with a demon inside of him.

That's when Kay discovers that the youth group and its corny pastor are also demon hunters par excellence. Their rec-rooms sport secret cubbies filled with holy weapons, and the words of exorcism come as readily to them as any embarrassing rewritten devotional pop song. Kay's discovery of this secret world convinces her that the youth group isn't so bad after all, and soon she is initiated into its mysteries, including the existence of rival demon-hunting kids from the local synagogue, Catholic church, and Wiccan coven.

https://pluralistic.net/2024/07/16/satanic-panic/#the-dream-of-the-nineties


VI. Justice Warriors: Vote Harder by Matt Bors and Ben Clarkson
The cover for Justice Warriors: Vote Harder

Vote Harder sees Bubble City facing its first election in living memory, as the mayor – who inherited his position from his "powerful, strapping Papa" – loses a confidence vote by the city's trustees. They're upset with his plan to bankrupt the city in order to buy a laser powerful enough to carve his likeness into the sun as a viral stunt for the launch of his comeback album. The trustees are in no way mollified by the fact that he expects to make a lot of money selling special branded sunglasses that allow Bubble City (and the mutant hordes of the Uninhabited Zone) to safely look into the sun and see what their tax dollars bought.

So it's time for an election, and the two candidates are going hard: there's the incumbent Mayor Prince; there's his half-sister and ex-girlfriend, Stufina Vipix XII, and there's a dark-horse candidate Flauf Tanko, a mutant-tank cyborg that went rogue after a militant Home Owners Association disabled it and its owners abandoned it. Flauf-Tanko is determined to give the masses of the Uninhabited Zone the representation they've been denied for so long, despite the structural impediments to this (UZers need to complete a questionnaire, sub-forms, have three forms of ID, and present a rental contract, drivers license, work permit and breeding license. They also need to get their paperwork signed in person at a VERI-VOTE location, then wait 14 days to get their voter IDs by mail. Also, districts of 2 million or more mutants are allocated the equivalent of only 250,000 votes, but only if 51% of eligible voters show up to the polls; otherwise, their votes are parceled out to other candidates per the terms of the Undervoting and Apathy Allotment Act).

What unfolds is a funny, bitter, superb piece of political satire that could not be better timed.

https://pluralistic.net/2024/09/11/uninhabited-zone/#eremption-season


As I mentioned in the introduction to this roundup, I had three books out in 2024; a new hardcover, and the paperback editions of two books that came out in hardcover last year. There's more on the horizon – a new hardcover novel (PICKS AND SHOVELS) in Feb 2025, along with the paperback of my novel THE BEZZLE (also Feb 2025). I just turned in the manuscript for my next nonfiction book, ENSHITTIFICATION, which will also be adapted as a graphic novel. I'll also be shortly announcing the publication details for a YA graphic novel, a new essay collection and short story collection.

If you enjoy my work – the newsletter, the talks, the reviews – the best way to support me is to buy my books. I write for grownups, teens, middle-schoolers and little kids, so there's something for everyone!

I. The Lost Cause
The cover for The Lost Cause.
A solarpunk novel of hope in the climate emergency. "The first great YIMBY novel" -Bill McKibben. "Completely delightful…Neither utopian nor dystopian…I loved it" -Rebecca Solnit. A national bestseller!

https://us.macmillan.com/books/9781250865946/thelostcause/


II. The Internet Con: How to Seize the Means of Computation
The cover for The Internet Con.
A detailed disassembly manual for people who want to dismantle Big Tech. "A passionate case for 'relief from manipulation, high-handed moderation, surveillance, price-gouging, disgusting or misleading algorithmic suggestions. -Akash Kapur, New Yorker. Another national bestseller!

https://www.versobooks.com/products/3035-the-internet-con

III. The Bezzle.
The cover for The Bezzle.
A seething rebuke of the privatized prison system that delves deeply into the arcane and baroque financial chicanery involved in the 2008 financial crash. "Righteously satisfying…A fascinating tale of financial skullduggery, long cons, and the delivery of ice-cold revenge." –Booklist. A third national bestseller!

https://us.macmillan.com/books/9781250865878/thebezzle/


Hey look at this (permalink)



A Wayback Machine banner.

This day in history (permalink)

#15yrsago Notes from a news-site paywall attempt https://lancewiggs.com/2009/11/29/2134-nbrs-performance-since-the-wall/

#15yrsago Iain Banks and other prominent Scots call for reform of Royal Bank of Scotland: “Royal Bank of Sustainability” https://www.theguardian.com/business/2009/nov/29/iain-banks-royal-bank-scotland

#15yrsago High-mag pollen photos highlight the invisible beauty of plants’ reproductive spritz https://web.archive.org/web/20091120085200/http://ngm.nationalgeographic.com/2009/12/pollen/oeggerli-photography

#15yrsago CCDs: a great disruptor lurking in the tech https://bitworking.org/news/2009/11/ccd/

#15yrsago Games Workshop declares war on best customers. Again. https://boardgamegeek.com/geeklist/48933/the-games-workshop-files-purge-of-09

#15yrsago Pub fined £8K after user infringes copyright with its WiFi https://web.archive.org/web/20091129040800/http://news.zdnet.co.uk/communications/0,1000000085,39909136,00.htm

#15yrsago DRM versus innovation https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1496058

#15yrsago Washington State to Microsoft: why aren’t you paying your taxes? https://web.archive.org/web/20120504044552/https://blogs.seattleweekly.com/dailyweekly/2009/11/an_open_letter_to_microsoft_ce.php

#15yrsago Disused call-box turned into world’s smallest lending library https://news.bbc.co.uk/2/hi/uk_news/england/somerset/8385313.stm

#15yrsago EU memo on secret copyright treaty confirms US desire for global DMCA https://web.archive.org/web/20091202005204/http://www.michaelgeist.ca/content/view/4575/125/

#15yrsago Turkey wants universal email surveillance from birth https://foreignpolicy.com/2009/11/30/turkey-tests-new-means-of-internet-control/

#15yrsago BBC photographer prevented from shooting St Paul’s because he might be “al Qaeda operative” https://news.bbc.co.uk/2/hi/uk_news/politics/8384972.stm

#15yrsago Business Software Alliance asks Britons to become paid informants https://web.archive.org/web/20091204180307/https://www.wired.com/threatlevel/2009/12/nark-on-your-boss/

#15yrsago Dane who ripped his DVDs demands to be arrested under DRM law https://torrentfreak.com/anti-piracy-group-refuses-bait-drm-breaker-goes-to-the-police-091201/

#15yrsago Somali pirate stock-market: “we’ve made piracy a community activity.” https://www.reuters.com/article/wtUSInvestingNews/idUSTRE5B01Z920091201/

#15yrsago Goldman Sachs bankers ready themselves to kill peasants in the inevitable uprising https://web.archive.org/web/20100131042233/http://www.bloomberg.com/apps/news?pid=20601039&sid=ahD2WoDAL9h0

#10yrsago A terrible restaurant for Canadian bankers called America https://www.theglobeandmail.com/life/food-and-wine/restaurant-reviews/america-at-the-trump-hotel-the-food-is-amazing-but-you-shouldnt-eat-here-ever/article21833277/

#10yrsago Chicago schools lost $100M by letting Wall Street engineer their finances https://www.nakedcapitalism.com/2014/11/chicago-public-schools-100-million-swaps-debacle-demonstrates-high-cost-high-finance.html

#10yrsago BMG and Rightscorp sue ISP for right to decide who may use the Internet https://arstechnica.com/tech-policy/2014/11/music-publishers-finally-pull-the-trigger-sue-an-isp-over-piracy/

#10yrsago Walmart holds food drive…for Walmart employees (again!) https://web.archive.org/web/20141127230642/http://dissenter.firedoglake.com/2014/11/26/walmart-again-holds-food-drive-for-own-underpaid-workers

#10yrsago John Oliver on Civil Forfeiture https://www.youtube.com/watch?v=3kEpZWGgJks

#5yrsago Three women independently accuse Gordon Sondland of repeated acts of highly similar sexual misconduct https://www.propublica.org/article/multiple-women-recall-sexual-misconduct-and-retaliation-by-gordon-sondland

#5yrsago This Thanksgiving, don’t have a political argument, have a “structured organizing conversation” https://jacobin.com/2019/11/thanksgiving-organizing-activism-friends-family-conversation-presidential-election

#5yrsago Profile of Mariana Mazzucato, the economist who’s swaying both left and right politicians with talk of “the entrepreneurial state” https://www.nytimes.com/2019/11/26/business/mariana-mazzucato.html

#5yrsago South Carolina’s magistrate judges are a clown-car of corrupt cronies, but they get to put people in jail https://www.propublica.org/article/these-judges-can-have-less-training-than-barbers-but-still-decide-thousands-of-cases-each-year

#5yrsago Italian cops raid neo-Nazis, find rifles, swords and Nazi literature https://www.bbc.com/news/world-europe-50590924.amp

#5yrsago Writer asks for an exclusive trademark on the use of the word “dark” in “Series of fiction works, namely, novels and books” https://twitter.com/Catrambo/status/1200060433216004096

#5yrsago Defense contractors gleefully report record earnings in divisions that bid on “classified” projects, the fastest-growing part of the Pentagon’s budget https://www.defenseone.com/business/2019/10/secret-pentagon-spending-rising-and-defense-firms-are-cashing/160802/

#5yrsago Meet the Krazy Klown Kavalcade of racists, homophobes, islamophobes and transphobes serving as appointed South Carolina magistrates https://www.propublica.org/article/he-defended-the-confederate-flag-and-insulted-immigrants-now-hes-a-judge#172036

#5yrsago DC Comics kills Batman image because China insisted it was supporting the Hong Kong protests https://variety.com/2019/film/news/dc-comics-warner-brothers-batman-1203419190/

#5yrsago The Oligarch Game: use coin-tosses to demonstrate “winner take all” and its power to warp perceptions https://brewster.kahle.org/2019/11/30/the-game-of-oligarchy/

#5yrsago Pennsylvania to Ohio: we see your terrible life-threatening anti-abortion bill and raise you with funerals for unimplanted, fertilized eggs https://www.vice.com/en/article/pennsylvania-fetal-burial-bill-death-certificates-for-miscarriage-abortion-fertilized-eggs-hb1890/

#5yrsago A quick trip through the ghastly, racist, sexist, eugenicist, authoritarian things that Boris Johnson has said in recent years https://www.businessinsider.com/boris-johnson-said-britain-poorest-chavs-losers-criminals-addicts-burglars-2019-11

#1yrago All the books I reviewed in 2023 https://pluralistic.net/2023/12/01/bookmaker/#2023-in-review

#1yrago Sponsored listings are a ripoff…for sellers https://pluralistic.net/2023/11/29/aethelred-the-unready/#not-one-penny-for-tribute

#1yrago Insurance companies are making climate risk worse https://pluralistic.net/2023/11/28/re-re-reinsurance/#useless-price-signals


Upcoming appearances (permalink)

A photo of me onstage, giving a speech, holding a mic.



A screenshot of me at my desk, doing a livecast.

Recent appearances (permalink)



A grid of my books with Will Stahle covers..

Latest books (permalink)



A cardboard book box with the Macmillan logo.

Upcoming books (permalink)

  • Picks and Shovels: a sequel to "Red Team Blues," about the heroic era of the PC, Tor Books, February 2025

  • Unauthorized Bread: a middle-grades graphic novel adapted from my novella about refugees, toasters and DRM, FirstSecond, 2025



Colophon (permalink)

Today's top sources:

Currently writing:

  • Enshittification: a nonfiction book about platform decay for Farrar, Straus, Giroux. Today's progress: 904 words (90067 words total). FIRST DRAFT COMPLETE

  • A Little Brother short story about DIY insulin PLANNING

  • Picks and Shovels, a Martin Hench noir thriller about the heroic era of the PC. FORTHCOMING TOR BOOKS FEB 2025

Latest podcast: Spill, part four (a Little Brother story) https://craphound.com/littlebrother/2024/10/28/spill-part-four-a-little-brother-story/


This work – excluding any serialized fiction – is licensed under a Creative Commons Attribution 4.0 license. That means you can use it any way you like, including commercially, provided that you attribute it to me, Cory Doctorow, and include a link to pluralistic.net.

https://creativecommons.org/licenses/by/4.0/

Quotations and images are not included in this license; they are included either under a limitation or exception to copyright, or on the basis of a separate license. Please exercise caution.


How to get Pluralistic:

Blog (no ads, tracking, or data-collection):

Pluralistic.net

Newsletter (no ads, tracking, or data-collection):

https://pluralistic.net/plura-list

Mastodon (no ads, tracking, or data-collection):

https://mamot.fr/@pluralistic

Medium (no ads, paywalled):

https://doctorow.medium.com/

Twitter (mass-scale, unrestricted, third-party surveillance and advertising):

https://twitter.com/doctorow

Tumblr (mass-scale, unrestricted, third-party surveillance and advertising):

https://mostlysignssomeportents.tumblr.com/tagged/pluralistic

"When life gives you SARS, you make sarsaparilla" -Joey "Accordion Guy" DeVilla

2024-12-01T02:01:29+00:00 Fullscreen Open in Tab
Note published on December 1, 2024 at 2:01 AM UTC
2024-11-30T00:36:04+00:00 Fullscreen Open in Tab
Note published on November 30, 2024 at 12:36 AM UTC
2024-11-29T08:23:31+00:00 Fullscreen Open in Tab
Why pipes sometimes get "stuck": buffering

Here’s a niche terminal problem that has bothered me for years but that I never really understood until a few weeks ago. Let’s say you’re running this command to watch for some specific output in a log file:

tail -f /some/log/file | grep thing1 | grep thing2

If log lines are being added to the file relatively slowly, the result I’d see is… nothing! It doesn’t matter if there were matches in the log file or not, there just wouldn’t be any output.

I internalized this as “uh, I guess pipes just get stuck sometimes and don’t show me the output, that’s weird”, and I’d handle it by just running grep thing1 /some/log/file | grep thing2 instead, which would work.

So as I’ve been doing a terminal deep dive over the last few months I was really excited to finally learn exactly why this happens.

why this happens: buffering

The reason why “pipes get stuck” sometimes is that it’s VERY common for programs to buffer their output before writing it to a pipe or file. So the pipe is working fine, the problem is that the program never even wrote the data to the pipe!

This is for performance reasons: writing all output immediately as soon as you can uses more system calls, so it’s more efficient to save up data until you have 8KB or so of data to write (or until the program exits) and THEN write it to the pipe.

In this example:

tail -f /some/log/file | grep thing1 | grep thing2

the problem is that grep thing1 is saving up all of its matches until it has 8KB of data to write, which might literally never happen.

programs don’t buffer when writing to a terminal

Part of why I found this so disorienting is that tail -f file | grep thing will work totally fine, but then when you add the second grep, it stops working!! The reason for this is that the way grep handles buffering depends on whether it’s writing to a terminal or not.

Here’s how grep (and many other programs) decides to buffer its output:

  • Check if stdout is a terminal or not using the isatty function
    • If it’s a terminal, use line buffering (print every line immediately as soon as you have it)
    • Otherwise, use “block buffering” – only print data if you have at least 8KB or so of data to print

So if grep is writing directly to your terminal then you’ll see the line as soon as it’s printed, but if it’s writing to a pipe, you won’t.

Of course the buffer size isn’t always 8KB for every program, it depends on the implementation. For grep the buffering is handled by libc, and libc’s buffer size is defined in the BUFSIZ variable. Here’s where that’s defined in glibc.

(as an aside: “programs do not use 8KB output buffers when writing to a terminal” isn’t, like, a law of terminal physics, a program COULD use an 8KB buffer when writing output to a terminal if it wanted, it would just be extremely weird if it did that, I can’t think of any program that behaves that way)

commands that buffer & commands that don’t

One annoying thing about this buffering behaviour is that you kind of need to remember which commands buffer their output when writing to a pipe.

Some commands that don’t buffer their output:

  • tail
  • cat
  • tee

I think almost everything else will buffer output, especially if it’s a command where you’re likely to be using it for batch processing. Here’s a list of some common commands that buffer their output when writing to a pipe, along with the flag that disables block buffering.

  • grep (--line-buffered)
  • sed (-u)
  • awk (there’s a fflush() function)
  • tcpdump (-l)
  • jq (-u)
  • tr (-u)
  • cut (can’t disable buffering)

Those are all the ones I can think of, lots of unix commands (like sort) may or may not buffer their output but it doesn’t matter because sort can’t do anything until it finishes receiving input anyway.

Also I did my best to test both the Mac OS and GNU versions of these but there are a lot of variations and I might have made some mistakes.

programming languages where the default “print” statement buffers

Also, here are a few programming language where the default print statement will buffer output when writing to a pipe, and some ways to disable buffering if you want:

  • C (disable with setvbuf)
  • Python (disable with python -u, or PYTHONUNBUFFERED=1, or sys.stdout.reconfigure(line_buffering=False), or print(x, flush=True))
  • Ruby (disable with STDOUT.sync = true)
  • Perl (disable with $| = 1)

I assume that these languages are designed this way so that the default print function will be fast when you’re doing batch processing.

Also whether output is buffered or not might depend on how you print, for example in C++ cout << "hello\n" buffers when writing to a pipe but cout << "hello" << endl will flush its output.

when you press Ctrl-C on a pipe, the contents of the buffer are lost

Let’s say you’re running this command as a hacky way to watch for DNS requests to example.com, and you forgot to pass -l to tcpdump:

sudo tcpdump -ni any port 53 | grep example.com

When you press Ctrl-C, what happens? In a magical perfect world, what I would want to happen is for tcpdump to flush its buffer, grep would search for example.com, and I would see all the output I missed.

But in the real world, what happens is that all the programs get killed and the output in tcpdump’s buffer is lost.

I think this problem is probably unavoidable – I spent a little time with strace to see how this works and grep receives the SIGINT before tcpdump anyway so even if tcpdump tried to flush its buffer grep would already be dead.

After a little more investigation, there is a workaround: if you find tcpdump’s PID and kill -TERM $PID, then tcpdump will flush the buffer so you can see the output. That’s kind of a pain but I tested it and it seems to work.

redirecting to a file also buffers

It’s not just pipes, this will also buffer:

sudo tcpdump -ni any port 53 > output.txt

Redirecting to a file doesn’t have the same “Ctrl-C will totally destroy the contents of the buffer” problem though – in my experience it usually behaves more like you’d want, where the contents of the buffer get written to the file before the program exits. I’m not 100% sure whether this is something you can always rely on or not.

a bunch of potential ways to avoid buffering

Okay, let’s talk solutions. Let’s say you’ve run this command or s

tail -f /some/log/file | grep thing1 | grep thing2

I asked people on Mastodon how they would solve this in practice and there were 5 basic approaches. Here they are:

solution 1: run a program that finishes quickly

Historically my solution to this has been to just avoid the “command writing to pipe slowly” situation completely and instead run a program that will finish quickly like this:

cat /some/log/file | grep thing1 | grep thing2 | tail

This doesn’t do the same thing as the original command but it does mean that you get to avoid thinking about these weird buffering issues.

(you could also do grep thing1 /some/log/file but I often prefer to use an “unnecessary” cat)

solution 2: remember the “line buffer” flag to grep

You could remember that grep has a flag to avoid buffering and pass it like this:

tail -f /some/log/file | grep --line-buffered thing1 | grep thing2

solution 3: use awk

Some people said that if they’re specifically dealing with a multiple greps situation, they’ll rewrite it to use a single awk instead, like this:

tail -f /some/log/file |  awk '/thing1/ && /thing2/'

Or you would write a more complicated grep, like this:

tail -f /some/log/file |  grep -E 'thing1.*thing2'

(awk also buffers, so for this to work you’ll want awk to be the last command in the pipeline)

solution 4: use stdbuf

stdbuf uses LD_PRELOAD to turn off libc’s buffering, and you can use it to turn off output buffering like this:

tail -f /some/log/file | stdbuf -o0 grep thing1 | grep thing2

Like any LD_PRELOAD solution it’s a bit unreliable – it doesn’t work on static binaries, I think won’t work if the program isn’t using libc’s buffering, and doesn’t always work on Mac OS. Harry Marr has a really nice How stdbuf works post.

solution 5: use unbuffer

unbuffer program will force the program’s output to be a TTY, which means that it’ll behave the way it normally would on a TTY (less buffering, colour output, etc). You could use it in this example like this:

tail -f /some/log/file | unbuffer grep thing1 | grep thing2

Unlike stdbuf it will always work, though it might have unwanted side effects, for example grep thing1’s will also colour matches.

If you want to install unbuffer, it’s in the expect package.

that’s all the solutions I know about!

It’s a bit hard for me to say which one is “best”, I think personally I’m mostly likely to use unbuffer because I know it’s always going to work.

If I learn about more solutions I’ll try to add them to this post.

I’m not really sure how often this comes up

I think it’s not very common for me to have a program that slowly trickles data into a pipe like this, normally if I’m using a pipe a bunch of data gets written very quickly, processed by everything in the pipeline, and then everything exits. The only examples I can come up with right now are:

  • tcpdump
  • tail -f
  • watching log files in a different way like with kubectl logs
  • the output of a slow computation

what if there were an environment variable to disable buffering?

I think it would be cool if there were a standard environment variable to turn off buffering, like PYTHONUNBUFFERED in Python. I got this idea from a couple of blog posts by Mark Dominus in 2018. Maybe NO_BUFFER like NO_COLOR?

The design seems tricky to get right; Mark points out that NETBSD has environment variables called STDBUF, STDBUF1, etc which gives you a ton of control over buffering but I imagine most developers don’t want to implement many different environment variables to handle a relatively minor edge case.

I’m also curious about whether there are any programs that just automatically flush their output buffers after some period of time (like 1 second). It feels like it would be nice in theory but I can’t think of any program that does that so I imagine there are some downsides.

stuff I left out

Some things I didn’t talk about in this post since these posts have been getting pretty long recently and seriously does anyone REALLY want to read 3000 words about buffering?

  • the difference between line buffering and having totally unbuffered output
  • how buffering to stderr is different from buffering to stdout
  • this post is only about buffering that happens inside the program, your operating system’s TTY driver also does a little bit of buffering sometimes
  • other reasons you might need to flush your output other than “you’re writing to a pipe”
2024-11-28T16:45:50+00:00 Fullscreen Open in Tab
Note published on November 28, 2024 at 4:45 PM UTC
2024-11-27T21:02:47+00:00 Fullscreen Open in Tab
Note published on November 27, 2024 at 9:02 PM UTC
2024-11-27T01:45:16+00:00 Fullscreen Open in Tab
Note published on November 27, 2024 at 1:45 AM UTC

It's interesting to me that the Fifth Circuit only considered "control" at the smart contract level, and does not seem to consider the role of validators in their opinion. A substantial portion of ETH blocks are built with relays that censor transactions with OFAC-sanctioned contracts, and it seems to me there is now an open question as to whether validators that use non-censoring relays could be sanctioned directly.

The software codes here—the twenty Tornado Cash addresses for immutable smart contracts—are tools used in providing a service of pooling and mixing the deposited Ether prior to withdrawal. Indeed, the immutable smart contract provides a “service” only when an individual cryptocurrency owner makes the relevant input and withdrawal from the smart contract; at that point, and only at that point, the immutable smart contract mixes deposits, provides the depositor a withdrawal key, and, when provided with that key, sends the specified amount to the designated withdrawal account. In short, the immutable smart contract begins working only when prompted to do so by a deposit or entry of a key for withdrawal. More importantly, Tornado Cash, as defined by OFAC, does not own the services provided by the immutable smart contracts. A homeowner may own the right to trash-removal services and a client may own the right to legal services performed by a lawyer, but neither the homeowner nor the client owns the person performing the trash-removal services or the lawyer—for good reason. Similarly, Tornado Cash as an “entity” does not own the immutable smart contracts, separate and apart from any rights or benefits of the services performed by the immutable smart contracts.76 

(Not saying they should, just remarking on the fact that it seems to have gone completely unaddressed.)

Of course this was a concern already, but what with the Treasury focused on the Tornado Cash contracts, it was less central than I suspect it might be soon. This strategy would be somewhat in keeping with legal theories around other "malicious" code, where it's broadly speaking legal to write a devastating computer virus, but a whole lot less legal to run one.

2024-11-27T00:07:41+00:00 Fullscreen Open in Tab
Note published on November 27, 2024 at 12:07 AM UTC
Tue, 26 Nov 2024 08:36:26 +0000 Fullscreen Open in Tab
Pluralistic: Bossware is unfair (in the legal sense, too) (26 Nov 2024)


Today's links



A sweatshop: women sit around a table sewing. Through the lone window, we can see a 'code waterfall' effect as seen in the credits of the Wachowskis' 'Matrix' movies. To their left stands a man in a pin-stripe suit, looking at his watch. His body language radiates impatience. His eyes have been replaced by the staring red eyes of HAL9000 from Kubrick's '2001: A Space Odyssey.' Each woman's head is surmounted by a set of floating Victorian calipers.

Bossware is unfair (in the legal sense, too) (permalink)

You can get into a lot of trouble by assuming that rich people know what they're doing. For example, you might assume that ad-tech works – bypassing peoples' critical faculties, reaching inside their minds and brainwashing them with Big Data insights, because if that's not what's happening, then why would rich people pour billions into those ads?

https://pluralistic.net/2020/12/06/surveillance-tulip-bulbs/#adtech-bubble

You might assume that private equity looters make their investors rich, because otherwise, why would rich people hand over trillions for them to play with?

https://thenextrecession.wordpress.com/2024/11/19/private-equity-vampire-capital/

The truth is, rich people are suckers like the rest of us. If anything, succeeding once or twice makes you an even bigger mark, with a sense of your own infallibility that inflates to fill the bubble your yes-men seal you inside of.

Rich people fall for scams just like you and me. Anyone can be a mark. I was:

https://pluralistic.net/2024/02/05/cyber-dunning-kruger/#swiss-cheese-security

But though rich people can fall for scams the same way you and I do, the way those scams play out is very different when the marks are wealthy. As Keynes had it, "The market can remain irrational longer than you can remain solvent." When the marks are rich (or worse, super-rich), they can be played for much longer before they go bust, creating the appearance of solidity.

Noted Keynesian John Kenneth Galbraith had his own thoughts on this. Galbraith coined the term "bezzle" to describe "the magic interval when a confidence trickster knows he has the money he has appropriated but the victim does not yet understand that he has lost it." In that magic interval, everyone feels better off: the mark thinks he's up, and the con artist knows he's up.

Rich marks have looong bezzles. Empirically incorrect ideas grounded in the most outrageous superstition and junk science can take over whole sections of your life, simply because a rich person – or rich people – are convinced that they're good for you.

Take "scientific management." In the early 20th century, the con artist Frederick Taylor convinced rich industrialists that he could increase their workers' productivity through a kind of caliper-and-stopwatch driven choreography:

https://pluralistic.net/2022/08/21/great-taylors-ghost/#solidarity-or-bust

Taylor and his army of labcoated sadists perched at the elbows of factory workers (whom Taylor referred to as "stupid," "mentally sluggish," and "an ox") and scripted their motions to a fare-the-well, transforming their work into a kind of kabuki of obedience. They weren't more efficient, but they looked smart, like obedient robots, and this made their bosses happy. The bosses shelled out fortunes for Taylor's services, even though the workers who followed his prescriptions were less efficient and generated fewer profits. Bosses were so dazzled by the spectacle of a factory floor of crisply moving people interfacing with crisply working machines that they failed to understand that they were losing money on the whole business.

To the extent they noticed that their revenues were declining after implementing Taylorism, they assumed that this was because they needed more scientific management. Taylor had a sweet con: the worse his advice performed, the more reasons there were to pay him for more advice.

Taylorism is a perfect con to run on the wealthy and powerful. It feeds into their prejudice and mistrust of their workers, and into their misplaced confidence in their own ability to understand their workers' jobs better than their workers do. There's always a long dollar to be made playing the "scientific management" con.

Today, there's an app for that. "Bossware" is a class of technology that monitors and disciplines workers, and it was supercharged by the pandemic and the rise of work-from-home. Combine bossware with work-from-home and your boss gets to control your life even when in your own place – "work from home" becomes "live at work":

https://pluralistic.net/2021/02/24/gwb-rumsfeld-monsters/#bossware

Gig workers are at the white-hot center of bossware. Gig work promises "be your own boss," but bossware puts a Taylorist caliper wielder into your phone, monitoring and disciplining you as you drive your own car around delivering parcels or picking up passengers.

In automation terms, a worker hitched to an app this way is a "reverse centaur." Automation theorists call a human augmented by a machine a "centaur" – a human head supported by a machine's tireless and strong body. A "reverse centaur" is a machine augmented by a human – like the Amazon delivery driver whose app goads them to make inhuman delivery quotas while punishing them for looking in the "wrong" direction or even singing along with the radio:

https://pluralistic.net/2024/08/02/despotism-on-demand/#virtual-whips

Bossware pre-dates the current AI bubble, but AI mania has supercharged it. AI pumpers insist that AI can do things it positively cannot do – rolling out an "autonomous robot" that turns out to be a guy in a robot suit, say – and rich people are groomed to buy the services of "AI-powered" bossware:

https://pluralistic.net/2024/01/29/pay-no-attention/#to-the-little-man-behind-the-curtain

For an AI scammer like Elon Musk or Sam Altman, the fact that an AI can't do your job is irrelevant. From a business perspective, the only thing that matters is whether a salesperson can convince your boss that an AI can do your job – whether or not that's true:

https://pluralistic.net/2024/07/25/accountability-sinks/#work-harder-not-smarter

The fact that AI can't do your job, but that your boss can be convinced to fire you and replace you with the AI that can't do your job, is the central fact of the 21st century labor market. AI has created a world of "algorithmic management" where humans are demoted to reverse centaurs, monitored and bossed about by an app.

The techbro's overwhelming conceit is that nothing is a crime, so long as you do it with an app. Just as fintech is designed to be a bank that's exempt from banking regulations, the gig economy is meant to be a workplace that's exempt from labor law. But this wheeze is transparent, and easily pierced by enforcers, so long as those enforcers want to do their jobs. One such enforcer is Alvaro Bedoya, an FTC commissioner with a keen interest in antitrust's relationship to labor protection.

Bedoya understands that antitrust has a checkered history when it comes to labor. As he's written, the history of antitrust is a series of incidents in which Congress revised the law to make it clear that forming a union was not the same thing as forming a cartel, only to be ignored by boss-friendly judges:

https://pluralistic.net/2023/04/14/aiming-at-dollars/#not-men

Bedoya is no mere historian. He's an FTC Commissioner, one of the most powerful regulators in the world, and he's profoundly interested in using that power to help workers, especially gig workers, whose misery starts with systemic, wide-scale misclassification as contractors:

https://pluralistic.net/2024/02/02/upward-redistribution/

In a new speech to NYU's Wagner School of Public Service, Bedoya argues that the FTC's existing authority allows it to crack down on algorithmic management – that is, algorithmic management is illegal, even if you break the law with an app:

https://www.ftc.gov/system/files/ftc_gov/pdf/bedoya-remarks-unfairness-in-workplace-surveillance-and-automated-management.pdf

Bedoya starts with a delightful analogy to The Hawtch-Hawtch, a mythical town from a Dr Seuss poem. The Hawtch-Hawtch economy is based on beekeeping, and the Hawtchers develop an overwhelming obsession with their bee's laziness, and determine to wring more work (and more honey) out of him. So they appoint a "bee-watcher." But the bee doesn't produce any more honey, which leads the Hawtchers to suspect their bee-watcher might be sleeping on the job, so they hire a bee-watcher-watcher. When that doesn't work, they hire a bee-watcher-watcher-watcher, and so on and on.

For gig workers, it's bee-watchers all the way down. Call center workers are subjected to "AI" video monitoring, and "AI" voice monitoring that purports to measure their empathy. Another AI times their calls. Two more AIs analyze the "sentiment" of the calls and the success of workers in meeting arbitrary metrics. On average, a call-center worker is subjected to five forms of bossware, which stand at their shoulders, marking them down and brooking no debate.

For example, when an experienced call center operator fielded a call from a customer with a flooded house who wanted to know why no one from her boss's repair plan system had come out to address the flooding, the operator was punished by the AI for failing to try to sell the customer a repair plan. There was no way for the operator to protest that the customer had a repair plan already, and had called to complain about it.

Workers report being sickened by this kind of surveillance, literally – stressed to the point of nausea and insomnia. Ironically, one of the most pervasive sources of automation-driven sickness are the "AI wellness" apps that bosses are sold by AI hucksters:

https://pluralistic.net/2024/03/15/wellness-taylorism/#sick-of-spying

The FTC has broad authority to block "unfair trade practices," and Bedoya builds the case that this is an unfair trade practice. Proving an unfair trade practice is a three-part test: a practice is unfair if it causes "substantial injury," can't be "reasonably avoided," and isn't outweighed by a "countervailing benefit." In his speech, Bedoya makes the case that algorithmic management satisfies all three steps and is thus illegal.

On the question of "substantial injury," Bedoya describes the workday of warehouse workers working for ecommerce sites. He describes one worker who is monitored by an AI that requires him to pick and drop an object off a moving belt every 10 seconds, for ten hours per day. The worker's performance is tracked by a leaderboard, and supervisors punish and scold workers who don't make quota, and the algorithm auto-fires if you fail to meet it.

Under those conditions, it was only a matter of time until the worker experienced injuries to two of his discs and was permanently disabled, with the company being found 100% responsible for this injury. OSHA found a "direct connection" between the algorithm and the injury. No wonder warehouses sport vending machines that sell painkillers rather than sodas. It's clear that algorithmic management leads to "substantial injury."

What about "reasonably avoidable?" Can workers avoid the harms of algorithmic management? Bedoya describes the experience of NYC rideshare drivers who attended a round-table with him. The drivers describe logging tens of thousands of successful rides for the apps they work for, on promise of "being their own boss." But then the apps start randomly suspending them, telling them they aren't eligible to book a ride for hours at a time, sending them across town to serve an underserved area and still suspending them. Drivers who stop for coffee or a pee are locked out of the apps for hours as punishment, and so drive 12-hour shifts without a single break, in hopes of pleasing the inscrutable, high-handed app.

All this, as drivers' pay is falling and their credit card debts are mounting. No one will explain to drivers how their pay is determined, though the legal scholar Veena Dubal's work on "algorithmic wage discrimination" reveals that rideshare apps temporarily increase the pay of drivers who refuse rides, only to lower it again once they're back behind the wheel:

https://pluralistic.net/2023/04/12/algorithmic-wage-discrimination/#fishers-of-men

This is like the pit boss who gives a losing gambler some freebies to lure them back to the table, over and over, until they're broke. No wonder they call this a "casino mechanic." There's only two major rideshare apps, and they both use the same high-handed tactics. For Bedoya, this satisfies the second test for an "unfair practice" – it can't be reasonably avoided. If you drive rideshare, you're trapped by the harmful conduct.

The final prong of the "unfair practice" test is whether the conduct has "countervailing value" that makes up for this harm.

To address this, Bedoya goes back to the call center, where operators' performance is assessed by "Speech Emotion Recognition" algorithms, a psuedoscientific hoax that purports to be able to determine your emotions from your voice. These SERs don't work – for example, they might interpret a customer's laughter as anger. But they fail differently for different kinds of workers: workers with accents – from the American south, or the Philippines – attract more disapprobation from the AI. Half of all call center workers are monitored by SERs, and a quarter of workers have SERs scoring them "constantly."

Bossware AIs also produce transcripts of these workers' calls, but workers with accents find them "riddled with errors." These are consequential errors, since their bosses assess their performance based on the transcripts, and yet another AI produces automated work scores based on them.

In other words, algorithmic management is a procession of bee-watchers, bee-watcher-watchers, and bee-watcher-watcher-watchers, stretching to infinity. It's junk science. It's not producing better call center workers. It's producing arbitrary punishments, often against the best workers in the call center.

There is no "countervailing benefit" to offset the unavoidable substantial injury of life under algorithmic management. In other words, algorithmic management fails all three prongs of the "unfair practice" test, and it's illegal.

What should we do about it? Bedoya builds the case for the FTC acting on workers' behalf under its "unfair practice" authority, but he also points out that the lack of worker privacy is at the root of this hellscape of algorithmic management.

He's right. The last major update Congress made to US privacy law was in 1988, when they banned video-store clerks from telling the newspapers which VHS cassettes you rented. The US is long overdue for a new privacy regime, and workers under algorithmic management are part of a broad coalition that's closer than ever to making that happen:

https://pluralistic.net/2023/12/06/privacy-first/#but-not-just-privacy

Workers should have the right to know which of their data is being collected, who it's being shared by, and how it's being used. We all should have that right. That's what the actors' strike was partly motivated by: actors who were being ordered to wear mocap suits to produce data that could be used to produce a digital double of them, "training their replacement," but the replacement was a deepfake.

With a Trump administration on the horizon, the future of the FTC is in doubt. But the coalition for a new privacy law includes many of Trumpland's most powerful blocs – like Jan 6 rioters whose location was swept up by Google and handed over to the FBI. A strong privacy law would protect their Fourth Amendment rights – but also the rights of BLM protesters who experienced this far more often, and with far worse consequences, than the insurrectionists.

The "we do it with an app, so it's not illegal" ruse is wearing thinner by the day. When you have a boss for an app, your real boss gets an accountability sink, a convenient scapegoat that can be blamed for your misery.

The fact that this makes you worse at your job, that it loses your boss money, is no guarantee that you will be spared. Rich people make great marks, and they can remain irrational longer than you can remain solvent. Markets won't solve this one – but worker power can.

(Image: Cryteria, CC BY 3.0, modified)


Hey look at this (permalink)



A Wayback Machine banner.

This day in history (permalink)

#15yrsago Concordia University has a spy-squad that snooped on novelist for “bilingual interests” https://web.archive.org/web/20101119125330/http://artthreat.net/2009/11/concordia-university-spied-novelist/

#10yrsago DC cops budget their asset forfeiture income years in advance https://www.washingtonpost.com/investigations/dc-police-plan-for-future-seizure-proceeds-years-in-advance-in-city-budget-documents/2014/11/15/7025edd2-6b76-11e4-b053-65cea7903f2e_story.html

#10yrsago Analysis of leaked logs from Syria’s censoring national firewall https://www.techdirt.com/2014/11/26/lessons-censorship-syrias-internet-filter-machines/

#10yrsago The Shibboleth, the sequel to The Twelve Fingered Boy https://memex.craphound.com/2014/11/27/the-shibboleth-the-sequel-to-the-twelve-fingered-boy/

#10yrsago Tiny, transforming apartment made huge with massive wheeled storage-compartments https://vimeo.com/110871691

#5yrsago Open Memory Box: hundreds of hours of East German home movies, 1947-1990 https://open-memory-box.de/roll/013-06/00-00-41-20

#5yrsago Talking Adversarial Interoperability with Y Combinator https://www.youtube.com/watch?v=1RsI-Vh-KWI

#5yrsago Debullshitifying the Right to Repair excuses Apple sent to Congress https://www.ifixit.com/News/33977/apple-told-congress-how-repair-should-work-we-respond

#5yrsago NSO Group employees kicked off Facebook for spying for brutal dictators are suing Facebook for violating their privacy https://www.vice.com/en/article/nso-employees-take-legal-action-against-facebook-for-banning-their-accounts/

#5yrsago Amazon secretly planned to use facial recognition and Ring doorbells to create neighborhood “watch lists” https://theintercept.com/2019/11/26/amazon-ring-home-security-facial-recognition/

#5yrsago Great backgrounder on the Hong Kong protests: what’s at stake and how’d we get here? https://www.vox.com/world/2019/8/22/20804294/hong-kong-protests-9-questions

#5yrsago Apple poses a false dichotomy between “privacy” and “competition” https://www.washingtonpost.com/technology/2019/11/26/apple-emphasizes-user-privacy-lawmakers-see-it-an-effort-edge-out-its-rivals/

#5yrsago China wants to lead the UN’s World Intellectual Property Organization https://foreignpolicy.com/2019/11/26/china-bids-lead-world-intellectual-property-organization-wipo/

#1yrago The real AI fight https://pluralistic.net/2023/11/27/10-types-of-people/#taking-up-a-lot-of-space


Upcoming appearances (permalink)

A photo of me onstage, giving a speech, holding a mic.



A screenshot of me at my desk, doing a livecast.

Recent appearances (permalink)



A grid of my books with Will Stahle covers..

Latest books (permalink)



A cardboard book box with the Macmillan logo.

Upcoming books (permalink)

  • Picks and Shovels: a sequel to "Red Team Blues," about the heroic era of the PC, Tor Books, February 2025

  • Unauthorized Bread: a middle-grades graphic novel adapted from my novella about refugees, toasters and DRM, FirstSecond, 2025



Colophon (permalink)

Today's top sources:

Currently writing:

  • Enshittification: a nonfiction book about platform decay for Farrar, Straus, Giroux. Today's progress: 766 words (88164 words total).

  • A Little Brother short story about DIY insulin PLANNING

  • Picks and Shovels, a Martin Hench noir thriller about the heroic era of the PC. FORTHCOMING TOR BOOKS FEB 2025

<b

2024-11-25T18:43:51+00:00 Fullscreen Open in Tab
Note published on November 25, 2024 at 6:43 PM UTC
Mon, 25 Nov 2024 01:23:31 +0000 Fullscreen Open in Tab
Pluralistic: The far right grows through "disaster fantasies" (25 Nov 2024)


Today's links



A heavily armed and armored figure with the head of an foolishly grinning 19th century newsie. He stands in the atrium of a pink, vintage mall.

The far right grows through "disaster fantasies" (permalink)

The core of the prepper fantasy: "What if the world ended in the precise way that made me the most important person?" The ultra-rich fantasize about emerging from luxury bunkers with an army of mercs and thumbdrives full of bitcoin to a world in ruins that they restructure using their "leadership skills."

The ethnographer Rich Miller spent his career embedding with preppers, eventually writing the canonical book of the fantasies that power their obsessions, Dancing at Armageddon: Survivalism and Chaos in Modern Times:

https://www.press.uchicago.edu/ucp/books/book/chicago/D/bo3637295.html

Miller recounts how the disasters that preppers prepare for are the disasters that will call upon their skills, like the water chemist who's devoted his life to preparing to help his community recover from a terrorist attack on its water supply; and who, when pressed, has no theory as to why any terrorist would stage such an attack:

https://pluralistic.net/2020/03/22/preppers-are-larpers/#preppers-unprepared

Prepping is what happens when you are consumed by the fantasy of a terrible omnicrisis that you can solve, personally. It's an individualistic fantasy, and that makes it inherently neoliberal. Neoliberalism's mind-zap is to convince us all that our only role in society is as an individual ("There is no such thing as society" – M. Thatcher). If we have a workplace problem, we must bargain with our bosses, and if we lose, our choices are to quit or eat shit. Under no circumstances should we solve labor disputes through a union, especially not one that wins strong legal protections for workers and then holds the government's feet to the fire.

Same with bad corporate conduct: getting ripped off? Caveat emptor! Vote with your wallet and take your business elsewhere. Elections are slow and politics are boring. But "vote with your wallet" turns retail therapy into a form of civics.

This individualistic approach to problem solving does useful work for powerful people, because it keeps the rest of us thoroughly powerless. Voting with your wallet is casting a ballot in a rigged election that's always won by the people with the thickest wallets, and statistically, that's never you. That's why the right is so obsessed with removing barriers to election spending: the wealthy can't win a one-person/one-vote election (to be in the 1% is to be outnumbered 99:1), but unlimited campaign spending lets the wealthy vote in real elections using their wallets, not just just ballots.

You can't recycle your way out of the climate emergency. Practically speaking, you can't even recycle. All those plastics you lovingly washed and sorted ended up in a landfill or floating in the ocean. Plastics recycling is a hoax perpetrated by the petrochemical industry, who knew all along that their products would never be recycled. These despoilers convinced us to view the systemic rot of corporate ecocide as an individual matter, chiding us about "littering" and exhorting us to sort our garbage:

https://pluralistic.net/2020/09/14/they-knew/#doing-it-again

We are bombarded by real problems that require urgent solutions that can only be resolved through collective action, which we are told is impossible. This is an objectively frightening state of affairs, and it makes people go nuts.

At the start of this century, in the weeks before 9/11, a message-board poster calling himself Gecko45 went Web 1.0 viral by earnestly bullshitting about his job as a mall security guard, doing battle with heavily armed gangs, human traffickers, and ravening monsters. Gecko45's posts were unhinged: he started out seeking advice for doubling up on body-armor to protect him while he deployed his smoke bombs and his partner assembled a high-powered rifle. Though Gecko45 was apparently sincere, he drew tongue-in-cheek replies from the other posters on GlockTalk, who soon dubbed him the "Mall Ninja":

https://lonelymachines.org/mall-ninjas/

The Mall Ninja professed to patrolling a suburban shopping mall while armed with 15 firearms as he carried out his duties as "Sergeant of a three-man Rapid Tactical Force at one of America’s largest indoor retail shopping areas." His qualifications? Mastery "of three martial arts including ninjitsu, which means I can wear the special boots to climb walls."

The Mall Ninja's fantasy of a single brave individual, defending the sleepy populace from violent, armed mobs is instantly recognizable as an ancestor to today's right wing fantasy of America's cities as "no-go zones" filled with "open air drug markets," patrolled by MS-13 and antifa super-soldiers. And while the Mall Ninja drew derision – even from the kinds of people who hang out on a message board called "GlockTalk" – today, his brand of fantasy wins elections.

On Jacobin, Olly Haynes interviews the political writer Richard Seymour about this phenomenon:

https://jacobin.com/2024/11/disaster-nationalism-fantasies-far-right/

Seymour's latest book is Disaster Nationalism:The Downfall of Liberal Civilization, an exploration of the strange obsessions of the right with imaginary disasters in the midst of real ones:

https://www.versobooks.com/en-gb/products/3147-disaster-nationalism

You know these imaginary disasters: "FEMA death camps, 'great replacement theory,' the 'Great Reset,' fifteen-minute cities, 5G towers being beacons of mind control, and microchips installed in people through vaccines." As Seymour writes, these conspiracy fantasies are proliferated by authoritarian regimes and their supporters, especially as real disasters rage around them.

For example, during the Oregon wildfires, people who were threatened by blazing forests that hit 800'C refused to evacuate because they'd been convinced that the fires were set by antifa arsonists in a bid to "wipe out white conservative Christians." They barricaded themselves in their fire-threatened homes, brandishing guns and prepping for the antifa mob.

Seymour says that this "disaster nationalism" "processes disaster in a way that is actually quite enlivening." Confronted with the helplessness of a real disaster that can only be solved through the collective action you've been told is both impossible and a Communist plot, you retreat to an individualistic disaster fantasy that you can play an outsized role in. Every crisis – the climate emergency, poverty, a toxic environment – is replaced by "bad people" and you can go get them.

For authoritarian politicians, a world of bad people at the gates who can only be stopped by "the good guys" makes for great politics. It impels proto-fascist movements to electoral victories, all over the world: in the US, of course, but Seymour also analyzes this as the phenomenon behind the electoral victories of authoritarian ethno-nationalists in India, Israel, Brazil, and all over the world.

I find Seymour's analysis bracing and clarifying. It explains the right's tendency to obsess over the imaginary at the expense of the real. Think of conservatives' obsession with imaginary and hypothetical children, from Qanon's child trafficking conspiracies to the forced birth movement's fixation on "the unborn."

It's not just that these kids don't exist – it's that the right is either indifferent or actively hostile to real children. Qanon peaked at the same time as Trump's "kids in cages" family separation policy, which saw thousands of kids separated from their parents, many forever, as a deliberate policy.

The forced birth movement spent decades fighting to overturn Roe in the name of saving "the unborn" – even as its leaders were also overturning the Child Tax Credit, the most successful child poverty alleviation measure in American history. Actual children were left to sink into food insecurity and precarity, to be enlisted to work overnight shifts in meat-packing plants, to fall into homelessness – even as the movement celebrated the "culture of life" that would rescue hypothetical children.

Lifting kids out of poverty and building a world where parents can afford to raise as many children as they care to have is a collective endeavor. Firebombing abortion clinics or storming into a pizza parlor with an assault rifle is an individual rescue fantasy that escapes into the world.

Mall Ninja politics are winning.


Hey look at this (permalink)



A Wayback Machine banner.

This day in history (permalink)

#15yrsago EULAs + Arbitration = endless opportunity for abuse https://archive.org/details/TheUnconcionabilityOfArbitrationAgreementsInEulas

#15yrsago Wikipedia’s facts-about-facts make the impossible real https://web.archive.org/web/20091116023225/http://www.make-digital.com/make/vol20/?pg=16

#15yrsago How Britain’s Pirate Finder General is trying to save the Analog Economy at the Digital Economy’s expense https://www.theguardian.com/technology/2009/nov/26/digital-economy-file-sharing-mandelson

#15yrago Musician’s open letter, sung to Peter Mandelson, Britain’s Pirate-Finder General https://www.youtube.com/watch?v=6_P4lJD_OPI

#15yrsago Scientist explains why climate scientists talk trash https://rifters.com/crawl/?p=886

#10yrsago The clown-prince of DHS checkpoint refusal videos https://www.youtube.com/user/ttoutpost/featured

#10yrsago Song for Shaker: free the last UK Gitmo prisoner! https://standwithshakeraamer.tumblr.com

#10yrsago Vodafone made millions helping GCHQ spy on the world https://arstechnica.com/tech-policy/2014/11/new-snowden-docs-gchqs-ties-to-telco-gave-spies-global-surveillance-reach/

#10yrsago Uberdystopian: the surge-priced nightmare future https://www.vice.com/en/article/one-day-i-will-die-on-mars/

#10yrsago Essential reading: the irreconcilable tension between cybersecurity and national security https://opencanada.org/the-cyber-security-syndrome/

#10yrsago Strong Female Protagonist Book One https://memex.craphound.com/2014/11/26/strong-female-protagonist-book-one/

#10yrsago Youtube nukes 7 hours’ worth of science symposium audio due to background music during lunch break https://memex.craphound.com/2014/11/25/youtube-nukes-7-hours-worth-of-science-symposium-audio-due-to-background-music-during-lunch-break/

#10yrsago El Deafo: moving, fresh YA comic-book memoir about growing up deaf https://memex.craphound.com/2014/11/25/el-deafo-moving-fresh-ya-comic-book-memoir-about-growing-up-deaf/

#5yrsago Four union organizers fired from Google https://arstechnica.com/tech-policy/2019/11/firing-of-four-google-employees-is-retaliatory-activists-say/

#5yrsago 1941 film shows striking animators brandishing a working guillotine at the Disney studio gates https://web.archive.org/web/20191126175152/https://paleofuture.gizmodo.com/that-time-animators-brought-a-guillotine-to-the-disney-1839802702

#5yrsago Christian TV pastor Rick Wiles: Impeachment is a “Jew coup” https://web.archive.org/web/20191127005302/https://www.patheos.com/blogs/progressivesecularhumanist/2019/11/christian-tv-host-warns-followers-trump-impeachment-is-jew-coup/

#5yrsago In defamation case, Elon Musk will testify that “pedo guy” is a common South African phrase and not an accusation of pedophilia https://www.vice.com/en/article/the-california-dmv-is-making-dollar50m-a-year-selling-drivers-personal-information/

#5yrsago Across America, DMVs make millions selling your license data to private eyes — and randos https://www.vice.com/en/article/the-california-dmv-is-making-dollar50m-a-year-selling-drivers-personal-information/

#5yrsago Bloomberg’s $34m presidential campaign ad-buy is 1.1% of the taxes Bernie, Warren and Steyer want him to pay https://newrepublic.com/article/155844/michael-bloomberg-big-hedge-wealth-tax-2020

#5yrsago How to argue with your racist Facebook uncle this Thanksgiving https://action.dccc.org/pdf/knowyourstuffing-2019_print.pdf

#5yrsago Podcast: The Engagement-Maximization Presidency https://ia803104.us.archive.org/30/items/Cory_Doctorow_Podcast_316/Cory_Doctorow_Podcast_316_-_The_Engagement-Maximization_Presidency.mp3

#5yrsago Networked authoritarianism may contain the seeds of its own undoing https://crookedtimber.org/2019/11/25/seeing-like-a-finite-state-machine/

#5yrsago After Katrina, neoliberals replaced New Orleans’ schools with charters, which are now failing https://www.nola.com/news/education/article_0c5918cc-058d-11ea-aa21-d78ab966b579.html

#5yrsago Talking about Disney’s 1964 Carousel of Progress with Bleeding Cool: our lost animatronic future https://bleedingcool.com/pop-culture/castle-talk-cory-doctorow-on-disneys-carousel-of-progress-and-lost-optimism/

#5yrsago Tiny alterations in training data can introduce “backdoors” into machine learning models https://arxiv.org/abs/1903.06638

#5yrsago Leaked documents document China’s plan for mass arrests and concentration-camp internment of Uyghurs and other ethnic minorities in Xinjiang https://www.icij.org/investigations/china-cables/exposed-chinas-operating-manuals-for-mass-internment-and-arrest-by-algorithm/

#5yrsago Hong Kong elections: overconfident Beijing loyalist parties suffer a near-total rout https://www.scmp.com/news/hong-kong/politics/article/3039132/results-blog

#5yrsago Library Socialism: a utopian vision of a sustaniable, luxuriant future of circulating abundance https://memex.craphound.com/2019/11/25/library-socialism-a-utopian-vision-of-a-sustaniable-luxuriant-future-of-circulating-abundance/

#1yrago The moral injury of having your work enshittified https://pluralistic.net/2023/11/25/moral-injury/#enshittification


Upcoming appearances (permalink)

A photo of me onstage, giving a speech, holding a mic.



A screenshot of me at my desk, doing a livecast.

Recent appearances (permalink)



A grid of my books with Will Stahle covers..

Latest books (permalink)



A cardboard book box with the Macmillan logo.

Upcoming books (permalink)

  • Picks and Shovels: a sequel to "Red Team Blues," about the heroic era of the PC, Tor Books, February 2025

  • Unauthorized Bread: a middle-grades graphic novel adapted from my novella about refugees, toasters and DRM, FirstSecond, 2025



Colophon (permalink)

Today's top sources:

Currently writing:

  • Enshittification: a nonfiction book about platform decay for Farrar, Straus, Giroux. Friday's progress: 796 words (87388 words total).

  • A Little Brother short story about DIY insulin PLANNING

  • Picks and Shovels, a Martin Hench noir thriller about the heroic era of the PC. FORTHCOMING TOR BOOKS FEB 2025

Latest podcast: Spill, part four (a Little Brother story) https://craphound.com/littlebrother/2024/10/28/spill-part-four-a-little-brother-story/


This work – excluding any serialized fiction – is licensed under a Creative Commons Attribution 4.0 license. That means you can use it any way you like, including commercially, provided that you attribute it to me, Cory Doctorow, and include a link to pluralistic.net.

https://creativecommons.org/licenses/by/4.0/

Quotations and images are not included in this license; they are included either under a limitation or exception to copyright, or on the basis of a separate license. Please exercise caution.


How to get Pluralistic:

Blog (no ads, tracking, or data-collection):

Pluralistic.net

Newsletter (no ads, tracking, or data-collection):

https://pluralistic.net/plura-list

Mastodon (no ads, tracking, or data-collection):

https://mamot.fr/@pluralistic

Medium (no ads, paywalled):

https://doctorow.medium.com/

Twitter (mass-scale, unrestricted, third-party surveillance and advertising):

https://twitter.com/doctorow

Tumblr (mass-scale, unrestricted, third-party surveillance and advertising):

https://mostlysignssomeportents.tumblr.com/tagged/pluralistic

"When life gives you SARS, you make sarsaparilla" -Joey "Accordion Guy" DeVilla

2024-11-24T18:37:46+00:00 Fullscreen Open in Tab
Note published on November 24, 2024 at 6:37 PM UTC

social media platform users are going to link offsite. the only question is how obnoxious the platform will make it for them and everyone else.

Screenshot of an Instagram comments section where every comment just says

(For context: Instagram prohibits links in post text. This, plus the incentive to inflate comments, has led to the proliferation of tools where creators instruct their followers to comment with a specific word to receive a link in their DMs — in this case, to a pie crust recipe)

2024-11-24T17:41:40+00:00 Fullscreen Open in Tab
Read "The Worst of Crypto Is Yet to Come"
Sat, 23 Nov 2024 14:14:10 +0000 Fullscreen Open in Tab
Pluralistic: Reverse engineers bust sleazy gig work platform (23 Nov 2024)


Today's links



An EU flag made up of circuit tracery. In the foreground is a huge figure in a top hat, with a sour expression, peering through a magnifying lens. In the figure's palm is a man on a pennyfarthing bike with a courier backpack. Behind them, the EU flag is disintegrating to reveal a code waterfall as seen in the credit sequences of the Wachowskis' 'Matrix' movies. In the opposite corner, a cyclist is entering the frame: she wears Victorian garb, and her head is a 'hacker in a hoodie' cliche image.

Reverse engineers bust sleazy gig work platform (permalink)

A COMPUTER CAN NEVER BE HELD ACCOUNTABLE

THEREFORE A COMPUTER MUST NEVER MAKE A MANAGEMENT DECISION

Supposedly, these lines were included in a 1979 internal presentation at IBM; screenshots of them routinely go viral:

https://twitter.com/SwiftOnSecurity/status/1385565737167724545?lang=en

The reason for their newfound popularity is obvious: the rise and rise of algorithmic management tools, in which your boss is an app. That IBM slide is right: turning an app into your boss allows your actual boss to create an "accountability sink" in which there is no obvious way to blame a human or even a company for your maltreatment:

https://profilebooks.com/work/the-unaccountability-machine/

App-based management-by-bossware treats the bug identified by the unknown author of that IBM slide into a feature. When an app is your boss, it can force you to scab:

https://pluralistic.net/2023/07/30/computer-says-scab/#instawork

Or it can steal your wages:

https://pluralistic.net/2023/04/12/algorithmic-wage-discrimination/#fishers-of-men

But tech giveth and tech taketh away. Digital technology is infinitely flexible: the program that spies on you can be defeated by another program that defeats spying. Every time your algorithmic boss hacks you, you can hack your boss back:

https://pluralistic.net/2022/12/02/not-what-it-does/#who-it-does-it-to

Technologists and labor organizers need one another. Even the most precarious and abused workers can team up with hackers to disenshittify their robo-bosses:

https://pluralistic.net/2021/07/08/tuyul-apps/#gojek

For every abuse technology brings to the workplace, there is a liberating use of technology that workers unleash by seizing the means of computation:

https://pluralistic.net/2024/01/13/solidarity-forever/#tech-unions

One tech-savvy group on the cutting edge of dismantling the Torment Nexus is Algorithms Exposed, a tiny, scrappy group of EU hacker/academics who recruit volunteers to reverse engineer and modify the algorithms that rule our lives as workers and as customers:

https://pluralistic.net/2022/12/10/e2e/#the-censors-pen

Algorithms Exposed have an admirable supply of seemingly boundless energy. Every time I check in with them, I learn that they've spun out yet another special-purpose subgroup. Today, I learned about Reversing Works, a hacking team that reverse engineers gig work apps, revealing corporate wrongdoing that leads to multimillion euro fines for especially sleazy companies.

One such company is Foodinho, an Italian subsidiary of the Spanish food delivery company Glovo. Foodinho/Glovo has been in the crosshairs of Italian labor enforcers since before the pandemic, racking up millions in fines – first for failing to file the proper privacy paperwork disclosing the nature of the data processing in the app that Foodinho riders use to book jobs. Then, after the Italian data commission investigated Foodinho, the company attracted new, much larger fines for its out-of-control surveillance conduct.

As all of this was underway, Reversing Works was conducting its own research into Glovo/Foodinho's app, running it on a simulated Android handset inside a PC so they could peer into the app's data collection and processing. They discovered a nightmarish world of pervasive, illegal worker surveillance, and published their findings a year ago in November, 2023:

https://www.etui.org/sites/default/files/2023-10/Exercising%20workers%20rights%20in%20algorithmic%20management%20systems_Lessons%20learned%20from%20the%20Glovo-Foodinho%20digital%20labour%20platform%20case_2023.pdf

That report reveals all kinds of extremely illegal behavior. Glovo/Foodinho makes its riders' data accessible across national borders, so Glovo managers outside of Italy can access fine-grained surveillance information and sensitive personal information – a major data protection no-no.

Worse, Glovo's app embeds trackers from a huge number of other tech platforms (for chat, analytics, and more), making it impossible for the company to account for all the ways that its riders' data is collected – again, a requirement under Italian and EU data protection law.

All this data collection continues even when riders have clocked out for the day – it's as though your boss followed you home after quitting time and spied on you.

The research also revealed evidence of a secretive worker scoring system that ranked workers based on undisclosed criteria and reserved the best jobs for workers with high scores. This kind of thing is pervasive in algorithmic management, from gig work to Youtube and Tiktok, where performers' videos are routinely suppressed because they crossed some undisclosed line. When an app is your boss, your every paycheck is docked because you violated a policy you're not allowed to know about, because if you knew why your boss was giving you shitty jobs, or refusing to show the video you spent thousands of dollars making to the subscribers who asked to see it, then maybe you could figure out how to keep your boss from detecting your rulebreaking next time.

All this data-collection and processing is bad enough, but what makes it all a thousand times worse is Glovo's data retention policy – they're storing this data on their workers for four years after the worker leaves their employ. That means that mountains of sensitive, potentially ruinous data on gig workers is just lying around, waiting to be stolen by the next hacker that breaks into the company's servers.

Reversing Works's report made quite a splash. A year after its publication, the Italian data protection agency fined Glovo another 5 million euros and ordered them to cut this shit out:

https://reversing.works/posts/2024/11/press-release-reversing.works-investigation-exposes-glovos-data-privacy-violations-marking-a-milestone-for-worker-rights-and-technology-accountability/

As the report points out, Italy is extremely well set up to defend workers' rights from this kind of bossware abuse. Not only do Italian enforcers have all the privacy tools created by the GDPR, the EU's flagship privacy regulation – they also have the benefit of Italy's 1970 Workers' Statute. The Workers Statute is a visionary piece of legislation that protects workers from automated management practices. Combined with later privacy regulation, it gave Italy's data regulators sweeping powers to defend Italian workers, like Glovo's riders.

Italy is also a leader in recognizing gig workers as de facto employees, despite the tissue-thin pretense that adding an app to your employment means that you aren't entitled to any labor protections. In the case of Glovo, the fine-grained surveillance and reputation scoring were deemed proof that Glovo was employer to its riders.

Reversing Works' report is a fascinating read, especially the sections detailing how the researchers recruited a Glovo rider who allowed them to log in to Glovo's platform on their account.

As Reversing Works points out, this bottom-up approach – where apps are subjected to technical analysis – has real potential for labor organizations seeking to protect workers. Their report established multiple grounds on which a union could seek to hold an abusive employer to account.

But this bottom-up approach also holds out the potential for developing direct-action tools that let workers flex their power, by modifying apps, or coordinating their actions to wring concessions out of their bosses.

After all, the whole reason for the gig economy is to slash wage-bills, by transforming workers into contractors, and by eliminating managers in favor of algorithms. This leaves companies extremely vulnerable, because when workers come together to exercise power, their employer can't rely on middle managers to pressure workers, deal with irate customers, or step in to fill the gap themselves:

https://projects.itforchange.net/state-of-big-tech/changing-dynamics-of-labor-and-capital/

Only by seizing the means of computation, workers and organized labor can turn the tables on bossware – both by directly altering the conditions of their employment, and by producing the evidence and tools that regulators can use to force employers to make those alterations permanent.

(Image: EFF, CC BY 3.0, modified)


Hey look at this (permalink)



A Wayback Machine banner.

This day in history (permalink)

#20yrsago Disney turns movie screenings into search-and-harass ordeals https://web.archive.org/web/20041125033545/http://www.defamer.com/hollywood/movies/mpaa/piracy-paranoia-part-ii-the-life-aquatic-screening-026073.php

#20yrsago Copyrights are awarded without economic rationale https://archive.is/C6T1R

#20yrsago Ed Felten’s lecture: “Rip, Mix, Burn, Sue” https://www.cs.princeton.edu/~felten/rip/

#15yrsago Associated Press loves fair use (we just wish they’d share) https://tushnet.blogspot.com/2009/11/actually-ap-likes-fair-use-after-all.html

#15yrsago Two US senators demand publication of secret copyright treaty https://www.keionline.org/39045

#15yrsago Conscious “coma man”‘s words seemingly delivered via discredited “facilitated communications” technique https://www.wired.com/2009/11/houben-communication/

#15yrsago TV vs Web: consumption characteristics https://www.nngroup.com/articles/media-velocity-tv-vs-the-web/

#15yrsago EFF sets sights on abusive EULAs https://www.eff.org/issues/terms-of-abuse

#15yrsago Record exec arrested for refusing to send a tweet asking Bieber-maddened crowd to disperse https://www.abajournal.com/news/article/cops_arrest_record_exec_claim_he_refused_to_send_crowd-control_tweet

#10yrsago Handbook for fighting climate-denialism https://skepticalscience.com/Debunking-Handbook-now-freely-available-download.html

#5yrsago California’s housing bubble is spilling over into poor and exurban neighborhoods, creating waves of crises https://www.nytimes.com/2019/11/21/us/california-housing-crisis-rent.html

#5yrsago Elizabeth Warren calls Zuck and Thiel’s secret Trump White House dinner “corrupt” https://www.commondreams.org/news/2019/11/21/warren-raises-corruption-alarm-after-trump-zuckerberg-and-thiel-hold-secret-white

#5yrsago Ecommerce sites’ mobile templates hide information that shoppers use to save money https://aisel.aisnet.org/icis2019/behavior_is/behavior_is/16/

#5yrsago Lawyer’s long, weird sigfile setting out when and whether he’s willing to talk on the phone goes viral https://www.fitsnews.com/2019/10/30/is-this-the-worlds-most-self-important-email-signature/

#5yrsago The Labour manifesto: transformation of the welfare system, fair conditions for workers, universal housing, home care for elderly, fully funded NHS, fair taxes for the rich https://jacobin.com/2019/11/labour-party-manifesto-jeremy-corbyn/

#5yrsago The Lincoln Library executive director got fired for renting Glenn Beck the original Gettysburg Address https://www.cbsnews.com/chicago/news/lincoln-library-director-fired-after-renting-out-gettysburg-address-to-glenn-beck/

#5yrsago I made Wil Wheaton recite the digits of Pi for four minutes, then a fan set it to music https://soundcloud.com/nicholasland/pi-funk

#5yrsago A poor, Trump-voting Florida town opened a government grocery store to end its food desert, but it’s “not socialism” https://www.washingtonpost.com/nation/2019/11/22/baldwin-florida-food-desert-city-owned-grocery-store/

#5yrsago Peak billionaire: a billionaire tries to purchase a party nomination to outflank anti-billionaires so he can run against another billionaire https://time.com/5735384/capitalism-reckoning-elitism-in-america-2019/

#1yrago Thankful for class consciousness https://pluralistic.net/2023/11/24/coalescence/#solidarnosc

#1yrago Don't Be Evil https://pluralistic.net/2023/11/22/who-wins-the-argument/#corporations-are-people-my-friend


Upcoming appearances (permalink)

A photo of me onstage, giving a speech, holding a mic.



A screenshot of me at my desk, doing a livecast.

Recent appearances (permalink)



A grid of my books with Will Stahle covers..

Latest books (permalink)



A cardboard book box with the Macmillan logo.

Upcoming books (permalink)

  • Picks and Shovels: a sequel to "Red Team Blues," about the heroic era of the PC, Tor Books, February 2025

  • Unauthorized Bread: a middle-grades graphic novel adapted from my novella about refugees, toasters and DRM, FirstSecond, 2025



Colophon (permalink)

Today's top sources:

Currently writing:

  • Enshittification: a nonfiction book about platform decay for Farrar, Straus, Giroux. Friday's progress: 796 words (87388 words total).

  • A Little Brother short story about DIY insulin PLANNING

  • Picks and Shovels, a Martin Hench noir thriller about the heroic era of the PC. FORTHCOMING TOR BOOKS FEB 2025

Latest podcast: Spill, part four (a Little Brother story) https://craphound.com/littlebrother/2024/10/28/spill-part-four-a-little-brother-story/


This work – excluding any serialized fiction – is licensed under a Creative Commons Attribution 4.0 license. That means you can use it any way you like, including commercially, provided that you attribute it to me, Cory Doctorow, and include a link to pluralistic.net.

https://creativecommons.org/licenses/by/4.0/

Quotations and images are not included in this license; they are included either under a limitation or exception to copyright, or on the basis of a separate license. Please exercise caution.


How to get Pluralistic:

Blog (no ads, tracking, or data-collection):

Pluralistic.net

Newsletter (no ads, tracking, or data-collection):

https://pluralistic.net/plura-list

Mastodon (no ads, tracking, or data-collection):

https://mamot.fr/@pluralistic

Medium (no ads, paywalled):

https://doctorow.medium.com/

Twitter (mass-scale, unrestricted, third-party surveillance and advertising):

https://twitter.com/doctorow

Tumblr (mass-scale, unrestricted, third-party surveillance and advertising):

https://mostlysignssomeportents.tumblr.com/tagged/pluralistic

"When life gives you SARS, you make sarsaparilla" -Joey "Accordion Guy" DeVilla

2024-11-21T19:33:29+00:00 Fullscreen Open in Tab
Note published on November 21, 2024 at 7:33 PM UTC
Thu, 21 Nov 2024 10:41:03 +0000 Fullscreen Open in Tab
Pluralistic: Expert agencies and elected legislatures (21 Nov 2024)


Today's links



A pair of balance scales high over the US Capitol Building. On one platform is a shouting banker holding a money-bag. On the other is a lap technician holding a giant testube larger than his torso, filled with various electronic gadgets. He uses tongs to hold a giant atomic motif over the tube's mouth. From behind the Capitol emerges an elephant in GOP logo livery, with the hair of Donald Trump. On the right is a gigantic telescoping platform terminating in a high-tech command chair from which a man observes the balance scales. Behind them is the DC cityscape, stretching off to the horizon.

Expert agencies and elected legislatures (permalink)

Since Trump hijacked the Supreme Court, his backers have achieved many of their policy priorities: legalizing bribery, formalizing forced birth, and – with the Loper Bright case, neutering the expert agencies that regulate business:

https://jacobin.com/2024/07/scotus-decisions-chevron-immunity-loper

What the Supreme Court began, Elon Musk and Vivek Ramaswamy are now poised to finish, through the "Department of Government Efficiency," a fake agency whose acronym ("DOGE") continues Musk's long-running cryptocurrency memecoin pump-and-dump. The new department is absurd – imagine a department devoted to "efficiency" with two co-equal leaders who are both famously incapable of getting along with anyone – but that doesn't make it any less dangerous.

Expert agencies are often all that stands between us and extreme misadventure, even death. The modern world is full of modern questions, the kinds of questions that require a high degree of expert knowledge to answer, but also the kinds of questions whose answers you'd better get right.

You're not stupid, nor are you foolish. You could go and learn everything you need to know to evaluate the firmware on your antilock brakes and decide whether to trust them. You could figure out how to assess the Common Core curriculum for pedagogical soundness. You could learn the material science needed to evaluate the soundness of the joists that hold the roof up over your head. You could acquire the biology and chemistry chops to decide whether you want to trust produce that's been treated with Monsanto's Roundup pesticides. You could do the same for cell biology, virology, and epidemiology and decide whether to wear a mask and/or get an MRNA vaccine and/or buy a HEPA filter.

You could do any of these. You might even be able to do two or three of them. But you can't do all of them, and that list is just a small slice of all the highly technical questions that stand between you and misery or an early grave. Practically speaking, you aren't going to develop your own robust meatpacking hygiene standards, nor your own water treatment program, nor your own Boeing 737 MAX inspection protocol.

Markets don't solve this either. If they did, we wouldn't have to worry about chunks of Boeing jets falling on our heads. The reason we have agencies like the FDA (and enabling legislation like the Pure Food and Drug Act) is that markets failed to keep people from being murdered by profit-seeking snake-oil salesmen and radium suppository peddlers.

These vital questions need to be answered by experts, but that's easier said than done. After all, experts disagree about this stuff. Shortcuts for evaluating these disagreements ("distrust any expert whose employer has a stake in a technical question") are crude and often lead you astray. If you dismiss any expert employed by a firm that wants to bring a new product to market, you will lose out on the expertise of people who are so legitimately excited about the potential improvements of an idea that they quit their jobs and go to work for whomever has the best chance of realizing a product based on it. Sure, that doctor who works for a company with a new cancer cure might just be shilling for a big bonus – but maybe they joined the company because they have an informed, truthful belief that the new drug might really cure cancer.

What's more, the scientific method itself speaks against the idea of there being one, permanent answer to any big question. The method is designed as a process of continual refinement, where new evidence is continuously brought forward and evaluated, and where cherished ideas that are invalidated by new evidence are discarded and replaced with new ideas.

So how are we to survive and thrive in a world of questions we ourselves can't answer, that experts disagree about, and whose answers are only ever provisional?

The scientific method has an answer for this, too: refereed, adversarial peer review. The editors of major journals act as umpires in disputes among experts, exercising their editorial discernment to decide which questions are sufficiently in flux as to warrant taking up, then asking parties who disagree with a novel idea to do their damndest to punch holes in it. This process is by no means perfect, but, like democracy, it's the worst form of knowledge creation except for all others which have been tried.

Expert regulators bring this method to governance. They seek comment on technical matters of public concern, propose regulations based on them, invite all parties to comment on these regulations, weigh the evidence, and then pass a rule. This doesn't always get it right, but when it does work, your medicine doesn't poison you, the bridge doesn't collapse as you drive over it, and your airplane doesn't fall out of the sky.

Expert regulators work with legislators to provide an empirical basis for turning political choices into empirically grounded policies. Think of all the times you've heard about how the gerontocracy that dominates the House and the Senate is incapable of making good internet policy because "they're out of touch and don't understand technology." Even if this is true (and sometimes it is, as when Sen Ted Stevens ranted about the internet being "a series of tubes," not "a dump truck"), that doesn't mean that Congress can't make good internet policy.

After all, most Americans can safely drink their tap water, a novelty in human civilization, whose history amounts to short periods of thriving shattered at regular intervals by water-borne plagues. The fact that most of us can safely drink our water, but people who live in Flint (or remote indigenous reservations, or Louisiana's Cancer Alley) can't tells you that these neighbors of ours are being deliberately poisoned, as we know precisely how not to poison them.

How did we (most of us) get to the point where we can drink the water without shitting our guts out? It wasn't because we elected a bunch of water scientists! I don't know the precise number of microbiologists and water experts who've been elected to either house, but it's very small, and their contribution to good sanitation policy is negligible.

We got there by delegating these decisions to expert agencies. Congress formulates a political policy ("make the water safe") and the expert agency turns that policy into a technical program of regulation and enforcement, and your children live to drink another glass of water tomorrow.

Musk and Ramaswamy have set out to destroy this process. In their Wall Street Journal editorial, they explain that expert regulation is "undemocratic" because experts aren't elected:

https://www.wsj.com/opinion/musk-and-ramaswamy-the-doge-plan-to-reform-government-supreme-court-guidance-end-executive-power-grab-fa51c020

They've vowed to remove "thousands" of regulations, and to fire swathes of federal employees who are in charge of enforcing whatever remains:

https://www.theverge.com/2024/11/20/24301975/elon-musk-vivek-ramaswamy-doge-plan

And all this is meant to take place on an accelerated timeline, between now and July 4, 2026 – a timeline that precludes any meaningful assessment of the likely consequences of abolishing the regulations they'll get rid of.

"Chesterton's Fence" – a thought experiment from the novelist GK Chesterton – is instructive here:

There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, "I don't see the use of this; let us clear it away." To which the more intelligent type of reformer will do well to answer: "If you don't see the use of it, I certainly won't let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it.

A regulation that works might well produce no visible sign that it's working. If your water purification system works, everything is fine. It's only when you get rid of the sanitation system that you discover why it was there in the first place, a realization that might well arrive as you expire in a slick of watery stool with a rectum so prolapsed the survivors can use it as a handle when they drag your corpse to the mass burial pits.

When Musk and Ramaswamy decry the influence of "unelected bureaucrats" on your life as "undemocratic," they sound reasonable. If unelected bureaucrats were permitted to set policy without democratic instruction or oversight, that would be autocracy.

Indeed, it would resemble life on the Tesla factory floor: that most autocratic of institutions, where you are at the mercy of the unelected and unqualified CEO of Tesla, who holds the purely ceremonial title of "Chief Engineer" and who paid the company's true founders to falsely describe him as its founder.

But that's not how it works! At its best, expert regulations turns political choices in to policy that reflects the will of democratically accountable, elected representatives. Sometimes this fails, and when it does, the answer is to fix the system – not abolish it.

I have a favorite example of this politics/empiricism fusion. It comes from the UK, where, in 2008, the eminent psychopharmacologist David Nutt was appointed as the "drug czar" to the government. Parliament had determined to overhaul its system of drug classification, and they wanted expert advice:

https://locusmag.com/2021/05/cory-doctorow-qualia/

To provide this advice, Nutt convened a panel of drug experts from different disciplines and asked them to rate each drug in question on how dangerous it was for its user; for its user's family; and for broader society. These rankings were averaged, and then a statistical model was used to determine which drugs were always very dangerous, no matter which group's safety you prioritized, and which drugs were never very dangerous, no matter which group you prioritized.

Empirically, the "always dangerous" drugs should be in the most restricted category. The "never very dangerous" drugs should be at the other end of the scale. Parliament had asked how to rank drugs by their danger, and for these categories, there were clear, factual answers to Parliament's question.

But there were many drugs that didn't always belong in either category: drugs whose danger score changed dramatically based on whether you were more concerned about individual harms, familial harms, or societal harms. This prioritization has no empirical basis: it's a purely political question.

So Nutt and his panel said to Parliament, "Tell us which of these priorities matter the most to you, and we will tell you where these changeable drugs belong in your schedule of restricted substances." In other words, politicians make political determinations, and then experts turn those choices into empirically supported policies.

This is how policy by "unelected bureaucrats" can still be "democratic."

But the Nutt story doesn't end there. Nutt butted heads with politicians, who kept insisting that he retract factual, evidence-supported statements (like "alcohol is more harmful than cannabis"). Nutt refused to do so. It wasn't that he was telling politicians which decisions to make, but he took it as his duty to point out when those decisions did not reflect the policies they were said to be in support of. Eventually, Nutt was fired for his commitment to empirical truth. The UK press dubbed this "The Nutt Sack Affair" and you can read all about it in Nutt's superb book Drugs Without the Hot Air, an indispensable primer on the drug war and its many harms:

https://www.bloomsbury.com/us/drugs-without-the-hot-air-9780857844989/

Congress can't make these decisions. We don't elect enough water experts, virologists, geologists, oncology researchers, structural engineers, aerospace safety experts, pedagogists, gerontoloists, physicists and other experts for Congress to turn its political choices into policy. Mostly, we elect lawyers. Lawyers can do many things, but if you ask a lawyer to tell you how to make your drinking water safe, you will likely die a horrible death.

That's the point. The idea that we should just trust the market to figure this out, or that all regulation should be expressly written into law, is just a way of saying, "you will likely die a horrible death."

Trump – and his hatchet men Musk and Ramaswamy – are not setting out to create evidence-based policy. They are pursuing policy-based evidence, firing everyone capable of telling them how to turn the values they espouse (prosperity and safety for all Americans) into policy.

They dress this up in the language of democracy, but the destruction of the expert agencies that bring the political will of our representatives into our daily lives is anything but democratic. It's a prelude to transforming the nation into a land of epistemological chaos, where you never know what's coming out of your faucet.


Hey look at this (permalink)



A Wayback Machine banner.

This day in history (permalink)

#20yrsago Tech-support generation spends Thanksgiving patching for parents https://web.archive.org/web/20041120052426/http://www.msnbc.msn.com/id/6522314/site/newsweek/https://memex.craphound.com/2004/11/20/neal-stephensons-system-of-the-world-concludes-the-baroque-trilogy/

#20yrsago Internet “Hopkin” meme unravelled https://mike.whybark.com/archives/1951

#20yrsago Full-back HTML tattoo https://web.archive.org/web/20050126081525/http://www.bmezine.com/tattoo/A41118/high/tattoo4.jpg

#15yrsago Owner of trendy Manhattan restaurant Paradou plumbs new depths of evil bad-bossitude https://gothamist.com/food/restaurant-owners-email-to-staff-belongs-in-tyrant-hall-of-fame

#15yrsago Traffic cameras used to harass and limit movement of peaceful protestors https://www.theguardian.com/uk/2009/oct/25/surveillance-police-number-plate-recognition

#15yrsago Owner of trendy Manhattan restaurant Paradou plumbs new depths of evil bad-bossitude https://gothamist.com/food/restaurant-owners-email-to-staff-belongs-in-tyrant-hall-of-fame

#10yrsago Firefox switches default search from Google to Yahoo https://www.cnet.com/tech/services-and-software/in-major-shift-firefox-to-use-yahoo-search-by-default-in-us/

#10yrsago Blackpool’s Broadway Hotel fines guests £100 for negative review https://www.bbc.co.uk/news/technology-30100973

#10yrsago Hacker, Hoaxer, Whistleblower, Spy: why only an anthropologist can tell the story of Anonymous https://web.archive.org/web/20141122163653/https://www.spectator.co.uk/books/9373852/the-anonymous-ghost-in-the-machine/

#10yrsago Secret history of the poop emoji https://www.fastcompany.com/3037803/the-oral-history-of-the-poop-emoji-or-how-google-brought-poop-to-america
#10yrsago Gates Foundation mandates open access for all the research it funds https://blogs.nature.com/news/2014/11/gates-foundation-announces-worlds-strongest-policy-on-open-access-research.html

#10yrsago Leaked docs detail Big Oil and Big PR’s plans for a opinion-manipulation platform https://www.vice.com/en/article/a-top-pr-firm-promised-big-oil-software-that-can-convert-average-citizens/

#5yrsago "Out of Home Advertising”: the billboards that spy on you as you move through public spaces https://www.consumerreports.org/electronics-computers/privacy/digital-billboards-are-tracking-you-and-they-want-you-to-see-their-ads-a1117246807/

#5yrsago How to recognize AI snake oil https://www.cs.princeton.edu/~arvindn/talks/MIT-STS-AI-snakeoil.pdf

#5yrsago High prices and debt mean millennials don’t plan to stop renting, and that’s before their parents retire and become dependent on them https://www.businessinsider.com/more-millennials-planning-to-rent-forever-cant-afford-housing-2019-11

#5yrsago Mayor Pete: Obama should have left Chelsea Manning to rot in prison for 35 years https://www.cbsnews.com/amp/news/2020-candidate-pete-buttigieg-troubled-by-clemency-for-chelsea-manning/

#5yrsago In an age of disappearing prison libraries, jail profiteers provide “free” crapgadget tablets that charge prisoners by the minute to read Project Gutenberg ebooks https://appalachianprisonbookproject.org/2019/11/20/how-much-does-it-cost-to-read-a-free-book-on-a-free-tablet/

#5yrsago DoJ to scrap the Paramount antitrust rule that prohibits movie studios from buying or strong-arming movie theaters https://www.reuters.com/article/us-usa-film-antitrust/justice-department-asks-court-to-scrap-decades-old-paramount-antitrust-decrees-idUSKBN1XS2G0/

#5yrsago When Republicans say “How will you pay for Medicare for All?” Democrats should answer: “Mexico will pay for it” https://theintercept.com/2019/11/20/democratic-debate-budget-deficit/

#5yrsago Twitter censures UK Tory Party for changing its blue-check account name to “FactCheckUK” during the prime ministerial debates https://edition.cnn.com/2019/11/19/world/conservative-party-fact-check-twitter-intl/index.html

#1yrago Larry Summers' inflation scare-talk incinerated climate action https://pluralistic.net/2023/11/20/bloodletting/#inflated-ego

#1yrago Naomi Kritzer's "Liberty's Daughter" https://pluralistic.net/2023/11/21/podkaynes-dad-was-a-dick/#age-of-consent


Upcoming appearances (permalink)

A photo of me onstage, giving a speech, holding a mic.



A screenshot of me at my desk, doing a livecast.

Recent appearances (permalink)



A grid of my books with Will Stahle covers..

Latest books (permalink)



A cardboard book box with the Macmillan logo.

Upcoming books (permalink)

  • Picks and Shovels: a sequel to "Red Team Blues," about the heroic era of the PC, Tor Books, February 2025

  • Unauthorized Bread: a middle-grades graphic novel adapted from my novella about refugees, toasters and DRM, FirstSecond, 2025



Colophon (permalink)

Today's top sources:

Currently writing:

  • Enshittification: a nonfiction book about platform decay for Farrar, Straus, Giroux. Today's progress: 812 words (85779 words total).

  • A Little Brother short story about DIY insulin PLANNING

  • Picks and Shovels, a Martin Hench noir thriller about the heroic era of the PC. FORTHCOMING TOR BOOKS FEB 2025

Latest podcast: Spill, part four (a Little Brother story) https://craphound.com/littlebrother/2024/10/28/spill-part-four-a-little-brother-story/


This work – excluding any serialized fiction – is licensed under a Creative Commons Attribution 4.0 license. That means you can use it any way you like, including commercially, provided that you attribute it to me, Cory Doctorow, and include a link to pluralistic.net.

https://creativecommons.org/licenses/by/4.0/

Quotations and images are not included in this license; they are included either under a limitation or exception to copyright, or on the basis of a separate license. Please exercise caution.


How to get Pluralistic:

Blog (no ads, tracking, or data-collection):

Pluralistic.net

Newsletter (no ads, tracking, or data-collection):

https://pluralistic.net/plura-list

Mastodon (no ads, tracking, or data-collection):

https://mamot.fr/@pluralistic

Medium (no ads, paywalled):

https://doctorow.medium.com/

Twitter (mass-scale, unrestricted, third-party surveillance and advertising):

https://twitter.com/doctorow

Tumblr (mass-scale, unrestricted, third-party surveillance and advertising):

https://mostlysignssomeportents.tumblr.com/tagged/pluralistic

"When life gives you SARS, you make sarsaparilla" -Joey "Accordion Guy" DeVilla

2024-11-20T22:06:25+00:00 Fullscreen Open in Tab
Note published on November 20, 2024 at 10:06 PM UTC
Tue, 19 Nov 2024 10:05:04 +0000 Fullscreen Open in Tab
Pluralistic: Forcing Google to spin off Chrome (and Android?) (19 Nov 2024)


Today's links



An early 20th century editorial cartoon depicting the Standard Oil Company an a world-spanning octopus clutching the organs of state - White House, Capitol dome, etc - in its tentacles. It has been altered: to its left, curled within its tentacles, stands an early 20th century cartoon depicting Uncle Sam as a policeman with a billyclub, with a DOJ Antitrust Division crest on his chest. On its right, one of its tentacles clutches an early Google 'I'm Feeling Lucky' button. Its head has been colored in with bands in the colors of the Google logo, surmounted by the Chrome logo. Its eyes have been replaced with the eyes of HAL9000 from Kubrick's '2001: A Space Odyssey.' Nestled in one of its armpits is the Android robot.

Forcing Google to spin off Chrome (and Android?) (permalink)

Last August, a federal judge convicted Google of being "a monopolist" and acting "as one to maintain its monopoly." The judge concluded that key to Google's monopoly was the vast troves of data it collects and analyzes and asked the parties to come up with remedies to address this.

Many trustbusters and Google competitors read this and concluded that Google should be forced to share its click and query data. The technical term for this is "apocalyptically stupid." Releasing Google's click and query data into the wild is a privacy Chernobyl in the waiting. The secrets that we whisper to search engines have the power to destroy us a thousand times over.

Largely theoretical answers like "differential privacy" are promising, but remain theoretical at scale. The first large-scale live-fire exercise for these should not be something as high-stakes as Google's click and query data. If anything, we should delete that data:

https://pluralistic.net/2024/08/07/revealed-preferences/#extinguish-v-improve

The last thing we want to do is use antitrust to democratize surveillance so that everyone can spy as efficiently as Google does. In theory, we could sanitize the click and query data by limiting sharing to queries that were made by multiple, independent users (say, only sharing queries that at least 30 users have made), but it's unlikely that this will do much to improve the performance of rival firms' search engines.

Google only retains 18 months' worth of click and query data, thus once we cut off its capacity to collect more data, whatever advantage it has from surveillance will begin to decay immediately and fall to zero in 18 months.

(However: the 18 months figure is deceptive, and deliberately so. Google may only retain your queries for 18 months, but it is silent on how long it retains the inferences from those queries. It may discard your "how do I get an abortion in my red state" query after a year and a half, but indefinitely retain the "sought an illegal abortion" label it added to your profile. The US desperately needs a federal consumer privacy law!)

https://pluralistic.net/2023/12/06/privacy-first/#but-not-just-privacy

And just to be clear, there's other Google data that would be very useful to rival search engines, like Google's search index – the trove of pages from the internet. Google already licenses this out, and search engines like Kagi use it to produce substantially superior search results:

https://pluralistic.net/2024/04/04/teach-me-how-to-shruggie/#kagi

The DOJ has just filed its proposal for a remedy, and it's a doozy: forcing Google to sell off Chrome, on the basis that both of these are the source of much of Google's data, and no rival search engine is likely to also have a widely used browser:

https://9to5google.com/2024/11/18/us-doj-google-sell-chrome/

This represents something of a compromise position: the DOJ had initially signalled that it would also demand a selloff of Android, and that's been dropped. I think there's a good case for forcing the sale of Android as a source of data, too.

In competition theory, these selloffs are referred to as "structural separation" – when a company that provides infrastructure to other firms is prohibited from competing with those firms:

https://locusmag.com/2022/03/cory-doctorow-vertically-challenged/

For example, it used to be that banks were prohibited from competing with the companies they loaned money to. After all, if you borrow money from Chase to open a pizzeria, and then Chase opens a pizzeria of its own across the street, you can see how your business would be doomed. You have to make interest payments to Chase, and your rival doesn't, and if Chase wants to, it can subsidize that rival so it can sell pizzas below cost until you're out of business.

Likewise, rail companies were banned from owning freight companies, because otherwise they would destroy the businesses of every freight company that shipped on the railroad.

In theory, you could create fair play rules that required the bank or the railroad to play nice with the business customers that used their platforms, but in practice, there are so many ways of cheating that this would be unenforceable.

This principle is well established in all other areas of business, and we recoil in horror when it is violated. You wouldn't hire a lawyer who was also representing the person who's suing you. Judges (with the abominable exception of Supreme Court justices!) are required to recuse themselves when they have a personal connection with either of the parties in a case they preside over.

One of the weirdest sights of the new Gilded Age is when lawyers for monopoly companies argue that they can play fair with their customers despite their conflicts of interest. Think of Google or Meta, with their ad-tech duopoly. These are companies that purport to represent sellers of ads and buyers of ads in marketplaces they own and control, and where they compete with sellers and/or buyers. These companies suck up 51% of the revenue generated by advertising, while historically, the share taken by ad intermediaries was more like 15%!

https://pluralistic.net/2023/05/25/structural-separation/#america-act

Imagine if you and your partner discovered that the same lawyer was representing both of you in the divorce, while also serving as the judge, and trying to match with both of you on Tinder. Now imagine that when the divorce terms were finalized, lawyer got your family home.

No Google lawyer would agree to argue on the company's behalf in a case where the judge was employed by the party that's suing them, but they will blithely argue that the reason they're getting 51% of the ad-rake is that they're providing 51% of the value.

Structural separation – like judicial recusal – comprehensively and unarguably resolves all the perceptions and realities of conflict between parties. The fact that platform owners compete with platform users is the source of bottomless corruption, from Google to Amazon:

https://pluralistic.net/2022/11/28/enshittification/#relentless-payola

In other words, I think the DOJ is onto something here. That said, the devil is – as always – in the details. If Google is forced to sell off Chrome, rather than standing it up as its own competing business, things could go very wrong indeed.

Any company that buys Chrome will know that it only has a certain number of years before Google will be permitted to spin up a new browser, and will be incentivized to extract as much value from Chrome over that short period. So a selloff could make Chrome exponentially worse than Google, which, whatever other failings it has, is oriented towards long-term dominance, not a quick buck.

But if Google is forced to spin Chrome out as a standalone business, the incentives change. Anyone who buys Chrome will have to run it as a functional business that is designed to survive a future Google competitor – they won't have another business they can fall back on if Google bounces back in five years.

There's a good history of this in antitrust breakups: both Standard Oil and AT&T were forced to spin out, rather than sell off, parts of their empire, and those businesses stood alone and provided competitive pressure. That is, until we stopped enforcing antitrust law and allowed them to start merging again – womp womp.

This raises another question: does any of this matter, given this month's election results? Will Trump's DoJ follow through on whatever priorities the current DoJ sets? That's an open question, but – unlike so many other questions about the coming Trump regime – the answer here isn't necessarily a nightmare.

After all, the Google antitrust case started under Trump, and Trump's pick for Attorney General, the credibly accused sexual predator Matt Gaetz, is a "Khanservative" who breaks with his fellow Trumpians in professing great admiration for Biden's FTC chief Lina Khan, and her project of breaking up corporate monopolies:

https://www.thebignewsletter.com/p/trump-nominates-khanservative-matt

What's more, Trump is a landing strip for a stroke or coronary, which would make JD Vance president – and Vance has also expressed his approval of Khan's work.

Google bosses seem to be betting on Trump's "transactional" (that is, corrupt) style of governance, and his willingness to overrule his own appointees to protect the interests of anyone who flatters or bribes him sufficiently, or convinces the hosts of Fox and Friends to speak on their behalf:

https://www.mediamatters.org/donald-trump/comprehensive-review-revolving-door-between-fox-and-second-trump-administration

That would explain why Google capo Sundar Pichai ordered his employees not to speak out against Trump:

https://www.businessinsider.com/google-employees-memes-poke-fun-company-rules-political-discussion-2024-11

And why he followed up by publicly osculating Trump's sphincter:

https://twitter.com/sundarpichai/status/1854207788290850888

(Image: Cryteria, CC BY 3.0, modified)


Hey look at this (permalink)



A Wayback Machine banner.

This day in history (permalink)

#20yrsago WIPO notes from day three: democracy == ignoring dissent https://web.archive.org/web/20041124024604/https://www.eff.org/deeplinks/archives/002130.php#002130

#15yrsago Britain’s new Internet law — as bad as everyone’s been saying, and worse. Much, much worse. https://memex.craphound.com/2009/11/19/britains-new-internet-law-as-bad-as-everyones-been-saying-and-worse-much-much-worse/

#5yrsago DJ Earworm: 100 songhttps://www.tumblr.com/thevaultoftheatomicspaceage/767410440115503104s from the past decade in one mashup https://music.youtube.com/watch?v=UhIte8t6BEg

#5yrsago Leaks reveal how the “Pitbull of PR” helped Purdue Pharma and the Sacklers ignite the opioid crisis https://www.propublica.org/article/inside-purdue-pharma-media-playbook-how-it-planted-the-opioid-anti-story#171238

#5yrsago Beyond the gig economy: “platform co-ops” that run their own apps https://www.vice.com/en/article/worker-owned-apps-are-trying-to-fix-the-gig-economys-exploitation/

#5yrsago Elizabeth Warren’s plan to denazify America https://medium.com/@teamwarren/fighting-back-against-white-nationalist-violence-87b0c550f51f

#5yrsago Youtube told them to use this “royalty-free” music; now rightsholders are forcing ads on their videos and claiming most of the revenue https://torrentfreak.com/royalty-free-music-supplied-by-youtube-results-in-mass-video-demonetization-191118/

#5yrsago The State of South Dakota wants you to know that it’s on meth https://www.washingtonpost.com/health/2019/11/18/meth-were-it-says-south-dakota-new-ad-campaign/

#5yrsago Sand thieves believed to be behind epidemic of Chinese GPS jamming https://www.technologyreview.com/2019/11/15/131940/ghost-ships-crop-circles-and-soft-gold-a-gps-mystery-in-shanghai/

#5yrsago Quiet Rooms: Illinois schools lead the nation in imprisoning very young, disabled children in isolation chambers https://features.propublica.org/illinois-seclusion-rooms/school-students-put-in-isolated-timeouts/#170648

#5yrsago Terabytes of data leaked from an oligarch-friendly offshore bank https://web.archive.org/web/20191117042726/https://data.ddosecrets.com/file/Sherwood/

#5yrsago Naomi Kritzer’s “Catfishing on the CatNet”: an AI caper about the true nature of online friendship https://memex.craphound.com/2019/11/19/naomi-kritzers-catfishing-on-the-catnet-an-ai-caper-about-the-true-nature-of-online-friendship/

#5yrsago Girl on Film: a graphic novel memoir of a life in the arts and the biological basis for memory-formation https://memex.craphound.com/2019/11/19/girl-on-film-a-graphic-novel-memoir-of-a-life-in-the-arts-and-the-biological-basis-for-memory-formation/


Upcoming appearances (permalink)

A photo of me onstage, giving a speech, holding a mic.



A screenshot of me at my desk, doing a livecast.

Recent appearances (permalink)



A grid of my books with Will Stahle covers..

Latest books (permalink)



A cardboard book box with the Macmillan logo.

Upcoming books (permalink)

  • Picks and Shovels: a sequel to "Red Team Blues," about the heroic era of the PC, Tor Books, February 2025

  • Unauthorized Bread: a middle-grades graphic novel adapted from my novella about refugees, toasters and DRM, FirstSecond, 2025



Colophon (permalink)

Today's top sources:

Currently writing:

  • Enshittification: a nonfiction book about platform decay for Farrar, Straus, Giroux. Today's progress: 791 words (84962 words total).

  • A Little Brother short story about DIY insulin PLANNING

  • Picks and Shovels, a Martin Hench noir thriller about the heroic era of the PC. FORTHCOMING TOR BOOKS FEB 2025

Latest podcast: Spill, part four (a Little Brother story) https://craphound.com/littlebrother/2024/10/28/spill-part-four-a-little-brother-story/


This work – excluding any serialized fiction – is licensed under a Creative Commons Attribution 4.0 license. That means you can use it any way you like, including commercially, provided that you attribute it to me, Cory Doctorow, and include a link to pluralistic.net.

https://creativecommons.org/licenses/by/4.0/

Quotations and images are not included in this license; they are included either under a limitation or exception to copyright, or on the basis of a separate license. Please exercise caution.


How to get Pluralistic:

Blog (no ads, tracking, or data-collection):

Pluralistic.net

Newsletter (no ads, tracking, or data-collection):

https://pluralistic.net/plura-list

Mastodon (no ads, tracking, or data-collection):

https://mamot.fr/@pluralistic

Medium (no ads, paywalled):

https://doctorow.medium.com/

Twitter (mass-scale, unrestricted, third-party surveillance and advertising):

https://twitter.com/doctorow

Tumblr (mass-scale, unrestricted, third-party surveillance and advertising):

https://mostlysignssomeportents.tumblr.com/tagged/pluralistic

"When life gives you SARS, you make sarsaparilla" -Joey "Accordion Guy" DeVilla

2024-11-19T04:44:57+00:00 Fullscreen Open in Tab
Note published on November 19, 2024 at 4:44 AM UTC

people talk about “the media” and “journalists” and they picture the New York Times, Wall Street Journal, or cable news.

and sure, “the media” is the Big Five. but it’s also non-profit newsrooms, independent journalists, international and/or non-US publications, worker-owned media collectives, bloggers, local newsrooms, citizen journalists, podcasters, critics, community radio stations, documentary filmmakers, trade publications, freelancers, fact checkers...

if you applaud attacks and legal intimidation against “the media” and “journalists” because you are picturing the former, remember that it is ultimately the smaller fish who will suffer the most from it.

Illustration of Molly White sitting and typing on a laptop, on a purple background with 'Molly White' in white serif.
2024-11-19T03:58:20+00:00 Fullscreen Open in Tab
Note published on November 19, 2024 at 3:58 AM UTC

don’t let the failures of some (major) news outlets disillusion you with media as a whole. and especially don’t let those failures desensitize you to attacks on free expression.

we can criticize media failures while also fiercely defending media freedom.

it is scary to see people responding to trump’s baseless lawsuits against the NYT and others with a shrug because of their complaints about those outlets’ coverage of him.

we can oppose lawfare against media institutions and also hold those institutions properly to account for poor coverage.

allowing authoritarians to target media institutions you don’t like only works until they decide to start targeting the ones you do — often ones with far fewer resources than the NYT and its ilk.

Illustration of Molly White sitting and typing on a laptop, on a purple background with 'Molly White' in white serif.
2024-11-18T23:07:22+00:00 Fullscreen Open in Tab
Note published on November 18, 2024 at 11:07 PM UTC
2024-11-18T16:30:00+00:00 Fullscreen Open in Tab
Read "Bitcoin's identity crisis"
2024-11-18T09:35:42+00:00 Fullscreen Open in Tab
Importing a frontend Javascript library without a build system

I like writing Javascript without a build system and for the millionth time yesterday I ran into a problem where I needed to figure out how to import a Javascript library in my code without using a build system, and it took FOREVER to figure out how to import it because the library’s setup instructions assume that you’re using a build system.

Luckily at this point I’ve mostly learned how to navigate this situation and either successfully use the library or decide it’s too difficult and switch to a different library, so here’s the guide I wish I had to importing Javascript libraries years ago.

I’m only going to talk about using Javacript libraries on the frontend, and only about how to use them in a no-build-system setup.

In this post I’m going to talk about:

  1. the three main types of Javascript files a library might provide (ES Modules, the “classic” global variable kind, and CommonJS)
  2. how to figure out which types of files a Javascript library includes in its build
  3. ways to import each type of file in your code

the three kinds of Javascript files

There are 3 basic types of Javascript files a library can provide:

  1. the “classic” type of file that defines a global variable. This is the kind of file that you can just <script src> and it’ll Just Work. Great if you can get it but not always available
  2. an ES module (which may or may not depend on other files, we’ll get to that)
  3. a “CommonJS” module. This is for Node, you can’t use it in a browser at all without using a build system.

I’m not sure if there’s a better name for the “classic” type but I’m just going to call it “classic”. Also there’s a type called “AMD” but I’m not sure how relevant it is in 2024.

Now that we know the 3 types of files, let’s talk about how to figure out which of these the library actually provides!

where to find the files: the NPM build

Every Javascript library has a build which it uploads to NPM. You might be thinking (like I did originally) – Julia! The whole POINT is that we’re not using Node to build our library! Why are we talking about NPM?

But if you’re using a link from a CDN like https://cdnjs.cloudflare.com/ajax/libs/Chart.js/4.4.1/chart.umd.min.js, you’re still using the NPM build! All the files on the CDNs originally come from NPM.

Because of this, I sometimes like to npm install the library even if I’m not planning to use Node to build my library at all – I’ll just create a new temp folder, npm install there, and then delete it when I’m done. I like being able to poke around in the files in the NPM build on my filesystem, because then I can be 100% sure that I’m seeing everything that the library is making available in its build and that the CDN isn’t hiding something from me.

So let’s npm install a few libraries and try to figure out what types of Javascript files they provide in their builds!

example library 1: chart.js

First let’s look inside Chart.js, a plotting library.

$ cd /tmp/whatever
$ npm install chart.js
$ cd node_modules/chart.js/dist
$ ls *.*js
chart.cjs  chart.js  chart.umd.js  helpers.cjs  helpers.js

This library seems to have 3 basic options:

option 1: chart.cjs. The .cjs suffix tells me that this is a CommonJS file, for using in Node. This means it’s impossible to use it directly in the browser without some kind of build step.

option 2:chart.js. The .js suffix by itself doesn’t tell us what kind of file it is, but if I open it up, I see import '@kurkle/color'; which is an immediate sign that this is an ES module – the import ... syntax is ES module syntax.

option 3: chart.umd.js. “UMD” stands for “Universal Module Definition”, which I think means that you can use this file either with a basic <script src>, CommonJS, or some third thing called AMD that I don’t understand.

how to use a UMD file

When I was using Chart.js I picked Option 3. I just needed to add this to my code:

<script src="./chart.umd.js"> </script>

and then I could use the library with the global Chart environment variable. Couldn’t be easier. I just copied chart.umd.js into my Git repository so that I didn’t have to worry about using NPM or the CDNs going down or anything.

the build files aren’t always in the dist directory

A lot of libraries will put their build in the dist directory, but not always! The build files’ location is specified in the library’s package.json.

For example here’s an excerpt from Chart.js’s package.json.

  "jsdelivr": "./dist/chart.umd.js",
  "unpkg": "./dist/chart.umd.js",
  "main": "./dist/chart.cjs",
  "module": "./dist/chart.js",

I think this is saying that if you want to use an ES Module (module) you should use dist/chart.js, but the jsDelivr and unpkg CDNs should use ./dist/chart.umd.js. I guess main is for Node.

chart.js’s package.json also says "type": "module", which according to this documentation tells Node to treat files as ES modules by default. I think it doesn’t tell us specifically which files are ES modules and which ones aren’t but it does tell us that something in there is an ES module.

example library 2: @atcute/oauth-browser-client

@atcute/oauth-browser-client is a library for logging into Bluesky with OAuth in the browser.

Let’s see what kinds of Javascript files it provides in its build!

$ npm install @atcute/oauth-browser-client
$ cd node_modules/@atcute/oauth-browser-client/dist
$ ls *js
constants.js  dpop.js  environment.js  errors.js  index.js  resolvers.js

It seems like the only plausible root file in here is index.js, which looks something like this:

export { configureOAuth } from './environment.js';
export * from './errors.js';
export * from './resolvers.js';

This export syntax means it’s an ES module. That means we can use it in the browser without a build step! Let’s see how to do that.

how to use an ES module with importmaps

Using an ES module isn’t an easy as just adding a <script src="whatever.js">. Instead, if the ES module has dependencies (like @atcute/oauth-browser-client does) the steps are:

  1. Set up an import map in your HTML
  2. Put import statements like import { configureOAuth } from '@atcute/oauth-browser-client'; in your JS code
  3. Include your JS code in your HTML liek this: <script type="module" src="YOURSCRIPT.js"></script>

The reason we need an import map instead of just doing something like import { BrowserOAuthClient } from "./oauth-client-browser.js" is that internally the module has more import statements like import {something} from @atcute/client, and we need to tell the browser where to get the code for @atcute/client and all of its other dependencies.

Here’s what the importmap I used looks like for @atcute/oauth-browser-client:

<script type="importmap">
{
  "imports": {
    "nanoid": "./node_modules/nanoid/bin/dist/index.js",
    "nanoid/non-secure": "./node_modules/nanoid/non-secure/index.js",
    "nanoid/url-alphabet": "./node_modules/nanoid/url-alphabet/dist/index.js",
    "@atcute/oauth-browser-client": "./node_modules/@atcute/oauth-browser-client/dist/index.js",
    "@atcute/client": "./node_modules/@atcute/client/dist/index.js",
    "@atcute/client/utils/did": "./node_modules/@atcute/client/dist/utils/did.js"
  }
}
</script>

Getting these import maps to work is pretty fiddly, I feel like there must be a tool to generate them automatically but I haven’t found one yet. It’s definitely possible to write a script that automatically generates the importmaps using esbuild’s metafile but I haven’t done that and maybe there’s a better way.

I decided to set up importmaps yesterday to get github.com/jvns/bsky-oauth-example to work, so there’s some example code in that repo.

Also someone pointed me to Simon Willison’s download-esm, which will download an ES module and rewrite the imports to point to the JS files directly so that you don’t need importmaps. I haven’t tried it yet but it seems like a great idea.

problems with importmaps: too many files

I did run into some problems with using importmaps in the browser though – it needed to download dozens of Javascript files to load my site, and my webserver in development couldn’t keep up for some reason. I kept seeing files fail to load randomly and then had to reload the page and hope that they would succeed this time.

It wasn’t an issue anymore when I deployed my site to production, so I guess it was a problem with my local dev environment.

Also one slightly annoying thing about ES modules in general is that you need to be running a webserver to use them, I’m sure this is for a good reason but it’s easier when you can just open your index.html file without starting a webserver.

Because of the “too many files” thing I think actually using ES modules with importmaps in this way isn’t actually that appealing to me, but it’s good to know it’s possible.

how to use an ES module without importmaps

If the ES module doesn’t have dependencies then it’s even easier – you don’t need the importmaps! You can just:

  • put <script type="module" src="YOURCODE.js"></script> in your HTML. The type="module" is important.
  • put import {whatever} from "https://example.com/whatever.js" in YOURCODE.js

alternative: use esbuild

If you don’t want to use importmaps, you can also use a build system like esbuild. I talked about how to do that in Some notes on using esbuild, but this blog post is about ways to avoid build systems completely so I’m not going to talk about that option here. I do still like esbuild though and I think it’s a good option in this case.

what’s the browser support for importmaps?

CanIUse says that importmaps are in “Baseline 2023: newly available across major browsers” so my sense is that in 2024 that’s still maybe a little bit too new? I think I would use importmaps for some fun experimental code that I only wanted like myself and 12 people to use, but if I wanted my code to be more widely usable I’d use esbuild instead.

example library 3: @atproto/oauth-client-browser

Let’s look at one final example library! This is a different Bluesky auth library than @atcute/oauth-browser-client.

$ npm install @atproto/oauth-client-browser
$ cd node_modules/@atproto/oauth-client-browser/dist
$ ls *js
browser-oauth-client.js  browser-oauth-database.js  browser-runtime-implementation.js  errors.js  index.js  indexed-db-store.js  util.js

Again, it seems like only real candidate file here is index.js. But this is a different situation from the previous example library! Let’s take a look at index.js:

There’s a bunch of stuff like this in index.js:

__exportStar(require("@atproto/oauth-client"), exports);
__exportStar(require("./browser-oauth-client.js"), exports);
__exportStar(require("./errors.js"), exports);
var util_js_1 = require("./util.js");

This require() syntax is CommonJS syntax, which means that we can’t use this file in the browser at all, we need to use some kind of build step, and ESBuild won’t work either.

Also in this library’s package.json it says "type": "commonjs" which is another way to tell it’s CommonJS.

how to use a CommonJS module with esm.sh

Originally I thought it was impossible to use CommonJS modules without learning a build system, but then someone Bluesky told me about esm.sh! It’s a CDN that will translate anything into an ES Module. skypack.dev does something similar, I’m not sure what the difference is but one person mentioned that if one doesn’t work sometimes they’ll try the other one.

For @atproto/oauth-client-browser using it seems pretty simple, I just need to put this in my HTML:

<script type="module" src="script.js"> </script>

and then put this in script.js.

import { BrowserOAuthClient } from "https://esm.sh/@atproto/oauth-client-browser@0.3.0"

It seems to Just Work, which is cool! Of course this is still sort of using a build system – it’s just that esm.sh is running the build instead of me. My main concerns with this approach are:

  • I don’t really trust CDNs to keep working forever – usually I like to copy dependencies into my repository so that they don’t go away for some reason in the future.
  • I’ve heard of some issues with CDNs having security compromises which scares me. Also I don’t
  • I don’t really understand what esm.sh is doing and

esbuild can also convert CommonJS modules into ES modules

I also learned that you can also use esbuild to convert a CommonJS module into an ES module, though there are some limitations – the import { BrowserOAuthClient } from syntax doesn’t work. Here’s a github issue about that.

I think the esbuild approach is probably more appealing to me than the esm.sh approach because it’s a tool that I already have on my computer so I trust it more. I haven’t experimented with this much yet though.

summary of the three types of files

Here’s a summary of the three types of JS files you might encounter, options for how to use them, and how to identify them.

Unhelpfully a .js or .min.js file extension could be any of these 3 options, so if the file is something.js you need to do more detective work to figure out what you’re dealing with.

  1. “classic” JS files
    • How to use it:: <script src="whatever.js"></script>
    • Ways to identify it:
      • The website has a big friendly banner in its setup instructions saying “Use this with a CDN!” or something
      • A .umd.js extension
      • Just try to put it in a <script src=... tag and see if it works
  2. ES Modules
    • Ways to use it:
      • If there are no dependencies, just import {whatever} from "./my-module.js" directly in your code
      • If there are dependencies, create an importmap and import {whatever} from "my-module"
      • Use esbuild or any ES Module bundler
    • Ways to identify it:
      • Look for an import or export statement. (not module.exports = ..., that’s CommonJS)
      • An .mjs extension
      • maybe "type": "module" in package.json (though it’s not clear to me which file exactly this refers to)
  3. CommonJS Modules
    • Ways to use it:
      • Use https://esm.sh to convert it into an ES module, like https://esm.sh/@atproto/oauth-client-browser@0.3.0
      • Use a build somehow (??)
    • Ways to identify it:
      • Look for require() or module.exports = ... in the code
      • A .cjs extension
      • maybe "type": "commonjs" in package.json (though it’s not clear to me which file exactly this refers to)

it’s really nice to have ES modules standardized

The main difference between CommonJS modules and ES modules from my perspective is that ES modules are actually a standard. This makes me feel a lot more confident using them, because browsers commit to backwards compatibility for web standards forever – if I write some code using ES modules today, I can feel sure that it’ll still work the same way in 15 years.

It also makes me feel better about using tooling like esbuild because even if the esbuild project dies, because it’s implementing a standard it feels likely that there will be another similar tool in the future that I can replace it with.

the JS community has built a lot of very cool tools

A lot of the time when I talk about this stuff I get responses like “I hate javascript!!! it’s the worst!!!”. But my experience is that there are a lot of great tools for Javascript (I just learned about https://esm.sh yesterday which seems great! I love esbuild!), and that if I take the time to learn how things works I can take advantage of some of those tools and make my life a lot easier.

So the goal of this post is definitely not to complain about Javascript, it’s to understand the landscape so I can use the tooling in a way that feels good to me.

questions I still have

Here are some questions I still have, I’ll add the answers into the post if I learn the answer.

  • Is there a tool that automatically generates importmaps for an ES Module that I have set up locally? (apparently yes: jspm)
  • How can I convert a CommonJS module into an ES module on my computer, the way https://esm.sh does? (apparently esbuild can sort of do this, though named exports don’t work)
  • When people normally build CommonJS modules into regular JS code, what’s code is doing that? Obviously there are tools like webpack, rollup, esbuild, etc, but do those tools all implement their own JS parsers/static analysis? How many JS parsers are there out there?
  • Is there any way to bundle an ES module into a single file (like atcute-client.js), but so that in the browser I can still import multiple different paths from that file (like both @atcute/client/lexicons and @atcute/client)?

all the tools

Here’s a list of every tool we talked about in this post:

Writing this post has made me think that even though I usually don’t want to have a build that I run every time I update the project, I might be willing to have a build step (using download-esm or something) that I run only once when setting up the project and never run again except maybe if I’m updating my dependency versions.

that’s all!

Thanks to Marco Rogers who taught me a lot of the things in this post. I’ve probably made some mistakes in this post and I’d love to know what they are – let me know on Bluesky or Mastodon!

Mon, 18 Nov 2024 08:49:52 +0000 Fullscreen Open in Tab
Pluralistic: Harpercollins wants authors to sign away AI training rights (18 Nov 2024)


Today's links



A bookcase. Rupert Murdoch's grinning head rests between a gap in the books. Centered in his forehead is the glaring red eye of HAL 9000 from Kubrick's '2001: A Space Odyssey.'

Harpercollins wants authors to sign away AI training rights (permalink)

Rights don't give you power. People with power can claim rights. Giving a "right" to someone powerless just transfers it to someone more powerful than them. Nowhere is this more visible than in copyright fights, where creative workers are given new rights that are immediately hoovered up by their bosses.

It's not clear whether copyright gives anyone the right to control whether their work is used to train an AI model. It's very common for people (including high ranking officials in entertainment companies, and practicing lawyers who don't practice IP law) to overestimate their understanding of copyright in general, and their knowledge of fair use in particular.

Here's a hint: any time someone says "X can never be fair use," they are wrong and don't know what they're talking about (same goes for "X is always fair use"). Likewise, anyone who says, "Fair use is assessed solely by considering the 'four factors.'" That is your iron-clad sign that the speaker does not understand fair use:

https://pluralistic.net/2024/06/27/nuke-first/#ask-questions-never

But let's say for the sake of argument that training a model on someone's work is a copyright violation, and so training is a licensable activity, and AI companies must get permission from rightsholders before they use their copyrighted works to train a model.

Even if that's not how copyright works today, it's how things could work. No one came down off a mountain with two stone tablets bearing the text of 17 USC chiseled in very, very tiny writing. We totally overhauled copyright in 1976, and again in 1998. There've been several smaller alterations since.

We could easily write a new law that requires licensing for AI training, and it's not hard to imagine that happening, given the current confluence of interests among creative workers (who are worried about AI pitchmen's proclaimed intention to destroy their livelihoods) and entertainment companies (who are suing many AI companies).

Creative workers are an essential element of that coalition. Without those workers as moral standard-bearers, it's hard to imagine the cause getting much traction. No one seriously believes that entertainment execs like Warner CEO David Zaslav actually cares about creative works – this is a guy who happily deletes every copy of an unreleased major film that had superb early notices because it would be worth infinitesimally more as a tax-break than as a work of art:

https://collider.com/coyote-vs-acme-david-zaslav-never-seen/

The activists in this coalition commonly call it "anti AI." But is it? Does David Zaslav – or any of the entertainment execs who are suing AI companies – want to prevent gen AI models from being used in the production of their products? No way – these guys love AI. Zaslav and his fellow movie execs held out against screenwriters demanding control over AI in the writers' room for 148 days, and locked out their actors for another 118 days over the use of AI to replace actors. Studio execs forfeited at least $5 billion in a bid to insist on their right to use AI against workers:

https://sites.lsa.umich.edu/mje/2023/12/06/a-deep-dive-into-the-economic-ripples-of-the-hollywood-strike/

Entertainment businesses love the idea of replacing their workers with AI. Now, that doesn't mean that AI can replace workers: just because your boss can be sold an AI to do your job, it doesn't mean that the AI he buys can actually do your job:

https://pluralistic.net/2024/07/25/accountability-sinks/#work-harder-not-smarter

So if we get the right to refuse to allow our work to be used to train a model, the "anti AI" coalition will fracture. Workers will (broadly) want to exercise that right to prevent AI models from being trained at all, while our bosses will want to exercise that right to be sure that they're paid for AI training, and that they can steer production of the resulting model to maximize the number of workers they can fire after it's done.

Hypothetically, creative workers could simply say to our bosses, "We will not sell you this right to authorize or refuse AI training that Congress just gave us." But our bosses will then say, "Fine, you're fired. We won't hire you for this movie, or record your album, or publish your book."

Given that there are only five major publishers, four major studios, three major labels, two ad-tech companies and one company that controls the whole ebook and audiobook market, a refusal to deal on the part of a small handful of firms effectively dooms you to obscurity.

As Rebecca Giblin and I write in our 2022 book Chokepoint Capitalism, giving more rights to a creative worker who has no bargaining power is like giving your bullied schoolkid more lunch money. No matter how much lunch money you give that kid, the bullies will take it and your kid will remain hungry. To get your kid lunch, you have to clear the bullies away from the gate. You need to make a structural change:

https://chokepointcapitalism.com/

Or, put another way: people with power can claim rights. But giving powerless people more rights doesn't make them powerful – it just transfers those rights to the people they bargain against.

Or, put a third way: "just because you're on their side, it doesn't follow that they're on your side" (h/t Teresa Nielsen Hayden):

https://pluralistic.net/2024/10/19/gander-sauce/#just-because-youre-on-their-side-it-doesnt-mean-theyre-on-your-side

Last month, Penguin Random House, the largest publisher in the history of human civilization, started including a copyright notice in its books advising all comers that they would not permit AI training with the material between the covers:

https://pluralistic.net/2024/10/19/gander-sauce/#just-because-youre-on-their-side-it-doesnt-mean-theyre-on-your-side

At the time, people who don't like AI were very excited about this, even though it was – at the utmost – a purely theatrical gesture. After all, if AI training isn't fair use, then you don't need a notice to turn it into a copyright infringement. If AI training is fair use, it remains fair use even if you add some text to the copyright notice.

But far more important was the fact that the less that Penguin Random House pays its authors, the more it can pay its shareholders and executives. PRH didn't say it wouldn't sell the right to train a model to an AI company – they only said that an AI company that wanted to train a model on its books would have to pay PRH first. In other words, just because you're on their side, it doesn't follow that they're on your side.

When I wrote about PRH and its AI warning, I mentioned that I had personally seen one of the big five publishers hold up a book because a creator demanded a clause in their contract saying their work wouldn't be used to train an AI.

There's a good reason you'd want this in your contract; the standard contracting language contains bizarrely overreaching language seeking "rights in all media now known and yet to be devised throughout the universe":

https://pluralistic.net/2022/06/19/reasonable-agreement/

But the publisher flat-out refused, and the creator fought and fought, and in the end, it became clear that this was a take-it-or-leave-it situation: the publisher would not include a "no AI training" clause in the contract.

One of the big five publishers is Rupert Murdoch's Harpercollins. Murdoch is famously of the opinion that any kind of indexing or archiving of the work he publishes must require a license. He even demanded to be paid to have his newspapers indexed by search engines:

https://www.inquisitr.com/46786/epic-win-news-corp-likely-to-remove-content-from-google

No surprise, then, that Murdoch sued an AI company over training on Newscorp content:

https://www.theguardian.com/technology/2024/oct/25/unjust-threat-murdoch-and-artists-align-in-fight-over-ai-content-scraping

But Rupert Murdoch doesn't oppose the material he publishes from being used in AI training, nor is he opposed to the creation and use of models. Murdoch's Harpercollins is now pressuring its authors to sign away their rights to have their works used to train an AI model:

https://bsky.app/profile/kibblesmith.com/post/3laz4ryav3k2w

The deal is not negotiable, and the email demanding that authors opt into it warns that AI might make writers obsolete (remember, even if AI can't do your job, an AI salesman can convince Rupert Murdoch – who is insatiably horny for not paying writers – that an AI is capable of doing your job):

https://www.avclub.com/harpercollins-selling-books-to-ai-language-training

And it's not hard to see why an AI company might want this; after all, if they can lock in an exclusive deal to train a model on Harpercollins' back catalog, their products will exclusively enjoy whatever advantage is to be had in that corpus.

In just a month, we've gone from "publishers won't promise not to train a model on your work" to "publishers are letting an AI company train a model on your work, but will pay you a nonnegotiable pittance for your work." The next step is likely to be, "publishers require you to sign away the right to train a model on your work."

The right to decide who can train a model on your work does you no good unless it comes with the power to exercise that right.

Rather than campaigning for the right to decide who can train a model on our work, we should be campaigning for the power to decide what terms we contract under. The Writers Guild spent 148 days on the picket line, a remarkable show of solidarity.

But the Guild's real achievement was in securing the right to unionize at all – to create a sectoral bargaining unit that could represent all the writers, writing for all the studios. The achievements of our labor forebears, in the teeth of ruthless armed resistance, resulted in the legalization and formalization of unions. Never forget that the unions that exist today were criminal enterprises once upon a time, and the only reason they exist is because people risked prison, violence and murder to organize when doing so was a crime:

https://pluralistic.net/2024/11/11/rip-jane-mcalevey/#organize

The fights were worth fighting. The screenwriters comprehensively won the right to control AI in the writers' room, because they had power:

https://pluralistic.net/2023/10/01/how-the-writers-guild-sunk-ais-ship/

(Image: Cryteria, CC BY 3.0; Eva Rinaldi, CC BY-SA 2.0; modified)


Hey look at this (permalink)



A Wayback Machine banner.

This day in history (permalink)

#20yrsago The Grey Video https://random.waxy.org/video/grey_video.mov

#20yrsago Canada’s DMCA: why is it a bad idea? https://web.archive.org/web/20050428093632/http://www.digital-copyright.ca/files/The_Truth__Final__clean__Nov_16_04_DAF.html

#20yrsago Internet Archive pages are admissable into evidence https://web.archive.org/web/20041120050733/https://cyberlaw.stanford.edu/packets/vol_2_no_3/002728.shtml/

#15yrsago Demonstrating TSA futility by stabbing dead pigs with pens https://pubmed.ncbi.nlm.nih.gov/17325460/

#15yrsago Chumby One: handsome successor to the cutest computer ever https://www.bunniestudios.com/blog/2009/chumby-one

#15yrsago EFF analyzes the legal creepiness of ACTA, the secret copyright treaty https://www.eff.org/deeplinks/2009/11/stopping-acta-juggernaut

#15yrsago Maricopa deputy steals defender’s paperwork during a court case http://www.heatcity.org/2009/11/judge-orders-officer-to-apologize-or-face-jail-for-taking-attorneys-file.html

#15yrsago SFPD cops from imaginary anti-dance-party squad steal laptops https://web.archive.org/web/20091120193839/https://www.sfweekly.com/2009-11-18/music/s-f-cops-may-have-gone-too-far-in-seizing-dj-gear-at-underground-parties/

#15yrsago David Moles’s “Down and Out in the Magic Kingdom” https://chrononaut.org/fiction/down-and-out-in-the-magic-kingdom/

#15yrsago Apple patents anti-user attention-complianceware https://www.nytimes.com/2009/11/15/business/15digi.html

#15yrsago Struts & Frets: an indie-rock YA novel with heart and authenticity https://memex.craphound.com/2009/11/15/struts-frets-an-indie-rock-ya-novel-with-heart-and-authenticity/

#15yrsago UN goons destroy academic poster describing China’s censorwall https://web.archive.org/web/20091118143608/http://news.idg.no/cw/art.cfm

#15yrsago Viacom’s top lawyer thinks lawsuits were “terrorism” – but he’s learned nothing from the experience https://memex.craphound.com/2009/11/17/viacoms-top-lawyer-thinks-lawsuits-were-terrorism-but-hes-learned-nothing-from-the-experience/

#10yrsago London council threatens freedom of information site for “leaking” info they say doesn’t exist https://www.mysociety.org/2014/11/17/can-you-leak-a-decision-that-has-not-yet-been-made/

#10yrsago What “the worst ride in Disney World” teaches us about media strategy https://web.archive.org/web/20140501000000*/https://passport2dreams.blogspot.com/2014/11/stitchs-great-escape-ten-years.html

#10yrsago Rudy Rucker and Terry Bisson’s “Where the Lost Things Are” https://reactormag.com/where-the-lost-things-are-rudy-rucker-terry-bisson/

#10yrsago Mesmerizing rebuild of a mechanical Fourier calculator https://www.youtube.com/playlist?list=PL0INsTTU1k2UYO9Mck-i5HNqGNW5AeEwq

#10yrsago Rightscorp is running out of money https://torrentfreak.com/anti-piracy-firm-rightscorp-on-the-brink-of-bankruptcy-141114/

#10yrsago Spain’s top piracy-fighter goes to jail for embezzling $50K to spend in brothels https://torrentfreak.com/anti-piracy-boss-spent-50k-in-brothels-to-protect-copyright-141114/

#10yrsago EFF makes DoJ admit it lied in court about FBI secret warrants https://web.archive.org/web/20141115172425/http://www.nationaljournal.com/tech/justice-department-admits-it-misled-court-about-fbi-s-secret-surveillance-program-20141113

#10yrsago 1,000-room palace for Turkey’s President Erdogan will cost twice initial $615M pricetag https://www.bbc.co.uk/news/world-europe-30061107

#10yrsago Director Lexi Alexander explains why she sides with pirates https://torrentfreak.com/why-hollywood-director-lexi-alexander-sides-with-pirates-141118/

#10yrsago Bumfights creator accused of stealing remains of dead children from Thai hospital museum https://www.bangkokpost.com/thailand/general/443837/body-parts-in-dhl-packages-stolen-from-siriraj-museum-hospital-says

#10yrsago New sf story: “Huxleyed into the Full Orwell” https://www.vice.com/en/article/huxleyed-into-the-full-cory-orwell-cory-doctorow/

#10yrsago Whatsapp integrates Moxie Marlinspike’s Textsecure end-to-end crypto https://www.wired.com/2014/11/whatsapp-encrypted-messaging/

#5yrsago Podcast: Jeannette Ng Was Right, John W. Campbell Was a Fascist https://ia803108.us.archive.org/19/items/Cory_Doctorow_Podcast_315/Cory_Doctorow_Podcast_315_-_Jeannette_Ng_Was_Right_John_W_Campbell_Was_a_Fascist.mp3

#5yrsago Coop’s tribute to Randotti Skulls, from the golden age of Haunted Mansion merchandise https://memex.craphound.com/2019/11/18/coops-tribute-to-randotti-skulls-from-the-golden-age-of-haunted-mansion-merchandise/

#5yrsago Beyond antitrust: the anti-monopoly movement and what it stands for https://onezero.medium.com/the-utah-statement-reviving-antimonopoly-traditions-for-the-era-of-big-tech-e6be198012d7

#5yrsago Massive leak of Chinese government documents reveal the “no mercy” plan for Muslims in Xinjiang https://www.nytimes.com/interactive/2019/11/16/world/asia/china-xinjiang-documents.html

#5yrsago 900 pages of leaked Iranian spy cables reveal how America’s failures after invasions allowed Iran to seize control of Iraqi politics https://theintercept.com/2019/11/18/iran-iraq-spy-cables/

#5yrsago Majority of Americans know they’re under constant surveillance, don’t trust the companies doing it, and feel helpless to stop it https://www.pewresearch.org/internet/2019/11/15/americans-and-privacy-concerned-confused-and-feeling-lack-of-control-over-their-personal-information/

#5yrsago Supercut of British voters insulting Boris Johnson on the campaign trail https://twitter.com/TheIDSmiths/status/1194954125772853248

#5yrsago Thanks to an article about why science fiction great John M Ford’s books are out of print, they’re coming back https://slate.com/culture/2019/11/john-ford-science-fiction-fantasy-books.html

#5yrsago Many Chinese manufacturers are behaving as though they have no future https://web.archive.org/web/20191114152903/https://www.chinalawblog.com/2019/11/how-to-conduct-business-with-chinese-companies-that-see-a-dark-future.html

#5yrsago Why are we still treating economics as if it were an empirical science that makes reliable predictions? https://www.nybooks.com/articles/2019/12/05/against-economics/

#5yrsago Uber pretended its drivers were contractors, and now it owes New Jersey $650m in employment tax https://news.bloomberglaw.com/daily-labor-report/uber-hit-with-650-million-employment-tax-bill-in-new-jersey

#5yrsago Labour pledges universal broadband and nationwide fibre, will renationalise the farcical, terrible BT Openreach https://www.bbc.com/news/election-2019-50427369

#5yrsago “Hope literacy,” “functional denial” and other ways to keep going in this difficult time https://www.earthisland.org/journal/index.php/articles/entry/despairing-about-climate-crisis

#5yrsago Hong Kong protesters’ little stonehenges impede police cars https://twitter.com/rhokilpatrick/status/1195350548062654465

#5yrsago Extinction Rebellion floats a drowned house down the Thames https://extinctionrebellion.uk/2019/11/10/act-now-our-house-is-flooding/

#5yrsago After workers tried to form a union, trans rights group ditches most of its staff https://npeu.org/news/2019/11/15/nonprofit-professional-employees-union-files-unfair-labor-practice-against-national-center-for-transgender-equality-leadership-for-retaliation-against-staff-organizing

#1yrago Red-teaming the SCOTUS code of conduct https://pluralistic.net/2023/11/17/red-team-black-robes/#security-theater

#1yrago Big Train managers earn bonuses for greenlighting unsafe cars https://pluralistic.net/2023/11/15/safety-third/#all-the-livelong-day


Upcoming appearances (permalink)

A photo of me onstage, giving a speech, holding a mic.



A screenshot of me at my desk, doing a livecast.

Recent appearances (permalink)



A grid of my books with Will Stahle covers..

Latest books (permalink)



A cardboard book box with the Macmillan logo.

Upcoming books (permalink)

  • Picks and Shovels: a sequel to "Red Team Blues," about the heroic era of the PC, Tor Books, February 2025

  • Unauthorized Bread: a middle-grades graphic novel adapted from my novella about refugees, toasters and DRM, FirstSecond, 2025



Colophon (permalink)

Today's top sources:

Currently writing:

  • Enshittification: a nonfiction book about platform decay for Farrar, Straus, Giroux. Friday's progress: 786 words (83404 words total).

  • A Little Brother short story about DIY insulin PLANNING

  • Picks and Shovels, a Martin Hench noir thriller about the heroic era of the PC. FORTHCOMING TOR BOOKS FEB 2025

Latest podcast: Spill, part four (a Little Brother story) https://craphound.com/littlebrother/2024/10/28/spill-part-four-a-little-brother-story/


This work – excluding any serialized fiction – is licensed under a Creative Commons Attribution 4.0 license. That means you can use it any way you like, including commercially, provided that you attribute it to me, Cory Doctorow, and include a link to pluralistic.net.

https://creativecommons.org/licenses/by/4.0/

Quotations and images are not included in this license; they are included either under a limitation or exception to copyright, or on the basis of a separate license. Please exercise caution.


How to get Pluralistic:

Blog (no ads, tracking, or data-collection):

Pluralistic.net

Newsletter (no ads, tracking, or data-collection):

https://pluralistic.net/plura-list

Mastodon (no ads, tracking, or data-collection):

https://mamot.fr/@pluralistic

Medium (no ads, paywalled):

https://doctorow.medium.com/

Twitter (mass-scale, unrestricted, third-party surveillance and advertising):

https://twitter.com/doctorow

Tumblr (mass-scale, unrestricted, third-party surveillance and advertising):

https://mostlysignssomeportents.tumblr.com/tagged/pluralistic

"When life gives you SARS, you make sarsaparilla" -Joey "Accordion Guy" DeVilla

2024-11-17T17:44:36+00:00 Fullscreen Open in Tab
Note published on November 17, 2024 at 5:44 PM UTC
2024-11-16T15:59:23+00:00 Fullscreen Open in Tab
Note published on November 16, 2024 at 3:59 PM UTC
2024-11-16T00:03:30+00:00 Fullscreen Open in Tab
Published on Citation Needed: "Issue 70 – The Cryptocurrency States of America"
Fri, 15 Nov 2024 08:26:41 +0000 Fullscreen Open in Tab
Pluralistic: Canada's ground-breaking, hamstrung repair and interop laws (15 Nov 2024)


Today's links



An e-waste dump. In the foreground are two waste-barrels. A limp Canadian flag emerges from the left barrel; the nude head and shoulders of a grinning Tony Clement emerge from the right barrel.

Canada's ground-breaking, hamstrung repair and interop laws (permalink)

When the GOP trifecta assumes power in just a few months, they will pass laws, and those laws will be terrible, and they will cast long, long shadows.

This is the story of how another far-right conservative government used its bulletproof majority to pass a wildly unpopular law that continues to stymie progress to this day. It's the story of Canada's Harper Conservative government, and two of its key ministers: Tony Clement and James Moore.

Starting in 1998, the US Trade Rep embarked on a long campaign to force every country in the world to enact a new kind of IP law: an "anticircumvention" law that would criminalize the production and use of tools that allowed people to use their own property in ways that the manufacturer disliked.

This first entered the US statute books with the 1998 passage of the Digital Millennium Copyright Act (DMCA), whose Section 1201 established a new felony for circumventing an "access control." Crucially, DMCA 1201's prohibition on circumvention did not confine itself to protecting copyright.

Circumventing an access control is a felony, even if you never violate copyright law. For example, if you circumvent the access control on your own printer to disable the processes that check to make sure you're using an official HP cartridge, HP can come after you.

You haven't violated any copyright, but the ink-checking code is a copyrighted work, and you had to circumvent a block in order to reach it. Thus, if I provide you a tool to escape HP's ink racket, I commit a felony with penalties of five years in prison and a $500k fine, for a first offense. So it is that HP ink costs more per ounce than the semen of a Kentucky Derby-winning stallion.

This was clearly a bad idea in 1998, though it wasn't clear how bad an idea it was at the time. In 1998, chips were expensive and underpowered. By 2010, a chip that cost less than a dollar could easily implement a DMCA-triggering access control, and manufacturers of all kinds were adding superfluous chips to everything from engine parts to smart lightbulbs whose sole purpose was to transform modification into felonies. This is what Jay Freeman calls "felony contempt of business-model."

So when the Harper government set out to import US-style anticircumvention law to Canada, Canadians were furious. A consultation on the proposal received 6,138 responses opposing the law, and 54 in support:

https://www.michaelgeist.ca/2010/04/copycon-final-numbers/

And yet, James Moore and Tony Clement pressed on. When asked how they could advance such an unpopular bill, opposed by experts and the general public alike, Moore told the International Chamber of Commerce that every objector who responded to his consultation was a "radical extremist" with a "babyish" approach to copyright:

https://www.cbc.ca/news/science/copyright-debate-turns-ugly-1.898216

As is so often the case, history vindicated the babyish radical extremists. The DMCA actually has an official way to keep score on this one. Every three years, the US Copyright Office invites public submissions for exemptions to DMCA 1201, creating a detailed, evidence-backed record of all the legitimate activities that anticircumvention law interferes with.

Unfortunately, "a record" is all we get out of this proceeding. Even though the Copyright Office is allowed to grant "exemptions," these don't mean what you think they mean. The statute is very clear on this: the US Copyright Office is required to grant exemptions for the act of circumvention, but is forbidden from granting exemptions for tools needed to carry out these acts.

This is headspinningly and deliberately obscure, but there's one anecdote from my long crusade against this stupid law that lays it bare. As I mentioned, the US Trade Rep has made the passage of DMCA-like laws in other countries a top priority since the Clinton years. In 2001, the EU adopted the EU Copyright Directive, whose Article 6 copy-pastes the provisions of DMCA 1201.

In 2003, I found myself in Oslo, debating the minister who'd just completed Norway's EUCD implementation. The minister was very proud of his law, boasting that he'd researched the flaws in other countries' anticircumvention laws and addressed them in Norway's law. For example, Norway's law explicitly allowed blind people to bypass access controls on ebooks in order to feed them into text-to-speech engines, Braille printers and other accessibility tools.

I knew where this was going. I asked the minister how this would work in practice. Could someone sell a blind person a tool to break the DRM on their ebooks? Of course not, that's totally illegal. Could a nonprofit blind rights group make such a tool and give it away to blind people? No, that's illegal too. What about hobbyists, could they make the tool for their blind friends? No, not that either.

OK, so how do blind people exercise their right to bypass access controls on ebooks they own so they can actually read them?

Here's how. Each blind person, all by themself, is expected to decompile and reverse-engineer Adobe Reader, locate a vulnerability in the code and write a new program that exploits that vulnerability to extract their ebooks. While blind people are individually empowered to undertake this otherwise prohibited activity, they must do so on their own: they can't share notes with one another on the process. They certainly can't give each other the circumvention program they write in this way:

https://pluralistic.net/2024/10/28/mcbroken/#my-milkshake-brings-all-the-lawyers-to-the-yard

That's what a use-only exemption is: the right to individually put a locked down device up on your own workbench, and, laboring in perfect secrecy, figure out how it works and then defeat the locks that stop you from changing those workings so they benefit you instead of the manufacturer. Without a "tools" exemption, a use exemption is basically a decorative ornament.

So the many use exemptions that the US Copyright Office has granted since 1998 really amount to nothing more than a list of defects in the DMCA that the Copyright Office has painstaking verified but is powerless to fix. We could probably save everyone a lot of time by scrapping the triennial exemptions process and replacing it with a permanent sign over the doors of the Library of Congress reading "Abandon hope, all ye who enter here."

All of this was well understood by 2010, when Moore and Clement were working on the Canadian version of the DMCA. All of this was explained in eye-watering detail to Moore and Clement, but was roundly ignored. I even had a go at it, publicly picking a fight with Moore on Twitter:

https://web.archive.org/web/20130407101911if_/http://eaves.ca/wp-content/uploads/2010/Conversations%20between%20@doctorow%20and%20@mpjamesmoore.jpg

Moore and Clement rammed their proposal through in the next session of Parliament, passing it as Bill C-11 in 2012:

https://en.wikipedia.org/wiki/Copyright_Modernization_Act

This was something of a grand finale for the pair. Today, Moore is a faceless corporate lawyer, while Clement was last seen grifting covid PPE (Clement's political career ended abruptly when he sent dick pics to a young woman who turned out to be a pair of sextortionists from Cote D'Ivoire, and was revealed as a serial sex-pest in the ensuing scandal:)

https://globalnews.ca/news/4646287/tony-clement-instagram-women/

Even though Moore and Clement are long gone from public life, their signature achievement remains a Canadian disgrace, an anchor chain tied around the Canadian economy's throat, and an impediment to Canadian progress.

This week, two excellent new Canadian laws received royal assent: Bill C-244 is a broad, national Right to Repair law; and Bill C-294 is a broad, national interoperability law. Both laws establish the right to circumvent access controls for the purpose of fixing and improving things, something Canadians deserve and need.

But neither law contains a tools exemption. Like the blind people of Norway, a Canadian farmer who wants to attach a made-in-Canada Honeybee tool to their John Deere tractor is required to personally, individually reverse-engineer the John Deere tractor and modify it to talk to the Honeybee accessory, laboring in total secrecy:

https://www.theregister.com/2024/11/12/canada_right_to_repair/

Likewise the Canadian repair tech who fixes a smart speaker or a busted smartphone – they are legally permitted to circumvent in order to torture the device's repair codes out of it or force it to recognize a replacement part, but each technician must personally figure out how to get the device firmware to do this, without discussing it with anyone else.

Thus do Moore and Clement stand athwart Canadian self-reliance and economic development, shouting "STOP!" though both men have been out of politics for years.

There has never been a better time to hit Clement and Moore's political legacy over the head with a shovel and bury it in a shallow grave. Canadian technologists could be making a fortune creating circumvention devices that repair and improve devices marketed by foreign companies.

They could make circumvention tools to allow owners of consoles to play games by Canadian studios that are directly sold to Canadian gamers, bypassing the stores operated by Microsoft, Sony and Nintendo and the 30% commissions they charge. Canadian technologists could be making diagnostic tools that allow every auto mechanic in Canada to fix any car manufactured anywhere in the world.

Canadian cloud servers could power devices long after their US-based manufacturers discontinue support for them, providing income to Canadian cloud companies and continued enjoyment for Canadian owners of these otherwise bricked gadgets.

Canada's gigantic auto parts sector could clone the security chips that foreign auto manufacturers use to block the use of third party parts, and every Canadian could enjoy a steep discount every time they fix their cars. Every farmer could avail themselves of third party parts for their tractors, which they could install themselves, bypassing the $200 service call from a John Deere technician who does nothing more than look over the farmer's own repair and then type an unlock code into the tractor's console.

Every Canadian who prints out a shopping list or their kid's homework could use third party ink that sells for pennies per liter, rather than HP's official colored water that cost more than vintage Veuve Cliquot.

A Canadian e-waste dump generates five low-paid jobs per ton of waste, and that waste itself will poison the land and water for centuries to come. A circumvention-enabled Canadian repair sector could generate 150 skilled, high-paid community jobs that saves gadgets and the Earth, all while saving Canadians millions.

Canadians could enjoy the resliency that comes of having a domestic tech and repair sector, and could count on it through pandemics and Trumpian trade-war.

All of that and more could be ours, except for the cowardice and greed of Tony Clement and James Moore and the Harper Tories who voted C-11 into law in 2012.

Everything the "radical extremists" warned them of has come true. It's long past time Canadians tore up anticircumvention law and put the interests of the Canadian public and Canadian tech businesses ahead of the rent-seeking enshittification of American Big Tech.

Until we do that, we can keep on passing all the repair and interop laws we want, but each one will be hamstrung by Moore and Clement's "felony contempt of business model" law, and the contempt it showed for the Canadian people.

(Image: https://en.wikipedia.org/wiki/File:Tony_Clement_-_2007-06-30_in_Kearney,_Ontario.JPGJeffJ, CC BY-SA 3.0; Jorge Franganillo, CC BY 2.0, modified)


Hey look at this (permalink)



A Wayback Machine banner.

This day in history (permalink)

#20yrsgo My latest short story — CC-licensed, on Salon, all about gaming https://www.salon.com/2004/11/15/andas_game/

#10yrsago University of Michigan makes up a bunch of non-reasons why it doesn’t have to do record retention https://www.techdirt.com/2014/11/13/michigan-university-claims-its-public-records-retention-period-is-whatever-each-employee-wants-it-to-be/

#10yrsago Amazon and Hachette kiss and make up https://www.nytimes.com/2014/11/14/technology/amazon-hachette-ebook-dispute.html

#5yrsago Tpmfail: a timing attack that can extract keys from secure computing chips in 4-20 minutes https://www.zdnet.com/article/tpm-fail-vulnerabilities-impact-tpm-chips-in-desktops-laptops-servers/

#5yrsago Banned from Youtube, Chinese propagandists are using Pornhub to publish anti-Hong Kong videos https://qz.com/1747617/chinese-users-go-to-pornhub-to-spread-hong-kong-propaganda

#5yrsago Hong Kong protests: “Might as well go down fighting” https://www.theatlantic.com/international/archive/2019/11/escalating-violence-hong-kong-protests/601804/

#5yrsago Activists target Facebookers over “Gold Tier” sponsorship of Kavanaugh event https://www.theverge.com/2019/11/14/20963865/facebook-ads-brett-kavanaugh-federalist-society-employees

#5yrsago Trump’s signature tax break for poor people went to subsidize a superyacht marina in Florida https://www.propublica.org/article/superyacht-marina-west-palm-beach-opportunity-zone-trump-tax-break-to-help-the-poor-went-to-a-rich-gop-donor

#5yrsago Big Tech’s CEOs can’t possibly fix Big Tech https://medium.com/bloomberg-opinion/mark-zuckerberg-is-totally-out-of-his-depth-887682ba70b9

#5yrsago American health care’s life-destroying “surprise bills” are the fault of local, private-equity monopolies https://www.theamericanconservative.com/gougers-r-us-how-private-equity-is-gobbling-up-medical-care/

#5yrsago The poorest half of Americans have nothing left, so now the 1%’s growth comes from the upper middle class https://wolfstreet.com/2019/11/13/how-the-fed-boosts-the-1-even-the-upper-middle-class-loses-share-of-household-wealth-to-the-1-the-bottom-half-gets-screwed/

#1yrago The conservative movement is cracking up https://pluralistic.net/2023/11/14/when-youve-lost-the-fedsoc/#anti-buster-buster


Upcoming appearances (permalink)

A photo of me onstage, giving a speech, holding a mic.



A screenshot of me at my desk, doing a livecast.

Recent appearances (permalink)



A grid of my books with Will Stahle covers..

Latest books (permalink)



A cardboard book box with the Macmillan logo.

Upcoming books (permalink)

  • Picks and Shovels: a sequel to "Red Team Blues," about the heroic era of the PC, Tor Books, February 2025

  • Unauthorized Bread: a middle-grades graphic novel adapted from my novella about refugees, toasters and DRM, FirstSecond, 2025



Colophon (permalink)

Today's top sources:

Currently writing:

  • Enshittification: a nonfiction book about platform decay for Farrar, Straus, Giroux. Today's progress: 792 words (82608 words total).

  • A Little Brother short story about DIY insulin PLANNING

  • Picks and Shovels, a Martin Hench noir thriller about the heroic era of the PC. FORTHCOMING TOR BOOKS FEB 2025

Latest podcast: Spill, part four (a Little Brother story) https://craphound.com/littlebrother/2024/10/28/spill-part-four-a-little-brother-story/


This work – excluding any serialized fiction – is licensed under a Creative Commons Attribution 4.0 license. That means you can use it any way you like, including commercially, provided that you attribute it to me, Cory Doctorow, and include a link to pluralistic.net.

https://creativecommons.org/licenses/by/4.0/

Quotations and images are not included in this license; they are included either under a limitation or exception to copyright, or on the basis of a separate license. Please exercise caution.


How to get Pluralistic:

Blog (no ads, tracking, or data-collection):

Pluralistic.net

Newsletter (no ads, tracking, or data-collection):

https://pluralistic.net/plura-list

Mastodon (no ads, tracking, or data-collection):

https://mamot.fr/@pluralistic

Medium (no ads, paywalled):

https://doctorow.medium.com/

Twitter (mass-scale, unrestricted, third-party surveillance and advertising):

https://twitter.com/doctorow

Tumblr (mass-scale, unrestricted, third-party surveillance and advertising):

https://mostlysignssomeportents.tumblr.com/tagged/pluralistic

"When life gives you SARS, you make sarsaparilla" -Joey "Accordion Guy" DeVilla

2024-11-09T09:24:29+00:00 Fullscreen Open in Tab
New microblog with TILs

I added a new section to this site a couple weeks ago called TIL (“today I learned”).

the goal: save interesting tools & facts I posted on social media

One kind of thing I like to post on Mastodon/Bluesky is “hey, here’s a cool thing”, like the great SQLite repl litecli, or the fact that cross compiling in Go Just Works and it’s amazing, or cryptographic right answers, or this great diff tool. Usually I don’t want to write a whole blog post about those things because I really don’t have much more to say than “hey this is useful!”

It started to bother me that I didn’t have anywhere to put those things: for example recently I wanted to use diffdiff and I just could not remember what it was called.

the solution: make a new section of this blog

So I quickly made a new folder called /til/, added some custom styling (I wanted to style the posts to look a little bit like a tweet), made a little Rake task to help me create new posts quickly (rake new_til), and set up a separate RSS Feed for it.

I think this new section of the blog might be more for myself than anything, now when I forget the link to Cryptographic Right Answers I can hopefully look it up on the TIL page. (you might think “julia, why not use bookmarks??” but I have been failing to use bookmarks for my whole life and I don’t see that changing ever, putting things in public is for whatever reason much easier for me)

So far it’s been working, often I can actually just make a quick post in 2 minutes which was the goal.

inspired by Simon Willison’s TIL blog

My page is inspired by Simon Willison’s great TIL blog, though my TIL posts are a lot shorter.

I don’t necessarily want everything to be archived

This came about because I spent a lot of time on Twitter, so I’ve been thinking about what I want to do about all of my tweets.

I keep reading the advice to “POSSE” (“post on your own site, syndicate elsewhere”), and while I find the idea appealing in principle, for me part of the appeal of social media is that it’s a little bit ephemeral. I can post polls or questions or observations or jokes and then they can just kind of fade away as they become less relevant.

I find it a lot easier to identify specific categories of things that I actually want to have on a Real Website That I Own:

and then let everything else be kind of ephemeral.

I really believe in the advice to make email lists though – the first two (blog posts & comics) both have email lists and RSS feeds that people can subscribe to if they want. I might add a quick summary of any TIL posts from that week to the “blog posts from this week” mailing list.

2024-11-04T09:18:03+00:00 Fullscreen Open in Tab
My IETF 121 Agenda

Here's where you can find me at IETF 121 in Dublin!

Monday

Tuesday

  • 9:30 - 11:30 • oauth
  • 13:00 - 14:30 • spice
  • 16:30 - 17:30 • scim

Thursday

Get in Touch

My Current Drafts

2024-10-31T08:00:10+00:00 Fullscreen Open in Tab
ASCII control characters in my terminal

Hello! I’ve been thinking about the terminal a lot and yesterday I got curious about all these “control codes”, like Ctrl-A, Ctrl-C, Ctrl-W, etc. What’s the deal with all of them?

a table of ASCII control characters

Here’s a table of all 33 ASCII control characters, and what they do on my machine (on Mac OS), more or less. There are about a million caveats, but I’ll talk about what it means and all the problems with this diagram that I know about.

You can also view it as an HTML page (I just made it an image so it would show up in RSS).

different kinds of codes are mixed together

The first surprising thing about this diagram to me is that there are 33 control codes, split into (very roughly speaking) these categories:

  1. Codes that are handled by the operating system’s terminal driver, for example when the OS sees a 3 (Ctrl-C), it’ll send a SIGINT signal to the current program
  2. Everything else is passed through to the application as-is and the application can do whatever it wants with them. Some subcategories of those:
    • Codes that correspond to a literal keypress of a key on your keyboard (Enter, Tab, Backspace). For example when you press Enter, your terminal gets sent 13.
    • Codes used by readline: “the application can do whatever it wants” often means “it’ll do more or less what the readline library does, whether the application actually uses readline or not”, so I’ve labelled a bunch of the codes that readline uses
    • Other codes, for example I think Ctrl-X has no standard meaning in the terminal in general but emacs uses it very heavily

There’s no real structure to which codes are in which categories, they’re all just kind of randomly scattered because this evolved organically.

(If you’re curious about readline, I wrote more about readline in entering text in the terminal is complicated, and there are a lot of cheat sheets out there)

there are only 33 control codes

Something else that I find a little surprising is that are only 33 control codes – A to Z, plus 7 more (@, [, \, ], ^, _, ?). This means that if you want to have for example Ctrl-1 as a keyboard shortcut in a terminal application, that’s not really meaningful – on my machine at least Ctrl-1 is exactly the same thing as just pressing 1, Ctrl-3 is the same as Ctrl-[, etc.

Also Ctrl+Shift+C isn’t a control code – what it does depends on your terminal emulator. On Linux Ctrl-Shift-X is often used by the terminal emulator to copy or open a new tab or paste for example, it’s not sent to the TTY at all.

Also I use Ctrl+Left Arrow all the time, but that isn’t a control code, instead it sends an ANSI escape sequence (ctrl-[[1;5D) which is a different thing which we absolutely do not have space for in this post.

This “there are only 33 codes” thing is totally different from how keyboard shortcuts work in a GUI where you can have Ctrl+KEY for any key you want.

the official ASCII names aren’t very meaningful to me

Each of these 33 control codes has a name in ASCII (for example 3 is ETX). When all of these control codes were originally defined, they weren’t being used for computers or terminals at all, they were used for the telegraph machine. Telegraph machines aren’t the same as UNIX terminals so a lot of the codes were repurposed to mean something else.

Personally I don’t find these ASCII names very useful, because 50% of the time the name in ASCII has no actual relationship to what that code does on UNIX systems today. So it feels easier to just ignore the ASCII names completely instead of trying to figure which ones still match their original meaning.

It’s hard to use Ctrl-M as a keyboard shortcut

Another thing that’s a bit weird is that Ctrl-M is literally the same as Enter, and Ctrl-I is the same as Tab, which makes it hard to use those two as keyboard shortcuts.

From some quick research, it seems like some folks do still use Ctrl-I and Ctrl-M as keyboard shortcuts (here’s an example), but to do that you need to configure your terminal emulator to treat them differently than the default.

For me the main takeaway is that if I ever write a terminal application I should avoid Ctrl-I and Ctrl-M as keyboard shortcuts in it.

how to identify what control codes get sent

While writing this I needed to do a bunch of experimenting to digure out what various key combinations did, so I wrote this Python script echo-key.py that will print them out.

There’s probably a more official way but I appreciated having a script I could customize.

caveat: on canonical vs noncanonical mode

Two of these codes (Ctrl-W and Ctrl-U) are labelled in the table as “handled by the OS”, but actually they’re not always handled by the OS, it depends on whether the terminal is in “canonical” mode or in “noncanonical mode”.

In canonical mode, programs only get input when you press Enter (and the OS is in charge of deleting characters when you press Backspace or Ctrl-W). But in noncanonical mode the program gets input immediately when you press a key, and the Ctrl-W and Ctrl-U codes are passed through to the program to handle any way it wants.

Generally in noncanonical mode the program will handle Ctrl-W and Ctrl-U similarly to how the OS does, but there are some small differences.

Some examples of programs that use canonical mode:

  • probably pretty much any noninteractive program, like grep or cat
  • git, I think

Examples of programs that use noncanonical mode:

  • python3, irb and other REPLs
  • your shell
  • any full screen TUI like less or vim

caveat: all of the “OS terminal driver” codes are configurable with stty

I said that Ctrl-C sends SIGINT but technically this is not necessarily true, if you really want to you can remap all of the codes labelled “OS terminal driver”, plus Backspace, using a tool called stty, and you can view the mappings with stty -a.

Here are the mappings on my machine right now:

$ stty -a
cchars: discard = ^O; dsusp = ^Y; eof = ^D; eol = <undef>;
	eol2 = <undef>; erase = ^?; intr = ^C; kill = ^U; lnext = ^V;
	min = 1; quit = ^\; reprint = ^R; start = ^Q; status = ^T;
	stop = ^S; susp = ^Z; time = 0; werase = ^W;

I have personally never remapped any of these and I cannot imagine a reason I would (I think it would be a recipe for confusion and disaster for me), but I asked on Mastodon and people said the most common reasons they used stty were:

  • fix a broken terminal with stty sane
  • set stty erase ^H to change how Backspace works
  • set stty ixoff
  • some people even map SIGINT to a different key, like their DELETE key

caveat: on signals

Two signals caveats:

  1. If the ISIG terminal mode is turned off, then the OS won’t send signals. For example vim turns off ISIG
  2. Apparently on BSDs, there’s an extra control code (Ctrl-T) which sends SIGINFO

You can see which terminal modes a program is setting using strace like this, terminal modes are set with the ioctl system call:

$ strace -tt -o out  vim
$ grep ioctl out | grep SET

here are the modes vim sets when it starts (ISIG and ICANON are missing!):

17:43:36.670636 ioctl(0, TCSETS, {c_iflag=IXANY|IMAXBEL|IUTF8,
c_oflag=NL0|CR0|TAB0|BS0|VT0|FF0|OPOST, c_cflag=B38400|CS8|CREAD,
c_lflag=ECHOK|ECHOCTL|ECHOKE|PENDIN, ...}) = 0

and it resets the modes when it exits:

17:43:38.027284 ioctl(0, TCSETS, {c_iflag=ICRNL|IXANY|IMAXBEL|IUTF8,
c_oflag=NL0|CR0|TAB0|BS0|VT0|FF0|OPOST|ONLCR, c_cflag=B38400|CS8|CREAD,
c_lflag=ISIG|ICANON|ECHO|ECHOE|ECHOK|IEXTEN|ECHOCTL|ECHOKE|PENDIN, ...}) = 0

I think the specific combination of modes vim is using here might be called “raw mode”, man cfmakeraw talks about that.

there are a lot of conflicts

Related to “there are only 33 codes”, there are a lot of conflicts where different parts of the system want to use the same code for different things, for example by default Ctrl-S will freeze your screen, but if you turn that off then readline will use Ctrl-S to do a forward search.

Another example is that on my machine sometimes Ctrl-T will send SIGINFO and sometimes it’ll transpose 2 characters and sometimes it’ll do something completely different depending on:

  • whether the program has ISIG set
  • whether the program uses readline / imitates readline’s behaviour

caveat: on “backspace” and “other backspace”

In this diagram I’ve labelled code 127 as “backspace” and 8 as “other backspace”. Uh, what?

I think this was the single biggest topic of discussion in the replies on Mastodon – apparently there’s a LOT of history to this and I’d never heard of any of it before.

First, here’s how it works on my machine:

  1. I press the Backspace key
  2. The TTY gets sent the byte 127, which is called DEL in ASCII
  3. the OS terminal driver and readline both have 127 mapped to “backspace” (so it works both in canonical mode and noncanonical mode)
  4. The previous character gets deleted

If I press Ctrl+H, it has the same effect as Backspace if I’m using readline, but in a program without readline support (like cat for instance), it just prints out ^H.

Apparently Step 2 above is different for some folks – their Backspace key sends the byte 8 instead of 127, and so if they want Backspace to work then they need to configure the OS (using stty) to set erase = ^H.

There’s an incredible section of the Debian Policy Manual on keyboard configuration that describes how Delete and Backspace should work according to Debian policy, which seems very similar to how it works on my Mac today. My understanding (via this mastodon post) is that this policy was written in the 90s because there was a lot of confusion about what Backspace should do in the 90s and there needed to be a standard to get everything to work.

There’s a bunch more historical terminal stuff here but that’s all I’ll say for now.

there’s probably a lot more diversity in how this works

I’ve probably missed a bunch more ways that “how it works on my machine” might be different from how it works on other people’s machines, and I’ve probably made some mistakes about how it works on my machine too. But that’s all I’ve got for today.

Some more stuff I know that I’ve left out: according to stty -a Ctrl-O is “discard”, Ctrl-R is “reprint”, and Ctrl-Y is “dsusp”. I have no idea how to make those actually do anything (pressing them does not do anything obvious, and some people have told me what they used to do historically but it’s not clear to me if they have a use in 2024), and a lot of the time in practice they seem to just be passed through to the application anyway so I just labelled Ctrl-R and Ctrl-Y as readline.

not all of this is that useful to know

Also I want to say that I think the contents of this post are kind of interesting but I don’t think they’re necessarily that useful. I’ve used the terminal pretty successfully every day for the last 20 years without knowing literally any of this – I just knew what Ctrl-C, Ctrl-D, Ctrl-Z, Ctrl-R, Ctrl-L did in practice (plus maybe Ctrl-A, Ctrl-E and Ctrl-W) and did not worry about the details for the most part, and that was almost always totally fine except when I was trying to use xterm.js.

But I had fun learning about it so maybe it’ll be interesting to you too.

2024-10-27T07:47:04+00:00 Fullscreen Open in Tab
Using less memory to look up IP addresses in Mess With DNS

I’ve been having problems for the last 3 years or so where Mess With DNS periodically runs out of memory and gets OOM killed.

This hasn’t been a big priority for me: usually it just goes down for a few minutes while it restarts, and it only happens once a day at most, so I’ve just been ignoring. But last week it started actually causing a problem so I decided to look into it.

This was kind of winding road where I learned a lot so here’s a table of contents:

there’s about 100MB of memory available

I run Mess With DNS on a VM without about 465MB of RAM, which according to ps aux (the RSS column) is split up something like:

  • 100MB for PowerDNS
  • 200MB for Mess With DNS
  • 40MB for hallpass

That leaves about 110MB of memory free.

A while back I set GOMEMLIMIT to 250MB to try to make sure the garbage collector ran if Mess With DNS used more than 250MB of memory, and I think this helped but it didn’t solve everything.

the problem: OOM killing the backup script

A few weeks ago I started backing up Mess With DNS’s database for the first time using restic.

This has been working okay, but since Mess With DNS operates without much extra memory I think restic sometimes needed more memory than was available on the system, and so the backup script sometimes got OOM killed.

This was a problem because

  1. backups might be corrupted sometimes
  2. more importantly, restic takes out a lock when it runs, and so I’d have to manually do an unlock if I wanted the backups to continue working. Doing manual work like this is the #1 thing I try to avoid with all my web services (who has time for that!) so I really wanted to do something about it.

There’s probably more than one solution to this, but I decided to try to make Mess With DNS use less memory so that there was more available memory on the system, mostly because it seemed like a fun problem to try to solve.

what’s using memory: IP addresses

I’d run a memory profile of Mess With DNS a bunch of times in the past, so I knew exactly what was using most of Mess With DNS’s memory: IP addresses.

When it starts, Mess With DNS loads this database where you can look up the ASN of every IP address into memory, so that when it receives a DNS query it can take the source IP address like 74.125.16.248 and tell you that IP address belongs to GOOGLE.

This database by itself used about 117MB of memory, and a simple du told me that was too much – the original text files were only 37MB!

$ du -sh *.tsv
26M	ip2asn-v4.tsv
11M	ip2asn-v6.tsv

The way it worked originally is that I had an array of these:

type IPRange struct {
	StartIP net.IP
	EndIP   net.IP
	Num     int
	Name    string
	Country string
}

and I searched through it with a binary search to figure out if any of the ranges contained the IP I was looking for. Basically the simplest possible thing and it’s super fast, my machine can do about 9 million lookups per second.

attempt 1: use SQLite

I’ve been using SQLite recently, so my first thought was – maybe I can store all of this data on disk in an SQLite database, give the tables an index, and that’ll use less memory.

So I:

  • wrote a quick Python script using sqlite-utils to import the TSV files into an SQLite database
  • adjusted my code to select from the database instead

This did solve the initial memory goal (after a GC it now hardly used any memory at all because the table was on disk!), though I’m not sure how much GC churn this solution would cause if we needed to do a lot of queries at once. I did a quick memory profile and it seemed to allocate about 1KB of memory per lookup.

Let’s talk about the issues I ran into with using SQLite though.

problem: how to store IPv6 addresses

SQLite doesn’t have support for big integers and IPv6 addresses are 128 bits, so I decided to store them as text. I think BLOB might have been better, I originally thought BLOBs couldn’t be compared but the sqlite docs say they can.

I ended up with this schema:

CREATE TABLE ipv4_ranges (
   start_ip INTEGER NOT NULL,
   end_ip INTEGER NOT NULL,
   asn INTEGER NOT NULL,
   country TEXT NOT NULL,
   name TEXT NOT NULL
);
CREATE TABLE ipv6_ranges (
   start_ip TEXT NOT NULL,
   end_ip TEXT NOT NULL,
   asn INTEGER,
   country TEXT,
   name TEXT
);
CREATE INDEX idx_ipv4_ranges_start_ip ON ipv4_ranges (start_ip);
CREATE INDEX idx_ipv6_ranges_start_ip ON ipv6_ranges (start_ip);
CREATE INDEX idx_ipv4_ranges_end_ip ON ipv4_ranges (end_ip);
CREATE INDEX idx_ipv6_ranges_end_ip ON ipv6_ranges (end_ip);

Also I learned that Python has an ipaddress module, so I could use ipaddress.ip_address(s).exploded to make sure that the IPv6 addresses were expanded so that a string comparison would compare them properly.

problem: it’s 500x slower

I ran a quick microbenchmark, something like this. It printed out that it could look up 17,000 IPv6 addresses per second, and similarly for IPv4 addresses.

This was pretty discouraging – being able to look up 17k addresses per section is kind of fine (Mess With DNS does not get a lot of traffic), but I compared it to the original binary search code and the original code could do 9 million per second.

	ips := []net.IP{}
	count := 20000
	for i := 0; i < count; i++ {
		// create a random IPv6 address
		bytes := randomBytes()
		ip := net.IP(bytes[:])
		ips = append(ips, ip)
	}
	now := time.Now()
	success := 0
	for _, ip := range ips {
		_, err := ranges.FindASN(ip)
		if err == nil {
			success++
		}
	}
	fmt.Println(success)
	elapsed := time.Since(now)
	fmt.Println("number per second", float64(count)/elapsed.Seconds())

time for EXPLAIN QUERY PLAN

I’d never really done an EXPLAIN in sqlite, so I thought it would be a fun opportunity to see what the query plan was doing.

sqlite> explain query plan select * from ipv6_ranges where '2607:f8b0:4006:0824:0000:0000:0000:200e' BETWEEN start_ip and end_ip;
QUERY PLAN
`--SEARCH ipv6_ranges USING INDEX idx_ipv6_ranges_end_ip (end_ip>?)

It looks like it’s just using the end_ip index and not the start_ip index, so maybe it makes sense that it’s slower than the binary search.

I tried to figure out if there was a way to make SQLite use both indexes, but I couldn’t find one and maybe it knows best anyway.

At this point I gave up on the SQLite solution, I didn’t love that it was slower and also it’s a lot more complex than just doing a binary search. I felt like I’d rather keep something much more similar to the binary search.

A few things I tried with SQLite that did not cause it to use both indexes:

  • using a compound index instead of two separate indexes
  • running ANALYZE
  • using INTERSECT to intersect the results of start_ip < ? and ? < end_ip. This did make it use both indexes, but it also seemed to make the query literally 1000x slower, probably because it needed to create the results of both subqueries in memory and intersect them.

attempt 2: use a trie

My next idea was to use a trie, because I had some vague idea that maybe a trie would use less memory, and I found this library called ipaddress-go that lets you look up IP addresses using a trie.

I tried using it here’s the code, but I think I was doing something wildly wrong because, compared to my naive array + binary search:

  • it used WAY more memory (800MB to store just the IPv4 addresses)
  • it was a lot slower to do the lookups (it could do only 100K/second instead of 9 million/second)

I’m not really sure what went wrong here but I gave up on this approach and decided to just try to make my array use less memory and stick to a simple binary search.

some notes on memory profiling

One thing I learned about memory profiling is that you can use runtime package to see how much memory is currently allocated in the program. That’s how I got all the memory numbers in this post. Here’s the code:

func memusage() {
	runtime.GC()
	var m runtime.MemStats
	runtime.ReadMemStats(&m)
	fmt.Printf("Alloc = %v MiB\n", m.Alloc/1024/1024)
	// write mem.prof
	f, err := os.Create("mem.prof")
	if err != nil {
		log.Fatal(err)
	}
	pprof.WriteHeapProfile(f)
	f.Close()
}

Also I learned that if you use pprof to analyze a heap profile there are two ways to analyze it: you can pass either --alloc-space or --inuse-space to go tool pprof. I don’t know how I didn’t realize this before but alloc-space will tell you about everything that was allocated, and inuse-space will just include memory that’s currently in use.

Anyway I ran go tool pprof -pdf --inuse_space mem.prof > mem.pdf a lot. Also every time I use pprof I find myself referring to my own intro to pprof, it’s probably the blog post I wrote that I use the most often. I should add --alloc-space and --inuse-space to it.

attempt 3: make my array use less memory

I was storing my ip2asn entries like this:

type IPRange struct {
	StartIP net.IP
	EndIP   net.IP
	Num     int
	Name    string
	Country string
}

I had 3 ideas for ways to improve this:

  1. There was a lot of repetition of Name and the Country, because a lot of IP ranges belong to the same ASN
  2. net.IP is an []byte under the hood, which felt like it involved an unnecessary pointer, was there a way to inline it into the struct?
  3. Maybe I didn’t need both the start IP and the end IP, often the ranges were consecutive so maybe I could rearrange things so that I only had the start IP

idea 3.1: deduplicate the Name and Country

I figured I could store the ASN info in an array, and then just store the index into the array in my IPRange struct. Here are the structs so you can see what I mean:

type IPRange struct {
	StartIP netip.Addr
	EndIP   netip.Addr
	ASN     uint32
	Idx     uint32
}

type ASNInfo struct {
	Country string
	Name    string
}

type ASNPool struct {
	asns   []ASNInfo
	lookup map[ASNInfo]uint32
}

This worked! It brought memory usage from 117MB to 65MB – a 50MB savings. I felt good about this.

Here’s all of the code for that part.

how big are ASNs?

As an aside – I’m storing the ASN in a uint32, is that right? I looked in the ip2asn file and the biggest one seems to be 401307, though there are a few lines that say 4294901931 which is much bigger, but also are just inside the range of a uint32. So I can definitely use a uint32.

59.101.179.0	59.101.179.255	4294901931	Unknown	AS4294901931

idea 3.2: use netip.Addr instead of net.IP

It turns out that I’m not the only one who felt that net.IP was using an unnecessary amount of memory – in 2021 the folks at Tailscale released a new IP address library for Go which solves this and many other issues. They wrote a great blog post about it.

I discovered (to my delight) that not only does this new IP address library exist and do exactly what I want, it’s also now in the Go standard library as netip.Addr. Switching to netip.Addr was very easy and saved another 20MB of memory, bringing us to 46MB.

I didn’t try my third idea (remove the end IP from the struct) because I’d already been programming for long enough on a Saturday morning and I was happy with my progress.

It’s always such a great feeling when I think “hey, I don’t like this, there must be a better way” and then immediately discover that someone has already made the exact thing I want, thought about it a lot more than me, and implemented it much better than I would have.

all of this was messier in real life

Even though I tried to explain this in a simple linear way “I tried X, then I tried Y, then I tried Z”, that’s kind of a lie – I always try to take my actual debugging process (total chaos) and make it seem more linear and understandable because the reality is just too annoying to write down. It’s more like:

  • try sqlite
  • try a trie
  • second guess everything that I concluded about sqlite, go back and look at the results again
  • wait what about indexes
  • very very belatedly realize that I can use runtime to check how much memory everything is using, start doing that
  • look at the trie again, maybe I misunderstood everything
  • give up and go back to binary search
  • look at all of the numbers for tries/sqlite again to make sure I didn’t misunderstand

A note on using 512MB of memory

Someone asked why I don’t just give the VM more memory. I could very easily afford to pay for a VM with 1GB of memory, but I feel like 512MB really should be enough (and really that 256MB should be enough!) so I’d rather stay inside that constraint. It’s kind of a fun puzzle.

a few ideas from the replies

Folks had a lot of good ideas I hadn’t thought of. Recording them as inspiration if I feel like having another Fun Performance Day at some point.

  • Try Go’s unique package for the ASNPool. Someone tried this and it uses more memory, probably because Go’s pointers are 64 bits
  • Try compiling with GOARCH=386 to use 32-bit pointers to sace space (maybe in combination with using unique!)
  • It should be possible to store all of the IPv6 addresses in just 64 bits, because only the first 64 bits of the address are public
  • Interpolation search might be faster than binary search since IP addresses are numeric
  • Try the MaxMind db format with mmdbwriter or mmdbctl
  • Tailscale’s art routing table package

the result: saved 70MB of memory!

I deployed the new version and now Mess With DNS is using less memory! Hooray!

A few other notes:

  • lookups are a little slower – in my microbenchmark they went from 9 million lookups/second to 6 million, maybe because I added a little indirection. Using less memory and a little more CPU seemed like a good tradeoff though.
  • it’s still using more memory than the raw text files do (46MB vs 37MB), I guess pointers take up space and that’s okay.

I’m honestly not sure if this will solve all my memory problems, probably not! But I had fun, I learned a few things about SQLite, I still don’t know what to think about tries, and it made me love binary search even more than I already did.

2024-10-07T09:19:57+00:00 Fullscreen Open in Tab
Some notes on upgrading Hugo

Warning: this is a post about very boring yakshaving, probably only of interest to people who are trying to upgrade Hugo from a very old version to a new version. But what are blogs for if not documenting one’s very boring yakshaves from time to time?

So yesterday I decided to try to upgrade Hugo. There’s no real reason to do this – I’ve been using Hugo version 0.40 to generate this blog since 2018, it works fine, and I don’t have any problems with it. But I thought – maybe it won’t be as hard as I think, and I kind of like a tedious computer task sometimes!

I thought I’d document what I learned along the way in case it’s useful to anyone else doing this very specific migration. I upgraded from Hugo v0.40 (from 2018) to v0.135 (from 2024).

Here are most of the changes I had to make:

change 1: template "theme/partials/thing.html is now partial thing.html

I had to replace a bunch of instances of {{ template "theme/partials/header.html" . }} with {{ partial "header.html" . }}.

This happened in v0.42:

We have now virtualized the filesystems for project and theme files. This makes everything simpler, faster and more powerful. But it also means that template lookups on the form {{ template “theme/partials/pagination.html” . }} will not work anymore. That syntax has never been documented, so it’s not expected to be in wide use.

change 2: .Data.Pages is now site.RegularPages

This seems to be discussed in the release notes for 0.57.2

I just needed to replace .Data.Pages with site.RegularPages in the template on the homepage as well as in my RSS feed template.

change 3: .Next and .Prev got flipped

I had this comment in the part of my theme where I link to the next/previous blog post:

“next” and “previous” in hugo apparently mean the opposite of what I’d think they’d mean intuitively. I’d expect “next” to mean “in the future” and “previous” to mean “in the past” but it’s the opposite

It looks they changed this in ad705aac064 so that “next” actually is in the future and “prev” actually is in the past. I definitely find the new behaviour more intuitive.

downloading the Hugo changelogs with a script

Figuring out why/when all of these changes happened was a little difficult. I ended up hacking together a bash script to download all of the changelogs from github as text files, which I could then grep to try to figure out what happened. It turns out it’s pretty easy to get all of the changelogs from the GitHub API.

So far everything was not so bad – there was also a change around taxonomies that’s I can’t quite explain, but it was all pretty manageable, but then we got to the really tough one: the markdown renderer.

change 4: the markdown renderer (blackfriday -> goldmark)

The blackfriday markdown renderer (which was previously the default) was removed in v0.100.0. This seems pretty reasonable:

It has been deprecated for a long time, its v1 version is not maintained anymore, and there are many known issues. Goldmark should be a mature replacement by now.

Fixing all my Markdown changes was a huge pain – I ended up having to update 80 different Markdown files (out of 700) so that they would render properly, and I’m not totally sure

why bother switching renderers?

The obvious question here is – why bother even trying to upgrade Hugo at all if I have to switch Markdown renderers? My old site was running totally fine and I think it wasn’t necessarily a good use of time, but the one reason I think it might be useful in the future is that the new renderer (goldmark) uses the CommonMark markdown standard, which I’m hoping will be somewhat more futureproof. So maybe I won’t have to go through this again? We’ll see.

Also it turned out that the new Goldmark renderer does fix some problems I had (but didn’t know that I had) with smart quotes and how lists/blockquotes interact.

finding all the Markdown problems: the process

The hard part of this Markdown change was even figuring out what changed. Almost all of the problems (including #2 and #3 above) just silently broke the site, they didn’t cause any errors or anything. So I had to diff the HTML to hunt them down.

Here’s what I ended up doing:

  1. Generate the site with the old version, put it in public_old
  2. Generate the new version, put it in public
  3. Diff every single HTML file in public/ and public_old with this diff.sh script and put the results in a diffs/ folder
  4. Run variations on find diffs -type f | xargs cat | grep -C 5 '(31m|32m)' | less -r over and over again to look at every single change until I found something that seemed wrong
  5. Update the Markdown to fix the problem
  6. Repeat until everything seemed okay

(the grep 31m|32m thing is searching for red/green text in the diff)

This was very time consuming but it was a little bit fun for some reason so I kept doing it until it seemed like nothing too horrible was left.

the new markdown rules

Here’s a list of every type of Markdown change I had to make. It’s very possible these are all extremely specific to me but it took me a long time to figure them all out so maybe this will be helpful to one other person who finds this in the future.

4.1: mixing HTML and markdown

This doesn’t work anymore (it doesn’t expand the link):

<small>
[a link](https://example.com)
</small>

I need to do this instead:

<small>

[a link](https://example.com)

</small>

This works too:

<small> [a link](https://example.com) </small>

4.2: << is changed into «

I didn’t want this so I needed to configure:

markup:
  goldmark:
    extensions:
      typographer:
        leftAngleQuote: '&lt;&lt;'
        rightAngleQuote: '&gt;&gt;'

4.3: nested lists sometimes need 4 space indents

This doesn’t render as a nested list anymore if I only indent by 2 spaces, I need to put 4 spaces.

1. a
  * b
  * c
2. b

The problem is that the amount of indent needed depends on the size of the list markers. Here’s a reference in CommonMark for this.

4.4: blockquotes inside lists work better

Previously the > quote here didn’t render as a blockquote, and with the new renderer it does.

* something
> quote
* something else

I found a bunch of Markdown that had been kind of broken (which I hadn’t noticed) that works better with the new renderer, and this is an example of that.

Lists inside blockquotes also seem to work better.

4.5: headings inside lists

Previously this didn’t render as a heading, but now it does. So I needed to replace the # with &num;.

* # passengers: 20

4.6: + or 1) at the beginning of the line makes it a list

I had something which looked like this:

`1 / (1
+ exp(-1)) = 0.73`

With Blackfriday it rendered like this:

<p><code>1 / (1
+ exp(-1)) = 0.73</code></p>

and with Goldmark it rendered like this:

<p>`1 / (1</p>
<ul>
<li>exp(-1)) = 0.73`</li>
</ul>

Same thing if there was an accidental 1) at the beginning of a line, like in this Markdown snippet

I set up a small Hadoop cluster (1 master, 2 workers, replication set to 
1) on 

To fix this I just had to rewrap the line so that the + wasn’t the first character.

The Markdown is formatted this way because I wrap my Markdown to 80 characters a lot and the wrapping isn’t very context sensitive.

4.7: no more smart quotes in code blocks

There were a bunch of places where the old renderer (Blackfriday) was doing unwanted things in code blocks like replacing ... with or replacing quotes with smart quotes. I hadn’t realized this was happening and I was very happy to have it fixed.

4.8: better quote management

The way this gets rendered got better:

"Oh, *interesting*!"
  • old: “Oh, interesting!“
  • new: “Oh, interesting!”

Before there were two left smart quotes, now the quotes match.

4.9: images are no longer wrapped in a p tag

Previously if I had an image like this:

<img src="https://jvns.ca/images/rustboot1.png">

it would get wrapped in a <p> tag, now it doesn’t anymore. I dealt with this just by adding a margin-bottom: 0.75em to images in the CSS, hopefully that’ll make them display well enough.

4.10: <br> is now wrapped in a p tag

Previously this wouldn’t get wrapped in a p tag, but now it seems to:

<br><br>

I just gave up on fixing this though and resigned myself to maybe having some extra space in some cases. Maybe I’ll try to fix it later if I feel like another yakshave.

4.11: some more goldmark settings

I also needed to

  • turn off code highlighting (because it wasn’t working properly and I didn’t have it before anyway)
  • use the old “blackfriday” method to generate heading IDs so they didn’t change
  • allow raw HTML in my markdown

Here’s what I needed to add to my config.yaml to do all that:

markup:
  highlight:
    codeFences: false
  goldmark:
    renderer:
      unsafe: true
    parser:
      autoHeadingIDType: blackfriday

Maybe I’ll try to get syntax highlighting working one day, who knows. I might prefer having it off though.

a little script to compare blackfriday and goldmark

I also wrote a little program to compare the Blackfriday and Goldmark output for various markdown snippets, here it is in a gist.

It’s not really configured the exact same way Blackfriday and Goldmark were in my Hugo versions, but it was still helpful to have to help me understand what was going on.

a quick note on maintaining themes

My approach to themes in Hugo has been:

  1. pay someone to make a nice design for the site (for example wizardzines.com was designed by Melody Starling)
  2. use a totally custom theme
  3. commit that theme to the same Github repo as the site

So I just need to edit the theme files to fix any problems. Also I wrote a lot of the theme myself so I’m pretty familiar with how it works.

Relying on someone else to keep a theme updated feels kind of scary to me, I think if I were using a third-party theme I’d just copy the code into my site’s github repo and then maintain it myself.

which static site generators have better backwards compatibility?

I asked on Mastodon if anyone had used a static site generator with good backwards compatibility.

The main answers seemed to be Jekyll and 11ty. Several people said they’d been using Jekyll for 10 years without any issues, and 11ty says it has stability as a core goal.

I think a big factor in how appealing Jekyll/11ty are is how easy it is for you to maintain a working Ruby / Node environment on your computer: part of the reason I stopped using Jekyll was that I got tired of having to maintain a working Ruby installation. But I imagine this wouldn’t be a problem for a Ruby or Node developer.

Several people said that they don’t build their Jekyll site locally at all – they just use GitHub Pages to build it.

that’s it!

Overall I’ve been happy with Hugo – I started using it because it had fast build times and it was a static binary, and both of those things are still extremely useful to me. I might have spent 10 hours on this upgrade, but I’ve probably spent 1000+ hours writing blog posts without thinking about Hugo at all so that seems like an extremely reasonable ratio.

I find it hard to be too mad about the backwards incompatible changes, most of them were quite a long time ago, Hugo does a great job of making their old releases available so you can use the old release if you want, and the most difficult one is removing support for the blackfriday Markdown renderer in favour of using something CommonMark-compliant which seems pretty reasonable to me even if it is a huge pain.

But it did take a long time and I don’t think I’d particularly recommend moving 700 blog posts to a new Markdown renderer unless you’re really in the mood for a lot of computer suffering for some reason.

The new renderer did fix a bunch of problems so I think overall it might be a good thing, even if I’ll have to remember to make 2 changes to how I write Markdown (4.1 and 4.3).

Also I’m still using Hugo 0.54 for https://wizardzines.com so maybe these notes will be useful to Future Me if I ever feel like upgrading Hugo for that site.

Hopefully I didn’t break too many things on the blog by doing this, let me know if you see anything broken!

2024-10-01T10:01:44+00:00 Fullscreen Open in Tab
Terminal colours are tricky

Yesterday I was thinking about how long it took me to get a colorscheme in my terminal that I was mostly happy with (SO MANY YEARS), and it made me wonder what about terminal colours made it so hard.

So I asked people on Mastodon what problems they’ve run into with colours in the terminal, and I got a ton of interesting responses! Let’s talk about some of the problems and a few possible ways to fix them.

problem 1: blue on black

One of the top complaints was “blue on black is hard to read”. Here’s an example of that: if I open Terminal.app, set the background to black, and run ls, the directories are displayed in a blue that isn’t that easy to read:

To understand why we’re seeing this blue, let’s talk about ANSI colours!

the 16 ANSI colours

Your terminal has 16 numbered colours – black, red, green, yellow, blue, magenta, cyan, white, and “bright” version of each of those.

Programs can use them by printing out an “ANSI escape code” – for example if you want to see each of the 16 colours in your terminal, you can run this Python program:

def color(num, text):
    return f"\033[38;5;{num}m{text}\033[0m"

for i in range(16):
    print(color(i, f"number {i:02}"))

what are the ANSI colours?

This made me wonder – if blue is colour number 5, who decides what hex color that should correspond to?

The answer seems to be “there’s no standard, terminal emulators just choose colours and it’s not very consistent”. Here’s a screenshot of a table from Wikipedia, where you can see that there’s a lot of variation:

problem 1.5: bright yellow on white

Bright yellow on white is even worse than blue on black, here’s what I get in a terminal with the default settings:

That’s almost impossible to read (and some other colours like light green cause similar issues), so let’s talk about solutions!

two ways to reconfigure your colours

If you’re annoyed by these colour contrast issues (or maybe you just think the default ANSI colours are ugly), you might think – well, I’ll just choose a different “blue” and pick something I like better!

There are two ways you can do this:

Way 1: Configure your terminal emulator: I think most modern terminal emulators have a way to reconfigure the colours, and some of them even come with some preinstalled themes that you might like better than the defaults.

Way 2: Run a shell script: There are ANSI escape codes that you can print out to tell your terminal emulator to reconfigure its colours. Here’s a shell script that does that, from the base16-shell project. You can see that it has a few different conventions for changing the colours – I guess different terminal emulators have different escape codes for changing their colour palette, and so the script is trying to pick the right style of escape code based on the TERM environment variable.

what are the pros and cons of the 2 ways of configuring your colours?

I prefer to use the “shell script” method, because:

  • if I switch terminal emulators for some reason, I don’t need to a different configuration system, my colours still Just Work
  • I use base16-shell with base16-vim to make my vim colours match my terminal colours, which is convenient

some advantages of configuring colours in your terminal emulator:

  • if you use a popular terminal emulator, there are probably a lot more nice terminal themes out there that you can choose from
  • not all terminal emulators support the “shell script method”, and even if they do, the results can be a little inconsistent

This is what my shell has looked like for probably the last 5 years (using the solarized light base16 theme), and I’m pretty happy with it. Here’s htop:

Okay, so let’s say you’ve found a terminal colorscheme that you like. What else can go wrong?

problem 2: programs using 256 colours

Here’s what some output of fd, a find alternative, looks like in my colorscheme:

The contrast is pretty bad here, and I definitely don’t have that lime green in my normal colorscheme. What’s going on?

We can see what color codes fd is using using the unbuffer program to capture its output including the color codes:

$ unbuffer fd . > out
$ vim out
^[[38;5;48mbad-again.sh^[[0m
^[[38;5;48mbad.sh^[[0m
^[[38;5;48mbetter.sh^[[0m
out

^[[38;5;48 means “set the foreground color to color 48”. Terminals don’t only have 16 colours – many terminals these days actually have 3 ways of specifying colours:

  1. the 16 ANSI colours we already talked about
  2. an extended set of 256 colours
  3. a further extended set of 24-bit hex colours, like #ffea03

So fd is using one of the colours from the extended 256-color set. bat (a cat alternative) does something similar – here’s what it looks like by default in my terminal.

This looks fine though and it really seems like it’s trying to work well with a variety of terminal themes.

some newer tools seem to have theme support

I think it’s interesting that some of these newer terminal tools (fd, cat, delta, and probably more) have support for arbitrary custom themes. I guess the downside of this approach is that the default theme might clash with your terminal’s background, but the upside is that it gives you a lot more control over theming the tool’s output than just choosing 16 ANSI colours.

I don’t really use bat, but if I did I’d probably use bat --theme ansi to just use the ANSI colours that I have set in my normal terminal colorscheme.

problem 3: the grays in Solarized

A bunch of people on Mastodon mentioned a specific issue with grays in the Solarized theme: when I list a directory, the base16 Solarized Light theme looks like this:

but iTerm’s default Solarized Light theme looks like this:

This is because in the iTerm theme (which is the original Solarized design), colors 9-14 (the “bright blue”, “bright red”, etc) are mapped to a series of grays, and when I run ls, it’s trying to use those “bright” colours to color my directories and executables.

My best guess for why the original Solarized theme is designed this way is to make the grays available to the vim Solarized colorscheme.

I’m pretty sure I prefer the modified base16 version I use where the “bright” colours are actually colours instead of all being shades of gray though. (I didn’t actually realize the version I was using wasn’t the “original” Solarized theme until I wrote this post)

In any case I really love Solarized and I’m very happy it exists so that I can use a modified version of it.

problem 4: a vim theme that doesn’t match the terminal background

If I my vim theme has a different background colour than my terminal theme, I get this ugly border, like this:

This one is a pretty minor issue though and I think making your terminal background match your vim background is pretty straightforward.

problem 5: programs setting a background color

A few people mentioned problems with terminal applications setting an unwanted background colour, so let’s look at an example of that.

Here ngrok has set the background to color #16 (“black”), but the base16-shell script I use sets color 16 to be bright orange, so I get this, which is pretty bad:

I think the intention is for ngrok to look something like this:

I think base16-shell sets color #16 to orange (instead of black) so that it can provide extra colours for use by base16-vim. This feels reasonable to me – I use base16-vim in the terminal, so I guess I’m using that feature and it’s probably more important to me than ngrok (which I rarely use) behaving a bit weirdly.

This particular issue is a maybe obscure clash between ngrok and my colorschem, but I think this kind of clash is pretty common when a program sets an ANSI background color that the user has remapped for some reason.

a nice solution to contrast issues: “minimum contrast”

A bunch of terminals (iTerm2, tabby, kitty’s text_fg_override_threshold, and folks tell me also Ghostty and Windows Terminal) have a “minimum contrast” feature that will automatically adjust colours to make sure they have enough contrast.

Here’s an example from iTerm. This ngrok accident from before has pretty bad contrast, I find it pretty difficult to read:

With “minimum contrast” set to 40 in iTerm, it looks like this instead:

I didn’t have minimum contrast turned on before but I just turned it on today because it makes such a big difference when something goes wrong with colours in the terminal.

problem 6: TERM being set to the wrong thing

A few people mentioned that they’ll SSH into a system that doesn’t support the TERM environment variable that they have set locally, and then the colours won’t work.

I think the way TERM works is that systems have a terminfo database, so if the value of the TERM environment variable isn’t in the system’s terminfo database, then it won’t know how to output colours for that terminal. I don’t know too much about terminfo, but someone linked me to this terminfo rant that talks about a few other issues with terminfo.

I don’t have a system on hand to reproduce this one so I can’t say for sure how to fix it, but this stackoverflow question suggests running something like TERM=xterm ssh instead of ssh.

problem 7: picking “good” colours is hard

A couple of problems people mentioned with designing / finding terminal colorschemes:

  • some folks are colorblind and have trouble finding an appropriate colorscheme
  • accidentally making the background color too close to the cursor or selection color, so they’re hard to find
  • generally finding colours that work with every program is a struggle (for example you can see me having a problem with this with ngrok above!)

problem 8: making nethack/mc look right

Another problem people mentioned is using a program like nethack or midnight commander which you might expect to have a specific colourscheme based on the default ANSI terminal colours.

For example, midnight commander has a really specific classic look:

But in my Solarized theme, midnight commander looks like this:

The Solarized version feels like it could be disorienting if you’re very used to the “classic” look.

One solution Simon Tatham mentioned to this is using some palette customization ANSI codes (like the ones base16 uses that I talked about earlier) to change the color palette right before starting the program, for example remapping yellow to a brighter yellow before starting Nethack so that the yellow characters look better.

problem 9: commands disabling colours when writing to a pipe

If I run fd | less, I see something like this, with the colours disabled.

In general I find this useful – if I pipe a command to grep, I don’t want it to print out all those color escape codes, I just want the plain text. But what if you want to see the colours?

To see the colours, you can run unbuffer fd | less -r! I just learned about unbuffer recently and I think it’s really cool, unbuffer opens a tty for the command to write to so that it thinks it’s writing to a TTY. It also fixes issues with programs buffering their output when writing to a pipe, which is why it’s called unbuffer.

Here’s what the output of unbuffer fd | less -r looks like for me:

Also some commands (including fd) support a --color=always flag which will force them to always print out the colours.

problem 10: unwanted colour in ls and other commands

Some people mentioned that they don’t want ls to use colour at all, perhaps because ls uses blue, it’s hard to read on black, and maybe they don’t feel like customizing their terminal’s colourscheme to make the blue more readable or just don’t find the use of colour helpful.

Some possible solutions to this one:

  • you can run ls --color=never, which is probably easiest
  • you can also set LS_COLORS to customize the colours used by ls. I think some other programs other than ls support the LS_COLORS environment variable too.
  • also some programs support setting NO_COLOR=true (there’s a list here)

Here’s an example of running LS_COLORS="fi=0:di=0:ln=0:pi=0:so=0:bd=0:cd=0:or=0:ex=0" ls:

problem 11: the colours in vim

I used to have a lot of problems with configuring my colours in vim – I’d set up my terminal colours in a way that I thought was okay, and then I’d start vim and it would just be a disaster.

I think what was going on here is that today, there are two ways to set up a vim colorscheme in the terminal:

  1. using your ANSI terminal colours – you tell vim which ANSI colour number to use for the background, for functions, etc.
  2. using 24-bit hex colours – instead of ANSI terminal colours, the vim colorscheme can use hex codes like #faea99 directly

20 years ago when I started using vim, terminals with 24-bit hex color support were a lot less common (or maybe they didn’t exist at all), and vim certainly didn’t have support for using 24-bit colour in the terminal. From some quick searching through git, it looks like vim added support for 24-bit colour in 2016 – just 8 years ago!

So to get colours to work properly in vim before 2016, you needed to synchronize your terminal colorscheme and your vim colorscheme. Here’s what that looked like, the colorscheme needed to map the vim color classes like cterm05 to ANSI colour numbers.

But in 2024, the story is really different! Vim (and Neovim, which I use now) support 24-bit colours, and as of Neovim 0.10 (released in May 2024), the termguicolors setting (which tells Vim to use 24-bit hex colours for colorschemes) is turned on by default in any terminal with 24-bit color support.

So this “you need to synchronize your terminal colorscheme and your vim colorscheme” problem is not an issue anymore for me in 2024, since I don’t plan to use terminals without 24-bit color support in the future.

The biggest consequence for me of this whole thing is that I don’t need base16 to set colors 16-21 to weird stuff anymore to integrate with vim – I can just use a terminal theme and a vim theme, and as long as the two themes use similar colours (so it’s not jarring for me to switch between them) there’s no problem. I think I can just remove those parts from my base16 shell script and totally avoid the problem with ngrok and the weird orange background I talked about above.

some more problems I left out

I think there are a lot of issues around the intersection of multiple programs, like using some combination tmux/ssh/vim that I couldn’t figure out how to reproduce well enough to talk about them. Also I’m sure I missed a lot of other things too.

base16 has really worked for me

I’ve personally had a lot of success with using base16-shell with base16-vim – I just need to add a couple of lines to my fish config to set it up (+ a few .vimrc lines) and then I can move on and accept any remaining problems that that doesn’t solve.

I don’t think base16 is for everyone though, some limitations I’m aware of with base16 that might make it not work for you:

  • it comes with a limited set of builtin themes and you might not like any of them
  • the Solarized base16 theme (and maybe all of the themes?) sets the “bright” ANSI colours to be exactly the same as the normal colours, which might cause a problem if you’re relying on the “bright” colours to be different from the regular ones
  • it sets colours 16-21 in order to give the vim colorschemes from base16-vim access to more colours, which might not be relevant if you always use a terminal with 24-bit color support, and can cause problems like the ngrok issue above
  • also the way it sets colours 16-21 could be a problem in terminals that don’t have 256-color support, like the linux framebuffer terminal

Apparently there’s a community fork of base16 called tinted-theming, which I haven’t looked into much yet.

some other colorscheme tools

Just one so far but I’ll link more if people tell me about them:

okay, that was a lot

We talked about a lot in this post and while I think learning about all these details is kind of fun if I’m in the mood to do a deep dive, I find it SO FRUSTRATING to deal with it when I just want my colours to work! Being surprised by unreadable text and having to find a workaround is just not my idea of a good day.

Personally I’m a zero-configuration kind of person and it’s not that appealing to me to have to put together a lot of custom configuration just to make my colours in the terminal look acceptable. I’d much rather just have some reasonable defaults that I don’t have to change.

minimum contrast seems like an amazing feature

My one big takeaway from writing this was to turn on “minimum contrast” in my terminal, I think it’s going to fix most of the occasional accidental unreadable text issues I run into and I’m pretty excited about it.

2024-09-27T11:16:00+00:00 Fullscreen Open in Tab
Some Go web dev notes

I spent a lot of time in the past couple of weeks working on a website in Go that may or may not ever see the light of day, but I learned a couple of things along the way I wanted to write down. Here they are:

go 1.22 now has better routing

I’ve never felt motivated to learn any of the Go routing libraries (gorilla/mux, chi, etc), so I’ve been doing all my routing by hand, like this.

	// DELETE /records:
	case r.Method == "DELETE" && n == 1 && p[0] == "records":
		if !requireLogin(username, r.URL.Path, r, w) {
			return
		}
		deleteAllRecords(ctx, username, rs, w, r)
	// POST /records/<ID>
	case r.Method == "POST" && n == 2 && p[0] == "records" && len(p[1]) > 0:
		if !requireLogin(username, r.URL.Path, r, w) {
			return
		}
		updateRecord(ctx, username, p[1], rs, w, r)

But apparently as of Go 1.22, Go now has better support for routing in the standard library, so that code can be rewritten something like this:

	mux.HandleFunc("DELETE /records/", app.deleteAllRecords)
	mux.HandleFunc("POST /records/{record_id}", app.updateRecord)

Though it would also need a login middleware, so maybe something more like this, with a requireLogin middleware.

	mux.Handle("DELETE /records/", requireLogin(http.HandlerFunc(app.deleteAllRecords)))

a gotcha with the built-in router: redirects with trailing slashes

One annoying gotcha I ran into was: if I make a route for /records/, then a request for /records will be redirected to /records/.

I ran into an issue with this where sending a POST request to /records redirected to a GET request for /records/, which broke the POST request because it removed the request body. Thankfully Xe Iaso wrote a blog post about the exact same issue which made it easier to debug.

I think the solution to this is just to use API endpoints like POST /records instead of POST /records/, which seems like a more normal design anyway.

sqlc automatically generates code for my db queries

I got a little bit tired of writing so much boilerplate for my SQL queries, but I didn’t really feel like learning an ORM, because I know what SQL queries I want to write, and I didn’t feel like learning the ORM’s conventions for translating things into SQL queries.

But then I found sqlc, which will compile a query like this:


-- name: GetVariant :one
SELECT *
FROM variants
WHERE id = ?;

into Go code like this:

const getVariant = `-- name: GetVariant :one
SELECT id, created_at, updated_at, disabled, product_name, variant_name
FROM variants
WHERE id = ?
`

func (q *Queries) GetVariant(ctx context.Context, id int64) (Variant, error) {
	row := q.db.QueryRowContext(ctx, getVariant, id)
	var i Variant
	err := row.Scan(
		&i.ID,
		&i.CreatedAt,
		&i.UpdatedAt,
		&i.Disabled,
		&i.ProductName,
		&i.VariantName,
	)
	return i, err
}

What I like about this is that if I’m ever unsure about what Go code to write for a given SQL query, I can just write the query I want, read the generated function and it’ll tell me exactly what to do to call it. It feels much easier to me than trying to dig through the ORM’s documentation to figure out how to construct the SQL query I want.

Reading Brandur’s sqlc notes from 2024 also gave me some confidence that this is a workable path for my tiny programs. That post gives a really helpful example of how to conditionally update fields in a table using CASE statements (for example if you have a table with 20 columns and you only want to update 3 of them).

sqlite tips

Someone on Mastodon linked me to this post called Optimizing sqlite for servers. My projects are small and I’m not so concerned about performance, but my main takeaways were:

  • have a dedicated object for writing to the database, and run db.SetMaxOpenConns(1) on it. I learned the hard way that if I don’t do this then I’ll get SQLITE_BUSY errors from two threads trying to write to the db at the same time.
  • if I want to make reads faster, I could have 2 separate db objects, one for writing and one for reading

There are a more tips in that post that seem useful (like “COUNT queries are slow” and “Use STRICT tables”), but I haven’t done those yet.

Also sometimes if I have two tables where I know I’ll never need to do a JOIN beteween them, I’ll just put them in separate databases so that I can connect to them independently.

Go 1.19 introduced a way to set a GC memory limit

I run all of my Go projects in VMs with relatively little memory, like 256MB or 512MB. I ran into an issue where my application kept getting OOM killed and it was confusing – did I have a memory leak? What?

After some Googling, I realized that maybe I didn’t have a memory leak, maybe I just needed to reconfigure the garbage collector! It turns out that by default (according to A Guide to the Go Garbage Collector), Go’s garbage collector will let the application allocate memory up to 2x the current heap size.

Mess With DNS’s base heap size is around 170MB and the amount of memory free on the VM is around 160MB right now, so if its memory doubled, it’ll get OOM killed.

In Go 1.19, they added a way to tell Go “hey, if the application starts using this much memory, run a GC”. So I set the GC memory limit to 250MB and it seems to have resulted in the application getting OOM killed less often:

export GOMEMLIMIT=250MiB

some reasons I like making websites in Go

I’ve been making tiny websites (like the nginx playground) in Go on and off for the last 4 years or so and it’s really been working for me. I think I like it because:

  • there’s just 1 static binary, all I need to do to deploy it is copy the binary. If there are static files I can just embed them in the binary with embed.
  • there’s a built-in webserver that’s okay to use in production, so I don’t need to configure WSGI or whatever to get it to work. I can just put it behind Caddy or run it on fly.io or whatever.
  • Go’s toolchain is very easy to install, I can just do apt-get install golang-go or whatever and then a go build will build my project
  • it feels like there’s very little to remember to start sending HTTP responses – basically all there is are functions like Serve(w http.ResponseWriter, r *http.Request) which read the request and send a response. If I need to remember some detail of how exactly that’s accomplished, I just have to read the function!
  • also net/http is in the standard library, so you can start making websites without installing any libraries at all. I really appreciate this one.
  • Go is a pretty systems-y language, so if I need to run an ioctl or something that’s easy to do

In general everything about it feels like it makes projects easy to work on for 5 days, abandon for 2 years, and then get back into writing code without a lot of problems.

For contrast, I’ve tried to learn Rails a couple of times and I really want to love Rails – I’ve made a couple of toy websites in Rails and it’s always felt like a really magical experience. But ultimately when I come back to those projects I can’t remember how anything works and I just end up giving up. It feels easier to me to come back to my Go projects that are full of a lot of repetitive boilerplate, because at least I can read the code and figure out how it works.

things I haven’t figured out yet

some things I haven’t done much of yet in Go:

  • rendering HTML templates: usually my Go servers are just APIs and I make the frontend a single-page app with Vue. I’ve used html/template a lot in Hugo (which I’ve used for this blog for the last 8 years) but I’m still not sure how I feel about it.
  • I’ve never made a real login system, usually my servers don’t have users at all.
  • I’ve never tried to implement CSRF

In general I’m not sure how to implement security-sensitive features so I don’t start projects which need login/CSRF/etc. I imagine this is where a framework would help.

it’s cool to see the new features Go has been adding

Both of the Go features I mentioned in this post (GOMEMLIMIT and the routing) are new in the last couple of years and I didn’t notice when they came out. It makes me think I should pay closer attention to the release notes for new Go versions.

2024-09-12T15:09:12+00:00 Fullscreen Open in Tab
Reasons I still love the fish shell

I wrote about how much I love fish in this blog post from 2017 and, 7 years of using it every day later, I’ve found even more reasons to love it. So I thought I’d write a new post with both the old reasons I loved it and some reasons.

This came up today because I was trying to figure out why my terminal doesn’t break anymore when I cat a binary to my terminal, the answer was “fish fixes the terminal!”, and I just thought that was really nice.

1. no configuration

In 10 years of using fish I have never found a single thing I wanted to configure. It just works the way I want. My fish config file just has:

  • environment variables
  • aliases (alias ls eza, alias vim nvim, etc)
  • the occasional direnv hook fish | source to integrate a tool like direnv
  • a script I run to set up my terminal colours

I’ve been told that configuring things in fish is really easy if you ever do want to configure something though.

2. autosuggestions from my shell history

My absolute favourite thing about fish is that I type, it’ll automatically suggest (in light grey) a matching command that I ran recently. I can press the right arrow key to accept the completion, or keep typing to ignore it.

Here’s what that looks like. In this example I just typed the “v” key and it guessed that I want to run the previous vim command again.

2.5 “smart” shell autosuggestions

One of my favourite subtle autocomplete features is how fish handles autocompleting commands that contain paths in them. For example, if I run:

$ ls blah.txt

that command will only be autocompleted in directories that contain blah.txt – it won’t show up in a different directory. (here’s a short comment about how it works)

As an example, if in this directory I type bash scripts/, it’ll only suggest history commands including files that actually exist in my blog’s scripts folder, and not the dozens of other irrelevant scripts/ commands I’ve run in other folders.

I didn’t understand exactly how this worked until last week, it just felt like fish was magically able to suggest the right commands. It still feels a little like magic and I love it.

3. pasting multiline commands

If I copy and paste multiple lines, bash will run them all, like this:

[bork@grapefruit linux-playground (main)]$ echo hi
hi
[bork@grapefruit linux-playground (main)]$ touch blah
[bork@grapefruit linux-playground (main)]$ echo hi
hi

This is a bit alarming – what if I didn’t actually want to run all those commands?

Fish will paste them all at a single prompt, so that I can press Enter if I actually want to run them. Much less scary.

bork@grapefruit ~/work/> echo hi

                         touch blah
                         echo hi

4. nice tab completion

If I run ls and press tab, it’ll display all the filenames in a nice grid. I can use either Tab, Shift+Tab, or the arrow keys to navigate the grid.

Also, I can tab complete from the middle of a filename – if the filename starts with a weird character (or if it’s just not very unique), I can type some characters from the middle and press tab.

Here’s what the tab completion looks like:

bork@grapefruit ~/work/> ls 
api/  blah.py     fly.toml   README.md
blah  Dockerfile  frontend/  test_websocket.sh

I honestly don’t complete things other than filenames very much so I can’t speak to that, but I’ve found the experience of tab completing filenames to be very good.

5. nice default prompt (including git integration)

Fish’s default prompt includes everything I want:

  • username
  • hostname
  • current folder
  • git integration
  • status of last command exit (if the last command failed)

Here’s a screenshot with a few different variations on the default prompt, including if the last command was interrupted (the SIGINT) or failed.

6. nice history defaults

In bash, the maximum history size is 500 by default, presumably because computers used to be slow and not have a lot of disk space. Also, by default, commands don’t get added to your history until you end your session. So if your computer crashes, you lose some history.

In fish:

  1. the default history size is 256,000 commands. I don’t see any reason I’d ever need more.
  2. if you open a new tab, everything you’ve ever run (including commands in open sessions) is immediately available to you
  3. in an existing session, the history search will only include commands from the current session, plus everything that was in history at the time that you started the shell

I’m not sure how clearly I’m explaining how fish’s history system works here, but it feels really good to me in practice. My impression is that the way it’s implemented is the commands are continually added to the history file, but fish only loads the history file once, on startup.

I’ll mention here that if you want to have a fancier history system in another shell it might be worth checking out atuin or fzf.

7. press up arrow to search history

I also like fish’s interface for searching history: for example if I want to edit my fish config file, I can just type:

$ config.fish

and then press the up arrow to go back the last command that included config.fish. That’ll complete to:

$ vim ~/.config/fish/config.fish

and I’m done. This isn’t so different from using Ctrl+R in bash to search your history but I think I like it a little better over all, maybe because Ctrl+R has some behaviours that I find confusing (for example you can end up accidentally editing your history which I don’t like).

8. the terminal doesn’t break

I used to run into issues with bash where I’d accidentally cat a binary to the terminal, and it would break the terminal.

Every time fish displays a prompt, it’ll try to fix up your terminal so that you don’t end up in weird situations like this. I think this is some of the code in fish to prevent broken terminals.

Some things that it does are:

  • turn on echo so that you can see the characters you type
  • make sure that newlines work properly so that you don’t get that weird staircase effect
  • reset your terminal background colour, etc

I don’t think I’ve run into any of these “my terminal is broken” issues in a very long time, and I actually didn’t even realize that this was because of fish – I thought that things somehow magically just got better, or maybe I wasn’t making as many mistakes. But I think it was mostly fish saving me from myself, and I really appreciate that.

9. Ctrl+S is disabled

Also related to terminals breaking: fish disables Ctrl+S (which freezes your terminal and then you need to remember to press Ctrl+Q to unfreeze it). It’s a feature that I’ve never wanted and I’m happy to not have it.

Apparently you can disable Ctrl+S in other shells with stty -ixon.

10. fish_add_path

I have mixed feelings about this one, but in Fish you can use fish_add_path /opt/whatever/bin to add a path to your PATH, globally, permanently, across all open shell sessions. This can get a bit confusing if you forget where those PATH entries are configured but overall I think I appreciate it.

11. nice syntax highlighting

By default commands that don’t exist are highlighted in red, like this.

12. easier loops

I find the loop syntax in fish a lot easier to type than the bash syntax. It looks like this:

for i in *.yaml
  echo $i
end

Also it’ll add indentation in your loops which is nice.

13. easier multiline editing

Related to loops: you can edit multiline commands much more easily than in bash (just use the arrow keys to navigate the multiline command!). Also when you use the up arrow to get a multiline command from your history, it’ll show you the whole command the exact same way you typed it instead of squishing it all onto one line like bash does:

$ bash
$ for i in *.png
> do
> echo $i
> done
$ # press up arrow
$ for i in *.png; do echo $i; done ink

14. Ctrl+left arrow

This might just be me, but I really appreciate that fish has the Ctrl+left arrow / Ctrl+right arrow keyboard shortcut for moving between words when writing a command.

I’m honestly a bit confused about where this keyboard shortcut is coming from (the only documented keyboard shortcut for this I can find in fish is Alt+left arrow / Alt + right arrow which seems to do the same thing), but I’m pretty sure this is a fish shortcut.

A couple of notes about getting this shortcut to work / where it comes from:

  • one person said they needed to switch their terminal emulator from the “Linux console” keybindings to “Default (XFree 4)” to get it to work in fish
  • on Mac OS, Ctrl+left arrow switches workspaces by default, so I had to turn that off.
  • Also apparently Ubuntu configures libreadline in /etc/inputrc to make Ctrl+left/right arrow go back/forward a word, so it’ll work in bash on Ubuntu and maybe other Linux distros too. Here’s a stack overflow question talking about that

a downside: not everything has a fish integration

Sometimes tools don’t have instructions for integrating them with fish. That’s annoying, but:

  • I’ve found this has gotten better over the last 10 years as fish has gotten more popular. For example Python’s virtualenv has had a fish integration for a long time now.
  • If I need to run a POSIX shell command real quick, I can always just run bash or zsh
  • I’ve gotten much better over the years at translating simple commands to fish syntax when I need to

My biggest day-to-day to annoyance is probably that for whatever reason I’m still not used to fish’s syntax for setting environment variables, I get confused about set vs set -x.

on POSIX compatibility

When I started using fish, you couldn’t do things like cmd1 && cmd2 – it would complain “no, you need to run cmd1; and cmd2” instead.

It seems like over the years fish has started accepting a little more POSIX-style syntax than it used to, like:

  • cmd1 && cmd2
  • export a=b to set an environment variable (though this seems a bit limited, you can’t do export PATH=$PATH:/whatever so I think it’s probably better to learn set instead)

on fish as a default shell

Changing my default shell to fish is always a little annoying, I occasionally get myself into a situation where

  1. I install fish somewhere like maybe /home/bork/.nix-stuff/bin/fish
  2. I add the new fish location to /etc/shells as an allowed shell
  3. I change my shell with chsh
  4. at some point months/years later I reinstall fish in a different location for some reason and remove the old one
  5. oh no!!! I have no valid shell! I can’t open a new terminal tab anymore!

This has never been a major issue because I always have a terminal open somewhere where I can fix the problem and rescue myself, but it’s a bit alarming.

If you don’t want to use chsh to change your shell to fish (which is very reasonable, maybe I shouldn’t be doing that), the Arch wiki page has a couple of good suggestions – either configure your terminal emulator to run fish or add an exec fish to your .bashrc.

I’ve never really learned the scripting language

Other than occasionally writing a for loop interactively on the command line, I’ve never really learned the fish scripting language. I still do all of my shell scripting in bash.

I don’t think I’ve ever written a fish function or if statement.

I ran a highly unscientific poll on Mastodon asking people what shell they use interactively. The results were (of 2600 responses):

  • 46% bash
  • 49% zsh
  • 16% fish
  • 5% other

I think 16% for fish is pretty remarkable, since (as far as I know) there isn’t any system where fish is the default shell, and my sense is that it’s very common to just stick to whatever your system’s default shell is.

It feels like a big achievement for the fish project, even if maybe my Mastodon followers are more likely than the average shell user to use fish for some reason.

who might fish be right for?

Fish definitely isn’t for everyone. I think I like it because:

  1. I really dislike configuring my shell (and honestly my dev environment in general), I want things to “just work” with the default settings
  2. fish’s defaults feel good to me
  3. I don’t spend that much time logged into random servers using other shells so there’s not too much context switching
  4. I liked its features so much that I was willing to relearn how to do a few “basic” shell things, like using parentheses (seq 1 10) to run a command instead of backticks or using set instead of export

Maybe you’re also a person who would like fish! I hope a few more of the people who fish is for can find it, because I spend so much of my time in the terminal and it’s made that time much more pleasant.

2024-08-31T18:36:50-07:00 Fullscreen Open in Tab
Thoughts on the Resiliency of Web Projects

I just did a massive spring cleaning of one of my servers, trying to clean up what has become quite the mess of clutter. For every website on the server, I either:

  • Documented what it is, who is using it, and what version of language and framework it uses
  • Archived it as static HTML flat files
  • Moved the source code from GitHub to a private git server
  • Deleted the files

It feels good to get rid of old code, and to turn previously dynamic sites (with all of the risk they come with) into plain HTML.

This is also making me seriously reconsider the value of spinning up any new projects. Several of these are now 10 years old, still churning along fine, but difficult to do any maintenance on because of versions and dependencies. For example:

  • indieauth.com - this has been on the chopping block for years, but I haven't managed to build a replacement yet, and is still used by a lot of people
  • webmention.io - this is a pretty popular service, and I don't want to shut it down, but there's a lot of problems with how it's currently built and no easy way to make changes
  • switchboard.p3k.io - this is a public WebSub (PubSubHubbub) hub, like Superfeedr, and has weirdly gained a lot of popularity in the podcast feed space in the last few years

One that I'm particularly happy with, despite it being an ugly pile of PHP, is oauth.net. I inherited this site in 2012, and it hasn't needed any framework upgrades since it's just using PHP templates. My ham radio website w7apk.com is similarly a small amount of templated PHP, and it is low stress to maintain, and actually fun to quickly jot some notes down when I want. I like not having to go through the whole ceremony of setting up a dev environment, installing dependencies, upgrading things to the latest version, checking for backwards incompatible changes, git commit, deploy, etc. I can just sftp some changes up to the server and they're live.

Some questions for myself for the future, before starting a new project:

  • Could this actually just be a tag page on my website, like #100DaysOfMusic or #BikeTheEclipse?
  • If it really needs to be a new project, then:
  • Can I create it in PHP without using any frameworks or libraries? Plain PHP ages far better than pulling in any dependencies which inevitably stop working with a version 2-3 EOL cycles back, so every library brought in means signing up for annual maintenance of the whole project. Frameworks can save time in the short term, but have a huge cost in the long term.
  • Is it possible to avoid using a database? Databases aren't inherently bad, but using one does make the project slightly more fragile, since it requires plans for migrations and backups, and 
  • If a database is required, is it possible to create it in a way that does not result in ever-growing storage needs?
  • Is this going to store data or be a service that other people are going to use? If so, plan on a registration form so that I have a way to contact people eventually when I need to change it or shut it down.
  • If I've got this far with the questions, am I really ready to commit to supporting this code base for the next 10 years?

One project I've been committed to maintaining and doing regular (ok fine, "semi-regular") updates for is Meetable, the open source events website that I run on a few domains:

I started this project in October 2019, excited for all the IndieWebCamps we were going to run in 2020. Somehow that is already 5 years ago now. Well that didn't exactly pan out, but I did quickly pivot it to add a bunch of features that are helpful for virtual events, so it worked out ok in the end. We've continued to use it for posting IndieWeb events, and I also run an instance for two IETF working groups. I'd love to see more instances pop up, I've only encountered one or two other ones in the wild. I even spent a significant amount of time on the onboarding flow so that it's relatively easy to install and configure. I even added passkeys for the admin login so you don't need any external dependencies on auth providers. It's a cool project if I may say so myself.

Anyway, this is not a particularly well thought out blog post, I just wanted to get my thoughts down after spending all day combing through the filesystem of my web server and uncovering a lot of ancient history.

2024-08-29T12:59:53-07:00 Fullscreen Open in Tab
OAuth Oh Yeah!

The first law of OAuth states that
the total number of authorized access tokens
in an isolated system
must remain constant over time. Over time.

In the world of OAuth, where the sun always shines,
Tokens like treasures, in digital lines.
Security's a breeze, with every law so fine,
OAuth, oh yeah, tonight we dance online!

The second law of OAuth states that
the overall security of the system
must always remain constant over time.
Over time. Over time. Over time.

In the world of OAuth, where the sun always shines,
Tokens like treasures, in digital lines.
Security's a breeze, with every law so fine,
OAuth, oh yeah, tonight we dance online!

The third law of OAuth states that
as the security of the system approaches absolute,
the ability to grant authorized access approaches zero. Zero!

In the world of OAuth, where the sun always shines,
Tokens like treasures, in digital lines.
Security's a breeze, with every law so fine,
OAuth, oh yeah, tonight we dance online!

Tonight we dance online!
OAuth, oh yeah!
Lyrics and music by AI, prompted and edited by Aaron Parecki
2024-08-19T08:15:28+00:00 Fullscreen Open in Tab
Migrating Mess With DNS to use PowerDNS

About 3 years ago, I announced Mess With DNS in this blog post, a playground where you can learn how DNS works by messing around and creating records.

I wasn’t very careful with the DNS implementation though (to quote the release blog post: “following the DNS RFCs? not exactly”), and people started reporting problems that eventually I decided that I wanted to fix.

the problems

Some of the problems people have reported were:

  • domain names with underscores weren’t allowed, even though they should be
  • If there was a CNAME record for a domain name, it allowed you to create other records for that domain name, even if it shouldn’t
  • you could create 2 different CNAME records for the same domain name, which shouldn’t be allowed
  • no support for the SVCB or HTTPS record types, which seemed a little complex to implement
  • no support for upgrading from UDP to TCP for big responses

And there are certainly more issues that nobody got around to reporting, for example that if you added an NS record for a subdomain to delegate it, Mess With DNS wouldn’t handle the delegation properly.

the solution: PowerDNS

I wasn’t sure how to fix these problems for a long time – technically I could have started addressing them individually, but it felt like there were a million edge cases and I’d never get there.

But then one day I was chatting with someone else who was working on a DNS server and they said they were using PowerDNS: an open source DNS server with an HTTP API!

This seemed like an obvious solution to my problems – I could just swap out my own crappy DNS implementation for PowerDNS.

There were a couple of challenges I ran into when setting up PowerDNS that I’ll talk about here. I really don’t do a lot of web development and I think I’ve never built a website that depends on a relatively complex API before, so it was a bit of a learning experience.

challenge 1: getting every query made to the DNS server

One of the main things Mess With DNS does is give you a live view of every DNS query it receives for your subdomain, using a websocket. To make this work, it needs to intercept every DNS query before they it gets sent to the PowerDNS DNS server:

There were 2 options I could think of for how to intercept the DNS queries:

  1. dnstap: dnsdist (a DNS load balancer from the PowerDNS project) has support for logging all DNS queries it receives using dnstap, so I could put dnsdist in front of PowerDNS and then log queries that way
  2. Have my Go server listen on port 53 and proxy the queries myself

I originally implemented option #1, but for some reason there was a 1 second delay before every query got logged. I couldn’t figure out why, so I implemented my own very simple proxy instead.

challenge 2: should the frontend have direct access to the PowerDNS API?

The frontend used to have a lot of DNS logic in it – it converted emoji domain names to ASCII using punycode, had a lookup table to convert numeric DNS query types (like 1) to their human-readable names (like A), did a little bit of validation, and more.

Originally I considered keeping this pattern and just giving the frontend (more or less) direct access to the PowerDNS API to create and delete, but writing even more complex code in Javascript didn’t feel that appealing to me – I don’t really know how to write tests in Javascript and it seemed like it wouldn’t end well.

So I decided to take all of the DNS logic out of the frontend and write a new DNS API for managing records, shaped something like this:

  • GET /records
  • DELETE /records/<ID>
  • DELETE /records/ (delete all records for a user)
  • POST /records/ (create record)
  • POST /records/<ID> (update record)

This meant that I could actually write tests for my code, since the backend is in Go and I do know how to write tests in Go.

what I learned: it’s okay for an API to duplicate information

I had this idea that APIs shouldn’t return duplicate information – for example if I get a DNS record, it should only include a given piece of information once.

But I ran into a problem with that idea when displaying MX records: an MX record has 2 fields, “preference”, and “mail server”. And I needed to display that information in 2 different ways on the frontend:

  1. In a form, where “Preference” and “Mail Server” are 2 different form fields (like 10 and mail.example.com)
  2. In a summary view, where I wanted to just show the record (10 mail.example.com)

This is kind of a small problem, but it came up in a few different places.

I talked to my friend Marco Rogers about this, and based on some advice from him I realized that I could return the same information in the API in 2 different ways! Then the frontend just has to display it. So I started just returning duplicate information in the API, something like this:

{
  values: {'Preference': 10, 'Server': 'mail.example.com'},
  content: '10 mail.example.com',
  ...
}

I ended up using this pattern in a couple of other places where I needed to display the same information in 2 different ways and it was SO much easier.

I think what I learned from this is that if I’m making an API that isn’t intended for external use (there are no users of this API other than the frontend!), I can tailor it very specifically to the frontend’s needs and that’s okay.

challenge 3: what’s a record’s ID?

In Mess With DNS (and I think in most DNS user interfaces!), you create, add, and delete records.

But that’s not how the PowerDNS API works. In PowerDNS, you create a zone, which is made of record sets. Records don’t have any ID in the API at all.

I ended up solving this by generate a fake ID for each records which is made of:

  • its name
  • its type
  • and its content (base64-encoded)

For example one record’s ID is brooch225.messwithdns.com.|NS|bnMxLm1lc3N3aXRoZG5zLmNvbS4=

Then I can search through the zone and find the appropriate record to update it.

This means that if you update a record then its ID will change which isn’t usually what I want in an ID, but that seems fine.

challenge 4: making clear error messages

I think the error messages that the PowerDNS API returns aren’t really intended to be shown to end users, for example:

  • Name 'new\032site.island358.messwithdns.com.' contains unsupported characters (this error encodes the space as \032, which is a bit disorienting if you don’t know that the space character is 32 in ASCII)
  • RRset test.pear5.messwithdns.com. IN CNAME: Conflicts with pre-existing RRset (this talks about RRsets, which aren’t a concept that the Mess With DNS UI has at all)
  • Record orange.beryl5.messwithdns.com./A '1.2.3.4$': Parsing record content (try 'pdnsutil check-zone'): unable to parse IP address, strange character: $ (mentions “pdnsutil”, a utility which Mess With DNS’s users don’t have access to in this context)

I ended up handling this in two ways:

  1. Do some initial basic validation of values that users enter (like IP addresses), so I can just return errors like Invalid IPv4 address: "1.2.3.4$
  2. If that goes well, send the request to PowerDNS and if we get an error back, then do some hacky translation of those messages to make them clearer.

Sometimes users will still get errors from PowerDNS directly, but I added some logging of all the errors that users see, so hopefully I can review them and add extra translations if there are other common errors that come up.

I think what I learned from this is that if I’m building a user-facing application on top of an API, I need to be pretty thoughtful about how I resurface those errors to users.

challenge 5: setting up SQLite

Previously Mess With DNS was using a Postgres database. This was problematic because I only gave the Postgres machine 256MB of RAM, which meant that the database got OOM killed almost every single day. I never really worked out exactly why it got OOM killed every day, but that’s how it was. I spent some time trying to tune Postgres’ memory usage by setting the max connections / work-mem / maintenance-work-mem and it helped a bit but didn’t solve the problem.

So for this refactor I decided to use SQLite instead, because the website doesn’t really get that much traffic. There are some choices involved with using SQLite, and I decided to:

  1. Run db.SetMaxOpenConns(1) to make sure that we only open 1 connection to the database at a time, to prevent SQLITE_BUSY errors from two threads trying to access the database at the same time (just setting WAL mode didn’t work)
  2. Use separate databases for each of the 3 tables (users, records, and requests) to reduce contention. This maybe isn’t really necessary, but there was no reason I needed the tables to be in the same database so I figured I’d set up separate databases to be safe.
  3. Use the cgo-free modernc.org/sqlite, which translates SQLite’s source code to Go. I might switch to a more “normal” sqlite implementation instead at some point and use cgo though. I think the main reason I prefer to avoid cgo is that cgo has landed me with difficult-to-debug errors in the past.
  4. use WAL mode

I still haven’t set up backups, though I don’t think my Postgres database had backups either. I think I’m unlikely to use litestream for backups – Mess With DNS is very far from a critical application, and I think daily backups that I could recover from in case of a disaster are more than good enough.

challenge 6: upgrading Vue & managing forms

This has nothing to do with PowerDNS but I decided to upgrade Vue.js from version 2 to 3 as part of this refresh. The main problem with that is that the form validation library I was using (FormKit) completely changed its API between Vue 2 and Vue 3, so I decided to just stop using it instead of learning the new API.

I ended up switching to some form validation tools that are built into the browser like required and oninvalid (here’s the code). I think it could use some of improvement, I still don’t understand forms very well.

challenge 7: managing state in the frontend

This also has nothing to do with PowerDNS, but when modifying the frontend I realized that my state management in the frontend was a mess – in every place where I made an API request to the backend, I had to try to remember to add a “refresh records” call after that in every place that I’d modified the state and I wasn’t always consistent about it.

With some more advice from Marco, I ended up implementing a single global state management store which stores all the state for the application, and which lets me create/update/delete records.

Then my components can just call store.createRecord(record), and the store will automatically resynchronize all of the state as needed.

challenge 8: sequencing the project

This project ended up having several steps because I reworked the whole integration between the frontend and the backend. I ended up splitting it into a few different phases:

  1. Upgrade Vue from v2 to v3
  2. Make the state management store
  3. Implement a different backend API, move a lot of DNS logic out of the frontend, and add tests for the backend
  4. Integrate PowerDNS

I made sure that the website was (more or less) 100% working and then deployed it in between phases, so that the amount of changes I was managing at a time stayed somewhat under control.

the new website is up now!

I released the upgraded website a few days ago and it seems to work! The PowerDNS API has been great to work on top of, and I’m relieved that there’s a whole class of problems that I now don’t have to think about at all, other than potentially trying to make the error messages from PowerDNS a little clearer. Using PowerDNS has fixed a lot of the DNS issues that folks have reported in the last few years and it feels great.

If you run into problems with the new Mess With DNS I’d love to hear about them here.

2024-08-06T08:38:35+00:00 Fullscreen Open in Tab
Go structs are copied on assignment (and other things about Go I'd missed)

I’ve been writing Go pretty casually for years – the backends for all of my playgrounds (nginx, dns, memory, more DNS) are written in Go, but many of those projects are just a few hundred lines and I don’t come back to those codebases much.

I thought I more or less understood the basics of the language, but this week I’ve been writing a lot more Go than usual while working on some upgrades to Mess with DNS, and ran into a bug that revealed I was missing a very basic concept!

Then I posted about this on Mastodon and someone linked me to this very cool site (and book) called 100 Go Mistakes and How To Avoid Them by Teiva Harsanyi. It just came out in 2022 so it’s relatively new.

I decided to read through the site to see what else I was missing, and found a couple of other misconceptions I had about Go. I’ll talk about some of the mistakes that jumped out to me the most, but really the whole 100 Go Mistakes site is great and I’d recommend reading it.

Here’s the initial mistake that started me on this journey:

mistake 1: not understanding that structs are copied on assignment

Let’s say we have a struct:

type Thing struct {
    Name string
}

and this code:

thing := Thing{"record"}
other_thing := thing
other_thing.Name = "banana"
fmt.Println(thing)

This prints “record” and not “banana” (play.go.dev link), because thing is copied when you assign it to other_thing.

the problem this caused me: ranges

The bug I spent 2 hours of my life debugging last week was effectively this code (play.go.dev link):

type Thing struct {
  Name string
}
func findThing(things []Thing, name string) *Thing {
  for _, thing := range things {
    if thing.Name == name {
      return &thing
    }
  }
  return nil
}

func main() {
  things := []Thing{Thing{"record"}, Thing{"banana"}}
  thing := findThing(things, "record")
  thing.Name = "gramaphone"
  fmt.Println(things)
}

This prints out [{record} {banana}] – because findThing returned a copy, we didn’t change the name in the original array.

This mistake is #30 in 100 Go Mistakes.

I fixed the bug by changing it to something like this (play.go.dev link), which returns a reference to the item in the array we’re looking for instead of a copy.

func findThing(things []Thing, name string) *Thing {
  for i := range things {
    if things[i].Name == name {
      return &things[i]
    }
  }
  return nil
}

why didn’t I realize this?

When I learned that I was mistaken about how assignment worked in Go I was really taken aback, like – it’s such a basic fact about the language works! If I was wrong about that then what ELSE am I wrong about in Go????

My best guess for what happened is:

  1. I’ve heard for my whole life that when you define a function, you need to think about whether its arguments are passed by reference or by value
  2. So I’d thought about this in Go, and I knew that if you pass a struct as a value to a function, it gets copied – if you want to pass a reference then you have to pass a pointer
  3. But somehow it never occurred to me that you need to think about the same thing for assignments, perhaps because in most of the other languages I use (Python, JS, Java) I think everything is a reference anyway. Except for in Rust, where you do have values that you make copies of but I think most of the time I had to run .clone() explicitly. (though apparently structs will be automatically copied on assignment if the struct implements the Copy trait)
  4. Also obviously I just don’t write that much Go so I guess it’s never come up.

mistake 2: side effects appending slices (#25)

When you subset a slice with x[2:3], the original slice and the sub-slice share the same backing array, so if you append to the new slice, it can unintentionally change the old slice:

For example, this code prints [1 2 3 555 5] (code on play.go.dev)

x := []int{1, 2, 3, 4, 5}
y := x[2:3]
y = append(y, 555)
fmt.Println(x)

I don’t think this has ever actually happened to me, but it’s alarming and I’m very happy to know about it.

Apparently you can avoid this problem by changing y := x[2:3] to y := x[2:3:3], which restricts the new slice’s capacity so that appending to it will re-allocate a new slice. Here’s some code on play.go.dev that does that.

mistake 3: not understanding the different types of method receivers (#42)

This one isn’t a “mistake” exactly, but it’s been a source of confusion for me and it’s pretty simple so I’m glad to have it cleared up.

In Go you can declare methods in 2 different ways:

  1. func (t Thing) Function() (a “value receiver”)
  2. func (t *Thing) Function() (a “pointer receiver”)

My understanding now is that basically:

  • If you want the method to mutate the struct t, you need a pointer receiver.
  • If you want to make sure the method doesn’t mutate the struct t, use a value receiver.

Explanation #42 has a bunch of other interesting details though. There’s definitely still something I’m missing about value vs pointer receivers (I got a compile error related to them a couple of times in the last week that I still don’t understand), but hopefully I’ll run into that error again soon and I can figure it out.

more interesting things I noticed

Some more notes from 100 Go Mistakes:

Also there are some things that have tripped me up in the past, like:

this “100 common mistakes” format is great

I really appreciated this “100 common mistakes” format – it made it really easy for me to skim through the mistakes and very quickly mentally classify them into:

  1. yep, I know that
  2. not interested in that one right now
  3. WOW WAIT I DID NOT KNOW THAT, THAT IS VERY USEFUL!!!!

It looks like “100 Common Mistakes” is a series of books from Manning and they also have “100 Java Mistakes” and an upcoming “100 SQL Server Mistakes”.

Also I enjoyed what I’ve read of Effective Python by Brett Slatkin, which has a similar “here are a bunch of short Python style tips” structure where you can quickly skim it and take what’s useful to you. There’s also Effective C++, Effective Java, and probably more.

some other Go resources

other resources I’ve appreciated:

2024-07-21T12:54:40-07:00 Fullscreen Open in Tab
My IETF 120 Agenda

Here's where you can find me at IETF 120 in Vancouver!

Monday

  • 9:30 - 11:30 • alldispatch • Regency C/D
  • 13:00 - 15:00 • oauth • Plaza B
  • 18:30 - 19:30 • Hackdemo Happy Hour • Regency Hallway

Tuesday

  • 15:30 - 17:00 • oauth • Georgia A
  • 17:30 - 18:30 • oauth • Plaza B

Wednesday

  • 9:30 - 11:30 • wimse • Georgia A
  • 11:45 - 12:45 • Chairs Forum • Regency C/D
  • 17:30 - 19:30 • IETF Plenary • Regency A/B/C/D

Thursday

  • 17:00 - 18:00 • spice • Regency A/B
  • 18:30 - 19:30 • spice • Regency A/B

Friday

  • 13:00 - 15:00 • oauth • Regency A/B

My Current Drafts

2024-07-08T13:00:15+00:00 Fullscreen Open in Tab
Entering text in the terminal is complicated

The other day I asked what folks on Mastodon find confusing about working in the terminal, and one thing that stood out to me was “editing a command you already typed in”.

This really resonated with me: even though entering some text and editing it is a very “basic” task, it took me maybe 15 years of using the terminal every single day to get used to using Ctrl+A to go to the beginning of the line (or Ctrl+E for the end – I think I used Home/End instead).

So let’s talk about why entering text might be hard! I’ll also share a few tips that I wish I’d learned earlier.

it’s very inconsistent between programs

A big part of what makes entering text in the terminal hard is the inconsistency between how different programs handle entering text. For example:

  1. some programs (cat, nc, git commit --interactive, etc) don’t support using arrow keys at all: if you press arrow keys, you’ll just see ^[[D^[[D^[[C^[[C^
  2. many programs (like irb, python3 on a Linux machine and many many more) use the readline library, which gives you a lot of basic functionality (history, arrow keys, etc)
  3. some programs (like /usr/bin/python3 on my Mac) do support very basic features like arrow keys, but not other features like Ctrl+left or reverse searching with Ctrl+R
  4. some programs (like the fish shell or ipython3 or micro or vim) have their own fancy system for accepting input which is totally custom

So there’s a lot of variation! Let’s talk about each of those a little more.

mode 1: the baseline

First, there’s “the baseline” – what happens if a program just accepts text by calling fgets() or whatever and doing absolutely nothing else to provide a nicer experience. Here’s what using these tools typically looks for me – If I start the version of dash installed on my machine (a pretty minimal shell) press the left arrow keys, it just prints ^[[D to the terminal.

$ ls l-^[[D^[[D^[[D

At first it doesn’t seem like all of these “baseline” tools have much in common, but there are actually a few features that you get for free just from your terminal, without the program needing to do anything special at all.

The things you get for free are:

  1. typing in text, obviously
  2. backspace
  3. Ctrl+W, to delete the previous word
  4. Ctrl+U, to delete the whole line
  5. a few other things unrelated to text editing (like Ctrl+C to interrupt the process, Ctrl+Z to suspend, etc)

This is not great, but it means that if you want to delete a word you generally can do it with Ctrl+W instead of pressing backspace 15 times, even if you’re in an environment which is offering you absolutely zero features.

You can get a list of all the ctrl codes that your terminal supports with stty -a.

mode 2: tools that use readline

The next group is tools that use readline! Readline is a GNU library to make entering text more pleasant, and it’s very widely used.

My favourite readline keyboard shortcuts are:

  1. Ctrl+E (or End) to go to the end of the line
  2. Ctrl+A (or Home) to go to the beginning of the line
  3. Ctrl+left/right arrow to go back/forward 1 word
  4. up arrow to go back to the previous command
  5. Ctrl+R to search your history

And you can use Ctrl+W / Ctrl+U from the “baseline” list, though Ctrl+U deletes from the cursor to the beginning of the line instead of deleting the whole line. I think Ctrl+W might also have a slightly different definition of what a “word” is.

There are a lot more (here’s a full list), but those are the only ones that I personally use.

The bash shell is probably the most famous readline user (when you use Ctrl+R to search your history in bash, that feature actually comes from readline), but there are TONS of programs that use it – for example psql, irb, python3, etc.

tip: you can make ANYTHING use readline with rlwrap

One of my absolute favourite things is that if you have a program like nc without readline support, you can just run rlwrap nc to turn it into a program with readline support!

This is incredible and makes a lot of tools that are borderline unusable MUCH more pleasant to use. You can even apparently set up rlwrap to include your own custom autocompletions, though I’ve never tried that.

some reasons tools might not use readline

I think reasons tools might not use readline might include:

  • the program is very simple (like cat or nc) and maybe the maintainers don’t want to bring in a relatively large dependency
  • license reasons, if the program’s license is not GPL-compatible – readline is GPL-licensed, not LGPL
  • only a very small part of the program is interactive, and maybe readline support isn’t seen as important. For example git has a few interactive features (like git add -p), but not very many, and usually you’re just typing a single character like y or n – most of the time you need to really type something significant in git, it’ll drop you into a text editor instead.

For example idris2 says they don’t use readline to keep dependencies minimal and suggest using rlwrap to get better interactive features.

how to know if you’re using readline

The simplest test I can think of is to press Ctrl+R, and if you see:

(reverse-i-search)`':

then you’re probably using readline. This obviously isn’t a guarantee (some other library could use the term reverse-i-search too!), but I don’t know of another system that uses that specific term to refer to searching history.

the readline keybindings come from Emacs

Because I’m a vim user, It took me a very long time to understand where these keybindings come from (why Ctrl+A to go to the beginning of a line??? so weird!)

My understanding is these keybindings actually come from Emacs – Ctrl+A and Ctrl+E do the same thing in Emacs as they do in Readline and I assume the other keyboard shortcuts mostly do as well, though I tried out Ctrl+W and Ctrl+U in Emacs and they don’t do the same thing as they do in the terminal so I guess there are some differences.

There’s some more history of the Readline project here.

mode 3: another input library (like libedit)

On my Mac laptop, /usr/bin/python3 is in a weird middle ground where it supports some readline features (for example the arrow keys), but not the other ones. For example when I press Ctrl+left arrow, it prints out ;5D, like this:

$ python3
>>> importt subprocess;5D

Folks on Mastodon helped me figure out that this is because in the default Python install on Mac OS, the Python readline module is actually backed by libedit, which is a similar library which has fewer features, presumably because Readline is GPL licensed.

Here’s how I was eventually able to figure out that Python was using libedit on my system:

$ python3 -c "import readline; print(readline.__doc__)"
Importing this module enables command line editing using libedit readline.

Generally Python uses readline though if you install it on Linux or through Homebrew. It’s just that the specific version that Apple includes on their systems doesn’t have readline. Also Python 3.13 is going to remove the readline dependency in favour of a custom library, so “Python uses readline” won’t be true in the future.

I assume that there are more programs on my Mac that use libedit but I haven’t looked into it.

mode 4: something custom

The last group of programs is programs that have their own custom (and sometimes much fancier!) system for editing text. This includes:

  • most terminal text editors (nano, micro, vim, emacs, etc)
  • some shells (like fish), for example it seems like fish supports Ctrl+Z for undo when typing in a command. Zsh’s line editor is called zle.
  • some REPLs (like ipython), for example IPython uses the prompt_toolkit library instead of readline
  • lots of other programs (like atuin)

Some features you might see are:

  • better autocomplete which is more customized to the tool
  • nicer history management (for example with syntax highlighting) than the default you get from readline
  • more keyboard shortcuts

custom input systems are often readline-inspired

I went looking at how Atuin (a wonderful tool for searching your shell history that I started using recently) handles text input. Looking at the code and some of the discussion around it, their implementation is custom but it’s inspired by readline, which makes sense to me – a lot of users are used to those keybindings, and it’s convenient for them to work even though atuin doesn’t use readline.

prompt_toolkit (the library IPython uses) is similar – it actually supports a lot of options (including vi-like keybindings), but the default is to support the readline-style keybindings.

This is like how you see a lot of programs which support very basic vim keybindings (like j for down and k for up). For example Fastmail supports j and k even though most of its other keybindings don’t have much relationship to vim.

I assume that most “readline-inspired” custom input systems have various subtle incompatibilities with readline, but this doesn’t really bother me at all personally because I’m extremely ignorant of most of readline’s features. I only use maybe 5 keyboard shortcuts, so as long as they support the 5 basic commands I know (which they always do!) I feel pretty comfortable. And usually these custom systems have much better autocomplete than you’d get from just using readline, so generally I prefer them over readline.

lots of shells support vi keybindings

Bash, zsh, and fish all have a “vi mode” for entering text. In a very unscientific poll I ran on Mastodon, 12% of people said they use it, so it seems pretty popular.

Readline also has a “vi mode” (which is how Bash’s support for it works), so by extension lots of other programs have it too.

I’ve always thought that vi mode seems really cool, but for some reason even though I’m a vim user it’s never stuck for me.

understanding what situation you’re in really helps

I’ve spent a lot of my life being confused about why a command line application I was using wasn’t behaving the way I wanted, and it feels good to be able to more or less understand what’s going on.

I think this is roughly my mental flowchart when I’m entering text at a command line prompt:

  1. Do the arrow keys not work? Probably there’s no input system at all, but at least I can use Ctrl+W and Ctrl+U, and I can rlwrap the tool if I want more features.
  2. Does Ctrl+R print reverse-i-search? Probably it’s readline, so I can use all of the readline shortcuts I’m used to, and I know I can get some basic history and press up arrow to get the previous command.
  3. Does Ctrl+R do something else? This is probably some custom input library: it’ll probably act more or less like readline, and I can check the documentation if I really want to know how it works.

Being able to diagnose what’s going on like this makes the command line feel a more predictable and less chaotic.

some things this post left out

There are lots more complications related to entering text that we didn’t talk about at all here, like:

  • issues related to ssh / tmux / etc
  • the TERM environment variable
  • how different terminals (gnome terminal, iTerm, xterm, etc) have different kinds of support for copying/pasting text
  • unicode
  • probably a lot more
2024-07-03T08:00:20+00:00 Fullscreen Open in Tab
Reasons to use your shell's job control

Hello! Today someone on Mastodon asked about job control (fg, bg, Ctrl+z, wait, etc). It made me think about how I don’t use my shell’s job control interactively very often: usually I prefer to just open a new terminal tab if I want to run multiple terminal programs, or use tmux if it’s over ssh. But I was curious about whether other people used job control more often than me.

So I asked on Mastodon for reasons people use job control. There were a lot of great responses, and it even made me want to consider using job control a little more!

In this post I’m only going to talk about using job control interactively (not in scripts) – the post is already long enough just talking about interactive use.

what’s job control?

First: what’s job control? Well – in a terminal, your processes can be in one of 3 states:

  1. in the foreground. This is the normal state when you start a process.
  2. in the background. This is what happens when you run some_process &: the process is still running, but you can’t interact with it anymore unless you bring it back to the foreground.
  3. stopped. This is what happens when you start a process and then press Ctrl+Z. This pauses the process: it won’t keep using the CPU, but you can restart it if you want.

“Job control” is a set of commands for seeing which processes are running in a terminal and moving processes between these 3 states

how to use job control

  • fg brings a process to the foreground. It works on both stopped processes and background processes. For example, if you start a background process with cat < /dev/zero &, you can bring it back to the foreground by running fg
  • bg restarts a stopped process and puts it in the background.
  • Pressing Ctrl+z stops the current foreground process.
  • jobs lists all processes that are active in your terminal
  • kill sends a signal (like SIGKILL) to a job (this is the shell builtin kill, not /bin/kill)
  • disown removes the job from the list of running jobs, so that it doesn’t get killed when you close the terminal
  • wait waits for all background processes to complete. I only use this in scripts though.
  • apparently in bash/zsh you can also just type %2 instead of fg %2

I might have forgotten some other job control commands but I think those are all the ones I’ve ever used.

You can also give fg or bg a specific job to foreground/background. For example if I see this in the output of jobs:

$ jobs
Job Group State   Command
1   3161  running cat < /dev/zero &
2   3264  stopped nvim -w ~/.vimkeys $argv

then I can foreground nvim with fg %2. You can also kill it with kill -9 %2, or just kill %2 if you want to be more gentle.

how is kill %2 implemented?

I was curious about how kill %2 works – does %2 just get replaced with the PID of the relevant process when you run the command, the way environment variables are? Some quick experimentation shows that it isn’t:

$ echo kill %2
kill %2
$ type kill
kill is a function with definition
# Defined in /nix/store/vicfrai6lhnl8xw6azq5dzaizx56gw4m-fish-3.7.0/share/fish/config.fish

So kill is a fish builtin that knows how to interpret %2. Looking at the source code (which is very easy in fish!), it uses jobs -p %2 to expand %2 into a PID, and then runs the regular kill command.

on differences between shells

Job control is implemented by your shell. I use fish, but my sense is that the basics of job control work pretty similarly in bash, fish, and zsh.

There are definitely some shells which don’t have job control at all, but I’ve only used bash/fish/zsh so I don’t know much about that.

Now let’s get into a few reasons people use job control!

reason 1: kill a command that’s not responding to Ctrl+C

I run into processes that don’t respond to Ctrl+C pretty regularly, and it’s always a little annoying – I usually switch terminal tabs to find and kill and the process. A bunch of people pointed out that you can do this in a faster way using job control!

How to do this: Press Ctrl+Z, then kill %1 (or the appropriate job number if there’s more than one stopped/background job, which you can get from jobs). You can also kill -9 if it’s really not responding.

reason 2: background a GUI app so it’s not using up a terminal tab

Sometimes I start a GUI program from the command line (for example with wireshark some_file.pcap), forget to start it in the background, and don’t want it eating up my terminal tab.

How to do this:

  • move the GUI program to the background by pressing Ctrl+Z and then running bg.
  • you can also run disown to remove it from the list of jobs, to make sure that the GUI program won’t get closed when you close your terminal tab.

Personally I try to avoid starting GUI programs from the terminal if possible because I don’t like how their stdout pollutes my terminal (on a Mac I use open -a Wireshark instead because I find it works better but sometimes you don’t have another choice.

reason 2.5: accidentally started a long-running job without tmux

This is basically the same as the GUI app thing – you can move the job to the background and disown it.

I was also curious about if there are ways to redirect a process’s output to a file after it’s already started. A quick search turned up this Linux-only tool which is based on nelhage’s reptyr (which lets you for example move a process that you started outside of tmux to tmux) but I haven’t tried either of those.

reason 3: running a command while using vim

A lot of people mentioned that if they want to quickly test something while editing code in vim or another terminal editor, they like to use Ctrl+Z to stop vim, run the command, and then run fg to go back to their editor.

You can also use this to check the output of a command that you ran before starting vim.

I’ve never gotten in the habit of this, probably because I mostly use a GUI version of vim. I feel like I’d also be likely to switch terminal tabs and end up wondering “wait… where did I put my editor???” and have to go searching for it.

reason 4: preferring interleaved output

A few people said that they prefer to the output of all of their commands being interleaved in the terminal. This really surprised me because I usually think of having the output of lots of different commands interleaved as being a bad thing, but one person said that they like to do this with tcpdump specifically and I think that actually sounds extremely useful. Here’s what it looks like:

# start tcpdump
$ sudo tcpdump -ni any port 1234 &
tcpdump: data link type PKTAP
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on any, link-type PKTAP (Apple DLT_PKTAP), snapshot length 524288 bytes

# run curl
$ curl google.com:1234
13:13:29.881018 IP 192.168.1.173.49626 > 142.251.41.78.1234: Flags [S], seq 613574185, win 65535, options [mss 1460,nop,wscale 6,nop,nop,TS val 2730440518 ecr 0,sackOK,eol], length 0
13:13:30.881963 IP 192.168.1.173.49626 > 142.251.41.78.1234: Flags [S], seq 613574185, win 65535, options [mss 1460,nop,wscale 6,nop,nop,TS val 2730441519 ecr 0,sackOK,eol], length 0
13:13:31.882587 IP 192.168.1.173.49626 > 142.251.41.78.1234: Flags [S], seq 613574185, win 65535, options [mss 1460,nop,wscale 6,nop,nop,TS val 2730442520 ecr 0,sackOK,eol], length 0
 
# when you're done, kill the tcpdump in the background
$ kill %1 

I think it’s really nice here that you can see the output of tcpdump inline in your terminal – when I’m using tcpdump I’m always switching back and forth and I always get confused trying to match up the timestamps, so keeping everything in one terminal seems like it might be a lot clearer. I’m going to try it.

reason 5: suspend a CPU-hungry program

One person said that sometimes they’re running a very CPU-intensive program, for example converting a video with ffmpeg, and they need to use the CPU for something else, but don’t want to lose the work that ffmpeg already did.

You can do this by pressing Ctrl+Z to pause the process, and then run fg when you want to start it again.

reason 6: you accidentally ran Ctrl+Z

Many people replied that they didn’t use job control intentionally, but that they sometimes accidentally ran Ctrl+Z, which stopped whatever program was running, so they needed to learn how to use fg to bring it back to the foreground.

The were also some mentions of accidentally running Ctrl+S too (which stops your terminal and I think can be undone with Ctrl+Q). My terminal totally ignores Ctrl+S so I guess I’m safe from that one though.

reason 7: already set up a bunch of environment variables

Some folks mentioned that they already set up a bunch of environment variables that they need to run various commands, so it’s easier to use job control to run multiple commands in the same terminal than to redo that work in another tab.

reason 8: it’s your only option

Probably the most obvious reason to use job control to manage multiple processes is “because you have to” – maybe you’re in single-user mode, or on a very restricted computer, or SSH’d into a machine that doesn’t have tmux or screen and you don’t want to create multiple SSH sessions.

reason 9: some people just like it better

Some people also said that they just don’t like using terminal tabs: for instance a few folks mentioned that they prefer to be able to see all of their terminals on the screen at the same time, so they’d rather have 4 terminals on the screen and then use job control if they need to run more than 4 programs.

I learned a few new tricks!

I think my two main takeaways from thos post is I’ll probably try out job control a little more for:

  1. killing processes that don’t respond to Ctrl+C
  2. running tcpdump in the background with whatever network command I’m running, so I can see both of their output in the same place
2024-06-03T09:45:11+00:00 Fullscreen Open in Tab
New zine: How Git Works!

Hello! I’ve been writing about git on here nonstop for months, and the git zine is FINALLY done! It came out on Friday!

You can get it for $12 here: https://wizardzines.com/zines/git, or get an 14-pack of all my zines here.

Here’s the cover:

the table of contents

Here’s the table of contents:

who is this zine for?

I wrote this zine for people who have been using git for years and are still afraid of it. As always – I think it sucks to be afraid of the tools that you use in your work every day! I want folks to feel confident using git.

My goals are:

  • To explain how some parts of git that initially seem scary (like “detached HEAD state”) are pretty straightforward to deal with once you understand what’s going on
  • To show some parts of git you probably should be careful around. For example, the stash is one of the places in git where it’s easiest to lose your work in a way that’s incredibly annoying to recover form, and I avoid using it heavily because of that.
  • To clear up a few common misconceptions about how the core parts of git (like commits, branches, and merging) work

what’s the difference between this and Oh Shit, Git!

You might be wondering – Julia! You already have a zine about git! What’s going on? Oh Shit, Git! is a set of tricks for fixing git messes. “How Git Works” explains how Git actually works.

Also, Oh Shit, Git! is the amazing Katie Sylor Miller’s concept: we made it into a zine because I was such a huge fan of her work on it.

I think they go really well together.

what’s so confusing about git, anyway?

This zine was really hard for me to write because when I started writing it, I’d been using git pretty confidently for 10 years. I had no real memory of what it was like to struggle with git.

But thanks to a huge amount of help from Marie as well as everyone who talked to me about git on Mastodon, eventually I was able to see that there are a lot of things about git that are counterintuitive, misleading, or just plain confusing. These include:

  • confusing terminology (for example “fast-forward”, “reference”, or “remote-tracking branch”)
  • misleading messages (for example how Your branch is up to date with 'origin/main' doesn’t necessary mean that your branch is up to date with the main branch on the origin)
  • uninformative output (for example how I STILL can’t reliably figure out which code comes from which branch when I’m looking at a merge conflict)
  • a lack of guidance around handling diverged branches (for example how when you run git pull and your branch has diverged from the origin, it doesn’t give you great guidance how to handle the situation)
  • inconsistent behaviour (for example how git’s reflogs are almost always append-only, EXCEPT for the stash, where git will delete entries when you run git stash drop)

The more I heard from people how about how confusing they find git, the more it became clear that git really does not make it easy to figure out what its internal logic is just by using it.

handling git’s weirdnesses becomes pretty routine

The previous section made git sound really bad, like “how can anyone possibly use this thing?”.

But my experience is that after I learned what git actually means by all of its weird error messages, dealing with it became pretty routine! I’ll see an error: failed to push some refs to 'github.com:jvns/wizard-zines-site', realize “oh right, probably a coworker made some changes to main since I last ran git pull”, run git pull --rebase to incorporate their changes, and move on with my day. The whole thing takes about 10 seconds.

Or if I see a You are in 'detached HEAD' state warning, I’ll just make sure to run git checkout mybranch before continuing to write code. No big deal.

For me (and for a lot of folks I talk to about git!), dealing with git’s weird language can become so normal that you totally forget why anybody would even find it weird.

a little bit of internals

One of my biggest questions when writing this zine was how much to focus on what’s in the .git directory. We ended up deciding to include a couple of pages about internals (“inside .git”, pages 14-15), but otherwise focus more on git’s behaviour when you use it and why sometimes git behaves in unexpected ways.

This is partly because there are lots of great guides to git’s internals out there already (1, 2), and partly because I think even if you have read one of these guides to git’s internals, it isn’t totally obvious how to connect that information to what you actually see in git’s user interface.

For example: it’s easy to find documentation about remotes in git – for example this page says:

Remote-tracking branches […] remind you where the branches in your remote repositories were the last time you connected to them.

But even if you’ve read that, you might not realize that the statement Your branch is up to date with 'origin/main'" in git status doesn’t necessarily mean that you’re actually up to date with the remote main branch.

So in general in the zine we focus on the behaviour you see in Git’s UI, and then explain how that relates to what’s happening internally in Git.

the cheat sheet

The zine also comes with a free printable cheat sheet: (click to get a PDF version)

it comes with an HTML transcript!

The zine also comes with an HTML transcript, to (hopefully) make it easier to read on a screen reader! Our Operations Manager, Lee, transcribed all of the pages and wrote image descriptions. I’d love feedback about the experience of reading the zine on a screen reader if you try it.

I really do love git

I’ve been pretty critical about git in this post, but I only write zines about technologies I love, and git is no exception.

Some reasons I love git:

  • it’s fast!
  • it’s backwards compatible! I learned how to use it 10 years ago and everything I learned then is still true
  • there’s tons of great free Git hosting available out there (GitHub! Gitlab! a million more!), so I can easily back up all my code
  • simple workflows are REALLY simple (if I’m working on a project on my own, I can just run git commit -am 'whatever' and git push over and over again and it works perfectly)
  • Almost every internal file in git is a pretty simple text file (or has a version which is a text file), which makes me feel like I can always understand exactly what’s going on under the hood if I want to.

I hope this zine helps some of you love it too.

people who helped with this zine

I don’t make these zines by myself!

I worked with Marie Claire LeBlanc Flanagan every morning for 8 months to write clear explanations of git.

The cover is by Vladimir Kašiković, Gersande La Flèche did copy editing, James Coglan (of the great Building Git) did technical review, our Operations Manager Lee did the transcription as well as a million other things, my partner Kamal read the zine and told me which parts were off (as he always does), and I had a million great conversations with Marco Rogers about git.

And finally, I want to thank all the beta readers! There were 66 this time which is a record! They left hundreds of comments about what was confusing, what they learned, and which of my jokes were funny. It’s always hard to hear from beta readers that a page I thought made sense is actually extremely confusing, and fixing those problems before the final version makes the zine so much better.

get the zine

Here are some links to get the zine again:

As always, you can get either a PDF version to print at home or a print version shipped to your house. The only caveat is print orders will ship in July – I need to wait for orders to come in to get an idea of how many I should print before sending it to the printer.

thank you

As always: if you’ve bought zines in the past, thank you for all your support over the years. And thanks to all of you (1000+ people!!!) who have already bought the zine in the first 3 days. It’s already set a record for most zines sold in a single day and I’ve been really blown away.

2024-05-12T07:39:30-07:00 Fullscreen Open in Tab
FedCM for IndieAuth

IndieWebCamp Düsseldorf took place this weekend, and I was inspired to work on a quick hack for demo day to show off a new feature I've been working on for IndieAuth.

Since I do actually use my website to log in to different websites on a regular basis, I am often presented with the login screen asking for my domain name, which is admittedly an annoying part of the process. I don't even like having to enter my email address when I log in to a site, and entering my domain isn't any better.

So instead, I'd like to get rid of this prompt, and let the browser handle it for you! Here's a quick video of logging in to a website using my domain with the new browser API:

So how does this work?

For the last couple of years, there has been an ongoing effort at the Federated Identity Community Group at the W3C to build a new API in browsers that can sit in the middle of login flows. It's primarily being driven by Google for their use case of letting websites show a Google login popup dialog without needing 3rd party cookies and doing so in a privacy-preserving way. There's a lot to unpack here, more than I want to go into in this blog post. You can check out Tim Cappalli's slides from the OAuth Security Workshop for a good explainer on the background and how it works.

However, there are a few experimental features that are being considered for the API to accommodate use cases beyond the "Sign in with Google" case. The one that's particularly interesting to the IndieAuth use case is the IdP Registration API. This API allows any website to register itself as an identity provider that can appear in the account chooser popup, so that a relying party website doesn't have to list out all the IdPs it supports, it can just say it supports "any" IdP. This maps to how IndieAuth is already used today, where a website can accept any user's IndieAuth server without any prior relationship with the user. For more background, check out my previous blog post "OAuth for the Open Web".

So now, with the IdP Registration API in FedCM, your website can tell your browser that it is an IdP, then when a website wants to log you in, it asks your browser to prompt you. You choose your account from the list, the negotiation happens behind the scenes, and you're logged in!

One of the nice things about combining FedCM with IndieAuth is it lends itself nicely to running the FedCM IdP as a separate service from your actual website. I could run an IndieAuth IdP service that you could sign up for and link your website to. Since your identity is your website, your website would be the thing ultimately sent to the relying party that you're signing in to, even though it was brokered through the IdP service. Ultimately this means much faster adoption is possible, since all it takes to turn your website into a FedCM-supported site is adding a single <link> tag to your home page.

So if this sounds interesting to you, leave a comment below! The IdP registration API is currently an early experiment, and Google needs to see actual interest in it in order to keep it around! In particular, they are looking for Relying Parties who would be interested in actually using this to log users in. I am planning on launching this on webmention.io as an experiment. If you have a website where users can sign in with IndieAuth, feel free to get in touch and I'd be happy to help you set up FedCM support as well!

2024-05-02T15:06:00-07:00 Fullscreen Open in Tab
OAuth for Browser-Based Apps Working Group Last Call!

The draft specification OAuth for Browser-Based Applications has just entered Working Group Last Call!

https://datatracker.ietf.org/doc/html/draft-ietf-oauth-browser-based-apps

This begins a two-week period to collect final comments on the draft. Please review the draft and reply on the OAuth mailing list if you have any comments or concerns. And if you've reviewed the document and are happy with the current state, it is also extremely helpful if you can reply on the list to just say "looks good to me"!

If joining the mailing list is too much work, you're also welcome to comment on the Last Call issue on GitHub.

In case you were wondering, yes your comments matter! Even just a small indication of support goes a long way in these discussions!

I am extremely happy with how this draft has turned out, and would like to again give a huge thanks to Philippe De Ryck for the massive amount of work he's put in to the latest few versions to help get this over the finish line!

2024-04-10T12:43:14+00:00 Fullscreen Open in Tab
Notes on git's error messages

While writing about Git, I’ve noticed that a lot of folks struggle with Git’s error messages. I’ve had many years to get used to these error messages so it took me a really long time to understand why folks were confused, but having thought about it much more, I’ve realized that:

  1. sometimes I actually am confused by the error messages, I’m just used to being confused
  2. I have a bunch of strategies for getting more information when the error message git gives me isn’t very informative

So in this post, I’m going to go through a bunch of Git’s error messages, list a few things that I think are confusing about them for each one, and talk about what I do when I’m confused by the message.

improving error messages isn’t easy

Before we start, I want to say that trying to think about why these error messages are confusing has given me a lot of respect for how difficult maintaining Git is. I’ve been thinking about Git for months, and for some of these messages I really have no idea how to improve them.

Some things that seem hard to me about improving error messages:

  • if you come up with an idea for a new message, it’s hard to tell if it’s actually better!
  • work like improving error messages often isn’t funded
  • the error messages have to be translated (git’s error messages are translated into 19 languages!)

That said, if you find these messages confusing, hopefully some of these notes will help clarify them a bit.

error: git push on a diverged branch

$ git push
To github.com:jvns/int-exposed
! [rejected]        main -> main (non-fast-forward)
error: failed to push some refs to 'github.com:jvns/int-exposed'
hint: Updates were rejected because the tip of your current branch is behind
hint: its remote counterpart. Integrate the remote changes (e.g.
hint: 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

$ git status
On branch main
Your branch and 'origin/main' have diverged,
and have 2 and 1 different commits each, respectively.

Some things I find confusing about this:

  1. You get the exact same error message whether the branch is just behind or the branch has diverged. There’s no way to tell which it is from this message: you need to run git status or git pull to find out.
  2. It says failed to push some refs, but it’s not totally clear which references it failed to push. I believe everything that failed to push is listed with ! [rejected] on the previous line– in this case just the main branch.

What I like to do if I’m confused:

  • I’ll run git status to figure out what the state of my current branch is.
  • I think I almost never try to push more than one branch at a time, so I usually totally ignore git’s notes about which specific branch failed to push – I just assume that it’s my current branch

error: git pull on a diverged branch

$ git pull
hint: You have divergent branches and need to specify how to reconcile them.
hint: You can do so by running one of the following commands sometime before
hint: your next pull:
hint:
hint:   git config pull.rebase false  # merge
hint:   git config pull.rebase true   # rebase
hint:   git config pull.ff only       # fast-forward only
hint:
hint: You can replace "git config" with "git config --global" to set a default
hint: preference for all repositories. You can also pass --rebase, --no-rebase,
hint: or --ff-only on the command line to override the configured default per
hint: invocation.
fatal: Need to specify how to reconcile divergent branches.

The main thing I think is confusing here is that git is presenting you with a kind of overwhelming number of options: it’s saying that you can either:

  1. configure pull.rebase false, pull.rebase true, or pull.ff only locally
  2. or configure them globally
  3. or run git pull --rebase or git pull --no-rebase

It’s very hard to imagine how a beginner to git could easily use this hint to sort through all these options on their own.

If I were explaining this to a friend, I’d say something like “you can use git pull --rebase or git pull --no-rebase to resolve this with a rebase or merge right now, and if you want to set a permanent preference, you can do that with git config pull.rebase false or git config pull.rebase true.

git config pull.ff only feels a little redundant to me because that’s git’s default behaviour anyway (though it wasn’t always).

What I like to do here:

  • run git status to see the state of my current branch
  • maybe run git log origin/main or git log to see what the diverged commits are
  • usually run git pull --rebase to resolve it
  • sometimes I’ll run git push --force or git reset --hard origin/main if I want to throw away my local work or remote work (for example because I accidentally commited to the wrong branch, or because I ran git commit --amend on a personal branch that only I’m using and want to force push)

error: git checkout asdf (a branch that doesn't exist)

$ git checkout asdf
error: pathspec 'asdf' did not match any file(s) known to git

This is a little weird because we my intention was to check out a branch, but git checkout is complaining about a path that doesn’t exist.

This is happening because git checkout’s first argument can be either a branch or a path, and git has no way of knowing which one you intended. This seems tricky to improve, but I might expect something like “No such branch, commit, or path: asdf”.

What I like to do here:

  • in theory it would be good to use git switch instead, but I keep using git checkout anyway
  • generally I just remember that I need to decode this as “branch asdf doesn’t exist”

error: git switch asdf (a branch that doesn't exist)

$ git switch asdf
fatal: invalid reference: asdf

git switch only accepts a branch as an argument (unless you pass -d), so why is it saying invalid reference: asdf instead of invalid branch: asdf?

I think the reason is that internally, git switch is trying to be helpful in its error messages: if you run git switch v0.1 to switch to a tag, it’ll say:

$ git switch v0.1
fatal: a branch is expected, got tag 'v0.1'`

So what git is trying to communicate with fatal: invalid reference: asdf is “asdf isn’t a branch, but it’s not a tag either, or any other reference”. From my various git polls my impression is that a lot of git users have literally no idea what a “reference” is in git, so I’m not sure if that’s coming across.

What I like to do here:

90% of the time when a git error message says reference I just mentally replace it with branch in my head.

error: git checkout HEAD^

$ git checkout HEAD^
Note: switching to 'HEAD^'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c 

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at 182cd3f add "swap byte order" button

This is a tough one. Definitely a lot of people are confused about this message, but obviously there's been a lot of effort to improve it too. I don't have anything smart to say about this one.

What I like to do here:

  • my shell prompt tells me if I’m in detached HEAD state, and generally I can remember not to make new commits while in that state
  • when I’m done looking at whatever old commits I wanted to look at, I’ll run git checkout main or something to go back to a branch

message: git status when a rebase is in progress

This isn’t an error message, but I still find it a little confusing on its own:

$ git status
interactive rebase in progress; onto c694cf8
Last command done (1 command done):
   pick 0a9964d wip
No commands remaining.
You are currently rebasing branch 'main' on 'c694cf8'.
  (fix conflicts and then run "git rebase --continue")
  (use "git rebase --skip" to skip this patch)
  (use "git rebase --abort" to check out the original branch)

Unmerged paths:
  (use "git restore --staged ..." to unstage)
  (use "git add ..." to mark resolution)
  both modified:   index.html

no changes added to commit (use "git add" and/or "git commit -a")

Two things I think could be clearer here:

  1. I think it would be nice if You are currently rebasing branch 'main' on 'c694cf8'. were on the first line instead of the 5th line – right now the first line doesn’t say which branch you’re rebasing.
  2. In this case, c694cf8 is actually origin/main, so I feel like You are currently rebasing branch 'main' on 'origin/main' might be even clearer.

What I like to do here:

My shell prompt includes the branch that I’m currently rebasing, so I rely on that instead of the output of git status.

error: git rebase when a file has been deleted

$ git rebase main
CONFLICT (modify/delete): index.html deleted in 0ce151e (wip) and modified in HEAD.  Version HEAD of index.html left in tree.
error: could not apply 0ce151e... wip

The thing I still find confusing about this is – index.html was modified in HEAD. But what is HEAD? Is it the commit I was working on when I started the merge/rebase, or is it the commit from the other branch? (the answer is “HEAD is your branch if you’re doing a merge, and it’s the “other branch” if you’re doing a rebase, but I always find that hard to remember)

I think I would personally find it easier to understand if the message listed the branch names if possible, something like this:

CONFLICT (modify/delete): index.html deleted on `main` and modified on `mybranch`

error: git status during a merge or rebase (who is "them"?)

$ git status 
On branch master
You have unmerged paths.
  (fix conflicts and run "git commit")
  (use "git merge --abort" to abort the merge)

Unmerged paths: (use “git add/rm …” as appropriate to mark resolution) deleted by them: the_file

no changes added to commit (use “git add” and/or “git commit -a”)

I find this one confusing in exactly the same way as the previous message: it says deleted by them:, but what “them” refers to depends on whether you did a merge or rebase or cherry-pick.

  • for a merge, them is the other branch you merged in
  • for a rebase, them is the branch that you were on when you ran git rebase
  • for a cherry-pick, I guess it’s the commit you cherry-picked

What I like to do if I’m confused:

  • try to remember what I did
  • run git show main --stat or something to see what I did on the main branch if I can’t remember

error: git clean

$ git clean
fatal: clean.requireForce defaults to true and neither -i, -n, nor -f given; refusing to clean

I just find it a bit confusing that you need to look up what -i, -n and -f are to be able to understand this error message. I’m personally way too lazy to do that so even though I’ve probably been using git clean for 10 years I still had no idea what -i stood for (interactive) until I was writing this down.

What I like to do if I’m confused:

Usually I just chaotically run git clean -f to delete all my untracked files and hope for the best, though I might actually switch to git clean -i now that I know what -i stands for. Seems a lot safer.

that’s all!

Hopefully some of this is helpful!

2024-04-01T07:37:00+00:00 Fullscreen Open in Tab
Making crochet cacti

I noticed some tech bloggers I follow have been making April Cools Day posts about topics they don’t normally write about (like decaf or microscopes). The goal isn’t to trick anyone, just to write about something different for a day.

I thought those posts were fun so here is a post with some notes on learning to crochet tiny cacti.

first, the cacti

I’ve been trying to do some non-computer hobbies, without putting a lot of pressure on myself to be “good” at them. Here are some cacti I crocheted:

They are a little wonky and I like them.

a couple of other critters

Here are a couple of other things I made: an elephant, an orange guy, a much earlier attempt at a cactus, and an in-progress cactus

Some of these are also pretty wonky, but sometimes it adds to the charm: for example the elephant’s head is attached at an angle which was not on purpose but I think adds to the effect. (orange guy pattern, elephant pattern)

I haven’t really been making clothing: I like working in a pretty chaotic way and I think you need to be a lot more careful when you make clothing so that it will actually fit.

the first project: a mouse

The first project I made was this little mouse. It took me a few hours (maybe 3 hours?) and I made a lot of mistakes and it definitely was not as cute as it was in the pictures in the pattern, but it was still good! I can’t find a picture right now though.

buying patterns is great

Originally I started out using free patterns, but I found some cacti patterns I really liked in an ebook called Knotmonsters: Cactus Gardens Edition, so I bought it.

I like the patterns in that book and also buying patterns seems like a nice way to support people who are making fun patterns. I found this guide to designing your own patterns through searching on Ravelry and it seems like a lot of work! Maybe I will do it one day but for now I appreciate the work of other people who make the patterns.

modifying patterns chaotically is great too

I’ve been modifying all of the patterns I make in a somewhat chaotic way, often just because I made a mistake somewhere along the way and then decide to move forward and change the pattern to adjust for the mistake instead of undoing my work. Some of of the changes I’ve made are:

  • remove rows
  • put fewer stitches in a row
  • use a different stitch

This doesn’t always work but often it works well enough, and I think all of the mistakes help me learn.

no safety eyes

A lot of the patterns I’ve been seeing for animals suggest using “safety eyes” (plastic eyes). I didn’t really feel like buying those , so I’ve been embroidering eyes on instead. “Embroidering” might not be accurate, really I just sew some black yarn on in a haphazard way and hope it doesn’t come out looking too weird.

My crochet kit came with a big plastic yarn needle that I’ve been using to embroider and also

no stitch markers

My crochet kit came with some plastic “stitch markers” which you can use to figure out where the beginning of your row is, so you know when you’re done. I’ve been finding it easier to just use a short piece of scrap yarn instead.

on dealing with all the counting

In crochet there is a LOT of counting. Like “single crochet 3 times, then double crochet 1 time, then repeat that 6 times”. I find it hard to do that accurately without making mistakes, and all of the counting is not that fun! A few things that have helped:

  • go back and look at my stitches to see what I did (“have I done 1 single crochet, or 2?”). I’m not actually very good at doing this, but I find it easier to see my stitches with wool/cotton yarn than with acrylic yarn for some reason.
  • count how many stitches in total I’ve done since the last row, and make sure it seems approximately right (“well, I’m supposed to have 20 stitches and I have 19, that’s pretty close!”). Then I’ll maybe just add an extra stitch in the wrong place to adjust, or maybe just leave it the way it is.

notes on yarn

So far I’ve tried three kinds of yarn: merino (for the elephant), cotton (for the cacti), and acrylic (for the orange dude). I still don’t know which one I like best, but since I’m doing small projects it feels like the right move is still to just buy small amounts of yarn and experiment. I think I like the cotton and merino more than the acrylic.

For the cacti I used Ricorumi cotton yarn, which comes in tiny balls (which is good for me because if I don’t end up liking it, I don’t have a lot of extra!) and in a lot of different colours.

There are a lot of yarn weights (lace! sock! sport! DK! worsted! bulky! and more!). I don’t really underestand them yet but I think so far I’ve been mostly using DK and worsted yarn.

hook size? who knows!

I’ve mostly been using a 3.5mm hook, probably because I read a tutorial that said to use a 3.5mm hook. It seems to work fine! I used a larger hook size when making a hat, and that also worked.

I still don’t really know how to choose hook sizes but that doesn’t seem to have a lot of consequences when making cacti.

every stitch I’ve learned

I think I’ve probably only learned how to do 5 things in crochet so far:

  • magic ring (mr)
  • single crochet (sc)
  • half double crochet (hdc)
  • front post half double crochet (fphdc)
  • double crochet (dc)
  • back loops only/front loops only (flo/blo)
  • increase/decrease

The way I’ve been approaching learning new crochet stitches is:

  1. find a pattern I want to make
  2. start it without reviewing it very much at all
  3. when I get to a stitch I don’t know, watch youtube videos
  4. don’t watch it very carefully and get it wrong
  5. eventually realize that it doesn’t look right at all, rewatch the video, and continue

I’ve been using Sarah Maker’s pages a lot, except for the magic ring where I used this 3-minute youtube video.

The magic ring took me a very long time to learn to do correctly, I didn’t pay attention very closely to the 3-minute youtube video so I did it wrong in maybe 4 projects before I figured out how to do it right.

every single thing I’ve bought

So far I’ve only needed:

  1. a crochet kit (which I got as a gift). it came with yarn, a bunch of crochet needles in different sizes, big sewing needles, and some other things I haven’t needed yet.
  2. some Ricorumi cotton (for the cacti)
  3. 1 ball of gray yarn (for the elephant)

I’ve been trying to not buy too much stuff, because I never know if I’ll get bored with a new hobby, and if I get bored it’s annoying to have a bunch of stuff lying around. Some examples of things I’ve avoided buying so far:

  • Instead of buying polyester fiberfill, to fill all of the critters I’ve just been cutting up an old sweater I have that was falling apart.
  • I’ve been embroidering the eyes instead of buying safety eyes

Everything I have right now fits in a the box the crochet kit came in (which is about the size of a large shoebox), and my plan is to keep it that way for a while.

that’s all!

Mainly what I like about crochet so far is that:

  • it’s a way to not be on the computer, and you can chat with people while doing it
  • you can do it without buying too much stuff, it’s pretty compact
  • I end up with cacti in our living room which is great (I also have a bunch of real succulents, so they go with those)
  • it seems extremely forgiving of mistakes and experimentation

There are definitely still a lot of things I’m doing “wrong” but it’s fun to learn through trial and error.

2024-03-29T08:15:24-07:00 Fullscreen Open in Tab
OAuth: "grant" vs "flow" vs "grant type"
Is it called an OAuth "grant" or a "flow"? What about "grant type"?

These are common questions when writing documentation for OAuth-related things. While these terms are all used in RFC 6749 and many extensions, the differences between the terminology is never actually explained.

I wanted to finally write down a definition of the terms, along with examples of when each is appropriate.

  • flow - use "flow" when referring to the end-to-end process, for example:
    • "the client initiates the flow by..."
    • "the flow ends with the successful issuance of an access token"
    • This can also be combined with the type of flow, for example:
    • "The Authorization Code flow starts by..."
  • grant - use "grant" when referring to the specific POST request to the token endpoint, for example:
    • "The authorization code grant includes the PKCE code verifier..."
    • "The refresh token grant can be used with or without client authentication..."
    • "Grant" also refers to the abstract concept of the user having granted authorization, which is expressed as the authorization code, or implicitly with the client credentials grant. This is a bit of an academic definition of the term, and is used much less frequently in normal conversation around OAuth.
  • grant type - use "grant type" when referring to the definition of the flow in the spec itself, for example:
    • "there are several drawbacks to the Implicit grant type"
    • "the Authorization Code grant type enables the use of..."

Let me know if you have any suggestions for clarifying any of this, or any other helpful examples to add! I'm planning on adding this summary to OAuth 2.1 so that we have a formal reference for it in the future!

2024-03-28T08:35:56+00:00 Fullscreen Open in Tab
Some Git poll results

A new thing I’ve been trying while writing this Git zine is doing a bunch of polls on Mastodon to learn about:

  • which git commands/workflows people use (like “do you use merge or rebase more?” or “do you put your current git branch in your shell prompt?”)
  • what kinds of problems people run into with git (like “have you lost work because of a git problem in the last year or two?”)
  • which terminology people find confusing (like “how confident do you feel that you know what HEAD means in git?”)
  • how people think about various git concepts (“how do you think about git branches?”)
  • in what ways my usage of git is “normal” and in what ways it’s “weird”. Where am I pretty similar to the majority of people, and where am I different?

It’s been a lot of fun and some of the results have been surprising to me, so here are some of the results. I’m partly just posting these so that I can have them all in one place for myself to refer to, but maybe some of you will find them interesting too.

these polls are highly unscientific

Polls on social media that I thought about for approximately 45 seconds before posting are not the most rigorous way of doing user research, so I’m pretty cautious about drawing conclusions from them. Potential problems include: I phrased the poll badly, the set of possible responses aren’t chosen very carefully, some of the poll responses I just picked because I thought they were funny, and the set of people who follow me on Mastodon is not representative of all git users.

But here are a couple of examples of why I still find these poll results useful:

  • The first poll is “what’s your approach to merge commits and rebase in git”? 600 people (30% of responders) replied “I usually use merge, rarely/never rebase”. It’s helpful for me to know that there are a lot of people out there who rarely/never use rebase, because I use rebase all the time – it’s a good reminder that my experiences isn’t necessarily representative.
  • For the poll “how confident do you feel that you know what HEAD means in git?”, 14% of people replied “literally no idea”. That tells me to be careful about assuming that people know what HEAD means in my writing.

where to read more

If you want to read more about any given poll, you can click at the date at the bottom – there’s usually a bunch of interesting follow-up discussion.

Also this post has a lot of CSS so it might not work well in a feed reader.

Now! Here are the polls! I’m mostly just going to post the results without commenting on them.

merge and rebase

poll: what's your approach to merge commits and rebase in git?

merge conflicts

poll: if you use git, how often do you deal with nontrivial merge conflicts? (like where 2 people were really editing the same code at the same time and you need to take time to think about how to reconcile the edits)

another merge conflict poll:

have you ever seen a bug in production caused by an incorrect merge conflict resolution? I've heard about this as a reason to prefer merges over rebase (because it makes the merge conflict resolution easier to audit) and I'm curious about how common it is

I thought it was interesting in the next one that “edit the weird text file by hand” was most people’s preference:

poll: when you have a merge conflict, how do you prefer to handle it?

merge conflict follow up: if you prefer to edit the weird text file by hand instead of using a dedicated merge conflict tool, why is that?

poll: did you know that in a git merge conflict, the order of the code is different when you do a merge/rebase?

merge:

<<<<<<< HEAD
YOUR CODE
=======
OTHER BRANCH'S CODE
>>>>>>> c694cf8aabe

rebase:

<<<<<<< HEAD
OTHER BRANCH'S CODE
=======
YOUR CODE
>>>>>>> d945752 (your commit message)

(where "YOUR CODE" is the code from the branch you were on when you ran `git merge` or `git rebase`)

git pull

poll: do you prefer `git fetch` or `git pull`?

(no lectures about why you think `git pull` is bad please but if you use both I'd be curious to hear in what cases you use fetch!)

commits

[poll] how do you think of a git commit?

(sorry, you can't pick “it’s all 3”, I'm curious about which one feels most true to you)

branches

poll: how do you think about git branches? (I'll put an image in a reply with pictures for the 3 options)

as with all of these polls obviously all 3 are valid, I'm curious which one feels the most true to you

git environment

poll: do you put your current git branch in your shell prompt?

poll: do you use git on the command line or in a GUI?

(you can pick more than one option if it’s a mix of both, sorry magit users I didn't have space for you in this poll)

losing work

poll: have you lost work because of a git problem in the last year or two? (it counts even if it was "your fault" :))

meaning of various git terms

These polls gave me the impression that for a lot of git terms (fast-forward, reference, HEAD), there are a lot of git users who have “literally no idea” what they mean. That makes me want to be careful about using and defining those terms.

poll: how confident do you feel that you know what HEAD means in git?

another poll: how do you think of HEAD in git?

poll: when you see this message in `git status`:

”Your branch is up to date with 'origin/main’.”

do you know that your branch may not actually be up to date with the `main` branch on the remote?

poll: how confident do you feel that you know what the term "fast-forward" means in git, for example in this error message:

`! [rejected] main -> main (non-fast-forward)`

or this one:

fatal: Not possible to fast-forward, aborting.

(I promise this is not a trick question, I'm just writing a blog post about git terminology and I'm trying to gauge how people feel about various core git terms)

poll: how confident do you feel that you know what a "ref" or "reference" is in git? (“ref” and “reference” are the same thing)

for example in this error message (from `git push`)

error: failed to push some refs to 'github.com:jvns/int-exposed'

or this one: (from `git switch mybranch`)

fatal: invalid reference: mybranch

another git terminology poll: how confident do you feel that you know what a git commit is?

(not a trick question, I'm mostly curious how this one relates to people's reported confidence about more "advanced" terms like reference/fast-forward/HEAD)

poll: in git, do you think of "detached HEAD state" and "not having any branch checked out" as being the same thing?

poll: how confident do you feel that you know what the term "current branch" means in git?

(deleted & reposted to clarify that I'm asking about the meaning of the term)

other version control systems

I occasionally hear “SVN was better than git!” but this “svn vs git” poll makes me think that’s a minority opinion. I’m much more cautious about concluding anything from the hg-vs-git poll but it does seem like some people prefer git and some people prefer Mercurial.

poll 2: if you've used both svn and git, which do you prefer?

(no replies please, i have already read 300 comments about git vs other version control systems today and they were great but i can't read more)

gonna do a short thread of git vs other version control systems polls just to get an overall vibe

poll 1: if you've used both hg and git, which do you prefer?

(no replies please though, i have already read 300 comments about git vs other version control systems today and i can't read more)

that’s all!

It’s been very fun to run all of these polls and I’ve learned a lot about how people use and think about git.

2024-03-22T08:15:02+00:00 Fullscreen Open in Tab
The "current branch" in git

Hello! I know I just wrote a blog post about HEAD in git, but I’ve been thinking more about what the term “current branch” means in git and it’s a little weirder than I thought.

four possible definitions for “current branch”

  1. It’s what’s in the file .git/HEAD. This is how the git glossary defines it.
  2. It’s what git status says on the first line
  3. It’s what you most recently checked out with git checkout or git switch
  4. It’s what’s in your shell’s git prompt. I use fish_git_prompt so that’s what I’ll be talking about.

I originally thought that these 4 definitions were all more or less the same, but after chatting with some people on Mastodon, I realized that they’re more different from each other than I thought.

So let’s talk about a few git scenarios and how each of these definitions plays out in each of them. I used git version 2.39.2 (Apple Git-143) for all of these experiments.

scenario 1: right after git checkout main

Here’s the most normal situation: you check out a branch.

  1. .git/HEAD contains ref: refs/heads/main
  2. git status says On branch main
  3. The thing I most recently checked out was: main
  4. My shell’s git prompt says: (main)

In this case the 4 definitions all match up: they’re all main. Simple enough.

scenario 2: right after git checkout 775b2b399

Now let’s imagine I check out a specific commit ID (so that we’re in “detached HEAD state”).

  1. .git/HEAD contains 775b2b399fb8b13ee3341e819f2aaa024a37fa92
  2. git status says HEAD detached at 775b2b39
  3. The thing I most recently checked out was 775b2b399
  4. My shell’s git prompt says ((775b2b39))

Again, these all basically match up – some of them have truncated the commit ID and some haven’t, but that’s it. Let’s move on.

scenario 3: right after git checkout v1.0.13

What if we’ve checked out a tag, instead of a branch or commit ID?

  1. .git/HEAD contains ca182053c7710a286d72102f4576cf32e0dafcfb
  2. git status says HEAD detached at v1.0.13
  3. The thing I most recently checked out was v1.0.13
  4. My shell’s git prompt says ((v1.0.13))

Now things start to get a bit weirder! .git/HEAD disagrees with the other 3 indicators: git status, the git prompt, and what I checked out are all the same (v1.0.13), but .git/HEAD contains a commit ID.

The reason for this is that git is trying to help us out: commit IDs are kind of opaque, so if there’s a tag that corresponds to the current commit, git status will show us that instead.

Some notes about this:

  • If we check out the commit by its ID (git checkout ca182053c7710a286d72) instead of by its tag, what shows up in git status and in my shell prompt are exactly the same – git doesn’t actually “know” that we checked out a tag.
  • it looks like you can find the tags matching HEAD by running git describe HEAD --tags --exact-match (here’s the fish git prompt code)
  • You can see where git-prompt.sh added support for describing a commit by a tag in this way in commit 27c578885 in 2008.
  • I don’t know if it makes a difference whether the tag is annotated or not.
  • If there are 2 tags with the same commit ID, it gets a little weird. For example, if I add the tag v1.0.12 to this commit so that it’s with both v1.0.12 and v1.0.13, you can see here that my git prompt changes, and then the prompt and git status disagree about which tag to display:
bork@grapefruit ~/w/int-exposed ((v1.0.12))> git status
HEAD detached at v1.0.13

(my prompt shows v1.0.12 and git status shows v1.0.13)

scenario 4: in the middle of a rebase

Now: what if I check out the main branch, do a rebase, but then there was a merge conflict in the middle of the rebase? Here’s the situation:

  1. .git/HEAD contains c694cf8aabe2148b2299a988406f3395c0461742 (the commit ID of the commit that I’m rebasing onto, origin/main in this case)
  2. git status says interactive rebase in progress; onto c694cf8
  3. The thing I most recently checked out was main
  4. My shell’s git prompt says (main|REBASE-i 1/1)

Some notes about this:

  • I think that in some sense the “current branch” is main here – it’s what I most recently checked out, it’s what we’ll go back to after the rebase is done, and it’s where we’d go back to if I run git rebase --abort
  • in another sense, we’re in a detached HEAD state at c694cf8aabe2. But it doesn’t have the usual implications of being in “detached HEAD state” – if you make a commit, it won’t get orphaned! Instead, assuming you finish the rebase, it’ll get absorbed into the rebase and put somewhere in the middle of your branch.
  • it looks like during the rebase, the old “current branch” (main) is stored in .git/rebase-merge/head-name. Not totally sure about this though.

scenario 5: right after git init

What about when we create an empty repository with git init?

  1. .git/HEAD contains ref: refs/heads/main
  2. git status says On branch main (and “No commits yet”)
  3. The thing I most recently checked out was, well, nothing
  4. My shell’s git prompt says: (main)

So here everything mostly lines up, except that we’ve never run git checkout or git switch. Basically Git automatically switches to whatever branch was configured in init.defaultBranch.

scenario 6: a bare git repository

What if we clone a bare repository with git clone --bare https://github.com/rbspy/rbspy?

  1. HEAD contains ref: refs/heads/main
  2. git status says fatal: this operation must be run in a work tree
  3. The thing I most recently checked out was, well, nothing, git checkout doesn’t even work in bare repositories
  4. My shell’s git prompt says: (BARE:main)

So #1 and #4 match (they both agree that the current branch is “main”), but git status and git checkout don’t even work.

Some notes about this one:

  • I think HEAD in a bare repository mainly only really affects 1 thing: it’s the branch that gets checked out when you clone the repository. It’s also used when you run git log.
  • if you really want to, you can update HEAD in a bare repository to a different branch with git symbolic-ref HEAD refs/heads/whatever. I’ve never needed to do that though and it seems weird because git symbolic ref doesn’t check if the thing you’re pointing HEAD at is actually a branch that exists. Not sure if there’s a better way.

all the results

Here’s a table with all of the results:

.git/HEAD git status checked out prompt
1. checkout main ref: refs/heads/main On branch main main (main)
2. checkout 775b2b 775b2b399... HEAD detached at 775b2b39 775b2b399 ((775b2b39))
3. checkout v1.0.13 ca182053c... HEAD detached at v1.0.13 v1.0.13 ((v1.0.13))
4. inside rebase c694cf8aa... interactive rebase in progress; onto c694cf8 main (main|REBASE-i 1/1)
5. after git init ref: refs/heads/main On branch main n/a (main)
6. bare repository ref: refs/heads/main fatal: this operation must be run in a work tree n/a (BARE:main)

“current branch” doesn’t seem completely well defined

My original instinct when talking about git was to agree with the git glossary and say that HEAD and the “current branch” mean the exact same thing.

But this doesn’t seem as ironclad as I used to think anymore! Some thoughts:

  • .git/HEAD is definitely the one with the most consistent format – it’s always either a branch or a commit ID. The others are all much messier
  • I have a lot more sympathy than I used to for the definition “the current branch is whatever you last checked out”. Git does a lot of work to remember which branch you last checked out (even if you’re currently doing a bisect or a merge or something else that temporarily moves HEAD off of that branch) and it feels weird to ignore that.
  • git status gives a lot of helpful context – these 5 status messages say a lot more than just what HEAD is set to currently
    1. on branch main
    2. HEAD detached at 775b2b39
    3. HEAD detached at v1.0.13
    4. interactive rebase in progress; onto c694cf8
    5. on branch main, no commits yet

some more “current branch” definitions

I’m going to try to collect some other definitions of the term current branch that I heard from people on Mastodon here and write some notes on them.

  1. “the branch that would be updated if i made a commit”
  • Most of the time this is the same as .git/HEAD
  • Arguably if you’re in the middle of a rebase, it’s different from HEAD, because ultimately that new commit will end up on the branch in .git/rebase-merge/head-name
  1. “the branch most git operations work against”
  • This is sort of the same as what’s in .git/HEAD, except that some operations (like git status) will behave differently in some situations, like how git status won’t tell you the current branch if you’re in a bare repository

on orphaned commits

One thing I noticed that wasn’t captured in any of this is whether the current commit is orphaned or not – the git status message (HEAD detached from c694cf8) is the same whether or not your current commit is orphaned.

I imagine this is because figuring out whether or not a given commit is orphaned might take a long time in a large repository: you can find out if the current commit is orphaned with git branch --contains HEAD, and that command takes about 500ms in a repository with 70,000 commits.

Git will warn you if the commit is orphaned (“Warning: you are leaving 1 commit behind, not connected to any of your branches…”) when you switch to a different branch though.

that’s all!

I don’t have anything particularly smart to say about any of this. The more I think about git the more I can understand why people get confused.

2024-03-08T10:13:27+00:00 Fullscreen Open in Tab
How HEAD works in git

Hello! The other day I ran a Mastodon poll asking people how confident they were that they understood how HEAD works in Git. The results (out of 1700 votes) were a little surprising to me:

  • 10% “100%”
  • 36% “pretty confident”
  • 39% “somewhat confident?”
  • 15% “literally no idea”

I was surprised that people were so unconfident about their understanding – I’d been thinking of HEAD as a pretty straightforward topic.

Usually when people say that a topic is confusing when I think it’s not, the reason is that there’s actually some hidden complexity that I wasn’t considering. And after some follow up conversations, it turned out that HEAD actually was a bit more complicated than I’d appreciated!

Here’s a quick table of contents:

HEAD is actually a few different things

After talking to a bunch of different people about HEAD, I realized that HEAD actually has a few different closely related meanings:

  1. The file .git/HEAD
  2. HEAD as in git show HEAD (git calls this a “revision parameter”)
  3. All of the ways git uses HEAD in the output of various commands (<<<<<<<<<<HEAD, (HEAD -> main), detached HEAD state, On branch main, etc)

These are extremely closely related to each other, but I don’t think the relationship is totally obvious to folks who are starting out with git.

the file .git/HEAD

Git has a very important file called .git/HEAD. The way this file works is that it contains either:

  1. The name of a branch (like ref: refs/heads/main)
  2. A commit ID (like 96fa6899ea34697257e84865fefc56beb42d6390)

This file is what determines what your “current branch” is in Git. For example, when you run git status and see this:

$ git status
On branch main

it means that the file .git/HEAD contains ref: refs/heads/main.

If .git/HEAD contains a commit ID instead of a branch, git calls that “detached HEAD state”. We’ll get to that later.

(People will sometimes say that HEAD contains a name of a reference or a commit ID, but I’m pretty sure that that the reference has to be a branch. You can technically make .git/HEAD contain the name of a reference that isn’t a branch by manually editing .git/HEAD, but I don’t think you can do it with a regular git command. I’d be interested to know if there is a regular-git-command way to make .git/HEAD a non-branch reference though, and if so why you might want to do that!)

HEAD as in git show HEAD

It’s very common to use HEAD in git commands to refer to a commit ID, like:

  • git diff HEAD
  • git rebase -i HEAD^^^^
  • git diff main..HEAD
  • git reset --hard HEAD@{2}

All of these things (HEAD, HEAD^^^, HEAD@{2}) are called “revision parameters”. They’re documented in man gitrevisions, and Git will try to resolve them to a commit ID.

(I’ve honestly never actually heard the term “revision parameter” before, but that’s the term that’ll get you to the documentation for this concept)

HEAD in git show HEAD has a pretty simple meaning: it resolves to the current commit you have checked out! Git resolves HEAD in one of two ways:

  1. if .git/HEAD contains a branch name, it’ll be the latest commit on that branch (for example by reading it from .git/refs/heads/main)
  2. if .git/HEAD contains a commit ID, it’ll be that commit ID

next: all the output formats

Now we’ve talked about the file .git/HEAD, and the “revision parameter” HEAD, like in git show HEAD. We’re left with all of the various ways git uses HEAD in its output.

git status: “on branch main” or “HEAD detached”

When you run git status, the first line will always look like one of these two:

  1. on branch main. This means that .git/HEAD contains a branch.
  2. HEAD detached at 90c81c72. This means that .git/HEAD contains a commit ID.

I promised earlier I’d explain what “HEAD detached” means, so let’s do that now.

detached HEAD state

“HEAD is detached” or “detached HEAD state” mean that you have no current branch.

Having no current branch is a little dangerous because if you make new commits, those commits won’t be attached to any branch – they’ll be orphaned! Orphaned commits are a problem for 2 reasons:

  1. the commits are more difficult to find (you can’t run git log somebranch to find them)
  2. orphaned commits will eventually be deleted by git’s garbage collection

Personally I’m very careful about avoiding creating commits in detached HEAD state, though some people prefer to work that way. Getting out of detached HEAD state is pretty easy though, you can either:

  1. Go back to a branch (git checkout main)
  2. Create a new branch at that commit (git checkout -b newbranch)
  3. If you’re in detached HEAD state because you’re in the middle of a rebase, finish or abort the rebase (git rebase --abort)

Okay, back to other git commands which have HEAD in their output!

git log: (HEAD -> main)

When you run git log and look at the first line, you might see one of the following 3 things:

  1. commit 96fa6899ea (HEAD -> main)
  2. commit 96fa6899ea (HEAD, main)
  3. commit 96fa6899ea (HEAD)

It’s not totally obvious how to interpret these, so here’s the deal:

  • inside the (...), git lists every reference that points at that commit, for example (HEAD -> main, origin/main, origin/HEAD) means HEAD, main, origin/main, and origin/HEAD all point at that commit (either directly or indirectly)
  • HEAD -> main means that your current branch is main
  • If that line says HEAD, instead of HEAD ->, it means you’re in detached HEAD state (you have no current branch)

if we use these rules to explain the 3 examples above: the result is:

  1. commit 96fa6899ea (HEAD -> main) means:
    • .git/HEAD contains ref: refs/heads/main
    • .git/refs/heads/main contains 96fa6899ea
  2. commit 96fa6899ea (HEAD, main) means:
    • .git/HEAD contains 96fa6899ea (HEAD is “detached”)
    • .git/refs/heads/main also contains 96fa6899ea
  3. commit 96fa6899ea (HEAD) means:
    • .git/HEAD contains 96fa6899ea (HEAD is “detached”)
    • .git/refs/heads/main either contains a different commit ID or doesn’t exist

merge conflicts: <<<<<<< HEAD is just confusing

When you’re resolving a merge conflict, you might see something like this:

<<<<<<< HEAD
def parse(input):
    return input.split("\n")
=======
def parse(text):
    return text.split("\n\n")
>>>>>>> somebranch

I find HEAD in this context extremely confusing and I basically just ignore it. Here’s why.

  • When you do a merge, HEAD in the merge conflict is the same as what HEAD was when you ran git merge. Simple.
  • When you do a rebase, HEAD in the merge conflict is something totally different: it’s the other commit that you’re rebasing on top of. So it’s totally different from what HEAD was when you ran git rebase. It’s like this because rebase works by first checking out the other commit and then repeatedly cherry-picking commits on top of it.

Similarly, the meaning of “ours” and “theirs” are flipped in a merge and rebase.

The fact that the meaning of HEAD changes depending on whether I’m doing a rebase or merge is really just too confusing for me and I find it much simpler to just ignore HEAD entirely and use another method to figure out which part of the code is which.

some thoughts on consistent terminology

I think HEAD would be more intuitive if git’s terminology around HEAD were a little more internally consistent.

For example, git talks about “detached HEAD state”, but never about “attached HEAD state” – git’s documentation never uses the term “attached” at all to refer to HEAD. And git talks about being “on” a branch, but never “not on” a branch.

So it’s very hard to guess that on branch main is actually the opposite of HEAD detached. How is the user supposed to guess that HEAD detached has anything to do with branches at all, or that “on branch main” has anything to do with HEAD?

that’s all!

If I think of other ways HEAD is used in Git (especially ways HEAD appears in Git’s output), I might add them to this post later.

If you find HEAD confusing, I hope this helps a bit!

2024-02-16T10:42:27+00:00 Fullscreen Open in Tab
Popular git config options

Hello! I always wish that command line tools came with data about how popular their various options are, like:

  • “basically nobody uses this one”
  • “80% of people use this, probably take a look”
  • “this one has 6 possible values but people only really use these 2 in practice”

So I asked about people’s favourite git config options on Mastodon:

what are your favourite git config options to set? Right now I only really have git config push.autosetupremote true and git config init.defaultBranch main set in my ~/.gitconfig, curious about what other people set

As usual I got a TON of great answers and learned about a bunch of very popular git config options that I’d never heard of.

I’m going to list the options, starting with (very roughly) the most popular ones. Here’s a table of contents:

All of the options are documented in man git-config, or this page.

pull.ff only or pull.rebase true

These two were the most popular. These both have similar goals: to avoid accidentally creating a merge commit when you run git pull on a branch where the upstream branch has diverged.

  • pull.rebase true is the equivalent of running git pull --rebase every time you git pull
  • pull.ff only is the equivalent of running git pull --ff-only every time you git pull

I’m pretty sure it doesn’t make sense to set both of them at once, since --ff-only overrides --rebase.

Personally I don’t use either of these since I prefer to decide how to handle that situation every time, and now git’s default behaviour when your branch has diverged from the upstream is to just throw an error and ask you what to do (very similar to what git pull --ff-only does).

merge.conflictstyle zdiff3

Next: making merge conflicts more readable! merge.conflictstyle zdiff3 and merge.conflictstyle diff3 were both super popular (“totally indispensable”).

The main idea is The consensus seemed to be “diff3 is great, and zdiff3 (which is newer) is even better!”.

So what’s the deal with diff3. Well, by default in git, merge conflicts look like this:

<<<<<<< HEAD
def parse(input):
    return input.split("\n")
=======
def parse(text):
    return text.split("\n\n")
>>>>>>> somebranch

I’m supposed to decide whether input.split("\n") or text.split("\n\n") is better. But how? What if I don’t remember whether \n or \n\n is right? Enter diff3!

Here’s what the same merge conflict look like with merge.conflictstyle diff3 set:

<<<<<<< HEAD
def parse(input):
    return input.split("\n")
||||||| b9447fc
def parse(input):
    return input.split("\n\n")
=======
def parse(text):
    return text.split("\n\n")
>>>>>>> somebranch

This has extra information: now the original version of the code is in the middle! So we can see that:

  • one side changed \n\n to \n
  • the other side renamed input to text

So presumably the correct merge conflict resolution is return text.split("\n"), since that combines the changes from both sides.

I haven’t used zdiff3, but a lot of people seem to think it’s better. The blog post Better Git Conflicts with zdiff3 talks more about it.

rebase.autosquash true

Autosquash was also a new feature to me. The goal is to make it easier to modify old commits.

Here’s how it works:

  • You have a commit that you would like to be combined with some commit that’s 3 commits ago, say add parsing code
  • You commit it with git commit --fixup OLD_COMMIT_ID, which gives the new commit the commit message fixup! add parsing code
  • Now, when you run git rebase --autosquash main, it will automatically combine all the fixup! commits with their targets

rebase.autosquash true means that --autosquash always gets passed automatically to git rebase.

rebase.autostash true

This automatically runs git stash before a git rebase and git stash pop after. It basically passes --autostash to git rebase.

Personally I’m a little scared of this since it potentially can result in merge conflicts after the rebase, but I guess that doesn’t come up very often for people since it seems like a really popular configuration option.

push.default simple, push.default current, push.autoSetupRemote true

These push options tell git push to automatically push the current branch to a remote branch with the same name.

  • push.default simple is the default in Git. It only works if your branch is already tracking a remote branch
  • push.default current is similar, but it’ll always push the local branch to a remote branch with the same name.
  • push.autoSetupRemote true is a little different – this one makes it so when you first push a branch, it’ll automatically set up tracking for it

I think I prefer push.autoSetupRemote true to push.default current because push.autoSetupRemote true also lets you pull from the matching remote branch (though you do need to push first to set up tracking). push.default current only lets you push.

I believe the only thing to be careful of with push.autoSetupRemote true and push.default current is that you need to be confident that you’re never going to accidentally make a local branch with the same name as an unrelated remote branch. Lots of people have branch naming conventions (like julia/my-change) that make this kind of conflict very unlikely, or just have few enough collaborators that branch name conflicts probably won’t happen.

init.defaultBranch main

Create a main branch instead of a master branch when creating a new repo.

commit.verbose true

This adds the whole commit diff in the text editor where you’re writing your commit message, to help you remember what you were doing.

rerere.enabled true

This enables rerere ("reuse recovered resolution"), which remembers how you resolved merge conflicts during a git rebase and automatically resolves conflicts for you when it can.

help.autocorrect 10

By default git’s autocorrect try to check for typos (like git ocmmit), but won’t actually run the corrected command.

If you want it to run the suggestion automatically, you can set help.autocorrect to 1 (run after 0.1 seconds), 10 (run after 1 second), immediate (run immediately), or prompt (run after prompting)

core.pager delta

The “pager” is what git uses to display the output of git diff, git log, git show, etc. People set it to:

  • delta (a fancy diff viewing tool with syntax highlighting)
  • less -x5,9 (sets tabstops, which I guess helps if you have a lot of files with tabs in them?)
  • less -F -X (not sure about this one, -F seems to disable the pager if everything fits on one screen if but my git seems to do that already anyway)
  • cat (to disable paging altogether)

I used to use delta but turned it off because somehow I messed up the colour scheme in my terminal and couldn’t figure out how to fix it. I think it’s a great tool though.

I believe delta also suggests that you set up interactive.diffFilter delta --color-only to syntax highlight code when you run git add -p.

diff.algorithm histogram

Git’s default diff algorithm often handles functions being reordered badly. For example look at this diff:

-.header {
+.footer {
     margin: 0;
 }

-.footer {
+.header {
     margin: 0;
+    color: green;
 }

I find it pretty confusing. But with diff.algorithm histogram, the diff looks like this instead, which I find much clearer:

-.header {
-    margin: 0;
-}
-
 .footer {
     margin: 0;
 }

+.header {
+    margin: 0;
+    color: green;
+}

Some folks also use patience, but histogram seems to be more popular. When to Use Each of the Git Diff Algorithms has more on this.

core.excludesfile: a global .gitignore

core.excludeFiles = ~/.gitignore lets you set a global gitignore file that applies to all repositories, for things like .idea or .DS_Store that you never want to commit to any repo. It defaults to ~/.config/git/ignore.

includeIf: separate git configs for personal and work

Lots of people said they use this to configure different email addresses for personal and work repositories. You can set it up something like this:

[includeIf "gitdir:~/code/<work>/"]
path = "~/code/<work>/.gitconfig"

url."git@github.com:".insteadOf 'https://github.com/'

I often accidentally clone the HTTP version of a repository instead of the SSH version and then have to manually go into ~/.git/config and edit the remote URL. This seems like a nice workaround: it’ll replace https://github.com in remotes with git@github.com:.

Here’s what it looks like in ~/.gitconfig since it’s kind of a mouthful:

[url "git@github.com:"]
	insteadOf = "https://github.com/"

One person said they use pushInsteadOf instead to only do the replacement for git push because they don’t want to have to unlock their SSH key when pulling a public repo.

A couple of other people mentioned setting insteadOf = "gh:" so they can git remote add gh:jvns/mysite to add a remote with less typing.

fsckobjects: avoid data corruption

A couple of people mentioned this one. Someone explained it as “detect data corruption eagerly. Rarely matters but has saved my entire team a couple times”.

transfer.fsckobjects = true
fetch.fsckobjects = true
receive.fsckObjects = true

submodule stuff

I’ve never understood anything about submodules but a couple of person said they like to set:

  • status.submoduleSummary true
  • diff.submodule log
  • submodule.recurse true

I won’t attempt to explain those but there’s an explanation on Mastodon by @unlambda here.

and more

Here’s everything else that was suggested by at least 2 people:

  • blame.ignoreRevsFile .git-blame-ignore-revs lets you specify a file with commits to ignore during git blame, so that giant renames don’t mess up your blames
  • branch.sort -committerdate, makes git branch sort by most recently used branches instead of alphabetical, to make it easier to find branches. tag.sort taggerdate is similar for tags.
  • color.ui false: to turn off colour
  • commit.cleanup scissors: so that you can write #include in a commit message without the # being treated as a comment and removed
  • core.autocrlf false: on Windows, to work well with folks using Unix
  • core.editor emacs: to use emacs (or another editor) to edit commit messages
  • credential.helper osxkeychain: use the Mac keychain for managing
  • diff.tool difftastic: use difftastic (or meld or nvimdiffs) to display diffs
  • diff.colorMoved default: uses different colours to highlight lines in diffs that have been “moved”
  • diff.colorMovedWS allow-indentation-change: with diff.colorMoved set, also ignores indentation changes
  • diff.context 10: include more context in diffs
  • fetch.prune true and fetch.prunetags - automatically delete remote tracking branches that have been deleted
  • gpg.format ssh: allow you to sign commits with SSH keys
  • log.date iso: display dates as 2023-05-25 13:54:51 instead of Thu May 25 13:54:51 2023
  • merge.keepbackup false, to get rid of the .orig files git creates during a merge conflict
  • merge.tool meld (or nvim, or nvimdiff) so that you can use git mergetool to help resolve merge conflicts
  • push.followtags true: push new tags along with commits being pushed
  • rebase.missingCommitsCheck error: don’t allow deleting commits during a rebase
  • rebase.updateRefs true: makes it much easier to rebase multiple stacked branches at a time. Here’s a blog post about it.

how to set these

I generally set git config options with git config --global NAME VALUE, for example git config --global diff.algorithm histogram. I usually set all of my options globally because it stresses me out to have different git behaviour in different repositories.

If I want to delete an option I’ll edit ~/.gitconfig manually, where they look like this:

[diff]
	algorithm = histogram

config changes I’ve made after writing this post

My git config is pretty minimal, I already had:

  • init.defaultBranch main
  • push.autoSetupRemote true
  • merge.tool meld
  • diff.colorMoved default (which actually doesn’t even work for me for some reason but I haven’t found the time to debug)

and I added these 3 after writing this blog post:

  • diff.algorithm histogram
  • branch.sort -committerdate
  • merge.conflictstyle zdiff3

I’d probably also set rebase.autosquash if making carefully crafted pull requests with multiple commits were a bigger part of my life right now.

I’ve learned to be cautious about setting new config options – it takes me a long time to get used to the new behaviour and if I change too many things at once I just get confused. branch.sort -committerdate is something I was already using anyway (through an alias), and I’m pretty sold that diff.algorithm histogram will make my diffs easier to read when I reorder functions.

that’s all!

I’m always amazed by how useful to just ask a lot of people what stuff they like and then list the most commonly mentioned ones, like with this list of new-ish command line tools I put together a couple of years ago. Having a list of 20 or 30 options to consider feels so much more efficient than combing through a list of all 600 or so git config options

It was a little confusing to summarize these because git’s default options have actually changed a lot of the years, so people occasionally have options set that were important 8 years ago but today are the default. Also a couple of the experimental options people were using have been removed and replaced with a different version.

I did my best to explain things accurately as of how git works right now in 2024 but I’ve definitely made mistakes in here somewhere, especially because I don’t use most of these options myself. Let me know on Mastodon if you see a mistake and I’ll try to fix it.

I might also ask people about aliases later, there were a bunch of great ones that I left out because this was already getting long.

2023-12-01T19:38:05-08:00 Fullscreen Open in Tab
I took the High-Speed Brightline Train from Miami to Orlando with only two hours notice

It was 11am at the Fort Lauderdale airport, an hour after my non-stop flight to Portland was supposed to have boarded. As I had been watching our estimated departure get pushed back in 15 minute increments, I finally received the dreaded news over the loudspeaker - the flight was cancelled entirely. As hordes of people started lining up to rebook their flights with the gate agent, I found a quiet spot in the corner and opened up my laptop to look at my options.

The other Alaska Airlines flight options were pretty terrible. There was a Fort Lauderdale to Seattle to Portland option that would have me landing at midnight. A flight on a partner airline had a 1-hour connection through Dallas, and there were only middle seats available on both legs. So I started to get creative, and searched for flights from Orlando, about 200 miles north. There was a non-stop on Alaska Airlines at 7pm, with plenty of available seats, so I called up customer service and asked them to change my booking. Since the delay was their fault, there were no change fees even though the flight was leaving from a different airport.

So now it was my responsibility to get myself from Miami to Orlando by 7pm. I could have booked a flight on a budget airline for $150, but it wouldn't have been a very nice experience, and I'd have a lot of time to kill in the Orlando airport. Then I remembered the Brightline train recently opened new service from Miami to Orlando, supposedly taking less time than driving there.

Brightline Station Fort Lauderdale

Brightline Station

Never having tried to take that train before, I didn't realize they run a shuttle service from the Fort Lauderdale airport to the train station, so I jumped in an Uber headed to the station. On the way there, I booked a ticket on my phone. The price from Miami to Orlando was $144 for Coach, or $229 for Premium class. Since this will probably be the only time I take this train for the foreseeable future, I splurged for the Premium class ticket to see what that experience is like.

Astute readers will have noticed that I mentioned I booked a ticket from Miami rather than Fort Lauderdale. We'll come back to that in a bit. Once I arrived at the station, I began my Brightline experience.

Walking in to the station felt like something between an airport and a car rental center.

Brightline Station entrance

There was a small ticket counter in the lobby, but I already had a ticket on my phone so I went up the escalators.

Brightline Station escalator

At the top of the escalators was an electronic gate where you scan your QR code to go through. Mine didn't work (again, more on that later), but it was relatively empty and a staff member was able to look at my ticket on my phone and let me through anyway. There was a small X-Ray machine, I tossed my roller bag and backpack onto the belt, but kept my phone and wallet in my pocket, and walked through the security checkpoint.

Once through the minimal security checkpoint, I was up in the waiting area above the platform with a variety of different sections. There was a small bar with drinks and snacks, a couple large seating areas, an automated mini mart, some tall tables...

Stool seating

More seating

Even more seating

Shop entrance

... and the entrance to the Premium lounge.

Brightline Station Premium Lounge

Premium Lounge entrance

The Premium Lounge entrance had another electronic gate with a QR code scanner. I tried getting in but it also rejected my boarding pass. My first thought was I booked my ticket just 10 minutes earlier so it hadn't synced up yet, so I went back to the the security checkpoint and asked what was wrong. They looked at my boarding pass and had no idea what was wrong, and let me in to the lounge via the back employee-only entrance instead.

Once inside the lounge, I did a quick loop to see what kind of food and drink options there were. The lounge was entirely un-attended, the only staff I saw were at the security checkpoint, and someone occasionally coming through to take out dirty dishes.

The first thing you're presented with after entering the lounge is the beverage station. There are 6 taps with beer and wine, and you use a touch screen to make your selection and pour what you want.

Beverages

On the other side of the wall is the food. I arrived at the tail end of the breakfast service, so there were pretty slim pickings by the end.

Breakfast

There were yogurts, granola, a bowl of bacon and egg mix, several kinds of pastries, and a bowl of fruit that nobody seemed to have touched. I don't know if this was just because this was the end of the morning, but if you were vegan or gluten free there was really nothing you could eat there.

There was also a coffee and tea station with some minimal options.

Coffee station

Shortly after I arrived, it rolled over to lunch time, so the staff came out to swap out the food at the food station. The lunch options were also minimal, but there was a bit more selection.

Lunch

There was a good size meat and cheese spread. I'm not a big fan of when they mix the meat and cheese on the same plate, but there was enough of a cheese island in the middle I was reasonably confident I wasn't eating meat juice off the side of the cheeses. The pasta dish also had meat so I didn't investigate further. Two of the three wraps had meat and I wasn't confident about which were which so I skipped those. There was a pretty good spinach and feta salad, and some hummus as well as artichoke dip, and a variety of crackers. If you like desserts, there was an even better selection of small desserts as well.

At this point I was starting to listen for my train's boarding announcement. There was really barely any staff visible anywhere, but the few people I saw had made it clear they would clearly announce the train over the loudspeakers when it was time. There was also a sign at the escalators to the platform that said boarding opens 10 minutes before the train departs.

Ten minute warning

The trains run northbound and southbound every 1-2 hours, so it's likely that you'll only hear one announcement for a train other than yours the entire time you're there.

Departure board

The one train announcement I heard was a good demonstration of how quickly the whole process actually is once the train shows up. The train pulls up, they call everyone down to the platform, and you have ten minutes to get onto the train. Ten minutes isn't much, but you're sitting literally right on top of the train platform so it takes no time to get down there.

View from the lounge

Once your train is called, it's time to head down the escalator to the train platform!

Boarding the Train

Escalators

Escalators

But wait, I mentioned my barcode had failed to be scanned a couple times at this point. Let me explain. Apparently, in my haste in the back of the Uber, I had actually booked a ticket from Miami to Orlando, but since I was already at the Fort Lauderdale airport, I had gone to the Fort Lauderdale Brightline station since it was the closest. So the departure time I saw on my ticket didn't match the time the train arrived at Fort Lauderdale, and the ticket gates refused to let me in because the ticket didn't depart from that station. I don't know why none of the employees who looked at my ticket mentioned this ever. It didn't end up being a big deal because thankfully Miami was earlier in the route, so I essentially just got on my scheduled train 2 stops late.

Brightline Route

So anyway, I made my way down to the platform to board the train. I should also mention at this point that I was on a conference call from my phone. I had previously connected my phone to the free wifi at the station, and it was plenty good enough for the call. As I went down the escalator to the platform, it broke up a bit in the middle of the escalator, but picked back up once I was on the platform outside.

Platform

There were some signs on the platform to indicate "Coach 1", "Coach 2" and "Coach 3" cars. However my ticket was a "Premium" ticket, so I walked to where I assumed the front of the train would be when it pulled up.

Train approach

I got on the train on the front car marked "SMART" and "3", seats 9-17. It wasn't clear what "SMART" was since I didn't see that option when booking online. My seat was seat 9A, so I wasn't entirely sure I was in the right spot, but I figured better to be on the train than on the platform, so I just went in. We started moving shortly after. As soon as I walked in, I had to walk past the train attendant pushing a beverage cart through the aisles. I made it to seat 9, but it was occupied. I asked the attendant where my seat was, and she said it was in car 1 at the "front", and motioned to the back of the train. I don't know why their cars are in the opposite order you'd expect. So I took my bags back to car 1 where I was finally greeted with the "Premium" sign I was looking for.

Premium

I was quickly able to find my seat, which was not in fact occupied. The Premium car was configured with 2 seats on one side and 1 seat on the other side.

The Brightline Premium Car

Premium Seats

Some of the seats are configured to face each other, so there is a nice variety of seating options. You could all be sitting around a table if you booked a ticket for 4 people, or you could book 2 tickets and sit either next to each other or across from each other.

Seating across

Since I had booked my ticket so last minute, I had basically the last available seat in the car so I was sitting next to someone. As soon as I sat down, the beverage cart came by with drinks. The cart looked like the same type you'd find on an airplane, and even had some identical warning stickers on it such as the "must be secured for takeoff and landing" sign. The drink options were also similar to what you'd get on a Premium Economy flight service. I opted for a glass of prosecco, and made myself comfortable.

The tray table at the seat had two configurations. You could either drop down a small flap or the whole tray.

Small tray table

Large tray table

The small tray was big enough to hold a drink or an iPad or phone, but not much else. The large tray was big enough for my laptop with a drink next to it as well as an empty glass or bottle behind it.

Under the seat there was a single power outlet for the 2 seats with 120v power as well as two USB-C ports.

Power outlets

Shortly after I had settled in, the crew came back with a snack tray and handed me these four snacks without really giving me the option of refusing any of them.

Snacks

At this point I wasn't really hungry since I had just eaten at the airport, so I stuffed the snacks in my bag, except for the prosciutto, which I offered to my seat mate but he refused.

By this point we were well on our way to the Boca Raton stop. A few people got off and on there, and we continued on. I should add here that I always feel a bit unsettled when there is that much movement of people getting on and off all the time. These stops were about 20-30 minutes away from each other, which meant the beginning of the ride I never really felt completely settled in. This is the same reason I prefer a 6 hour flight over two 3 hour flights. I like to be able to settle in and just not think about anything until we arrive.

We finally left the last of the South Florida stops, West Palm Beach, and started the rest of the trip to Orlando. A bunch of people got off at West Palm Beach, enough that the Premium cabin was nearly empty at that point. I was able to move to the seat across the aisle which was a window/aisle seat all to myself!

My own seat

Finally I could settle in for the long haul. Shortly before 3, the crew came by with the lunch cart. The options were either a vegetarian or non-vegetarian option, so that made the choice easy for me.

Lunch

The vegetarian option was a tomato basil mozzarella sandwich, a side of fruit salad, and some vegetables with hummus. The hummus was surprisingly good, not like the little plastic tubs you get at the airport. The sandwich was okay, but did have a nice pesto spread on it.

After lunch, I opened up my computer to start writing this post and worked on it for most of the rest of the trip.

As the train started making a left turn to head west, the conductor came on the loudspeaker and made an announcement along the lines of "we're about to head west onto the newest tracks that have been built in the US in 100 years. We'll be reaching 120 miles per hour, so feel free to feel smug as we whiz by the cars on the highway." And sure enough, we really picked up the speed on that stretch! While we had reached 100-120mph briefly during the trip north, that last stretch was a solid 120mph sustained for about 20 minutes!

Orlando Station

Orlando Station

We finally slowed down and pulled into the Orlando station at the airport.

Disembarking the train was simple enough. This was the last stop of the train so there wasn't quite as much of a rush to get off before the train started again. There's no need to mind the gap as you get off since there's a little platform that extends from the train car.

Don't mind the gap

At the Orlando station there was a short escalator up and then you exit through the automated gates.

Exit gates

I assumed I would have to scan my ticket when exiting but that ended up not being the case. Which actually meant that the only time my ticket was ever checked was when entering the station. I never saw anyone come through to check tickets on the train.

At this point I was already in the airport, and it was a short walk around the corner to the tram that goes directly to the airport security checkpoint.

The whole trip took 176 minutes for 210 miles, which is an average speed of 71 miles per hour. When moving, we were typically moving at anywhere from 80-120 miles per hour.

Summary

  • The whole experience was way nicer than an airplane, I would take this over a short flight from Miami to Orlando any day.
  • It felt similar to a European train, but with service closer to an airline.
  • The service needs to be better timed with the stops when people are boarding.
  • The only ticket check was when entering the station, nobody came to check my ticket or seat on the train, or even when I left the destination station.
  • While the Premium car food and drinks were free, I'm not sure it was worth the $85 extra ticket price over just buying the food you want.
  • Unfortunately the ticket cost was similar to that of budget airlines, I would have preferred the cost to be slightly lower. But even still, I would definitely take this train over a budget airline at the same cost.

We need more high speed trains in the US! I go from Portland to Seattle often enough that a train running every 90 minutes that was faster than a car and easier and more comfortable than an airplane would be so nice!

2023-10-23T09:12:55-07:00 Fullscreen Open in Tab
OAuth for Browser-Based Apps Draft 15

After a lot of discussion on the mailing list over the last few months, and after some excellent discussions at the OAuth Security Workshop, we've been working on revising the draft to provide clearer guidance and clearer discussion of the threats and consequences of the various architectural patterns in the draft.

I would like to give a huge thanks to Philippe De Ryck for stepping up to work on this draft as a co-author!

This version is a huge restructuring of the draft and now starts with a concrete description of possible threats of malicious JavaScript as well as the consequences of each. The architectural patterns have been updated to reference which of each threat is mitigated by the pattern. This restructuring should help readers make a better informed decision by being able to evaluate the risks and benefits of each solution.

https://datatracker.ietf.org/doc/html/draft-ietf-oauth-browser-based-apps

https://www.ietf.org/archive/id/draft-ietf-oauth-browser-based-apps-15.html

Please give this a read, I am confident that this is a major improvement to the draft!

2023-03-09T17:09:09-08:00 Fullscreen Open in Tab
OAuth Support in Bluesky and AT Protocol

Bluesky, a new social media platform and AT Protocol, is unsurprisingly running up against the same challenges and limitations that Flickr, Twitter and many other social media platforms faced in the 2000s: passwords!

yelp asks you to enter your gmail password

You wouldn't give your Gmail password to Yelp, right? Why should you give your Bluesky password to random apps either!

The current official Bluesky iOS application unsurprisingly works by logging in with a username and password. It's the easiest form of authentication to implement, even if it is the least secure. Since Bluesky and the AT Protocol are actually intending on creating an entire ecosystem of servers and clients, this is inevitably going to lead to a complete security disaster. In fact, we're already seeing people spin up prototype Bluesky clients, sharing links around to them, which result in users being taught that there's nothing wrong with handing out their account passwords to random website and applications that ask for them. Clearly there has to be a solution, right?

The good news is there has been a solution that has existed for about 15 years -- OAuth! This is exactly the problem that OAuth was created to solve. How do we let third party applications access data in a web service without sharing the password with that application.

What's novel about Bluesky (and other similarly decentralized and open services like WordPress, Mastodon, Micro.blog, and others), is that there is an expectation that any user should be able to bring any client to any server, without prior relationships between client developers and servers. This is in contrast to consumer services like Twitter and Google, where they limit which developers can access their API by going through a developer registration process. I wrote more about this problem in a previous blog post, OAuth for the Open Web.

There are two separate problems that Bluesky can solve with OAuth, especially a flavor of OAuth like IndieAuth.

  1. How apps can access data in the user's Personal Data Server (PDS)
  2. How the user logs in to their PDS

How apps can access the user's data

This is the problem OAuth solved when it was originally created, and the problem ATProto currently has. It's obviously very unsafe to have users give their PDS password to every third party application that's created, especially since the ecosystem is totally open so there's no way for a user to know how legitimate a particular application is. OAuth solves this by having the application redirect to the OAuth server, the user logs in there, and then the application gets only an access token.

ATProto already uses access tokens and refresh tokens, (although they strangely call them accessJwt and refreshJwt) so this is a small leap to make. OAuth support in mobile apps has gotten a lot better than it was 10 years ago, and there is first class support for this pattern on iOS and Android to make the experience work better than the much older plain redirect model used to work a decade ago.

Here is what the rough experience the user would see when logging in to an app:

app login flow

  1. The user launches the app and taps the "Sign In" button
  2. The user enters their handle or server name (e.g. jay.bsky.social, bsky.social, or aaronpk.com)
  3. The app discovers the user's OAuth server, and launches an in-app browser
  4. The user lands on their own PDS server, and logs in there (however they log in is not relevant to the app, it could be with a password, via email magic link, a passkey, or even delegated login to another provider)
  5. The user is presented with a dialog asking if they want to grant access to this app (this step is optional, but it's up to the OAuth server whether to do this and what it looks like)
  6. The application receives the authorization code and exchanges it at the PDS for an access token and refresh token


Most of this is defined in the core OAuth specifications. The part that's missing from OAuth is:

  • discovering an OAuth server given a server name
  • and how clients should be identified when there is no client preregistration step.

That's where IndieAuth fills this in. With IndieAuth, the user's authorization server is discovered by fetching the web page at their URL. IndieAuth avoids the need for client registration by also using URLs as OAuth client_ids.

This does mean IndieAuth assumes there is an HTML document hosted at the URL the user enters, which works well for web based solutions, and might even work well for Bluesky given the number of people who have already rushed to set their Bluesky handle to the same URL as their personal website. But, long term it might be an additional burden for people who want to bring their own domain to Bluesky if they aren't also hosting a website there.

There's a new discussion happening in the OAuth working group to enable this kind of authorization server discovery from a URL which could rely on DNS or a well-known endpoint. This is in-progress work at the IETF, and I would love to have ATProto/Bluesky involved in those discussions!

How the user logs in to their PDS

Currently, the AT Protocol specifies that login happens with a username and password to get the tokens the app needs. Once clients start using OAuth to log in to apps, this method can be dropped from the specification, which interestingly opens up a lot of new possibilities.

Passwords are inherently insecure, and there has been a multi-year effort to improve the security of every online service by adding two-factor authentication and even moving away from passwords entirely by using passkeys instead.

Imagine today, Bluesky wants to add multifactor authenticaiton to their current service. There's no good way to add this to the existing API, since the Bluesky client will send the password to the API and expect an access token immediately. If Bluesky switches to an OAuth flow described above, then the app never sees the password, which means the Bluesky server can start doing more fun things with multifactor auth as well as even passwordless flows!

Logging in with a passkey

Here is the same sequence of steps but this time swapping out the password step for a passkey.

app login flow with passkey

  1. The user launches the app and taps the "Sign In" button
  2. The user enters their handle or server name (e.g. jay.bsky.social, bsky.social, or aaronpk.com)
  3. The app discovers the user's OAuth server, and launches an in-app browser
  4. The user lands on their own PDS server, and logs in there with a passkey
  5. The user is presented with a dialog asking if they want to grant access to this app (this step is optional, but it's up to the OAuth server whether to do this and what it looks like)
  6. The application receives the authorization code and exchanges it at the PDS for an access token and refresh token

This is already a great improvement, and the nice thing is app developers don't need to worry about implementing passkeys, they just need to implement OAuth! The user's PDS implements passkeys and abstracts that away by providing the OAuth API instead.

Logging in with IndieAuth

Another variation of this would be if the Bluesky service itself supported delegating logins instead of managing any passwords or passkeys at all.

Since Bluesky already supports users setting their handle to their own personal website, it's a short leap to imaging allowing users to authenticate themselves to Bluesky using their website as well!

That is the exact problem IndieAuth already solves, with quite a few implementations in the wild of services that are IndieAuth providers, including Micro.blog, a WordPress plugin, a Drupal module, and many options for self-hosting and endpoint.

Let's look at what the sequence would look like for a user to use the bsky.social PDS with their custom domain handle mapped to it.

app login flow with indieauth

  1. The user launches the app and taps the "Sign In" button
  2. The user enters their server name (e.g. bsky.social)
  3. The app discovers the OAuth server and launches an in-app browser
  4. The user enters their handle, and bsky.social determines whether to prompt for a password or do an IndieAuth flow to their server
  5. The user is redirected to their own website (IndieAuth server) and authenticates there, and is then redirected back to bsky.social
  6. The user is presented by bsky.social with a dialog asking if they want to grant access to this app
  7. The application receives the authorization code and exchanges it at the PDS for an access token and refresh token

This is very similar to the previous flows, the difference being that in this version, bsky.social is the OAuth server as far as the app is concerned. The app never sees the user's actual IndieAuth server at all.

Further Work

These are some ideas to kick off the discussion of improving the security of Bluesky and the AT Protocol. Let me know if you have any thoughts on this! There is of course a lot more detail to discuss about the specifics, so if you're interested in diving in, a good place to start is reading up on OAuth as well as the IndieAuth extension to OAuth which has solved some of the problems that exist in the space.

You can reply to this post by sending a Webmention from your own website, or you can get in touch with me via Mastodon or, of course, find me on Bluesky as @aaronpk.com!