What Happens If Someone Reimplements Your Open Source Software with LLMs And Relicenses It?

It seems there’s a new use case for LLMs: letting it reimplement open-source software in order to re-license the result.

Armin Ronacher has some interesting thoughts on the licensing consequences:

What I think is more interesting about this question is the consequences of where we are. Copyleft code like the GPL heavily depends on copyrights and friction to enforce it. But because it’s fundamentally in the open, with or without tests, you can trivially rewrite it these days.

There are huge consequences to this. When the cost of generating code goes down that much, and we can re-implement it from test suites alone, what does that mean for the future of software? Will we see a lot of software re-emerging under more permissive licenses? Will we see a lot of proprietary software re-emerging as open source? Will we see a lot of software re-emerging as proprietary?

For me personally, what is more interesting is that we might not even be able to copyright these creations at all. A court still might rule that all AI-generated code is in the public domain, because there was not enough human input in it. That’s quite possible, though probably not very likely.

In the GPL case, though, I think it warms up some old fights about copyleft vs permissive licenses that we have not seen in a long time. It probably does not feel great to have one’s work rewritten with a Clanker and one’s authorship eradicated. Unlike the Ship of Theseus, though, this seems more clear-cut: if you throw away all code and start from scratch, even if the end result behaves the same, it’s a new ship. It only continues to carry the name. Which may be another argument for why authors should hold on to trademarks rather than rely on licenses and contract law.

Simon Willison has a timeline of how it came to the “LLM rewrite” of chardet summarizes the arguments of those involved. There’s also a comment by one of the authors of the GPLv3 and LGPLv3 Richard Fontana:

[…] FWIW, IANDBL, TINLA, etc., I don’t currently see any basis for concluding that chardet 7.0.0 is required to be released under the LGPL. AFAIK no one including Mark Pilgrim has identified persistence of copyrightable expressive material from earlier versions in 7.0.0 nor has anyone articulated some viable alternate theory of license violation. I don’t think I personally would have used the MIT license here, even if I somehow rewrote everything from scratch without the use of AI in a way that didn’t implicate obligations flowing from earlier versions of chardet, but that’s irrelevant.

Bionic Duckweed

bionic duckweed, noun

An as-yet-non-existent innovation, hyped with the aim not to sell it or to invent it, but simply to put a stop on or stalling the actually-existing competition.

In its broader sense, bionic duckweed can be thought of a sort of unobtainium that renders investment in present-day technologies pointless, unimaginative, and worst of all On The Wrong Side Of History. […] A a sort of promissory note in reverse, forcing us into inaction today in the hope of wonders tomorrow.

from Bionic Duckweed: making the future the enemy of the present.

On Moltbook

Bruce Schneier has probably found the best and most succinct quotes to summarize Moltbook:

Many people have pointed out that a lot of the viral comments were in fact posted by people posing as bots. But even the bot-written posts are ultimately the result of people pulling the strings, more puppetry than autonomy.

But it also has a very dystopian outlook on what might follow:

The theory is simple: First, AI gets accessible enough that anyone can use it. Second, AI gets good enough that you can’t reliably tell what’s fake. Third, and this is the crisis point, regular people realize there’s nothing online they can trust. At that moment, the internet stops being useful for anything except entertainment.

Those ones were the expensive headcount anyway

Arstechnica reports on a study where they measured the productivity of software developers of different open source projects doing different (also non-coding) tasks.

In the comments there’s a snarky summary of the articles main point:

“These factors lead the researchers to conclude that current AI coding tools may be particularly ill-suited to “settings with very high quality standards, or with many implicit requirements (e.g., relating to documentation, testing coverage, or linting/formatting) that take humans substantial time to learn.” While those factors may not apply in “many realistic, economically relevant settings” involving simpler code bases, they could limit the impact of AI tools in this study and similar real-world situations.”

So as long as I cull the experienced people and commit to lousy software the glorious Age of AI will deliver productivity gains? Awesome, those ones were the expensive headcount!

Century-Scale Storage

What would you use to keep (digital) data safe for at least a hundred years? Maxwell Neely-Cohen looks at all the factors, possible technologies, social and economic challenges that you have to contend with if you intentionally want to store data for a century. He explicitly chose that time scale, because it is at the edge of what a human can experience, but it is outside of a single human’s work life as well as beyond the lifetime of most companies or institutions. So the premise sets you up for a host of problems to be solved. He also analyses strategies for recording and keeping data past and present and evaluates their potential for keeping data safe at century-scale.
It’s long, but worth it.

We’ll Ask The AI How to Make Money

We have no current plans to make revenue.

We have no idea how we may one day generate revenue.

We have made a soft promise to investors that once we’ve built a general intelligence system, basically we will ask it to figure out a way to generate an investment return for you.

Sam Altman to VCs in 2024

A video of this memorable moment … you can’t make this up.