Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
The relicensing of chardet [LWN.net]
[go: Go Back, main page]

|
|
Log in / Subscribe / Register

The relicensing of chardet

By Jonathan Corbet
March 5, 2026
Chardet is a Python module that attempts to determine which character set was used to encode a text string. It was originally written by Mark Pilgrim, who is also the author of a number of Python books; the 1.0 release happened in 2006. For many years, this module has been under the maintainership of Dan Blanchard. Chardet has always been licensed under the LGPL, but, with the 7.0.0 release, Blanchard changed the terms to the permissive MIT license. That has led to an extensive (and ongoing) discussion on when code can be relicensed against the wishes of its original author, and whether using a large language model to rewrite code is a legitimate way to strip copyleft requirements from code.

The fact that chardet is LGPL-licensed has indeed caused some unhappiness in the past. That license is incompatible with the requirements for the Python standard library, frustrating those who would like to see chardet become one of the "batteries" that are included with Python; that licensing has also blocked the inclusion of some other modules that use chardet. Blanchard bemoaned his inability to relicense the code back in 2021:

Unfortunately, because the code that chardet was originally based on was LGPL, we don't really have a way to relicense it. Believe me, if we could, I would. There was talk of chardet being added to the standard library, and that was deemed impossible because of being unable to change the license.

In 2026, though, that inability has, according to Blanchard, been overcome by virtue of a complete rewrite — done using Anthropic's Claude LLM — of the source. Pilgrim did not see it that way:

However, it has been brought to my attention that, in the release 7.0.0, the maintainers claim to have the right to "relicense" the project. They have no such right; doing so is an explicit violation of the LGPL. Licensed code, when modified, must be released under the same LGPL license. Their claim that it is a "complete rewrite" is irrelevant, since they had ample exposure to the originally licensed code (i.e. this is not a "clean room" implementation). Adding a fancy code generator into the mix does not somehow grant them any additional rights.

Blanchard, unsurprisingly, disagreed. A clean-room reimplementation, he said, "is a means to an end, not the end itself", and that there are other ways to reach that end, including an LLM rewrite. He pointed to results from a code-comparison tool showing that there was almost no similarity between version 7.0 and the previous versions, and concluded:

I then started in an empty repository with no access to the old source tree, and explicitly instructed Claude not to base anything on LGPL/GPL-licensed code. I then reviewed, tested, and iterated on every piece of the result using Claude. You can see the history of all the design and implementation plans that were used to create 7.0.0 here. I did not write the code by hand, but I was deeply involved in designing, reviewing, and iterating on every aspect of it.

I understand this is a new and uncomfortable area, and that using AI tools in the rewrite of a long-standing open source project raises legitimate questions. But the evidence here is clear: 7.0 is an independent work, not a derivative of the LGPL-licensed codebase. The MIT license applies to it legitimately.

Simon Willison has observed, though, that the LLM did indeed access the LGPL-licensed source at one point. Beyond that, as others have pointed out, it is easy to ask an LLM to reimplement a body of code in a style different from the original, with the result that similarity checkers will see something entirely new. That does not necessarily break the derived-work link, though. Had an LLM been employed to translate chardet to, say, Lisp, the level of similarity would be quite low, but most would agree that the new code was derived from the original. The fact that the training corpus for Claude surely included all previous versions of chardet also muddies the picture.

A lot of people who are not lawyers have offered opinions on whether chardet 7.0 is derived from previous versions. I, too, am not a lawyer, and will not add to that pile. But it is worth saying that, if instructing an LLM to rewrite an existing body of code is sufficient to strip copyleft requirements from that code, then the future of copyleft looks even dimmer than it did before. But, then, the future of any sort of software licensing scheme could be threatened. The death of copyleft could, ironically, be part of its real goal: the end of copyright.

Meanwhile, of course, had Blanchard simply shown up with a new Python module, let's call it "detectchar", that implemented the same API as chardet, the overall level of eyebrow elevation would have been considerably lower. Replacing the existing code, under the same name but with a different license, drew a lot more attention to this move than it would have otherwise attracted.

Nobody involved in the current discussion is showing any sign of backing down. That means the license change seems likely to stand, unless Pilgrim decides to bring in real lawyers, which would be an expensive and uncertain prospect at best. But if the change stands, it would not be surprising to see a lot more people engaging in this sort of license-stripping exercise. That may eventually lead to a court decision (or, more likely, a series of conflicting decisions) on whether an LLM can be used to launder source code in this way. The old Chinese curse — may you live in interesting times — would certainly appear to be upon us.


to post comments

Can of worms..

Posted Mar 5, 2026 20:03 UTC (Thu) by dskoll (subscriber, #1630) [Link] (63 responses)

Is there not precedent saying that the output of an LLM cannot be copyrighted? So how does the maintainer get around that... make a few changes here and there to provide some "human" creative input?

This copyright-washing scenario is exactly what I'm trying to avoid with my anti-AI stance that was covered in the article about Remind a week or so ago. I do not know of a legal and effective way to do this, though.

Can of worms..

Posted Mar 5, 2026 20:24 UTC (Thu) by vadim (subscriber, #35271) [Link] (22 responses)

I don't think the maintainer has to get around anything. The MIT is already very close to no copyright at all, so if a good chunk of the code is in the public domain I doubt they care.

Where you really want copyright is licenses like the GPL.

Can of worms..

Posted Mar 5, 2026 21:02 UTC (Thu) by dskoll (subscriber, #1630) [Link] (21 responses)

Except that anyone can take the public-domain version and add chunks to it licensed under the LGPL, effectively converting the whole thing back to LGPL. I don't think you can relicense MIT-licensed software without the copyright holder's consent, can you?

The MIT license also includes requirements to include a copyright notice and a disclaimer of warranty, so it's not the same as public-domain.

Can of worms..

Posted Mar 6, 2026 0:56 UTC (Fri) by comex (subscriber, #71521) [Link] (16 responses)

You can indeed take MIT-licensed code and relicense it under the [L]GPL, as long as the original copyright notice is kept in addition to the new license. Or at least, this is the standard interpretation that the software world has used for decades; I don’t know if it has been tested in court.

As for the requirements of the MIT license, yes they exist, but they just aren’t as significant as the GPL’s, so people using MIT are probably less likely to care if they can’t be enforced.

Can of worms..

Posted Mar 6, 2026 14:27 UTC (Fri) by Wol (subscriber, #4433) [Link] (15 responses)

> You can indeed take MIT-licensed code and relicense it under the [L]GPL, as long as the original copyright notice is kept in addition to the new license. Or at least, this is the standard interpretation that the software world has used for decades; I don’t know if it has been tested in court.

It almost certainly has been tested in court, if not in that guise, and the answer is "If the original licence does not give you permission to relicence, then you cannot relicence". MIT does NOT give you that permission!!!

Your take is the layperson's take, which is close enough to have no legal consequences, but as far as the detail goes it's completely wrong. And it's what the (L)GPL takes advantage of!

Let's say I write a big project under the MPL (which would be my licence of choice). You then add a (not insignificant) amount of work under the LGPL. The *source code* is now an *aggregate* work, which contains my MPL work and your LGPL work. My licence applies to my code, your licence applies to your code. We can both distribute the source, and each individual part is distributed under its own licence.

BUT if both of us want to distribute binaries *different rules apply to each of us* (which you may find surprising). If I build a binary containing your code, I have to comply with the LGPL only (the MPL doesn't apply to me). If you build a binary, you have to comply with the MPL only. If Joe Bloggs builds a binary, he has to comply with both the MPL and the LGPL - the MPL for my code, the LGPL fo yours.

And this is the magic of the LGPL - it is written in such a way that the only way you can mix licences is to ensure any other licence grants a superset of the (L)GPL's grants. So by complying with the (L)GPL, you comply with any other compatible licence as a matter of course.

So you, me, and Jo Bloggs can all distribute the BINARY under the terms of the LGPL, and be comfortable we are all complying with all applicable licences. But only YOU (and yes you can) can distribute the binary under the terms of the MPL. Your downstream has to distribute under MPL/LGPL.

Cheers,
Wol

Can of worms..

Posted Mar 6, 2026 15:58 UTC (Fri) by intelfx (subscriber, #130118) [Link] (9 responses)

> It almost certainly has been tested in court, if not in that guise, and the answer is "If the original licence does not give you permission to relicence, then you cannot relicence". MIT does NOT give you that permission!!!

Text of the MIT license:

> Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, **sublicense**, and/or sell copies of the Software <...>

Can of worms..

Posted Mar 6, 2026 16:03 UTC (Fri) by dskoll (subscriber, #1630) [Link] (7 responses)

Yes, but also text of the MIT license: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

Can of worms..

Posted Mar 6, 2026 16:19 UTC (Fri) by intelfx (subscriber, #130118) [Link] (5 responses)

> Yes, but also text of the MIT license: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

And?

The permission notice shall be included, but it shall not need to apply to the derivative work. That is the point of a "permissive" license. Have you never seen an "About" page in various proprietary software packages informing the user of all the permissive licenses of libraries that were used in development of said software package?

Can of worms..

Posted Mar 6, 2026 22:44 UTC (Fri) by dskoll (subscriber, #1630) [Link] (4 responses)

If the permission notice is included, it does apply to the derivative work because the notice you are obliged to include says:

Permission is hereby granted, free of charge, to any person obtaining a copy of this software...

Can of worms..

Posted Mar 9, 2026 17:41 UTC (Mon) by NYKevin (subscriber, #129325) [Link] (3 responses)

This cannot be the correct analysis. The MIT license explicitly grants a right to "sublicense." That is lawyer speak for "distributing under a different license," like the following:

> [Name of upstream project] is licensed under the MIT license, which is as follows: [Text of the MIT license]
>
> The MIT license does not apply to [name of this project]. Instead, [this project] is licensed under [other terms].

Can of worms..

Posted Mar 10, 2026 0:28 UTC (Tue) by Wol (subscriber, #4433) [Link] (2 responses)

But a sub-licence does not allow you to grant rights you do not yourself have, so it can't be a re-licence.

Cheers,
Wol

Can of worms..

Posted Mar 18, 2026 11:47 UTC (Wed) by vonbrand (guest, #4458) [Link] (1 responses)

(L)GPL gives less rights than MIT, so applying (L)GPL to a MIT work is fine.

Can of worms..

Posted Mar 18, 2026 14:02 UTC (Wed) by dskoll (subscriber, #1630) [Link]

Sure, but how do we square that with the wording of the MIT license that requires a copy of "this" permission notice (ie, the MIT license itself) to be included with the software?

Can of worms..

Posted Mar 12, 2026 22:21 UTC (Thu) by immibis (subscriber, #105511) [Link]

Including the permission notice isn't the same as actually granting permission. You can precede it with a sentence which states it applies to just a certain portion, or that it does not apply but you must copy the notice anyway.

On my website there are software files that include a full MIT license text and a statement it applies to the software, but in order to access these files you must click through a notice that says the MIT license does not apply, the AGPL does, and these files are provided for purposes that need bit-for-bit compatibility with the previous iteration of the files before I relicensed them. The legal status is clear: I am not granting you an MIT license, even though the content of the file says it has an MIT license. The file contains green-coloured bits (https://ansuz.sooke.bc.ca/entry/23) that say "These bits are orange." which does not actually make them orange.

Can of worms..

Posted Mar 7, 2026 0:14 UTC (Sat) by Wol (subscriber, #4433) [Link]

> **sublicense**

Ianal, but to me sub-licence means "only pass on some of the rights I received". I can't actually change the licence, I can only *further* restrict what you're allowed to do (pretty easy here, I can erase most of your rights by not passing on the source).

Cheers,
Wol

Can of worms..

Posted Mar 12, 2026 3:53 UTC (Thu) by milesrout (subscriber, #126894) [Link] (1 responses)

This is a very confident comment to make as a nonlawyer Wol. I am a lawyer and I am not nearly as confident in my knowledge of this area as you are.

Can of worms..

Posted Mar 12, 2026 8:27 UTC (Thu) by Wol (subscriber, #4433) [Link]

Mebbe.

But PJ and Groklaw were very good teachers :-)

The big problem is what actually IS a derived work. That is vague. But imho once you've decided that something is a derived work, the legal consequences are clear and simple (mostly).

And apologies for being extremely cynical about professionals, but I've had far too much to do with - to take two, doctors and lawyers - and there is ABSOLUTELY NO WAY I would take what they say on trust unless I know them - "Trust me, I'm an X" raises red flags like nobody's business. "Verify then trust", and I trust (for the most part) my judgement.

Cheers,
Wol

Can of worms..

Posted Mar 12, 2026 22:14 UTC (Thu) by immibis (subscriber, #105511) [Link]

The MIT license gives you the permission to "sublicense", which (AFAIK) means to give downstream people only a subset of the permissions you received. The LGPL is a subset of "do basically anything you want to" so it's allowed. You can (and people do!) also make your downstream version proprietary, or almost any other license.

Can of worms..

Posted Mar 13, 2026 1:57 UTC (Fri) by comex (subscriber, #71521) [Link] (1 responses)

That may be true for the LGPL. But the GPL requires this (in v3):

> You must license the entire work, as a whole, under this License to anyone who comes into possession of a copy.

Or this in v2:

> b) You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License

The “standard interpretation” I was referring to is the treatment of MIT as GPL-compatible. GPL-compatible of course means that you can mix together MIT code and GPL code to create a combined work which can be legally distributed. But by the GPL’s own terms, this is only possible if you can license the entire work, including the MIT parts, under the GPL.

If you can license a combined work under the GPL, then presumably you can also license it under the LGPL (even though the LGPL does not require you to). And if you can license a combined work under the LGPL, then you can probably license the MIT code by itself under the LGPL, because the MIT License’s text does not distinguish between derived works and the original work.

However, there is still a lot that a court could unpack. Does “sublicense” in the MIT License really include licensing under a different license? If so, what about the BSD License, which is also considered GPL-compatible but doesn’t explicitly mention sublicensing? Is there any difference between “must cause [..] to be licensed” from the GPLv2 and “must license” from the GPLv3?

And what does it even mean to “license” something? To license is to grant permission. But the GPL doesn’t merely demand that you grant your personal permission to others to use the work. It wouldn’t do to say: “I’m combining Company X’s proprietary code with GPL code, and it’s fine by me if you use the result under the GPL (not my problem if Company X objects).” Rather, the GPL implies that your “license” has to be enough for the recipient to actually exercise their GPL rights (at least with respect to copyright; patents are another story). But does that mean you need the formal right to grant a license on behalf of everyone who owns a copyright in the work, or is it enough to combine your license with an assurance that all other copyright holders have also given permission? Is there even a legal difference between those two things?

Ultimately, it probably doesn’t matter. Courts try to interpret contract and license terms based on intent. It’s pretty clear that people who release their code under open source licenses, both (L)GPL and permissive, intend to allow (L)GPL and permissive code to be combined. (The exception is code released before the GPL was invented, but even then you have intent from the GPL side.) So courts would probably interpret the unclear provisions in favor of allowing these combinations.

But precisely because these details don’t make much practical difference, nobody has started any lawsuits over them (as far as I know), so we don’t get any actual court rulings on them. The GPL as a whole only has a few lawsuits, and they tend to be over spicer issues than GPL-MIT combinations.

Can of worms..

Posted Mar 13, 2026 8:55 UTC (Fri) by Wol (subscriber, #4433) [Link]

> The “standard interpretation” I was referring to is the treatment of MIT as GPL-compatible. GPL-compatible of course means that you can mix together MIT code and GPL code to create a combined work which can be legally distributed. But by the GPL’s own terms, this is only possible if you can license the entire work, including the MIT parts, under the GPL.

Note the words "as a whole".

Inasmuch as you can separate the parts, they are under their own licence (so there the individual licences trump). The binary is like an anthology - the publisher owns the copyright in the book, the contributors own the copyright in their individual parts. But you can't rip the book up into its individual pages and pretend the publisher's copyright no longer applies.

So effectively you are applying a "publisher's copyright" of (L)GPL to the binary as a whole.

Cheers,
Wol

Can of worms..

Posted Mar 6, 2026 15:50 UTC (Fri) by frankie (subscriber, #13593) [Link]

... which exactly implies that the whole license matter is toilet paper right now. ANYONE can take ANY source, rewrite it in another language with a certain set of human-in-the-loop changes, and relicense it under a different license, with a different copyright holder. Plain and clean. I will ignore implications for proprietary software, because that's out of legality for a totally different reason, but practically also feasible...

Can of worms..

Posted Mar 12, 2026 3:51 UTC (Thu) by milesrout (subscriber, #126894) [Link] (2 responses)

Why would you not be able to license a derived work of an MIT-licensed work under the LGPL?

Can of worms..

Posted Mar 12, 2026 8:20 UTC (Thu) by Wol (subscriber, #4433) [Link]

Bear in mind, the SOURCE is an aggregate work, not a derived work.

The binary is a derived work, so if you distribute the binary of a MIT/LGPL work you do get caught by the LGPL.

Cheers,
Wol

Can of worms..

Posted Mar 12, 2026 14:19 UTC (Thu) by dskoll (subscriber, #1630) [Link]

Well, maybe you could relicense. But the text of the MIT license requires you to include "this permission notice in all copies or substantial portions of the Software", where this refers to the MIT license itself.

Can of worms..

Posted Mar 5, 2026 20:35 UTC (Thu) by smcv (subscriber, #53363) [Link] (31 responses)

> Is there not precedent saying that the output of an LLM cannot be copyrighted?

If the maintainer wants a permissively-licensed, non-copyleft project, they're unlikely to see that as a problem.

> This copyright-washing scenario

I think you're right to describe it as copyright-washing rather than copyleft-washing: if the result is a derivative work then the limits on the permissions the copyright holder is willing to give are relevant (it doesn't really matter whether those permissions are LGPL or CC-NC-ND or "you can download one copy if you pay me first"), and if it isn't a derivative work then they have no control.

Perhaps an interesting/relevant thought experiment is: I ask a LLM to generate a book as similar as possible to (say) Lord of the Rings. Is the result a derivative work (infringing the Tolkein estate's copyright) or not? I suspect that the copyright industry wants the answer to be "yes, derivative work", the LLM industry wants the answer to be "no, not derivative work", and the answer the law gives in each country will come down to which well-funded group can get the law clarified or changed in their favour.

My understanding is that laws aren't applied like a deterministic machine and courts try to take intent into account, so whether this is viable might depend on whether the prompts to the LLM are an obvious attempt at copyrightwashing, or a good-faith attempt to generate something new, or somewhere in between those extremes, more than it depends on the actual content that is produced.

Can of worms..

Posted Mar 5, 2026 21:05 UTC (Thu) by dskoll (subscriber, #1630) [Link] (9 responses)

Well, if courts take intent into account, I think the maintainer who converted the license might be in trouble, since it's pretty clear his only motivation was to avoid the restrictions of the original license and not to add any other value to the project.

Can of worms..

Posted Mar 6, 2026 0:33 UTC (Fri) by rgmoore (✭ supporter ✭, #75) [Link] (8 responses)

since it's pretty clear his only motivation was to avoid the restrictions of the original license and not to add any other value to the project.

That appears not to be the case. According to the release notes, 7.0 massively improves performance relative to 6.0. It's supposed to be 30x faster and improve accuracy on their test suite from 88.2% to 98.1%. It also increased the number of supported encodings from 84 to 99 and added features like language detection. That sure sounds like he's done more than just rewrite things to justify changing the license.

Can of worms..

Posted Mar 6, 2026 0:46 UTC (Fri) by bluca (subscriber, #118303) [Link] (7 responses)

Also it takes some real kind of obtuse zealotry to try and suggest that a maintainer who has been keeping a project alive by themselves for ~15 years has no interest in "adding any other value to the project"

Can of worms..

Posted Mar 6, 2026 0:54 UTC (Fri) by dskoll (subscriber, #1630) [Link] (5 responses)

I meant that the LLM shenanigans were designed to change the license. I don't mean the maintainer hasn't added value in the intervening years.

Can of worms..

Posted Mar 6, 2026 14:32 UTC (Fri) by Wol (subscriber, #4433) [Link] (4 responses)

The other thing I haven't seen mentioned is how much 15-yr-old code is left in the project?

The mere fact the new guy has been maintaining the project for 15 years might mean the old guy has no copyright left to complain with.

It would have made a lot more sense to do a "git blame" and just rewrite the bits needed, than to make a big noise about replacing the lot.

Cheers,
Wol

Can of worms..

Posted Mar 6, 2026 17:16 UTC (Fri) by rgmoore (✭ supporter ✭, #75) [Link] (3 responses)

It's not just a question of the provenance of each line of code in the current project. There's a Ship of Theseus argument about any copyleft project. If you replace or add a small amount to the project- rewriting a function or adding a small feature- then the result is clearly a derivative work of what was there before. Because the previous version was under a copyleft license, the new version must be, too. Even if you keep tinkering with the code until none of the original code remains, each step was a derivative of the previous, so the completely rewritten version is still a derivative work that needs to keep the original license. If you want to change the license, you need to replace the whole thing all at once with a new version that does things a completely different way so you can break the chain of derivative works. At least that's the argument a copyleft maximalist would make.

My impression is this case is an example of that kind of ground-up rewrite. The maintainer decided to completely redesign the way the software works but keep it under the same project name. As I understand it, at least, the only thing that's left from the previous version is the public API. Because he had rewritten the whole thing from scratch (with the aid of a LLM), relicensing became possible. It's not 100% clear if this was mostly to justify relicensing or if it was mostly for performance reasons and the relicensing was an added benefit (from his point of view).

It might have been cleaner if he had released the new code under a new project name rather than just a new version of the old name. That said, I've done the same general kind of thing- completely rewriting an existing project but keeping it under the same name- for non-technical reasons, so I can understand the decision. I personally think people are paying way too much attention to the LLM part, and they need to dig into the code to see if the new version is just a derivative of the old or if it's genuinely different.

Can of worms..

Posted Mar 6, 2026 23:37 UTC (Fri) by muase (subscriber, #178466) [Link] (2 responses)

> It's not just a question of the provenance of each line of code in the current project. There's a Ship of Theseus argument about any copyleft project. If you replace or add a small amount to the project- rewriting a function or adding a small feature- then the result is clearly a derivative work of what was there before. Because the previous version was under a copyleft license, the new version must be, too. Even if you keep tinkering with the code until none of the original code remains, each step was a derivative of the previous, so the completely rewritten version is still a derivative work that needs to keep the original license. If you want to change the license, you need to replace the whole thing all at once with a new version that does things a completely different way so you can break the chain of derivative works. At least that's the argument a copyleft maximalist would make.

It should be noted though that this argument is usually not considered sound, nor has it any legal precedence (I know of). Thing is that licenses only work because they use copyright mechanisms as leverage: If you are NOT the copyright owner, by default you generally are not allowed to use the code freely. The license now makes use of this default by granting an exception: you may still use this code under certain conditions.

The important part here is that the license is only effective as an agreement between the copyright holders and an unauthorized third party – it is totally irrelevant for the copyright holders themselves; and it also cannot be applied if you otherwise have permission to use the code anyways (e.g. via other legal copyright exceptions, fair use, ...). So, as long as you are the copyright owner of all lines of code, you can do whatever you want whenever you want; there are actually lots of real-world examples where companies relicense their software under less permissive licenses over night, or introduce business models like GPL-by-default with alternative commercial licenses, etc.

And I've never heard anyone seriously considering the ship of Theseus argument when it comes to copyright itself before – to stay in the metaphor, your planks are your planks. The logical error is here:

> If you replace or add a small amount to the project- rewriting a function or adding a small feature- then the result is clearly a derivative work of what was there before.

Nope, there is nothing clear about that – quite contrary: If I add a small function, I still have the copyright for that function. If a rewrite something, I have the copyright for the rewritten parts. And if I have the copyright for something, I can relicense it. Let's say I write a small optimized XYZ-function and add it into a GPL project – now, this function is GPL licensed.
But: I'm still the copyright owner – and I still can do whatever I want with that specific function. If I want to copy-paste it into my other proprietary projects, I can do that. If I want to sell commercial closed-source licenses for that, I can do that. And finally, if I have provenance for all lines of code – either because I originally added them, or I have rewritten them – I'm the copyright holder for everything I have produced, with all the consequences.

(It is important to note though that "rewritten" really means "recreated in a sufficiently different way", and not just changed formatting or similarly trivial refactoring. That's also where "clean room implementations" come into play – not because they are strictly necessary, but because they make it easier to argue that you did not simply perform an obfuscated copy-and-paste.)

Can of worms..

Posted Mar 8, 2026 18:46 UTC (Sun) by NAR (subscriber, #1313) [Link] (1 responses)

And I've never heard anyone seriously considering the ship of Theseus argument when it comes to copyright itself before

Didn't that happen with the BSD vs System V UNIX case? I mean BSD started out from the System V, but eventually all source files were replaced.

Can of worms..

Posted Mar 12, 2026 17:55 UTC (Thu) by anton (subscriber, #25547) [Link]

And eventually BSD was free.

Can of worms..

Posted Mar 6, 2026 20:40 UTC (Fri) by ballombe (subscriber, #9523) [Link]

It is unclear whether the new maintainer actually asked the old maintainer for permission to relicense at any point.

Can of worms..

Posted Mar 5, 2026 23:46 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (19 responses)

> My understanding is that laws aren't applied like a deterministic machine and courts try to take intent into account, so whether this is viable might depend on whether the prompts to the LLM are an obvious attempt at copyrightwashing, or a good-faith attempt to generate something new, or somewhere in between those extremes, more than it depends on the actual content that is produced.

Well... Not really.

Yes, courts do generally take intent into account when ruling on things. No, you can't assume that will generalize to all things that all courts rule on for all time. In the case of infringement-by-derivative-work, there are two considerations (under US law, but other jurisdictions are likely to be similar in practice):

1. Did the defendant have access to the plaintiff's work? In this case, the answer is probably "yes," since the defendant was actively maintaining prior versions of chardet for years and years before this whole kerfuffle.
2. Is the defendant's work substantially similar to the plaintiff's work? Since this is a software case, the US would (probably) apply the abstraction-filtration-comparison test. Fully explaining the AFC test is well beyond the scope of an LWN comment, but the gist is that a judge compares the source code of the two works (7.0 and whatever prior version the plaintiff claims has been infringed, which must be a version that the plaintiff authored in part or whole) and decides whether the defendant's work has a lot of copyrightable elements in common with the plaintiff's work.

In this case, the critical question is how similar 7.0 is to whatever version Mark last contributed to (assuming he's the plaintiff). I'm not totally sold on Dan's similarity score approach[1], because this is generally a case-by-case question decided by a judge, not something you can evaluate mechanically. However, he is at least trying to evaluate the right sort of thing, as opposed to all this talk of what the LLM may or may not have done in the course of creating 7.0. I think the courts are very unlikely to care about what the LLM did, how it was trained, etc., unless Dan tries to contest the "access" prong of the test. But the plaintiff must satisfy both prongs, so that arguably does not matter - if there is no substantial similarity, then the plaintiff loses, the end.

[1]: https://github.com/chardet/chardet/issues/327#issuecommen...

Can of worms..

Posted Mar 6, 2026 1:03 UTC (Fri) by bluca (subscriber, #118303) [Link] (5 responses)

Would also be interesting to know whether the presumptive plaintiff has any standing to sue like that in the first place. If the last time they contributed was 20 years ago, and the current maintainer has been working on this solo for 15 years, it wouldn't be difficult to imagine that anything copyrightable from the former has long since been replaced.

Can of worms..

Posted Mar 6, 2026 2:09 UTC (Fri) by NYKevin (subscriber, #129325) [Link] (2 responses)

If there's nothing left of Mark's work in 6.x, then it could've been relicensed without a rewrite, at least in principle. But Mark would likely allege non-literal copying, which is harder to evaluate. You cannot just count identical lines of code and call it a day. The overall structure and organization of the code is also subject to copyright.

Can of worms..

Posted Mar 6, 2026 9:35 UTC (Fri) by joib (subscriber, #8541) [Link]

For a tautological(?) example, you could add "# Hello world" comments to the end of every line, and pronto, per "git blame" there's nothing left of the original author's work.

Can of worms..

Posted Mar 6, 2026 10:21 UTC (Fri) by bluca (subscriber, #118303) [Link]

Yes, and the explanation given by the maintainer for the new version is that it shares no overall structure and organization with the old code. A cursory look would seem to confirm it.

Can of worms..

Posted Mar 6, 2026 10:53 UTC (Fri) by ehiggs (subscriber, #90713) [Link] (1 responses)

It would be very interesting if courts found that version x+1 was NOT a derivative work of version x.

Can of worms..

Posted Mar 6, 2026 14:34 UTC (Fri) by Wol (subscriber, #4433) [Link]

But version x+99? When the last version the plaintiff committed to was version x?

Cheers,
Wol

Can of worms..

Posted Mar 6, 2026 6:26 UTC (Fri) by gdt (subscriber, #6284) [Link] (12 responses)

A legal question is if the use of the LLM, if trained on the original work, was unclean hands.

Can of worms..

Posted Mar 6, 2026 10:24 UTC (Fri) by bluca (subscriber, #118303) [Link] (11 responses)

No, as the training process is explicitly excepted from copyright, by law.

Copyright exemption

Posted Mar 6, 2026 14:10 UTC (Fri) by corbet (editor, #1) [Link] (10 responses)

You keep saying that training is exempt from copyright "by law". To which law are you referring? In the US, at least, there are ongoing court cases, and I'm allegedly someday going to get some sort of settlement for the use of my books. It does not seem so cut-and-dried to me.

Copyright exemption

Posted Mar 6, 2026 14:15 UTC (Fri) by bluca (subscriber, #118303) [Link] (6 responses)

The only one that matters, of course ;-) EU law, which applies where I live. I have no idea about the US, I think fair use is what gets mentioned? But that's case-by-case ofc, so it's messy. That's something your legislators should really take care of, but if the news are anything to go by, they seem otherwise busy these days...

Copyright exemption

Posted Mar 6, 2026 14:38 UTC (Fri) by Wol (subscriber, #4433) [Link] (5 responses)

It's only TRAINING that's exempt, not OUTPUT.

I'm sure that an EU court would have ruled that Mozart's rewrite of that mass was a copyright violation. Going to a performance and listening was perfectly legal. Then going home and regurgitating a copy onto fresh manuscript most certainly isn't under pretty much any modern law.

And afaict, that applies to LLMs as well.

Cheers,
Wol

Copyright exemption

Posted Mar 6, 2026 15:10 UTC (Fri) by bluca (subscriber, #118303) [Link] (4 responses)

Yes, and the question above was about training, not output

Copyright exemption

Posted Mar 7, 2026 0:56 UTC (Sat) by josh (subscriber, #17465) [Link] (3 responses)

The issue posed by the chardet case, though, is one of output: is the output derived from the training data. Whether any particular locale determines there's a special privilege for AI training to ignore licensing or not, that doesn't necessarily change whether derivation can arise from the resulting AI slopification, unless an even broader privilege to violate licensing were granted.

Copyright exemption

Posted Mar 7, 2026 1:12 UTC (Sat) by bluca (subscriber, #118303) [Link] (2 responses)

It is not derivative work of the "training data", because that's not how derivative work works.
And we know it's not a copy or a minor incremental modification of the previous version, as static analysis was done to show how insignificant the similarity is: https://github.com/chardet/chardet/issues/327#issuecommen...

Copyright exemption

Posted Mar 7, 2026 1:32 UTC (Sat) by josh (subscriber, #17465) [Link] (1 responses)

That analysis is a *guess* trying to approximate the underlying reality, not any kind of proof. I can easily think of ways to make those particular metrics small while writing a work very obviously derived from the original.

There is no metric you can use to decide programmatically or measure whether something is a derivative work or not. And, in fact, you can have two bit-for-bit identical files, where one is a derivative work of a third, and the other is not. https://ansuz.sooke.bc.ca/entry/23 is useful reading here.

Copyright exemption

Posted Mar 7, 2026 11:21 UTC (Sat) by bluca (subscriber, #118303) [Link]

That analysis is a guess based on tangible data. Your guess that it is definitely a derivative work is based on vibes and hot air.

Copyright exemption

Posted Mar 6, 2026 14:37 UTC (Fri) by farnz (subscriber, #17727) [Link] (2 responses)

EU law says that copyright does not prevent use of a work for data mining, and AI training has been ruled to be "use of a work for data mining". As a consequence of this, the model itself has been ruled to not be a derived work of the input data.

However, it's silent on whether the model's output can be a derived work of the training data; at the moment, it looks like the eventual state is going to be that a model's output can be a derived work of the training data, just as a human's output can be a derived work of their training inputs, and with the same tests required to show that the model's output is a derived work of the training data.

Copyright exemption

Posted Mar 6, 2026 20:22 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

It's probably going to end up being case-by-case.

I mean, I can reproduce from memory lyrics to many copyrighted songs. The same probably is going to be applicable to AI.

Copyright exemption

Posted Mar 6, 2026 20:53 UTC (Fri) by rgmoore (✭ supporter ✭, #75) [Link]

I think this is the right general idea. Whether or not the output of a LLM infringes copyright depends on that output. The big difference as it pertains to copyright lawsuits would be any part that requires or would be simplified by access to the infringed work. Because the training corpus of an LLM is so huge, it's going to be much more difficult for it to use "I never saw that before" as a defense. I wouldn't be surprised if LLM companies sued for copyright infringement would just stipulate that their training corpus includes the material so they can avoid discovery.

Can of worms..

Posted Mar 6, 2026 0:56 UTC (Fri) by dvdeug (subscriber, #10998) [Link]

> Perhaps an interesting/relevant thought experiment is: I ask a LLM to generate a book as similar as possible to (say) Lord of the Rings.

That's a very different game, though. A fictional work has stronger protections than a useful tool, especially with Google v. Oracle. I can use the API of chardet, and I can pretty freely use whatever data structures I want, provided I can justify them outside of copying chardet.

As a notable example, Torbjörn Granlund, the principle author of GNU GMP, asserted that NVIDIA's CGBN library infringed on GMP. NVIDIA argued that the similarity was merely from both being highly optimized libraries using similar algorithms. The FSF decided to take no action. Maybe if the FSF had sued, they could have discovered evidence through discovery, but they decided with what they knew, it wasn't worth pursuing.

Can of worms..

Posted Mar 6, 2026 14:04 UTC (Fri) by Wol (subscriber, #4433) [Link]

> Is there not precedent saying that the output of an LLM cannot be copyrighted? So how does the maintainer get around that... make a few changes here and there to provide some "human" creative input?

Imho (based on common sense and EU law) that should be interpreted as "The LLM cannot provide creative input, therefore the LLM has no copyright interest in the output." It says (like EU law) absolutely nothing about OTHER PEOPLES' copyright in the input, and whether that passes through to the output.

A lot of people see me as creative, and in many ways I am. But probably most, if not all, of my creative output is derivative, I'm very good at building on other peoples' work, but crap at *original* creativity.

So my reading on all this stuff is, if what goes in is copyrighted, then what comes out MAY carry THE SAME copyright. What it WON'T carry is "(c) The LLM".

Cheers,
Wol

Can of worms..

Posted Mar 9, 2026 4:19 UTC (Mon) by drago01 (subscriber, #50715) [Link] (4 responses)

Why would an LLM output not be copyright able? Copyright law does not limit the tools used to create the work.

Can of worms..

Posted Mar 9, 2026 10:41 UTC (Mon) by justincormack (subscriber, #70439) [Link] (3 responses)

Because copyright protects human creativity, not "stuff generated by machine". A US court case ruled this, but it is likely to be appealed and other jurisdictions may well differ.

Can of worms..

Posted Mar 9, 2026 11:02 UTC (Mon) by bluca (subscriber, #118303) [Link] (1 responses)

No, that's not what the court case decided. The court case was about a very specific and narrow situation where the claimant explicitly relinquished all claims of authorship, and then tried to get the office to grant copyright to the machine. In that situation, given there is no person involved, it upheld the decision to refuse the request.

The maintainer of this package did not disclaim anything, so the same conditions do not apply. Nor do they apply to any other normal situation where claude/copilot/etc are involved, given they act on input from a person.

Code generators are not new, they have existed for decades, and nobody ever tried to claim they are not protected by copyright because they are "stuff generated by machine".

Can of worms..

Posted Mar 9, 2026 13:06 UTC (Mon) by bluca (subscriber, #118303) [Link]

This makes for a good reading on that court case: https://sfconservancy.org/blog/2026/mar/04/scotus-deny-ce...

Can of worms..

Posted Mar 9, 2026 14:01 UTC (Mon) by farnz (subscriber, #17727) [Link]

Assuming you mean Thaler v. Perlmutter, the court's ruling is that a human has to be involved for there to be something that gets copyright protection, and thus that, absent a human claiming authorship, the work is not eligible for copyright protection.

There's a decent chunk of case law, relating to work where it's claimed that the author was a deity, or a monkey, or other non-human entity cited in the decision, and a discussion of why, for procedural reasons relating to his original attempts to file for copyright as a "work for hire", the court did not take a view on whether he provided enough input to qualify as the author himself.

Reading the judgement and reasoning, though, it reads like the judge would have been open to the idea that an AI is a tool like a camera, and thus that the output of the AI is protected by copyright in as far as the output of a camera is protected by copyright.

Can of worms..

Posted Mar 15, 2026 1:38 UTC (Sun) by wtanksleyjr (subscriber, #74601) [Link] (1 responses)

Does the precedent say it CANNOT be copyright, or that the person directing the output has the copyright? I would think the latter is the case ... I don't think "uncopyrightable" would be possible to enforce, as it would have to taint all derived works to be even slightly legally relevant, which in turn makes it strangely toxic.

Can of worms..

Posted Mar 15, 2026 11:06 UTC (Sun) by bluca (subscriber, #118303) [Link]

Yep, pretty much: https://sfconservancy.org/blog/2026/mar/04/scotus-deny-ce...

The recent court case that keeps getting cited as proof that "all LLM output is public domain" is simply a misunderstanding based on clickbait and wishful thinking

Copyleft is a means, not an end

Posted Mar 5, 2026 21:13 UTC (Thu) by geofft (subscriber, #59789) [Link] (54 responses)

> But it is worth saying that, if instructing an LLM to rewrite an existing body of code is sufficient to strip copyleft requirements from that code, then the future of copyleft looks even dimmer than it did before. But, then, the future of any sort of software licensing scheme could be threatened. The death of copyleft could, ironically, be part of its real goal: the end of copyright.

This is a very important point! I think we as a community have gotten so wrapped up in (justifiably) protecting FOSS from people who want to subvert it to proprietary ends that we've become reliant on copyright as a tool, and forgotten that the goal of copyleft was to work around copyright, not to embrace it.

Suppose it is true (as it seems to be) that within a couple of years, you can point an LLM at any piece of software and say, give me a clean-room permissive-licensed version of this. Wouldn't that mean we've won?

The FSF's four freedoms are:

  • The freedom to run the program as you wish, for any purpose (freedom 0).
  • The freedom to study how the program works, and change it so it does your computing as you wish (freedom 1). Access to the source code is a precondition for this.
  • The freedom to redistribute copies so you can help others (freedom 2).
  • The freedom to distribute copies of your modified versions to others (freedom 3). By doing this you can give the whole community a chance to benefit from your changes. Access to the source code is a precondition for this.
If I can take an existing piece of software, perhaps even a binary, and point an LLM at it and get a permissively-licensed reimplementation, I have the freedom to run that reimplementation as I wish, for any purpose. I can study how it works and change it, because I now have source code. I can redistribute copies, including modifications, without being burdened to the original program's license. So we're now in a world where we've guaranteed the four freedoms for every single piece of code in the world. Shouldn't we be celebrating?

I would argue that we've been significantly helped by the fact that every major LLM provider has steamrolled over copyright in their training process and we now have a significantly expanded view of fair use, i.e., that the output of LLMs is not considered a derivative work of every work in the training set, and that they owe at most financial compensation to authors for the initial copies but not for ongoing inference on the trained model. If megacorporations had all lobbied to, say, limit copyright to two weeks after the death of the author instead of 70 years, we probably would cheer that on, even though it would weaken the GPL compliance case for some software and even though it would transparently be in the service of the megacorporations being more profitable. I think we should see the current situation similarly.

It is weird to admit that the GPL has run its course. But that was already true for the megacorporations anyway. Richard Stallman convinced Steve Jobs that he had to release the source for his patches to GCC, which worked for a while, but eventually Apple invested heavily in the development of LLVM and was free from any requirements of GCC's license. Now the ability to do that is in the hands of everyone, which certainly feels worse, but I would argue it's strictly better from only being in the hands of Apple-sized companies. And as it happens, Apple still posts their patches to LLVM publicly, not because they're obligated to, but because they get better results from opening up their code and sending it upstream. (There's also an interesting argument about that link being about CLISP and Readline, and how Readline is certainly no longer leverage with the existence of libedit and several other options.)

I also wonder how proprietary software companies are feeling about this. The example in the comment thread of Windows and ReactOS is a good one. Or how do companies making proprietary, out-of-tree Linux drivers feel? Is there a world where they are soon convinced that there is no competitive advantage in shipping a closed-source Linux driver, because their competitors can learn everything that's in that driver anyway, and so they decide to open-source things because this brave new LLM world means that source code isn't as much of an asset? That also seems like it would be a significant victory.

Copyleft is a means, not an end

Posted Mar 5, 2026 22:47 UTC (Thu) by bluca (subscriber, #118303) [Link] (52 responses)

> So we're now in a world where we've guaranteed the four freedoms for every single piece of code in the world. Shouldn't we be celebrating?

We should, except we are now finding out that, as it is extremely often the case, it was really never about free software in itself, for many people. It is, and always was, a matter of identity. A very large number of people have built their whole personal identity around the idea of free software as an end, not as a means, and so they unsurprisingly lash out in violent, angry and cringeworthy rants against a mantainer who has spent ~15 years working alone to maintain a free software project alive, as we can see in some of the comments in the GH issue linked in the article. Not because they really care about some random python module that they never knew existed until last week, and that they'll have forgotten about by next week, but because they don't know how to answer the question: "what happens to me after the thing I've built my whole identity upon collapses into dust?" - and that must be truly terrifying

Copyleft is a means, not an end

Posted Mar 6, 2026 0:40 UTC (Fri) by josh (subscriber, #17465) [Link] (51 responses)

If the maintainer was seeking help, they could post about seeking help.

The maintainer violated the license on the work of prior contributors, falsely claimed a "clean room" reimplementation, and uploaded the resulting slop as a "new version" of an existing package.

As with attempts to bait-and-switch Open Source projects with proprietary alternatives, this needs a fork of the last LGPLed version, and a backport of the useful optimizations in the new version.

> Not because they really care about some random python module that they never knew existed until last week, and that they'll have forgotten about by next week

I've known about chardet since long before LLMs existed. If its maintainer was seeking help, I would happily have amplified that call for help and sent some sponsorship its way. But its maintainer wasn't seeking help; its maintainer was looking to replace copyleft code with permissive code.

AI reimplementations aren't some magical realization of the FOSS dream. They're laundering and producing fallout that's drowning out and alienating human collaboration.

Copyleft is a means, not an end

Posted Mar 6, 2026 0:43 UTC (Fri) by josh (subscriber, #17465) [Link] (44 responses)

And to be more specific: if copyright were abolished tomorrow, I'd celebrate. The current situation with LLMs is "rules for thee, not for me"; LLMs slurp in and launder people's work ignoring proprietary and FOSS licenses alike, but *people* still get sued for copyright violations.

Copyleft is a means, not an end

Posted Mar 6, 2026 13:33 UTC (Fri) by dskoll (subscriber, #1630) [Link] (43 responses)

if copyright were abolished tomorrow, I'd celebrate

I wouldn't. I happen to think that composers, authors, photographers, film-makers and other artists should be compensated for their work, and copyright is pretty much the only means they have to guarantee such compensation.

I think a world without any copyright protections would be much worse and it would make artists and creators lives even more precarious than they are now.

Copyleft is a means, not an end

Posted Mar 6, 2026 13:48 UTC (Fri) by bluca (subscriber, #118303) [Link] (35 responses)

> I think a world without any copyright protections would be much worse and it would make artists and creators lives even more precarious than they are now.

They live precarious lives because of capitalism and large corporations syphoning off all the wealth. Copyright helps those corporations to do so. Aside from a tiny minority of ultra-rich super-stars, everyone else would be better off.

Copyleft is a means, not an end

Posted Mar 6, 2026 15:18 UTC (Fri) by dskoll (subscriber, #1630) [Link] (34 responses)

No, I disagree. Even small artists benefit from copyright. I get royalties still from comedy tracks I recorded a few years ago, and without copyright, I wouldn't be earning that money.

Copyleft is a means, not an end

Posted Mar 6, 2026 15:27 UTC (Fri) by bluca (subscriber, #118303) [Link] (33 responses)

You get the crumbs, while a handful of corporations get literal truckloads of money. The system is broken, and copyright is part of it.

Copyleft is a means, not an end

Posted Mar 6, 2026 15:29 UTC (Fri) by dskoll (subscriber, #1630) [Link] (32 responses)

You are missing the point. First of all, I'm pretty happy with the royalties I receive relative to the amount of work I did to earn them. Secondly, without copyright protection, I would receive nothing instead of something.

So sure, let's stick it to those greedy corporations! And if independent artists lose their sources of income, well hey... collateral damage. ¯\_(ツ)_/¯

Copyleft is a means, not an end

Posted Mar 6, 2026 15:38 UTC (Fri) by bluca (subscriber, #118303) [Link] (31 responses)

You are the one missing the point. A different system to support artists is needed, that benefits the majority, instead of the 0.x%. This is not possible as long as the current system is in place, and vested interests have unlimited amounts of capital to direct legislation to their leisure - see the literal fucking mickey mouse act as an obvious example

Copyleft is a means, not an end

Posted Mar 6, 2026 15:57 UTC (Fri) by dskoll (subscriber, #1630) [Link] (29 responses)

Copyright law needs reform, not elimination. At least, not before you have a proposal for something better and hard evidence that it will actually work.

But just imagining that getting rid of copyright will improve things, without having a viable alternative in place, is naive at best and disingenuous at worst.

Copyleft is a means, not an end

Posted Mar 6, 2026 16:18 UTC (Fri) by bluca (subscriber, #118303) [Link] (28 responses)

The death of copyright is the first of many steps needed, not the only one, of course. But a very welcome and needed one for sure.

As for alternatives, there is plenty of evidence already that UBI works, and works well, for example.

Copyleft is a means, not an end

Posted Mar 6, 2026 19:35 UTC (Fri) by dskoll (subscriber, #1630) [Link] (27 responses)

Sorry, but BS.

Copyright is necessary to protect and incentivize creators. In an earlier comment, you wrote: You get the crumbs, while a handful of corporations get literal truckloads of money.

Here's the reality: For streams of my comedy tracks, I get 90% and the record label gets 10%. For broadcasts on Sirius XM, I get 50% and the record label gets 50%. Streaming revenue is next to nothing because streaming companies are terrible, but Sirius XM royalties are decent.

If you think I'm getting "crumbs" while the record company is getting "truckloads", then it just goes to show you know nothing. And I'm fine with that split; the record label took the risk of producing the show, taping and producing the album, and marketing it.

UBI is great. But someone has to pay for it, and it provides no incentive for artists... if you get UBI whether your music is heard by 10 people or 100,000 people, where's the incentive?

The anti-copyright zealots are full of horsefeathers and live in some idealistic world that does not resemble reality in the slightest. Copyright reform could fix most of the problems and not cause additional harms.

Copyleft is a means, not an end

Posted Mar 6, 2026 19:57 UTC (Fri) by bluca (subscriber, #118303) [Link] (12 responses)

Have you considered that it is not all about you and yourself only? Have a look at the total amount of profits of all the media corporations, and then look at what share of that mountain goes to the 99% of artists. If you can't be bothered, just do an easy one, and look at Spotify and what it pays to the majority of earners.

> UBI is great. But someone has to pay for it, and it provides no incentive for artists... if you get UBI whether your music is heard by 10 people or 100,000 people, where's the incentive?

You know, there are people out there who do those kind of activities because they actually _enjoy_ it. If you refuse to do "art" unless it gets you filthy rich, then actually I am pretty sure the world is better off without your "art". Everyone deserves to make a good living out of what they love doing, whether it's arts, crafts or whatever else. Nobody deserves to be filthy rich.

Copyleft is a means, not an end

Posted Mar 6, 2026 20:11 UTC (Fri) by dskoll (subscriber, #1630) [Link] (11 responses)

Are you an artist who has income that would be lost if copyright were abolished? If not, then I don't think your opinion should count for much.

It's not just about me. I know many artists, comedians and musicians who make a reasonable living off of their work. They are not superstars by any means, and their income isn't enormous, but it's significant and not something they'd enjoy losing.

Again: Copyright reform to get rid of the leeches is a great idea.

there are people out there who do those kind of activities because they actually _enjoy_ it

Ah yes. Straight out of the mouths of exploitative producers who pay musicians and comedians in "exposure". I've certainly dealt with my share of them.

And thanks for saying that I said I need to be "filthy rich" from my creative endeavours; nothing like putting words into someone else's mouth to bolster your argument.

Copyleft is a means, not an end

Posted Mar 6, 2026 20:36 UTC (Fri) by pizza (subscriber, #46) [Link]

> Ah yes. Straight out of the mouths of exploitative producers who pay musicians and comedians in "exposure". I've certainly dealt with my share of them.

...Not only does "exposure" not pay your bills, enough of it will quite literally kill you.

(and s/musicians/all creative endeavors, including most foss authors)

Copyleft is a means, not an end

Posted Mar 6, 2026 20:37 UTC (Fri) by bluca (subscriber, #118303) [Link] (9 responses)

> Are you an artist who has income that would be lost if copyright were abolished?

What part of "Everyone deserves to make a good living out of what they love doing, whether it's arts, crafts or whatever else." is not registering exactly?

Regardless of whether it earns _you_ a satisfying amount of money, copyright has been perveted into an exploitative system that benefits the few at the expense of the many. Like many other aspects of capitalism, there's no "reforming" it, it needs to die.

Copyleft is a means, not an end

Posted Mar 6, 2026 20:43 UTC (Fri) by dskoll (subscriber, #1630) [Link] (8 responses)

> Are you an artist who has income that would be lost if copyright were abolished?

What part of "Everyone deserves to make a good living out of what they love doing, whether it's arts, crafts or whatever else." is not registering exactly?

A simple "No" would have sufficed. Plenty of people make good livings doing things that don't require copyright protection. It's a little rich of them to say that people who do rely on copyright protection for their incomes should just abandon the protection and do without their incomes.

copyright has been perverted into an exploitative system

If it has been perverted, it can be de-perverted. But please don't promote a course that will harm millions of creators just to satisfy your personal vendetta against aspects of capitalism.

Copyleft is a means, not an end

Posted Mar 6, 2026 23:09 UTC (Fri) by bluca (subscriber, #118303) [Link] (6 responses)

> should just abandon the protection and do without their incomes.

I'm really not sure what is so hard to understand about:

"Everyone deserves to make a good living out of what they love doing, whether it's arts, crafts or whatever else."

It's really, really not that hard. Try reading it again a couple of times.

Copyleft is a means, not an end

Posted Mar 6, 2026 23:39 UTC (Fri) by dskoll (subscriber, #1630) [Link] (5 responses)

You're being disingenuous. It is fantastically easy to trumpet:

Everyone deserves to make a good living out of what they love doing, whether it's arts, crafts or whatever else.

while not offering any feasible, proven way to do that. Heck, if it were up to me, everyone would have kittens and rainbows and all the chocolate they could eat while never gaining an ounce.

Go on. Come up with a feasible alternative to copyright that protects creators' livelihoods and that actually has a hope in hell of happening, and then maybe I'll take you seriously. Until then, you're just wasting electrons with meaningless slogans. (And don't say UBI. Much as I support UBI, it's not going to do the trick.)

Copyleft is a means, not an end

Posted Mar 6, 2026 23:49 UTC (Fri) by bluca (subscriber, #118303) [Link] (4 responses)

Of course it's UBI. Everywhere it's been tried, it worked. People where happier, healthier, more productive, you name it.

"give me an alternative"
"ok, here's the alternative"
"NO NOT LIKE THAT"

Copyleft is a means, not an end

Posted Mar 6, 2026 23:59 UTC (Fri) by dskoll (subscriber, #1630) [Link] (3 responses)

I agree with you about UBI!

If you're claiming UBI will provide sufficient income for people to live on whatever they enjoy doing, you're dreaming in technicolor.

I am not aware of a single UBI program that's generous enough, by itself, to even make people's income meet the poverty level. If you know of one, post a link.

If you know of a way to give UBI to every single person in a country without causing enormous deficits, I'm all ears. As an example, I live in Canada and there are over 30 million of us over the age of 18. If we gave every one of those people $1750/month, which is much, much lower than the lowest minimum wage in Canada and certainly not enough to live on, we'd have to find $630B somewhere... about a quarter of our entire GDP. How do we do that? Magic?

Copyleft is a means, not an end

Posted Mar 7, 2026 0:08 UTC (Sat) by bluca (subscriber, #118303) [Link] (2 responses)

> If you're claiming UBI will provide sufficient income for people to live on whatever they enjoy doing, you're dreaming in technicolor.

The most recent trial:

https://www.theguardian.com/world/2026/feb/10/ireland-bas...

> If you know of a way to give UBI to every single person in a country without causing enormous deficits

Yes. Tax the fucking rich and their corporations.

https://www.oxfamamerica.org/explore/issues/economic-just...

"12 people own more wealth than half the world"

That's where the money is - hoarded by literal dragons

Copyleft is a means, not an end

Posted Mar 7, 2026 0:26 UTC (Sat) by dskoll (subscriber, #1630) [Link] (1 responses)

I'm sympathetic to taxing the rich. But even if we did that, it would not be enough for UBI for everyone on Earth. As I mentioned, this would cost one-quarter of the entire GDP of Canada were it implemented here, and would still not be enough to live on. We don't have enough rich people or corporations to tax to afford this. [We should tax them anyway, but...]

The Irish study gave a small number (2000) of artists €325/week. The poverty level in Ireland is for a single adult is €346/week. Great. 0.04% of Irish people were given income below the poverty level. Woo.

Now sure, the Irish artists in the study probably had other income... income that was protected because of copyright law.

UBI simply cannot be scaled up to be truly universal, even if we taxed all the rich and all the corporations on Earth. The numbers just don't work.

Copyleft is a means, not an end

Posted Mar 7, 2026 0:57 UTC (Sat) by bluca (subscriber, #118303) [Link]

Of course they do work. You missed the most important part of the study: overall tax intake _increases_ with the UBI pilot. Healthier, happier, safer people are more productive people, and increase the overall economic activity in the community.
Billionaires and corporations hoarding uninaginable amounts of wealth in tax havens don't.

Copyleft is a means, not an end

Posted Mar 8, 2026 21:52 UTC (Sun) by da4089 (subscriber, #1195) [Link]

The problem with revolutions, assuming they even achieve a worthy goal, is that they tend to destroy a lot in the transition.

It might be that a world where creatives are justly rewarded for their work while unjust exploitation is avoided is possible, but eliminating copyright as a first step to that means a lot of collateral damage.

And without broad agreement on what the end goal should be, that damage will continue for some time until a new stable state is established.

So sure, right now there’s a heap of unjust exploitation enabled by copyright laws. But there’s also a lot of people who justly make a living off it too. Getting there from here needs more than just smashing the system.

Copyleft is a means, not an end

Posted Mar 6, 2026 20:16 UTC (Fri) by marcH (subscriber, #57642) [Link]

Long before UBI, it would be much easier to fix streaming revenues so a bot that listen 24/7 using a single subscription cannot skew revenues that much. That would be a start.

It looks like ACPS has already fixed this: https://support.deezer.com/hc/en-gb/articles/360002471277...

Now it's just the small matter of generalizing it. It's basic maths, so there's probably as little hope to get it fix as getting rid of "winner takes all" voting systems. Basic maths almost never makes a good TikTok story.

Copyleft is a means, not an end

Posted Mar 6, 2026 21:05 UTC (Fri) by joib (subscriber, #8541) [Link] (12 responses)

This is starting to sound a lot like these "open source will destroy the livelihood of programmers" arguments that were thrown about a couple decades ago. :)

Copyleft is a means, not an end

Posted Mar 6, 2026 21:23 UTC (Fri) by dskoll (subscriber, #1630) [Link] (11 responses)

Not really. Participating in open-source is optional. If copyright is abolished, artists and creators will be forced into a new regime without any choice.

Copyleft is a means, not an end

Posted Mar 6, 2026 21:53 UTC (Fri) by josh (subscriber, #17465) [Link] (10 responses)

Copyright is a privilege granted to authors because we think it'll give us more works produced on net. It's not some inherent good. Bits are infinitely copyable by default, copyright is a non-default propped up by explicit laws punishing copying.

There is no inherent right to a business model that depends on preventing copying. If that puts people out of business, so be it; copyright does more harm than good, now that far more people have the means to benefit from the ability to copy.

All that said, the worst of all worlds would be the one in which *large companies* can misappropriate licensed FOSS works and proprietary work alike as AI training data, while smaller entities are still punished for copying. I'd like to see copyright abolished *across the board*, not just waived for AI training.

Copyleft is a means, not an end

Posted Mar 6, 2026 22:42 UTC (Fri) by dskoll (subscriber, #1630) [Link] (9 responses)

If that puts people out of business, so be it;

Very easy to say when you're not the one being put out of business.

now that far more people have the means to benefit from the ability to copy.

This is why copyright is needed. Back before modern technology, if you were a painter or a musician or an actor, you made your money directly by performing live or from patrons, and it was infeasible to simply copy your work to deprive you of income. Now that copying is so easy, why would anyone spend months or years working on a creative work, only to be completely unable to make money from it?

I'd like to see copyright abolished *across the board*

OK. How are you proposing to compensate musicians, performers, composers, authors, photographers, painters, and film-makers? Or or they part of your callous "so be it" attitude?

Copyleft is a means, not an end

Posted Mar 6, 2026 23:11 UTC (Fri) by mb (subscriber, #50428) [Link] (6 responses)

> Now that copying is so easy, why would anyone spend months or years working on a creative work, only to be completely unable to make money from it?

This is completely different from what has been done with chardet, though.

Copyleft is a means, not an end

Posted Mar 6, 2026 23:40 UTC (Fri) by dskoll (subscriber, #1630) [Link] (5 responses)

I was responding to bluca's comments, not the original article.

Copyleft is a means, not an end

Posted Mar 6, 2026 23:43 UTC (Fri) by mb (subscriber, #50428) [Link] (3 responses)

You don't say!
More explicit: Please stop responding.

Copyleft is a means, not an end

Posted Mar 6, 2026 23:47 UTC (Fri) by dskoll (subscriber, #1630) [Link] (2 responses)

That was uncalled for. If an LWN editor asks me to stop, then I will. But random people should not make such a request.

Copyleft is a means, not an end

Posted Mar 6, 2026 23:52 UTC (Fri) by mb (subscriber, #50428) [Link]

Ok. Welcome to my block list.

Copyleft is a means, not an end

Posted Mar 7, 2026 0:35 UTC (Sat) by corbet (editor, #1) [Link]

I have been hesitant to intervene here; copyright is obviously highly relevant to both our community and this article. I do think that the time is coming, though, for the various folks involved in this discussion to conclude that any possible changing of minds will have occurred by now, and that perhaps it's time to declare victory and enjoy the weekend.

Copyleft is a means, not an end

Posted Mar 6, 2026 23:44 UTC (Fri) by dskoll (subscriber, #1630) [Link]

Ugh, sorry. I was responding to your "so be it" comment.

Copyleft is a means, not an end

Posted Mar 7, 2026 0:42 UTC (Sat) by josh (subscriber, #17465) [Link] (1 responses)

> How are you proposing to compensate musicians, performers, composers, authors, photographers, painters, and film-makers?

How are you proposing to allow people to creatively remix and build upon literally everything that gets released the moment it's released? Or are you writing that off, as something you don't envision because you prioritize copyright higher?

As for your point, I pay literally well over a hundred per month to authors whose work I enjoy reading, even though their work will be released for free to everyone a few weeks later, and I'd pay that even if someone was posting all the content elsewhere. I subscribe to LWN, even though their content becomes free a week later. I work professionally as a software developer, and would continue to do so if copyright didn't exist. I would propose a combination of services, patronage, crowdfunding, UBI, and any number of other things that don't depend on copyright.

I'm not being callous about it. I acknowledge that the world will be vastly different without copyright. Some things will stop being as profitable; there will almost certainly be fewer billion-dollar movies, and far more indies. Some things will be more so. And on net the world will be better.

Copyleft is a means, not an end

Posted Mar 7, 2026 0:46 UTC (Sat) by josh (subscriber, #17465) [Link]

(Oh, and a few others I forgot: concerts and shows and other live experiences; commissions; interactive streams with paid interactivity.)

Copyleft is a means, not an end

Posted Mar 6, 2026 15:58 UTC (Fri) by dskoll (subscriber, #1630) [Link]

Sorry, forgot to add this in my other reply.

Do you work with or know many artists? Because I do, and they're certainly not the 0.01% of superstars. And I can tell you that not a single one of them wants to get rid of copyright and all of them would be fierce in their opposition to your position.

Copyleft is a means, not an end

Posted Mar 6, 2026 14:49 UTC (Fri) by Wol (subscriber, #4433) [Link] (6 responses)

> I wouldn't. I happen to think that composers, authors, photographers, film-makers and other artists should be compensated for their work, and copyright is pretty much the only means they have to guarantee such compensation.

The other thing, of course, is all those works that are being lost because nobody knows the copyright status, and nobody dares copy it.

Imho copyright should be maybe 25 years, and the ORIGINAL AUTHOR and their descendants *alive at creation* (real people) have the right to renew indefinitely.

For things like Mickey Mouse, or the Marvel Universe, or whatever, we should use trademarks not copyright.

Most works have no real value after ten years or so, LET them fall into the Public domain.

Cheers,
Wol

Copyleft is a means, not an end

Posted Mar 6, 2026 15:20 UTC (Fri) by dskoll (subscriber, #1630) [Link] (5 responses)

Yes, I do agree that copyright term should be limited to 25 years after the creation of the work and should not be renewable. IMO, that would balance the interests of creators and of society.

Copyleft is a means, not an end

Posted Mar 6, 2026 17:47 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

I'm fine with exponential (Nx, N>1 last renewal price for Y%, Y<100 more time) or proportional (N% of global turnover, also increasing over time) fees for renewal. If companies *really* want it, make them *really* pay for it.

Copyleft is a means, not an end

Posted Mar 6, 2026 19:17 UTC (Fri) by rgmoore (✭ supporter ✭, #75) [Link]

It's not just balancing the needs of creators against the rest of society. It's also about balancing the needs of past and present creators. Copyright is a double-edged sword. On the one hand, giving a creator exclusive right to their work makes it easier for them to earn money from it. On the other hand, all creative work is derivative at some level, so giving past creators the power to interfere with present ones interferes with the creative process. Extending copyright benefits past creators at the expense of present ones. Of course the dichotomy isn't quite so precise; most present creators are also either real or anticipatory past creators, so cutting copyright terms too short would hurt them. There has to be some happy medium, and I'm pretty sure we've gone way overboard in making copyright terms too long.

Copyleft is a means, not an end

Posted Mar 7, 2026 0:09 UTC (Sat) by Wol (subscriber, #4433) [Link] (2 responses)

> and should not be renewable.

So you don't want to be able to retire on the benefits of your creativity?

There's a whole bunch of reasons behind the various extensions to copyright length - I don't know the reason for 50 years, or lifetime, or lifetime+50, but the 70 year extension is interesting - it's all those people who died young in the 1st or 2nd World War.

Imho, my proposal would allow copyright to leave a legacy to the artist's family - which could be important, but restricting it imho to real people who were around "at the start" gives a good compromise between "no legacy" and "protecting my family".

Cheers,
Wol

Copyleft is a means, not an end

Posted Mar 7, 2026 19:50 UTC (Sat) by nix (subscriber, #2304) [Link] (1 responses)

> So you don't want to be able to retire on the benefits of your creativity?

No. No other kind of productive work gets to earn money for work done decades ago: you earn money as you work, and (in the case of things later sold) for a reasonable period afterwards: not indefinitely, and not for decades. Even copyright accepts this principle via the first-sale doctrine: you don't get to claim royalties on secondhand sales.

You retire based on pensions, investments, or other things of that nature, just like everyone else, not based on endless revenue streams for things you did fifty years ago.

Copyleft is a means, not an end

Posted Mar 8, 2026 9:04 UTC (Sun) by Wol (subscriber, #4433) [Link]

> You retire based on pensions, investments, or other things of that nature, just like everyone else, not based on endless revenue streams for things you did fifty years ago.

And what is a pension or an investment, but an endless revenue stream for things you did 50 years ago? (Plus, of course, copyrights are investments :-)

Be careful for what you wish for - do you really want to find the value of your work going through the floor when they abolish copyright, just as your need for revenue goes through the roof because they abolished copyright!

"There's always a solution that is simple, easy, and WRONG". I really don't see what's wrong with rewarding the creator of an artistic work, and I don't have any problem with that mechanism being copyright. It's all fiat money anyway.

The problem is the greed and rent-seeking culture that's grown up around it, with perpetual copyrights, and trying to tie everything up, and copyright franchises still spewing money long after the creator is dead - and the amount of work that's rotting because everybody's scared to preserve it because of copyright!

Cheers,
Wol

Copyleft is a means, not an end

Posted Mar 6, 2026 0:58 UTC (Fri) by bluca (subscriber, #118303) [Link] (3 responses)

> If the maintainer was seeking help, they could post about seeking help.

Or, how about they can do whatever the heck they want with their project? Or are you paying their salary now?

> The maintainer violated the license on the work of prior contributors, falsely claimed a "clean room" reimplementation, and uploaded the resulting slop as a "new version" of an existing package.

Yeah, well, you know, that's just, like, your opinion, man.

> They're laundering and producing fallout that's drowning out and alienating human collaboration.

If stallman apologists and other such religious fanatics as it can be seen in that GH issue end up being alienated I, for one, won't be missing any of them. Time to find something else to tie their whole persona around. Copyleft is dead, copyright is dying, and not a moment too soon.

Copyleft is a means, not an end

Posted Mar 6, 2026 1:41 UTC (Fri) by Kluge (subscriber, #2881) [Link] (2 responses)

>Or, how about they can do whatever the heck they want with their project? Or are you paying their salary now?

Who says it's *their* project? It's only their project if they own the copyright, and that's what's at issue.

I won't bother responding to your other "points".

Copyleft is a means, not an end

Posted Mar 6, 2026 10:19 UTC (Fri) by bluca (subscriber, #118303) [Link]

The fact that they are the only maintainer working on it for the past ~15 years says it's their project.

Copyleft is a means, not an end

Posted Mar 6, 2026 10:35 UTC (Fri) by kleptog (subscriber, #1183) [Link]

> Who says it's *their* project? It's only their project if they own the copyright, and that's what's at issue

Depends on what you consider the "project". As just literal source, you may be right. But when it comes to the community: the ones actually doing the maintenance, managing releases, etc, then ISTM the original author ceded ownership a long time ago.

I've always been of the view that open-source projects are more like mini-cooperatives, with members doing their part. Ownership in such cases becomes a bit fuzzy. But someone who is not present at all is not a candidate.

The (L)GPL is means, not the goal. You have projects like PostgreSQL which maintain a vibrant community with no copyleft licence support.

Copyleft is a means, not an end

Posted Mar 6, 2026 14:44 UTC (Fri) by Wol (subscriber, #4433) [Link]

> The maintainer violated the license on the work of prior contributors, falsely claimed a "clean room" reimplementation, and uploaded the resulting slop as a "new version" of an existing package.

Is there any of that work left, to violate the licence of? Serious question, if all that work is at least 15 years old ... ie out-of-copyright as per the US's original copyright laws :-)

Cheers,
Wol

High-quality "slop"?

Posted Mar 9, 2026 4:17 UTC (Mon) by gmatht (subscriber, #58961) [Link]

The "slop" you refer to reports 30+ times faster performance with better accuracy and support than the previous version. If we were to be pedantic it might be more accurate to refer to 6.0.0 as human slop ¯\_(ツ)_/¯. Unless AI has greatly improved however, I imagine that it was designed by a human with AI assistance and not just vibe coded.

Copyleft is a means, not an end

Posted Mar 6, 2026 1:43 UTC (Fri) by dvdeug (subscriber, #10998) [Link]

> the goal of copyleft was to work around copyright

Each person who has contributed has their own motivations. There's a lot of teamwork and community without getting ripped off motivations, the same sort of attitude that created a lot of non-commercial shared-source software like POV-Ray. A lot of people use GPL as the most protective license generally accepted as open source and thus by Debian, Red Hat and friends.

> If I can take an existing piece of software, perhaps even a binary, and point an LLM at it and get a permissively-licensed reimplementation

Except that's not really true. There's a lot of things that will always be non-trivial to clone; basically, unless you already have an open-source driver, you don't have enough information to write a hardware driver without reverse engineering and playing around with actual hardware.

> I can study how it works and change it, because I now have source code.

To the extent that's true, you clearly have a copyright infringement. You're not studying the original code, you're studying code that an AI wrote.

>If megacorporations had all lobbied to, say, limit copyright to two weeks after the death of the author instead of 70 years, we probably would cheer that on

No. I doubt even RMS would support his work under ND licenses, like much of the GNU webpages, going into the public domain right after he dies. It would have hurt historical uses of copyright, like Grant's writing of his autobiography right before he died (producing a valuable historical artifact) to provide money for his family.

>I also wonder how proprietary software companies are feeling about this.

Microsoft has huge budgets to sue for copyright infringement. Even in a world where you can feed a program into an AI and get out a copy that is by law a non-infringing work, that's only going to come after the big companies have spent years suing everyone.

If LLM can strip LGPL code of copyleft, it can strip copyright out of any copyrighted work

Posted Mar 6, 2026 1:08 UTC (Fri) by atai (subscriber, #10977) [Link] (6 responses)

So LGPL is not the specialty here; if LLM rewrites Harry Potter, then a new version of public domain Harry Potter is produced?

If LLM can strip LGPL code of copyleft, it can strip copyright out of any copyrighted work

Posted Mar 6, 2026 1:18 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link] (5 responses)

But what if an English author writes a story about a boy with extraordinary abilities who grew up in an abusive family with adoptive parents. This boy was then forced by circumstances to rise to the challenge and save all the people around him from death, caused by the machinations of a wannabe tyrant.

Should that author be guilty of copyright infringement?

Harry Potter? What? I'm talking about "Starman Jones", of course.

If LLM can strip LGPL code of copyleft, it can strip copyright out of any copyrighted work

Posted Mar 6, 2026 1:41 UTC (Fri) by pizza (subscriber, #46) [Link] (3 responses)

> Harry Potter? What? I'm talking about "Starman Jones", of course.

Well done, sir.. well done.

If LLM can strip LGPL code of copyleft, it can strip copyright out of any copyrighted work

Posted Mar 6, 2026 14:54 UTC (Fri) by Wol (subscriber, #4433) [Link] (2 responses)

Or the guy who wrote a story about kids going to a school of magic?

No again I'm not talking about Hogwarts. He *really was* accused of copying Hogwarts in order to create Unseen University :-)

Cheers,
Wol

If LLM can strip LGPL code of copyleft, it can strip copyright out of any copyrighted work

Posted Mar 7, 2026 19:53 UTC (Sat) by nix (subscriber, #2304) [Link] (1 responses)

Ursula Le Guin got distinctly tired of people claiming that _A Wizard of Earthsea_ was an obvious ripoff of Harry Potter, since, uh, both have magical schools in and obviously this idea was so stunning that Rowling must have been the first person to think of it ever. (Clearly this ripping off was done with the aid of a time machine, given that Earthsea predated Harry Potter by three decades.)

If LLM can strip LGPL code of copyleft, it can strip copyright out of any copyrighted work

Posted Mar 8, 2026 8:54 UTC (Sun) by Wol (subscriber, #4433) [Link]

Not by as much, but Unseen University predates Hogwarts by about a decade ...

Cheers,
Wol

Starman Jones

Posted Apr 1, 2026 21:26 UTC (Wed) by sammythesnake (guest, #17693) [Link]

My dad read that to me and my brother as a bedroom studio when we were kids. It wasn't until we got to the end of the book that we realised the last couple of pages were ripped out. I still don't know quite how it ended...

Joke

Posted Mar 6, 2026 1:11 UTC (Fri) by pabs (subscriber, #43278) [Link]

Seems like the Chardet folks didn't notice that MALUS corporation was a joke, not something to be emulated.

https://malus.sh/
https://fosdem.org/2026/schedule/event/SUVS7G-lets_end_op...

Rewrite proprietary software too!

Posted Mar 6, 2026 1:15 UTC (Fri) by pabs (subscriber, #43278) [Link] (2 responses)

There is an upside to this; we can now rewrite proprietary software from scratch with LLMs much faster than the manual clean room rewrites that have had to be done in the past!

Rewrite proprietary software too!

Posted Mar 6, 2026 7:56 UTC (Fri) by LtWorf (subscriber, #124958) [Link] (1 responses)

Except we don't have an LLM trained on their source.

Rewrite proprietary software too!

Posted Mar 7, 2026 2:45 UTC (Sat) by pabs (subscriber, #43278) [Link]

Apparently that is irrelevant. I read the HN threads related to this LWN post, and found some interesting things.

While LLMs can't yet do decompilation satisfactorily, they can clean up the output of NSA Ghidra satisfactorily, and convert between source code languages easily.

https://reorchestrate.com/posts/your-binary-is-no-longer-...
https://reorchestrate.com/posts/your-binary-is-no-longer-...

Another post claimed to be able to recreate a proprietary backend service when given FOSS frontend apps or even just a minified web frontend.

https://news.ycombinator.com/item?id=47259485

I can't find the post now but ISTR a post claiming LLMs were good at reversing WASM already.

win-win?

Posted Mar 6, 2026 1:20 UTC (Fri) by shironeko (subscriber, #159952) [Link] (1 responses)

I think this is a prime opportunity for a lawsuit, either LGPL gets better protected, or we get a great tool so we can train a LLM on a bunch of stolen proprietary code and produce clean free code out of them. Seem like a no brainer, no one looses.

win-win?

Posted Mar 6, 2026 1:59 UTC (Fri) by atai (subscriber, #10977) [Link]

> no one looses.

No. Everything gets loose.

Conservancy

Posted Mar 6, 2026 1:26 UTC (Fri) by pabs (subscriber, #43278) [Link]

Seems like Conservancy are thinking about AI related stuff, they recently posted this:

https://sfconservancy.org/blog/2026/mar/04/scotus-deny-ce...

Scary misunderstanding of how LLMs work

Posted Mar 6, 2026 1:34 UTC (Fri) by marcH (subscriber, #57642) [Link] (14 responses)

> "I then started in an empty repository with no access to the old source tree, and explicitly instructed Claude not to base anything on LGPL/GPL-licensed code."

How does that prove anything? Are there several variants of Claude, some trained on (L)GPL code and others not? If a Claude variant had GPL code in its training, then that GPL code will influence its output no matter what the prompt is. This is only an LLM, it's not "intelligent".

I'm not saying that the influence of that GPL code in the training is enough to affect the license of the output (like everyone else, I have absolutely no idea) I'm only saying that the "explicit instruction not to base anything on (L)GPL code" in the prompt seems utterly ignorant and ridiculous. As ridiculous as asking chatbots not to lie.

Scary misunderstanding of how LLMs work

Posted Mar 6, 2026 1:39 UTC (Fri) by bluca (subscriber, #118303) [Link] (13 responses)

> How does that prove anything?

It proves something extremely important, as it was explained by that same comment: the old code was not in the local context of the tool.

The local context is extremely different from the training set. Training is performed under explicit exceptions to copyright law, so the license of the dataset is irrelevant. The local context is a whole different story, obviously.

Scary misunderstanding of how LLMs work

Posted Mar 6, 2026 9:30 UTC (Fri) by excors (subscriber, #95769) [Link] (1 responses)

> It proves something extremely important, as it was explained by that same comment: the old code was not in the local context of the tool.

It doesn't prove that, because Claude Code is capable of downloading the old LGPL chardet code from GitHub and inserting that into its LLM context. From Simon Willison's post linked just after that quote, Claude Code wrote a plan that explicitly said it was going to do that for one source file. It's evidently quite happy to ignore the instruction to not base anything on LGPL code.

Scary misunderstanding of how LLMs work

Posted Mar 6, 2026 10:23 UTC (Fri) by bluca (subscriber, #118303) [Link]

No, it cannot access external resources like that, without explicit permission to do so. Unrestricted access is only to the local repository, which the maintainer explained was empty.

Scary misunderstanding of how LLMs work

Posted Mar 6, 2026 14:03 UTC (Fri) by marcH (subscriber, #57642) [Link] (2 responses)

> Training is performed under explicit exceptions to copyright law, so the license of the dataset is irrelevant.

So what are all those debates raging about then? Copyright is dead, let's celebrate and move on!

Explicit where? Afraid I (and others) missed that legal milestone.

Scary misunderstanding of how LLMs work

Posted Mar 6, 2026 14:57 UTC (Fri) by Wol (subscriber, #4433) [Link] (1 responses)

> Explicit where? Afraid I (and others) missed that legal milestone.

EU law explicitly treats training an LLM the same as training a schoolchild. It is not a copyright violation to READ a book.

Cheers,
Wol

Scary misunderstanding of how LLMs work

Posted Mar 6, 2026 19:57 UTC (Fri) by marcH (subscriber, #57642) [Link]

This is not about READING a book. This is about reading a book and writing another book "inspired" by it - for many different, complex and nuanced definitions of "inspired".

Scary misunderstanding of how LLMs work

Posted Mar 6, 2026 18:46 UTC (Fri) by valderman (subscriber, #56479) [Link] (7 responses)

It proves nothing, as the original chardet source is part of the training data and instructing an LLM to "not use any code with X license" is mere theatrics. It doesn't prevent code with X license from ending up in the output, it only makes it slightly less likely.

And no, the training exception does not make the LLM output exempt from violating the copyright of its training data. It only protects the model itself.

Scary misunderstanding of how LLMs work

Posted Mar 6, 2026 18:52 UTC (Fri) by bluca (subscriber, #118303) [Link] (6 responses)

And yet the output doesn't violate anything, as it's wildly different from the original

Scary misunderstanding of how LLMs work

Posted Mar 6, 2026 20:00 UTC (Fri) by marcH (subscriber, #57642) [Link] (3 responses)

That's a good point and TBH more relevant to the greater picture, but it's pretty far from the specific sentence "I explicitly instructed Claude not to base anything on LGPL/GPL-licensed code."

Scary misunderstanding of how LLMs work

Posted Mar 6, 2026 20:07 UTC (Fri) by bluca (subscriber, #118303) [Link] (2 responses)

That's not all the maintainer said? They posted some detailed text-based analysis of the similarities/differences of the two versions, on top of all the benchmarks showing massive differences in results too

Scary misunderstanding of how LLMs work

Posted Mar 6, 2026 20:19 UTC (Fri) by marcH (subscriber, #57642) [Link] (1 responses)

If you're ever proven wrong or confusing, never admit it. Like in: https://lwn.net/Articles/1061665/

Or, switch to a different topic instead.

Scary misunderstanding of how LLMs work

Posted Mar 6, 2026 20:34 UTC (Fri) by bluca (subscriber, #118303) [Link]

Are you trying to go for the chewbacca defence?

Scary misunderstanding of how LLMs work

Posted Mar 10, 2026 12:49 UTC (Tue) by diegor (subscriber, #1967) [Link] (1 responses)

You can also process code so the new code it's wildly different from the old, so the results it's considered something unrelated to the old?

Also who is the author of the new version? The LLM? The mantainer? If it's the mantainer, can he really claim that is a clean room implementation? If it's not the author, he can license the code, not being the copyright holder.

I'm not convinced that "wildly different" is a proof of a clean room implementation.

Scary misunderstanding of how LLMs work

Posted Mar 10, 2026 12:54 UTC (Tue) by bluca (subscriber, #118303) [Link]

There is no legal requirement of "clean room implementation". It's _one_ strategy that is often adopted to be on the safe side, but it doesn't guarantee anything by itself, and it is not a legal concept in the first place.

Is a port to another a language a derived work?

Posted Mar 6, 2026 6:30 UTC (Fri) by rsidd (subscriber, #2582) [Link] (2 responses)

Had an LLM been employed to translate chardet to, say, Lisp, the level of similarity would be quite low, but most would agree that the new code was derived from the original.

If this argument is correct, chardet should have been under the MPL and not LGPL, since it is apparently a straighforward port of the C++ version of chardet which is under the MPL:

[dan-blanchard] The only reason I am advocating for MPL in this particular case is that a straight literal port like chardet is usually considered a modification/derivative of the original code. Given the terms of the MPL, it seems to me like chardet should have been required to be released under MPL in the first place, and that the LGPL choice was a legal oversight on Mark's part.

Elsewhere Blanchard hints that Pilgrim is not easy to reach on email; but apparently he is able to send email, and it is weird that, after so many years, he comes out of the woodwork demanding that his original licence be respected.

Is a port to another a language a derived work?

Posted Mar 6, 2026 16:15 UTC (Fri) by Wol (subscriber, #4433) [Link]

> it seems to me like chardet should have been required to be released under MPL in the first place, and that the LGPL choice was a legal oversight on Mark's part.

MPL I or II? Iirc MPL II explicitly permits relicencing to (L)GPL?

There were complaints that MPL I and (L)GPL weren't compatible so I believe that was the fix that went into MPL II.

Cheers,
Wol

Is a port to another a language a derived work?

Posted Mar 6, 2026 17:20 UTC (Fri) by Kluge (subscriber, #2881) [Link]

> Elsewhere Blanchard hints that Pilgrim is not easy to reach on email; but apparently he is able to send email, and it is weird that, after so many years, he comes out of the woodwork demanding that his original licence be respected.

Why is it odd if there was no indication that his license would not be respected until now?

Just rename it and move along.

Posted Mar 6, 2026 8:01 UTC (Fri) by edomaur (subscriber, #14520) [Link] (9 responses)

ok, if Chardet v7.0.0 is causing licensing issue, why not just rename it as something-else v26.04 or cataracte v1.0.0 or unchardet v7.0.0 etc.

Just rename it and move along.

Posted Mar 7, 2026 7:47 UTC (Sat) by josh (subscriber, #17465) [Link] (8 responses)

It would still be a license violation of all the other contributors to chardet, no matter what it's called.

Just rename it and move along.

Posted Mar 8, 2026 5:41 UTC (Sun) by edomaur (subscriber, #14520) [Link] (4 responses)

I really don't understand why it should be that way : the overall "effect" of the v7.0.0 is the same as the non-v7.0.0, but if the source is different I cannot see how it could be a license violation. Ok, using LLM makes it ethically disputable, but what if it was completely rewritten only by human means, would it still be a violation of the license ? Because if yes, I see a can of worms there that Oracle and its legal team would be happy to open (e.g. Java and API stuff)

Just rename it and move along.

Posted Mar 8, 2026 5:46 UTC (Sun) by edomaur (subscriber, #14520) [Link]

Also, something else I read somewhere else : it seems that non-curated AI generated code has actually 1.7 times more issues than human made code, so even if AI helps somewhat, it still leave a lot of work on the table. And not really the fun work.

Just rename it and move along.

Posted Mar 8, 2026 8:40 UTC (Sun) by josh (subscriber, #17465) [Link] (2 responses)

> if the source is different I cannot see how it could be a license violation. Ok, using LLM makes it ethically disputable, but what if it was completely rewritten only by human means, would it still be a violation of the license ?

Yes, it would be. You can rewrite a piece of software and still be a derivative work of the software you based it on. By way of example, if you spend a while staring at the code of some software, then set it aside, go into an airgapped room, and type in a fresh implementation that follows the same spec, the result is likely still a derivative work, as you've based it on your knowledge of the source code. (The same thing applies if you try to write a copy of Lord of the Rings from memory and publish it, even if you don't end up copying any specific sentences word-for-word.)

The reason people talk about a "clean-room reimplementation" of a piece of software is that doing so typically involves having a careful firewall between those who read the source or reverse engineer the software (who use it to write documentation that explains *behavior* but not *implementation*, where that documentation may need to be reviewed by a lawyer to make sure it includes nothing copyrightable), and those in a "clean room" who read only the documentation and use that to reimplement the software based solely on its documented behavior. There's case law upholding that doing *that* is a defense against copyright infringement and can produce an independent work that isn't derived from the original.

Contrast that with having a person who has worked on the software for years and is deeply familiar with it, taking an LLM trained on most of the Internet *including the software source code in question*, and using that LLM to reimplement the software, with access to the original source that it has been *told* to only to use for its test suite. Consider the vast gulf between those two processes. There is zero legal precedent that the extremely *non-clean-room* approach used here would avoid producing a derivative work. It is my sincere hope that there will soon be legal precedent that doing so *does* produce a derivative work.

Just rename it and move along.

Posted Mar 8, 2026 12:50 UTC (Sun) by pizza (subscriber, #46) [Link]

> with access to the original source that it has been *told* to only to use for its test suite.

...Methinks copyleft projects need to start making their test suites private.

(I'd argue the same goes for all F/OSS but Apache/BSD/MIT-licensed stuff already allows for proprietarization)

Just rename it and move along.

Posted Mar 8, 2026 12:59 UTC (Sun) by bluca (subscriber, #118303) [Link]

> Yes, it would be.

You keep saying this, and yet providing zero proof for it. It doesn't work like that. A "clean room" implementation is one way to err on the safe side, but it is not a legal requirement. There are several criterias and tests that are typically taken into consideration in various jurisdictions to decide whether something is a derivative work or isn't, and feelings and vibes and wishful thinking do not come into play.

> Contrast that with having a person who has worked on the software for years and is deeply familiar with it

Doesn't matter, never did, never will. Otherwise it would be impossible for people to do things that are routinely done, like changing jobs and going to work for competitors. You obviously cannot take IP with you, and obviously cannot reimplement the same thing verbatim, but there is absolutely no requirement not to be "deeply familiar" with copyrighted work in order to be able to work on similar but different projects.

That's one of the reasons why large companies, especially in the US, try to impose draconian non-compete private agreements on their employees - because the law is not on their side on this. And thank fuck for that, because otherwise it would provide _yet another_ way for big players in capitalism to screw over ordinary people.

Just rename it and move along.

Posted Mar 8, 2026 9:08 UTC (Sun) by Wol (subscriber, #4433) [Link] (2 responses)

> It would still be a license violation of all the other contributors to chardet,

As has been pointed out repeatedly, *what* other contributors? The empty set?

I wouldn't have done it this way, I would have done it the way they converted the GPL'd Star Office into the MPL'd Libre Office, and the tragedy is it sounds like that would have been an easy thing to do! But what's done is done.

Cheers,
Wol

Just rename it and move along.

Posted Mar 8, 2026 9:24 UTC (Sun) by josh (subscriber, #17465) [Link] (1 responses)

> As has been pointed out repeatedly, *what* other contributors?

`git shortlog -es` shows several dozen, and that's from the git era, not counting those who contributed prior to the first git commit (which includes the original author Mark Pilgrim).

> But what's done is done.

That's not how license violations work, no.

Just rename it and move along.

Posted Mar 8, 2026 9:59 UTC (Sun) by Wol (subscriber, #4433) [Link]

> `git shortlog -es` shows several dozen, and that's from the git era, not counting those who contributed prior to the first git commit (which includes the original author Mark Pilgrim).

Ah! Okay. But you do know, you are the first person in this long thread to actually come up with a list?

How many of these are recent, and how many predate the current maintainer taking over? Sounds like he really should have gone down the LibreOffice type route ...

Cheers,
Wol

I could suggest

Posted Mar 6, 2026 15:35 UTC (Fri) by frankie (subscriber, #13593) [Link]

to follow the channel of Simone Aliprandi (Italian lawyer and expert on such topics). Most of the stuff is in Italian, but auto-translation could be enough, and some slides are also in English.

https://www.youtube.com/@SimoneAliprandi

Has the horse left the barn?

Posted Mar 12, 2026 12:52 UTC (Thu) by karim (subscriber, #114) [Link]

I've been watching LLM code generation unfold and I keep asking myself whether the horse hasn't already left the barn ...

Companies are already shipping code generated by LLMs -- whether proprietary or not. OpenClaw was vibe coded and was sold 1B$. Need we say more? Any PM with half a brain is likely planning their way towards this direction, licensing be damned.

Don't get me wrong. I'm not promoting copyright infringement, etc. But if the cost of generating a given functionality is essentially zero then what "right" is being protected?

That said, would I lay my life down on code generated by Claude? Nope :)

Pulling this off requires some special circumstances ....

Posted Mar 21, 2026 12:46 UTC (Sat) by ras (subscriber, #33059) [Link] (1 responses)

It's not surprising to me that an LLM pulled this off, but it does require some special circumstances if you are going to do a "clean-room"
implementation.

- You need a definition of the API. It's in the source that implements the API of course, but you can't look at that and claim it is clean-room. Perhaps you are allowed to look at the documentation? chardet seems to be small and have good documentation. Failing that, you can give it lots of examples of other modules calling the API.

- You need a very good test suite. Without this, you get what would be called crap (if you are being kind). With it, you usually get something that passes 100% of the tests. There is a surprisingly sharply defined moment when LLMs "became useful". Before that point when an LLM tried to fix some code to make failing unit tests pass, it usually introduced more bug than it fixed so it spiralled into a complete mess. In June 2024, Claude 3.5 Sonnet usually converged on code that made all tests pass. Once you have convergence the model just needs to bash on it long enough.

- The lower-end APIs (the ones the program uses to get stuff done) also have to be well defined. chardet looks to be pure python which is not only pretty well defined, it's also something the existing LLM models know very well.

- It can't be too big. The current models advertise 200k tokens, but that isn't near as large as it sounds. Some models advertise 1M tokens, but those models tend to forget random bits of the 1M tokens so they become less reliable, and reliability isn't stellar to begin with. Everything the LLM reads, outputs and thinks to itself in its chain of thought gets appended to that window. If source gets read multiple times, which will happen when its searching for a bug, then it gets appended multiple times. If it has to scan unit test output, or raw data feed to unit tests, that gets appended too. Running out of context is handled by a mechanism that is both less reliable than the OOM killer and more devastating as it gets rid of 90% of what it knows. Tell me again - how do I compile this, I've forgotten.

chardet met all those conditions. I'm guessing it was fed the documentation, told to make the unit tests pass, and it delivered something that passed the unit tests. I am not qualified to say if that means it complies with the LGPL. Others have pointed out in chardet's issues that it can't be MIT licensed as LLM output can't be copyrighted, but I suspect that becomes mushy if Dan hand-modified the source. Besides copyright law tends to change to accommodate new technologies, and we haven't seen the law makers hammer out a change to accommodate LLMs yet.

Any software that meets those conditions is reproducible via an LLM. Not effortlessly, as passing unit tests does not mean it works. A human has to review the code (or at least the result), and go through many rounds of pointing out the errors and getting the model to fix them. But if you do that, human + model in my experience turns out something better than a human could alone. If you already have documented API and unit tests (they are a fair chunk of the work), it's going to happen faster than the original was written.

One final twist - it is possible to get one LLM to read the original work, and produce an English specification. Then get a 2nd LLM to produce code that does the same as the original. That is what I was taught was the "clean-room" process you could use to work around copyright, albeit done by two teams of humans rather than LLMs. I dunno if doing with an LLM changes that calculation. I've done it as a lark to see how well it works. It worked, very, very well. You don't need the documentation or the unit tests in that case, but to get a useful result the LLM has to be able to run the old and new versions of the program and compare outputs. If this process is a way around copyright, any open source projects that meet the last two conditions is reproducible with an LLM.

Pulling this off requires some special circumstances ....

Posted Mar 21, 2026 14:09 UTC (Sat) by karath (subscriber, #19025) [Link]

> Others have pointed out in chardet's issues that it can't be MIT licensed as LLM output can't be copyrighted
If my understanding is correct, the case rulings in the US regarding copyright of the output of an LLM are very, very narrow. The relevant ruling does not state that the output of an LLM cannot be copyrighted. It states only that an LLM is not (currently) eligible to be be the copyright holder of the output.

IIRC, in the case, the plaintiff had disclaimed copyright in the prompt, and was using an LLM model where the creator had disclaimed copyright in the output of the LLM. The plaintiff claimed that the LLM was creative enough that the output could be copyrighted by the LLM. The judge declined to rule on the the creative enough part, and ruled only on whether the LLM was eligble to claim copyright.

I'm not a lawyer, I don't live in the USA, and I don't obsessively follow the relevant cases. So it it is entirely possible that there are later rulings that I have not yet heard of.

Protecting the Four Freedoms

Posted Mar 21, 2026 18:52 UTC (Sat) by jondo (guest, #69852) [Link] (1 responses)

Has anyone already thought/written about how we can continue to protect the Four Freedoms from the Free Software Definion? When I apply the (A,L-)GPL, can I in addition explicitly forbid to use the source as LLM training material?

Protecting the Four Freedoms

Posted Mar 21, 2026 20:04 UTC (Sat) by Wol (subscriber, #4433) [Link]

> can I in addition explicitly forbid to use the source as LLM training material?

Probably not. Certainly legally in many jurisdictions, and indeed morally in the view of many, such a restriction would be seen as making the work non-free, because you are explicitly forbidding the freedom to study and learn from the source.

(Never mind that an LLM is not a person, EU and others explicitly equate training an LLM to teaching a person. And that does make a lot of sense. I'm personally extremely happy with that - treating an LLM exactly the same as a person ie copyright material can go *IN* unhindered, what comes out may or may not be a copyright violation.)

Cheers,
Wol

SFC is analyzing chardet LGPL situation

Posted Mar 27, 2026 21:40 UTC (Fri) by bkuhn (subscriber, #58642) [Link]

Take a look at my comment on chardet's issue tracker.

TL;DR: I'm leading an effort at Software Freedom Conservancy (SFC) to analyze this situation. The results will be published. It will take a long time — for good reason. Meanwhile, anyone using chardet commercially should call their lawyer.

bkühn, Policy Fellow & Hacker-in-Residence at SFC


Copyright © 2026, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds