The relicensing of chardet
The fact that chardet is LGPL-licensed has indeed caused some unhappiness in the past. That license is incompatible with the requirements for the Python standard library, frustrating those who would like to see chardet become one of the "batteries" that are included with Python; that licensing has also blocked the inclusion of some other modules that use chardet. Blanchard bemoaned his inability to relicense the code back in 2021:
Unfortunately, because the code that chardet was originally based on was LGPL, we don't really have a way to relicense it. Believe me, if we could, I would. There was talk of chardet being added to the standard library, and that was deemed impossible because of being unable to change the license.
In 2026, though, that inability has, according to Blanchard, been overcome by virtue of a complete rewrite — done using Anthropic's Claude LLM — of the source. Pilgrim did not see it that way:
However, it has been brought to my attention that, in the release 7.0.0, the maintainers claim to have the right to "relicense" the project. They have no such right; doing so is an explicit violation of the LGPL. Licensed code, when modified, must be released under the same LGPL license. Their claim that it is a "complete rewrite" is irrelevant, since they had ample exposure to the originally licensed code (i.e. this is not a "clean room" implementation). Adding a fancy code generator into the mix does not somehow grant them any additional rights.
Blanchard, unsurprisingly, disagreed.
A clean-room reimplementation, he said, "is a means to an end, not the
end itself
", and that there are other ways to reach that end, including
an LLM rewrite. He pointed to results from a code-comparison tool showing
that there was almost no similarity between version 7.0 and the
previous versions, and concluded:
I then started in an empty repository with no access to the old source tree, and explicitly instructed Claude not to base anything on LGPL/GPL-licensed code. I then reviewed, tested, and iterated on every piece of the result using Claude. You can see the history of all the design and implementation plans that were used to create 7.0.0 here. I did not write the code by hand, but I was deeply involved in designing, reviewing, and iterating on every aspect of it.I understand this is a new and uncomfortable area, and that using AI tools in the rewrite of a long-standing open source project raises legitimate questions. But the evidence here is clear: 7.0 is an independent work, not a derivative of the LGPL-licensed codebase. The MIT license applies to it legitimately.
Simon Willison has observed, though, that the LLM did indeed access the LGPL-licensed source at one point. Beyond that, as others have pointed out, it is easy to ask an LLM to reimplement a body of code in a style different from the original, with the result that similarity checkers will see something entirely new. That does not necessarily break the derived-work link, though. Had an LLM been employed to translate chardet to, say, Lisp, the level of similarity would be quite low, but most would agree that the new code was derived from the original. The fact that the training corpus for Claude surely included all previous versions of chardet also muddies the picture.
A lot of people who are not lawyers have offered opinions on whether chardet 7.0 is derived from previous versions. I, too, am not a lawyer, and will not add to that pile. But it is worth saying that, if instructing an LLM to rewrite an existing body of code is sufficient to strip copyleft requirements from that code, then the future of copyleft looks even dimmer than it did before. But, then, the future of any sort of software licensing scheme could be threatened. The death of copyleft could, ironically, be part of its real goal: the end of copyright.
Meanwhile, of course, had Blanchard simply shown up with a new Python module, let's call it "detectchar", that implemented the same API as chardet, the overall level of eyebrow elevation would have been considerably lower. Replacing the existing code, under the same name but with a different license, drew a lot more attention to this move than it would have otherwise attracted.
Nobody involved in the current discussion is showing any sign of backing
down. That means the license change seems likely to stand, unless Pilgrim
decides to bring in real lawyers, which would be an expensive and uncertain
prospect at best. But if the change stands, it would not be surprising to
see a lot more people engaging in this sort of license-stripping exercise.
That may eventually lead to a court decision (or, more likely, a series of
conflicting decisions) on whether an LLM can be used to launder source code
in this way. The old Chinese curse — may you live in interesting times —
would certainly appear to be upon us.