Tag Archives: Licensing

Fixing our Self Defeating Licence Compatibility Problems in Open Source

Much angst (and discussion ink) is wasted in open source over whether pulling in code from one project with a different licence into another is allowable based on the compatibility of the two licences. I call this problem self defeating because it creates sequestered islands of incompatibly licensed but otherwise fully open source code that can never ever meet in combination. Everyone from the most permissive open source person to the most ardent free software one would agree this is a problem that should be solved, but most of the islands would only agree to it being solved on their terms. Practically, we have got around this problem by judicious use of dual licensing but that requires permission from the copyright holders, which can sometimes be hard to achieve; so dual licensing is more a band aid than a solution.

In this blog post, I’m going to walk you through the reasons behind cone the most intractable compatibility disputes in open source: Apache-2 vs GPLv2. However, before we get there, I’m first going to walk through several legal issues in general contract and licensing law and then get on to the law and politics of open source licensing.

The Law of Contracts and Licences

Contracts and Licences come from very similar branches of the law and concepts that apply to one often apply to the other. For this legal tour we’ll begin with materiality in contracts followed by licences then look at repairable and irreparable legal harms and finally the conditions necessary to take court action.

Materiality in Contracts

This is actually a well studied and taught bit of the law. The essence is that every contract has a “heart” or core set of clauses which really represent what the parties want from each other and often has a set of peripheral clauses which don’t really affect the “heart” of the contract if they’re not fulfilled. Not fulfilling the latter are said to cause non-material breaches of the contract (i.e. breaches which don’t terminate the contract if they happen, although a party may still have an additional legal claim for the breach if it caused some sort of harm). A classic illustration, often used in law schools, is a contract for electrical the electrical wiring of a house that specifies yellow insulation. The contractor can’t find yellow, so wires the house with blue insulation. The contract doesn’t suffer a material breach because the wires are in the wall (where no-one can see) and there’s no safety issue with the colour and the heart of the contract was about wiring the house not about wire colour.

Materiality in Licensing

This is actually much less often discussed, but it’s still believed that licences are subject to the same materiality constraints as contracts and for this reason, licences often contain “materiality clauses” to describe what the licensor considers to be material to it. So for the licensing example, consider a publisher wishing to publish a book written by a famous author known as the “Red Writer”. A licence to publish for per copy royalties of 25% of the purchase price of the book is agreed but the author inserts a clause specifying by exact pantone number the red that must be the predominant colour of the binding (it’s why they’re known as the “Red Writer”) and also throws in a termination of copyright licence for breaches clause. The publisher does the first batch of 10,000 copies, but only after they’ve been produced discovers that the red is actually one pantone shade lighter than that specified in the licence. Since the cost of destroying the batch and reprinting is huge, the publisher offers the copies for sale knowing they’re out of spec. Some time later the “Red Writer” comes to know of the problem, decides the licence is breached and therefore terminated, so the publisher owes statutory damages (yes, they’ve registered their copyright) per copy on 10,000 books (about $300 million maximum), would the author win?

The answer of course is that no court is going to award the author $300 million. Most courts would take the view that the heart of the contract was about money and if the author got their royalties per book, there was no material breach and the licence continues in force for the publisher. The “Red Writer” may have a separate tort claim for reputational damage if any was caused by the mis-colouring of the book, but that’s it.

Open Source Enforcement and Harm

Looking at the examples above, you can see that most commercial applications of the law eventually boil down to money: you go to court alleging a harm, the court must agree and then assess the monetary compensation for the harm which becomes damages. Long ago in community open source, we agreed that money could never compensate for a continuing licence violation because if it could we’d have set a price for buying yourself out of the terms of the licence (and some Silicon Valley Rich Companies would actually be willing to pay it, since it became the dual licence business model of companies like MySQL)1. The principle that mostly applies in open source enforcement actions is that the harm is to the open source ecosystem and is caused by non-compliance with the licence. Since such harm can only be repaired by compliance that’s the essence of the demand. Most enforcement cases have been about egregious breaches: lack of any source code rather than deficiencies in the offer to provide source code, so there’s actually very little in court records with regard to materiality of licence breaches.

One final thing to note about enforcement cases is there must always be an allegation of material harm to someone or something because you can’t go into court and argue on abstract legal principles (as we seem to like to do in various community mailing lists), you must show actual consequences as well. In addition to consequences, you must propose a viable remedy for the harm that a court could impose. As I said above in open source cases it’s often about harms to the open source ecosystem caused by licence breaches, which is often accepted unchallenged by the defence because the case is about something obviously harmful to open source, like failure to provide source code (and the remedy is correspondingly give us the source code). However, when considering about the examples below it’s instructive to think about how an allegation of harm around a combination of incompatible open source licences would play out. Since the source code is available, there would be much more argument over what the actual harm to the ecosystem, if any, was and even if some theoretical harm could be demonstrated, what would the remedy be?

Applying this to Apache-2 vs GPLv2

The divide between the Apache Software Foundation (ASF) and the Free Software Foundation (FSF) is old and partly rooted in politics. For proof of this notice the FSF says that the two licences (GPLv2 and Apache-2) are legally incompatible and in response the ASF says no-one should use any GPL licences anyway. The purpose of this section is to guide you through the technicalities of the incompatibility and then apply the materiality lessons from above to see if they actually matter.

Why GPLv2 is Incompatible with Apache-2

The argument is that Apache-2 contains two incompatible clauses: the patent termination clause (section 3) which says that if you launch an action against anyone alleging the licensed code infringes your patent then all your rights to patents in the code under the Apache-2 licence terminate; and the Indemnity clause (Section 9) which says that if you want to offer an a warranty you must indemnify every contributor against any liability that warranty might incur. By contrast, GPLv2 contains an implied patent licence (Section 7) and a No Warranty clause (Section 11). Licence scholars mostly agree that the patent and indemnity terms in GPLv2 are weaker than those in Apache-2.

The incompatibility now occurs because GPLv2 says in Section 2 that the entire work after the combination must be shipped under GPLv2, which is possible: Apache is mostly permissive except for the stronger patent and indemnity clauses. However, it is arguable that without keeping those stronger clauses on the Apache-2 code, you’ve violated the Apache-2 licence and the GPLv2 no additional restrictions clause (Section 6) prevents you from keeping the stronger licensing and indemnity clauses even on the Apache-2 portions of the code. Thus Apache-2 and GPLv2 are incompatible.

Materiality and Incompatibility

It should be obvious from the above that it’s hard to make a materiality argument for dropping the stronger apache2 provisions because someone, somewhere might one day get into a situation where they would have helped. However, we can look at the materiality of the no additional restrictions clause in GPLv2. The FSF has always taken the absolutist position on this, which is why they think practically every other licence is GPLv2 incompatible: when you dig at least one clause in every other open source licence can be regarded as an additional restriction. We also can’t take the view that the whole clause is not material: there are obviously some restrictions (like you must pay me for every additional distribution of the code) that would destroy the open source nature of the licence. This is the whole point of the no additional restrictions clause: to prevent the downstream addition of clauses incompatible with the free software goal of the licence.

I mentioned in the section on Materiality in Licences that some licences have materiality clauses that try to describe what’s important to the licensor. It turns out that GPLv2 actually does have a materiality clause: the preamble. We all tend to skip the preamble when analysing the licence, but there’s no denying it’s 7 paragraphs of justification for why the licence looks like it does and what its goals are.

So, to take the easiest analysis first, does the additional indemnity Apache-2 requires represent a material additional restriction. The preamble actually says “for each author’s protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors’ reputations.” Even on a plain reading an additional strengthening of that by providing an indemnity to the original authors has to be consistent with the purpose as described, so the indemnity clause can’t be regarded as a material additional restriction (a restriction which would harm the aims of the licence) when read in combination with the preamble.

Now the patent termination clause. The preamble has this to say about patents “Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone’s free use or not licensed at all.” So giving licensees the ability to terminate the patent rights for patent aggressors would appear to be an additional method of fulfilling the last sentence. And, again, the patent termination clause seems to be consistent with the licence purpose and thus must also not be a material additional restriction.

Thus the final conclusion is that while the patent and indemnity clauses of Apache-2 do represent additional restrictions, they’re not material additional restrictions according to the purpose of the licence as outlined by its materiality clause and thus the combination is permitted. This doesn’t mean the combination is free of consequences: the added code still carries the additional restrictions and you must call that out to the downstream via some mechanism like licensing tags, but it can be done.

Proving It

The only way to prove the above argument is to win in court on it. However, here lies the another good reason why combining Apache-2 and GPLv2 is allowed: there’s no real way to demonstrate harm to anything (either the copyright holder who agreed to GPLv2 or the Community) and without a theory of actual Harm, no-one would have standing to get to court to test the argument. This may look like a catch-22, but it’s another solid reason why, even in the absence of the materiality arguments, this would ultimately be allowed (if you can’t prevent it, it must be allowable, right …).

Community Problems with the Materiality Approach

The biggest worry about the loosening of the “no additional restrictions” clause of the GPL is opening the door to further abuse of the licence by unscrupulous actors. While I agree that this should be a concern, I think it is adequately addressed by rooting the materiality of the licence in the preamble or in provable harm to the open source community. There is also the flip side of this: licences are first and foremost meant to serve the needs of their development community rather than become inflexible implements for a group of enforcers, so even if there were some putative additional abuse in this approach, I suspect it would be outweighed by the licence compatibility benefit to the development communities in general.

Conclusion

The first thing to note is that Open Source incompatible licence combination isn’t as easy as simply combining the code under a single licence: You have to preserve the essential elements of both licences in the code which is combined (although not necessarily the whole project), so for an Apache-2/GPLv2 combination, you’ll need a note on the files saying they follow the stronger Apache patent termination and indemnity even if they’re otherwise GPLv2. However, as long as you’re careful the combination works for either of two reasons: because the Apache-2 restrictions aren’t material additional restrictions under the GPLv2 preamble or because no-one was actually harmed in the making of the combination (or both).

One can see from the above that similar arguments can be applied to various other supposedly incompatible licence combinations (exercise for the reader: try it with BSD-4-Clause and GPLv2). One final point that should be made is that licences and contracts are also all about what was in the minds of the parties, so for open source licences on community code, the norms and practices of the community matter in addition to what the licence actually says and what courts have made of it. In the final analysis, if the community norm of, say, a GPLv2 project is to accept Apache-2 code allowing for the stronger patent and indemnity clauses, then that will become the understood basis for interpreting the GPLv2 licence in that community.

For completeness, I should point out I’ve used the no harm no foul reasoning before when arguing that CDDL and GPLv2 are compatible.

The Community Corrosive Effects of CLAs

As one of the kernel DCO advocates, I’ve written many times about using the DCO instead of a CLA for copyright and patent contributions under open source licences. In spite of my obvious biases, I’ll try to give a factual overview of the cases for the DCO and CLA system. First, it should be noted that both the DCO and any CLA are types of Contribution Agreements (a set of terms by which contributors are agreeing to be bound). It should also be acknowledged that the DCO is a far more recent invention than CLAs. The DCO was first pioneered by the Linux kernel in 2004 (having been designed by Diane Peters, then of OSDL) and was subsequently adopted by a broad range of open source projects. However, in legal terms, the DCO is much less well understood than a standard CLA type agreement between the contributor and some entity, which is largely the reason you find a number of lawyers still advocating for the use of CLAs in various open source projects: because they’d like to stick with something that has more miles on it, or because they’re invested in the older model of community, largely pioneered by Apache. The biggest problem today is that the operation of most CLAs is asymmetrical: they take from the contributor more rights than the open source code actually needs, so lets begin with a summary of each type of Contribution Agreement.

DCO

The DCO is a legal representation by the contributor to everyone who might ever use the code. It requires no second party on the other side to counter sign it or act as the receiving entity, so it exactly mirrors the inbound=outbound licensing model first coined by Richard Fontana. The DCO explicitly grants to all downstream recipients only the exact rights the Open Source licence requires (and nothing more). In this sense it is fully symmetrical: the rights granted by the contributor are the same as the rights received by the downstream (i.e. inbound=outbound). Every contributor under the DCO retains their own copyright (or their company does if the contribution is a work for hire). The main alleged disadvantage of the DCO is that it encourages distributed ownership and makes it very hard to change the licence of the project because each contributor has only granted the rights necessary for the current licence, so if the new one requires more or different rights, all the current contributors have to re-grant those new or different rights (which can be a huge number of people for large long running projects). Since the DCO is a representation to everyone and requires no receiving entity, the project collecting the code doesn’t require any formal legal entity, like a foundation, to operate and thus the DCO gives rise to a truly lightweight structure for any project. The other big advantage of the DCO is that all of the representations are tracked by the Signed-off-by: tag on the commit, which goes in the git repository of the project code, so anyone with a clone of the repository has complete access to information about who changed what and where their DCO signoff is.

CLA

All current Open Source CLAs are structured as agreements between the contributor and a second party. Most often, the second party is a Foundation or a Corporation, making them quite heavy weight in terms of setup, admin and overhead. Every current CLA that I know about takes more rights from the contributor than the open source licence actually requires. For instance the Apache Individual CLA grants the right to copy, derive and sublicence to the Apache foundation who then relicence the contribution to the project usually under the Apache 2.0 licence. This is a classic asymmetric grant because the Apache foundation receives far more rights in the contribution than it grants to the downstream recipients. The FSF CLA is even more extreme because they require assignment of the copyright (so they will own the code and you, the author, will have no further right or interest in it except possibly for minimal moral rights to be named the author). Apart from the asymmetric grant, which places the receiving entity in a privileged position in the ecosystem, the other problem with CLAs is that they’re legal agreements, so they require a lawyer to prepare them, a mechanism to ensure people sign them and a mechanism to keep all the signatures … sometimes this can be in filing cabinets if paper instead of electronic copies are used. This repository of agreements then isn’t available to anyone except the tracking entity, meaning that if someone needs to know if John Doe signed a CLA, they have to reach out and ask. In some cases the actual filing cabinets got lost as projects changed offices, so some CLA based projects don’t actually have complete records of all their CLAs.

CLAs Catalyse Community Corrosion

The main driver of community corrosion is the temptation to abuse a position of power (this temptation becomes irresistable over time because, as Baron Acton put it, “all power corrupts”). Since CLAs by their nature force a power imbalance between the contributor and the receiving entity, they act as focal points for this corrosion. Communities are very sensitive to what they see as their work being misused, so the fastest way to lose community trust is to abuse the power the CLA gave you to go against the community itself. There are numerous examples of this in the Corporate World, the most topical one today being the Elastic change from Apache 2.0 to SSPL to better monetize the code the community contributed freely to. One might think the solution to this is never to sign a CLA if the holder of the power imbalance is a corporation … i.e. only do it if the other entity is a not for profit foundation. But ask yourself, how much do you trust the people running the foundation and do its bylaws guarantee your rights in the code? Relicensing for commercial gain isn’t the only way the community could be abused, so how sure are you of the power you’re handing to a foundation which, after all, is an entity governed by some type of board, all of whom likely have political agendas, won’t be abused? To see some examples of foundations not being in tune with their community, one only has to look at the FSF and Richard Stallman. Based on all of this I conclude, like Drew DeVault, that you should never sign a CLA under any circumstances.

The bottom line is that if you do sign a CLA some decision will happen at some point that you don’t agree with but which you already gave away the power to block because of the rights imbalance inherent in the CLA you signed. Inevitably this decision will cause you to feel betrayed because your views are being ignored and as a contributor you feel you should be heard, so you’ll sour on the project. This is the community corrosion catalyst buried deep inside all CLAs.

One final thing to note is that it is possible to craft a CLA that only takes the rights it needs, in the same way the DCO does, it’s just that no project I know has ever done this. However, even if this experiment were attempted, you still need a recipient entity, plus all the infrastructure to do signing and track the signed agreements, so you’d still be better off using a lightweight DCO process.

Conclusion: For Community Small is Beautiful

The way to avoid the community corrosion problem is to do everything minimally: use a DCO to take only the rights the downstream requires and to avoid all the heavyweight recipient, signing and tracking infrastructure. Don’t set up a foundation unless you absolutely need an entity, say to handle cash, and if you must set one up, never give it any control over the project (like appointing a change control or architecture control board for instance) everything you set up should be as small as possible and clearly serve the project and its community. Above all, don’t use a CLA because it will cause a rights imbalance that corrodes your community and it will require a large amount of overhead to run.