Assigning CC-BY-4.0 licence to CIF dictionaries

James H jamesrhester at gmail.com
Wed Apr 24 07:11:45 BST 2024


I want to analyse Herbert's concerns about CC-BY-SA. Before doing so, what
I'm proposing is that we determine a default license, but any of
CC0/CC-BY/CC-BY-SA would be acceptable depending on how the dictionary
authors want to jump. This is similar to Wikimedia, which allow these
licenses but individual projects can choose which one they prefer.

On Thu, 4 Apr 2024 at 13:06, Herbert J. Bernstein <yayahjb at gmail.com> wrote:

> The problem with simple CC-BY without SA is that:
>
> CC BY. This license enables reusers to distribute, remix, adapt, and
> build upon the material in any medium or format, so long as attribution is
> given to the creator. The license allows for commercial use. CC BY
> includes the following elements: BY: credit must be given to the creator."
>
> This means that the licensee is free to create a derivative work that is
> closed and does something completely different without having to disclose
> to anybody what this new thing does or how it does it.  For an app that can
> be very useful for encouraging for creative development of new and
> wonderful apps.  For a dictionary it can lead to the creation of a very
> different dialect with no clear documentation of how it differs and no way
> to find out. If you add a patent, you can make it even worse by being able
> to charge large fees to create documents that conform to the derived
> dictionary.
>

In the specific context of a CIF dictionary (or other text file), what does
"closed" mean? Any changes to the text file are transparent, so nothing can
be hidden, unlike compiled software. "Creation of a different dialect" in a
dictionary context means either use of different DDL attributes or
definition of different datanames, which is possible in both SA and non-SA
variants of the license. The way in which something differs has to be
stated under the terms of CC-BY, so it's not true to say that there is no
way of finding out the differences - and anyway, as text files it's easy
enough to compare with the original.
"If you add a patent": this seems irrelevant, as the publication of the
dictionary creates prior art (insofar as anything is patentable within it),
and in any case nothing in the SA variant protects from patents any more
than the non-SA version. "Charge fees to create documents conforming to the
derived dictionary" i.e. CIF files containing your special data names? You
can't patent a data name.


>
> What CC-BY-SA does is
>
> The Creative Commons Attribution Share-Alike license allows
> re-distribution and re-use of a licensed work on the conditions that the
> creator is appropriately credited and that any derivative work is made
> available under “the same, similar or a compatible license”.
>
> While that approach can be deadly to commercial app development, which is
> why CBFlib allows use of the LGPL with the operating system exception in
> addition to  allowing full GPL licensing, it is exactly what a dictionary
> needs in order to allow derivative dialects to the fully understood,  It
> won't stop people from making better dialects, but it will make sure that
> the derivative dictionary is fully open and available so that it can be
> understood, avoiding the chaos of propretary (i.e. secret internals or
> pay-wall protected) dialects.
>

A dictionary is a plain-text file, it is not like compiled software, which
is why we have separate license types for documents, as things like the GPL
don't make sense. The "dialects" you are talking about are edited text
files, which are just as understandable as the original. If I can restate
your concern, under CC-BY a commercial entity could take a CIF dictionary,
add some of its own data names for its proprietary equipment that describe
super secret settings, then bundle that dictionary with its equipment under
a restrictive license. And other companies could do the same, leading to a
bunch of similar but different data names whose definitions are hidden but
which appear in CIF data files. Under SA, they couldn't hide their
dictionaries.

This, with respect, makes no sense as an objection. Firstly, to create
their different data names they don't need to go to the trouble of writing
a dictionary. Secondly, we already have the situation where companies
register CIF prefixes and happily use data names that could shadow other
data names, so plain CC-BY would change nothing compared to CC-BY-SA.
Thirdly, the point of a dictionary is to communicate standards, and keeping
your standards-describing document secret makes it pointless to write it in
the first place. Fourthly, the IUCr is the authoritative source from which
the community takes their dictionaries, thus controlling spread of
dialects. Adding SA to our license doesn't change that.

This may seem like a far-fetched concern, but it is exactly what has
> happened many times with telecommunications, graphics, and compression
> protocols, especially in interacting with patents.  For example, we went
> through a couple of painful decades of essentially being unable to use the
> LZW-derived compression algorithms in crystallography until the patents
> expired, and it made several graphics formats unusable without paying large
> fees.  Use of CC-BY for documents, like BSD for software, is an invitation
> to some nasty patent troll to cause trouble,  Yes this is not a serious
> problem in large parts of the world that are not as litigious as the US,
> but in the US patent trolls are a very real problem.
>

Yes, it is far-fetched, as per my previous paragraph. The examples you
provide above involve patents or copyrighted protocols. None of those are
in play here. The dictionaries have been published, they are prior art, so
no new patents would read against them. The dictionary copyright belongs
with us, so there are no protocols to be hidden away from us. Furthermore,
it is not at all clear how a patent troll would operate in the context of a
plain text file describing data names. Most obviously, if they claim that
our dictionary file infringes their IP, that is true regardless of license.
Please provide a concrete scenario involving dictionaries where SA would
prevent something bad that non-SA would allow.


> I would suggest consulting with a good lawyer.  IP law is very
> entertaining in the abstract, but running afoul of it can be very expensive.
>

The CC licenses have been lawyered very carefully so that they do what the
claim to do in as many jurisdictions as possible. They are used every day
by thousands of organisations and individuals. Our situation (wanting to
publically share documents) is precisely the one envisaged by Creative
Commons. I don't think this is a situation requiring consulting a lawyer.

all the best,
James.

>
>
> On Wed, Apr 3, 2024 at 9:25 PM James H <jamesrhester at gmail.com> wrote:
>
>> Herbert, can you describe a concrete scenario in which CC-BY-SA would
>> reduce chaos more than CC-BY? CC-BY-SA still allows modification and
>> distribution of altered dictionaries, and it is not clear to me that e.g.
>> the liberal MIT licence has actually brought more chaos to projects using
>> it than the GPL.
>>
>>
>> On Thu, 4 Apr 2024 at 00:10, Herbert J. Bernstein via comcifs <
>> comcifs at iucr.org> wrote:
>>
>>> Personally, I would prefer CC-BY-SA for any dictionary to CC-BY, since
>>> the SA clause
>>> reduces the chance of abuse by malicious actors, for the same reason
>>> that the infectious
>>> GPL is the most effective license for open source software to prevent
>>> abuse.   Yes,
>>> Steve is right that the CC-BY licenses are hard to enforce -- they have
>>> not been
>>> tested in court that way the GPL has -- but if our intention is to limit
>>> the spread of
>>> conflicting dialects CC-BY-SA at least makes our intention to reduce the
>>> chaos clear.
>>>
>>> On Wed, Apr 3, 2024 at 7:10 AM Brian McMahon via comcifs <
>>> comcifs at iucr.org> wrote:
>>>
>>>> I confirm that CC-BY-4.0 would fit in with the projected workflow that
>>>> the Chester office has in place for assigning DOIs to future releases
>>>> of the dictionaries.
>>>>
>>>> Two corollaries:
>>>>
>>>> (1) Should we then have a _dictionary.licence term in the DDLm
>>>> dictionary?
>>>>     That would advertise the licence explicitly upon opening the
>>>> dictionary.
>>>>     Perhaps one also needs a _dictionary.licence_url to allow the full
>>>>     content of the licence to be retrieved?
>>>> (2) If so, we can enforce a single enumeration value (CC-BY-4.0) or we
>>>> can
>>>>     allow additional values (if the community needs that for the
>>>> exemptions
>>>>     that might be required e.g. by funding bodies as James mentions).
>>>>
>>>> Brian
>>>>
>>>>
>>>> On 03/04/2024 05:09, James H via comcifs wrote:
>>>> > Dear COMCIFS,
>>>> >
>>>> > It may come as some surprise that no licence is attached to our
>>>> > dictionaries. As these are machine-readable, they are available for
>>>> > other automated ontology-management systems (e.g. EMMO) to ingest and
>>>> > transform, however, the lack of a licence opens them up to perceived
>>>> > legal jeopardy. From time to time in the past licensing has been
>>>> raised
>>>> > but not followed through on, the latest as far as I can tell being
>>>> 2011.
>>>> > An educational thread from 1999 can be read
>>>> > https://www.iucr.org/__data/iucr/lists/comcifs-l/msg00032.html
>>>> > <https://www.iucr.org/__data/iucr/lists/comcifs-l/msg00032.html> and
>>>> the
>>>> > statement of IUCr policy originating at that time is at
>>>> > https://www.iucr.org/resources/cif/comcifs/policy
>>>> > <https://www.iucr.org/resources/cif/comcifs/policy>
>>>> >
>>>> > Since that time, Creative Commons have produced licences for material
>>>> > that is intended to be shared. These licenses are designed to work
>>>> > across international legal systems. The two which seem most
>>>> appropriate
>>>> > to us are CC0 (public domain), which is essentially renouncing all
>>>> > rights conferred by copyright, and CC-BY, which does the same, but
>>>> > requires attribution and that any changes to the original are clearly
>>>> > indicated. I urge you to have a look at
>>>> > https://creativecommons.org/share-your-work/
>>>> > <https://creativecommons.org/share-your-work/> for background on
>>>> > creative commons.
>>>> >
>>>> > Having pondered the above, I would like now to propose that our
>>>> > dictionaries are licensed as CC-BY, for the following reasons, based
>>>> on
>>>> > the decision points in the Creative Commons "chooser" tool:
>>>> >
>>>> > 1. We need to pick a licence for clarity (see above)
>>>> > 2. CC0 (public domain) would theoretically allow somebody to take our
>>>> > dictionaries and claim them as their own or to distribute subtly but
>>>> > incorrectly modified versions. Note that the wwPDB does license their
>>>> > data as CC0, so this concern on my part may be misguided, particularly
>>>> > in a scientific community where the IUCr is an authoritative source
>>>> > 3. We do not wish to restrict use of our dictionaries for commercial
>>>> > purposes, for example, if a diffractometer manufacturer wished to
>>>> bundle
>>>> > a dictionary and add their own data names to it, they should not need
>>>> to
>>>> > spend their time or our time gaining permission. Simply following the
>>>> > rules for attribution and flagging modifications should be enough.
>>>> > 4. Transformation and adaptation of our dictionaries is an
>>>> increasingly
>>>> > common approach as neighbouring disciplines realise that they can
>>>> save a
>>>> > lot of time (e.g. the ongoing EMMO work). Allowing this type of
>>>> > modification is just normal scientific practice, where one group
>>>> builds
>>>> > on the openly available results of other groups, so we should not
>>>> > restrict it
>>>> > 5. We could require that any modified versions are published under the
>>>> > same licence, which would then make it CC-BY-ShareAlike. My opinion is
>>>> > that this type of restriction just introduces friction, for example,
>>>> > some funding body may require all outputs to be licensed according to
>>>> > some quite liberal licence that is not clearly compatible with
>>>> > CC-BY-ShareAlike, and so there's a need to seek an exemption.
>>>> >
>>>> > Please discuss. Those with insight into the wwPDB's choice of CC0 are
>>>> > welcome to weigh in. If there are no outstanding objections by the end
>>>> > of the month I will take that as agreement.
>>>> >
>>>> > best wishes,
>>>> > James.
>>>> _______________________________________________
>>>> comcifs mailing list
>>>> comcifs at iucr.org
>>>> http://mailman.iucr.org/cgi-bin/mailman/listinfo/comcifs
>>>>
>>> _______________________________________________
>>> comcifs mailing list
>>> comcifs at iucr.org
>>> http://mailman.iucr.org/cgi-bin/mailman/listinfo/comcifs
>>>
>>
>>
>> --
>> T +61 (02) 9717 9907
>> F +61 (02) 9717 3145
>> M +61 (04) 0249 4148
>>
>

-- 
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.iucr.org/pipermail/comcifs/attachments/20240424/aa3b58ae/attachment-0001.htm>


More information about the comcifs mailing list