Provence and property rights

Brian McMahon bm at iucr.org
Mon Sep 20 14:51:31 BST 2004


A quick initial comment. It's inevitable that CIF data will be repurposed
in CIF format, so assertions about intellectual property and redistribution
rights should be 

(a) part of the audit history of the file; and
(b) verifiable against checksums.

So we might have (without any claim that the suggested data names are
optimal)

loop_
_audit_copyright_date
_audit_copyright_owner
_audit_copyright_details
_audit_copyright_checksum_md5

2002   'W. Plinge'   .   '25219b1586fa67a279ef9fb988d23c19'
2003   'J. Doe'      ?   '6cd63e9ef1f1e3117f67addfb497bb9c'
2004   'American Chemical Society'
             'Transferred when submitted for publication'
             '6cd63e9ef1f1e3117f67addfb497bb9c'

Comments:
(1) While not relevant to the technical discussion, I'm curious to know the
circumstances in which Peter envisages Plinge transferring the copyright to
Doe - or do they jointly own the copyright, but from different dates?

(2) The purpose of the checksum is to validate that a file matching that
checksum is (probably) the identical file to which the associated assertion
relates. If the file has been changed in any way, there is no way to
reverse-engineer the changes to reproduce the file corresponding to the
stated checksum. On the other hand, if there is a dispute and Plinge (let us
say) can produce an original file with the relevant checksum, that will
provide evidence to support his intellectual property claims.

(3) In my example, the checksums for Doe and the ACS are the same (which
almost certainly wouldn't be the case if a true MD5 checksum were used). Do
we want a checksum that validates the *exact* content of a file (so that
you need to preserve OS-dependent line endings, comments etc) or that simply
in some way validates the "essential contents" of the file, e.g. excluding
the copyright assertions?

(4) Requiring mandatory checksum generation may be too heavy a burden on
older CIF writers, but perhaps we can aim for a start to generate such
things for the CIFs redistributed off the IUCr web site.

(5) Is there a case for including some sort of digital signature (where
available) into each loop packet to strengthen the associated rights
assertion?


Present practice for Acta C and E papers is that they are submitted as CIFs.
These enhanced CIFs include the text of the paper; the author transfers
copyright of this material to us. (By the way that simply is present
practice - we're happy to make other arrangements if the author wishes to
retain copyright or if there is a general movement in that direction.)
Since we may change the text during editing, in practice we carry the
copyright along into the final version of the paper, and we don't wish to
expose the early draft to public redistribution. Therefore the CIFs served
as supplementary materials represent only the data component of the
submitted CIF - that is, they are a subset.

A general legal question: is a licence to redistribute (according to the NIH
or BOAI model, say) the sole prerogative of the copyright owner? If so, then
we would need to think rather carefully how to manage the serving of data
CIFs from our site that came from different authors who wish to retain
copyright but license redistribution under various conditions. A controlled
vocabulary would most certainly help here, so that we could implement a
policy to redistribute anything tagged with certain approved prorocols.

Brian



More information about the comcifs mailing list