[Cif2-encoding] Splitting of imgCIF and other sub-topics. .. .

SIMON WESTRIP simonwestrip at btinternet.com
Sat Sep 11 09:59:54 BST 2010


Dear all

I have found recent exchanges, especially Herbert's contributions regarding the 
real-world use of imgCIF, very
enlightening. Primarily for reasons of flexibility, I now find myself inclined 
to support a CIF specification
that allows a variety of encodings, provided that such are "clearly and 
unambiguously defined".

To me, the clear and unambiguous definition should encompass a clear and 
unambiguous *declaration*
 of the encoding; in the absence of such a declaration in the CIF or in its 
container, a default encoding 

should be assummed, either the default CIF encoding (which I think most agree 
should be UTF8) or inherited
from the container?

Though CIF1 has been successful without such a declaration (largely because of 
the ASCII restriction), 

I beleive it is essential in the case of CIF2.

Cheers

Simon










________________________________
From: "Bollinger, John C" <John.Bollinger at STJUDE.ORG>
To: Group for discussing encoding and content validation schemes for CIF2 
<cif2-encoding at iucr.org>
Sent: Friday, 10 September, 2010 19:24:05
Subject: Re: [Cif2-encoding] Splitting of imgCIF and other sub-topics. ..  .


On Friday, September 10, 2010 11:02 AM, Herbert J. Bernstein wrote:
>As I have said before, we went through this approach
>in 1997 and ended up going the other way -- treating the
>text-based CIF and the binary CBF as parts of the _same_
>format, not two different formats, not one being a serialization
>of the other, but the same format.  This may seem like a
>minor distinction, but it actually has strong implications
>for software design and implementation, ensuring that
>binaries in a CIF context are just a particular type of data
>handled with all the same mecnahisms as ASCII data, allowing,
>for example, multiple diffraction images and thumbnails in
>one file in an order-independent way.
>
>You may be interested to know that the false dichotomy between
>binary and text-based representations is not starting
>to imapct HDF5, requiring some significant effort to now
>work in database access, an aspect CIF1 supports -- why
>throw it away for CIF2?

Herb,

Perhaps you're reading more into my comments than I intended to put there.  In 
particular, I did not aim to suggest one on-disk/wire format should be a 
serialization of another, but rather that *all* on-disk/wire formats be 
characterized in terms of serialization of the Unicode character sequences 
described by most of the spec.  I meant "text" in that sense -- a sequence of 
Unicode characters -- not in the sense of a sequence of bytes conforming to some 
particular set of local conventions for text.  I meant "serialization" in the 
general sense of any reversible transformation of CIF text into a byte sequence, 
including those that rely on interpreting the CIF syntax.  That's aimed 
primarily at recognizing the use case in which CIF2 is embedded in or 
transformed into some other format, such as XML.

I postulate, but do not specify, a serialization form defining the CIF2 version 
of what we have conventionally called "a CIF."  The details of that form are 
exactly what this list was established to discuss, and I did not intend to imply 
a particular resolution of our ongoing debate.  It was perhaps a mistake to 
include imgCIF/CBF on the list of possible alternative serialization forms, as 
it is far from settled whether it will fit under the umbrella of the 'CIF File' 
serialization form.  I apologize if that caused confusion.

[... I wrote:]
>> I think this matter would be best addressed by explicitly adopting an idea that 
>>we have discussed before: a formal separation between the definition of CIF text 
>>(i.e. James's "CIF2-conformant character stream") and the particular kind of 
>>packaging that we are accustomed to calling "a CIF" or "a CIF file".  James's 
>>suggestion implies such a separation anyway, so let's not do it halfway.  Given 
>>such a separation, the explanatory comment could be as simple as:
>>
>> "This specification's definition of the 'CIF File' serialization form for CIF2 
>>text is not intended to preclude definition or use of other serialization forms, 
>>such as HDF5-based forms, XML-based forms, or imgCIF/CBF."
>>
>> I choose the term "serialization form" because it puts primary emphasis on the 
>>CIF text (which after all is the subject of the bulk of the specification).  
>>Every correct serialization of CIF text is, by definition, transformable into 
>>CIF text form.


Regards,

John
--
John C. Bollinger, Ph.D.
Department of Structural Biology
St. Jude Children's Research Hospital



Email Disclaimer:  www.stjude.org/emaildisclaimer

_______________________________________________
cif2-encoding mailing list
cif2-encoding at iucr.org
http://scripts.iucr.org/mailman/listinfo/cif2-encoding
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://scripts.iucr.org/pipermail/cif2-encoding/attachments/20100911/3aa37e05/attachment-0001.html 


More information about the cif2-encoding mailing list