[Imgcif-l] proposed change in first line of imgcif files

Tue Oct 7 05:06:01 BST 2008

---------- Forwarded message ----------
From: James Hester <jamesrhester at gmail.com>
Date: Tue, Oct 7, 2008 at 2:05 PM
Subject: Re: [Imgcif-l] proposed change in first line of imgcif files
To: "Herbert J. Bernstein" <yaya at bernstein-plus-sons.com>

Dear Herbert and Harry:

Herbert's suggestion is a fine one which I agree with.  What I took from
Harry's email was that he would be prepared to live in a regime where the
course of action to be taken if header and contents didn't match would be
application-specific, so that's good.

Perhaps we should nut out the technical details of the proposal in a new
thread for tidyness.

Best wishes,
James.

On Thu, Oct 2, 2008 at 11:39 PM, Herbert J. Bernstein <
yaya at bernstein-plus-sons.com> wrote:

> Dear James,
>
>  I have read your remarks and Harry's.  How about we say what we all seem
> to agree on:
>
>  If both a magic number and one or more CIF tags specify values for the
> same or related parameter(s), those values should agree according to the
> dictionary specified relationships among the parameter(s).  Similarly, if
> two CIF tags specify values for the same or related parameters, those values
> should agree according to the dictionary specified relationships among the
> parameter(s).  In all dictionaries, those relationships should be clearly
> explained in the explanatory text of the dictionaries.  In DDLm
> dictionaries, the ability to algorithmically specify those relationships
> should be exploited where appropriate.
>
>  This would clearly specify our common intent for clearly documented
> agreement among multiple presentations of the same information, and leave
> it to specific applications to follow whatever approach seems appropriate
> to the application developer in dealing with the error case of
> disagreement.
>
>  Now what we really need to do is to agree in the CIF tags that should
> agree with the imgCIF magic number.
>
>  To get everything in the same place, here is the magic number proposal
> along with a _ws...  tag to allow the information to be referenced
> algorithmically and, finally, a variant on James' specific tags for the
> newly added style and style_version
>
> 1.  What problem is being solved?.  As the use of imgCIF has increased, two
> very distinct sets of files have appeared: the "miniCBFs" used for the
> Pilatus 6m detector and more fully populated imgCIF files, such as the ones
> produced for ADSC detectors.  While the information necessary for processing
> can be discovered from context in handling a miniCBF, it may be necessary to
> read fairly far into the file to discover that the file is indeed a miniCBF,
> complicating the design of reading software.
>
> 2.  The proposed solution.  Currently CBF files begin with a magic number
> comment line
>            1         2         3         4         5
>   12345678901234567890123456789012345678901234567890
>   ###CBF: VERSION n.m
>
> We propose to extend the magic number comment line with two optional fields
> to read
>
>            1         2         3         4         5
>   12345678901234567890123456789012345678901234567890
>   ###CBF: VERSION n.m     style     style_version
>
> where "style" is a unique CBF style identifier left justified as a single
> word in columns 25-34 and "style_version" is a left justified integer in
> columns 35-44.
>
> Each style will be registered in a central repository along with
> information on the tags that will be carried for that style and a template
> of the tags that would be needed to fully populate the file.
>
> 3.  To faciltiate writing DDLm methods to work with this or any other magic
> number convention, a pseudo-tag _ws.prologue would allow application
> manipulation of the comments and whitespace from before a data block. The
> prefix ws would be reserved for this purpose and for similar, related tags.
>  No parser would have to work with this tag.  It is provided simply to have
> an unambigous algorithmic way to state the relationship with the following
> actual CIF tags.
>
> 4.  James Hester has proposed two new tags to be carried within an imgCIF
> file to agree with the style and style_version: _diffrn_detector.data_style
> and _diffrn_detector.data_style_version. Ignoring the 0-base, vs. 1-base
> indexing issues, just to state the relationship between the first comment
> line and these tags in pseudocode:
>
> _diffrn_detector.data_style = trim(_ws.prologue[25:34])
> _diffrn_detector.data_style_version = trim(_ws.prologue[35:44])
>
> I would suggest, however, that these two tags do not quite fit in the
> diffrn_detector category, inasmuch as they do not really describe the
> detector.  They actually describe the format of the data block being used to
> present the detector information.  Therefore I suggest that we start a new
> category:  data_block_format and define
>
> _data_block_format.data_style = trim(_ws.prologue[25:34])
> _data_block_format.data_style_version = trim(_ws.prologue[35:44])
>
>
> Regards,
>  Herbert
>
> P.S.  I think we should explore formally creating a standard following
> the ISO processes, working under the IUCr, but seeing if we can
> eventually get ISO to accept what we do.
>
> =====================================================
>  Herbert J. Bernstein, Professor of Computer Science
>   Dowling College, Kramer Science Center, KSC 121
>        Idle Hour Blvd, Oakdale, NY, 11769
>
>                 +1-631-244-3035
>                 yaya at dowling.edu
> =====================================================
>
>

-- 
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148

-- 
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148