CIF formal specification

Herbert J. Bernstein yaya at bernstein-plus-sons.com
Thu Mar 3 20:50:58 GMT 2005


Re: global_

   1.  The fact is that global_ is a reserved word in STAR so it is not
a good idea to write CIFs that use global_.  Even if use of global_ in
the CIF world is deprecated, it is still not a good idea to write
CIFs that use global_.

   2.  While I would prefer that we minimize confusion by barring all
unquoted use of any string that begins with global_ (and loop_ and
stop_ for that matter), the essential step is not to have unquoted
global_ strings in a CIF.

Bottom line:  I like Brian's rewrite of 55.

Re: ordering

   The proposed new wording is not accurate.  There is significance to
the ordering of data names, but certain reorderings do not change
the meaning of the CIF. I would suggest the following combined rewrite
of 7:

7. A given data name (tag) (see 2.4 and 2.7) may appear no more than
    once in a given data block or save frame.  A tag may be followed
    by a single value, or a list of one or more tags may be marked by
    the preceding reserved case-insensitive word loop_ as the headings
    of the columns of a table of values.  White space is used to
    separate a data block or save frame header from the contents of
    the data block or save frame, and to separate tags, values and
    the reserved word loop_.  Data items (tags along with their
    associated values) that are not presented in a table of values
    may be relocated along with their values within the same data
    block or save frame without changing the meaning of the data block
    or save frame.  Complete tables of values (the table column headings
    along with all columns of data) may be relocated within the same
    data block or save frame without changing the meaning of the data
    block or save frame.  Within a table of values, each tag may be
    relocated along with its associated column of values within the
    same table of values without changing the meaning of the table of
    values.  In general each row of a table of values may also be
    relocated within the same table of values without changing the
    meaning of the table of values.  Combining tables of values
    or breaking up tables of values would change the meanings, and
    is likely to violate the rules for constructing such tables
    of values.

I apologize for the complexity of this, but it is actually harder to
specify the meaning of an unordered set than it is to specify the
meaning of an ordered tuple, since the former requires specification
of equivalence classes, while the latter does not.

    -- Herbert


At 3:58 PM -0500 3/3/05, John Westbrook wrote:
>Brian McMahon wrote:
>
>>While working through final proofs of the Volume G chapter on the CIF
>>specification, I have taken the opportunity to address two minor
>>problems.
>>
>>(1) John Bollinger wrote to me to say:
>>
>>>I was going over the specifications for CIF 1.1 currently available from
>>>IUCr at http://www.iucr.org/iucr-top/cif/spec/version1.1/cifsyntax.html,
>>>and I found a minor inconsistency related to use of the string
>>>"global_".  Paragraphs 11, 33, and 57 seem to say that only the string
>>>"global_" itself is forbidden as an unquoted data value, whereas
>>>paragraph 55 says that string is forbidden as the _start_ of an unquoted
>>>data value.  Which is correct?
>>
>>
>>global_ in STAR does not take a label, and so its behaviour should be the
>>same as loop_ and stop_ (rather than data_ and save_, both of which may be
>>expanded with a label). While all these constructs are classified as
>>reserved words, the difference in handling the complete and partial words is
>>made explicit in para 57 of the syntax document. I have therefore changed
>>paragraph (55) to read
>>
>>55. The reserved word global_ (in a case insensitive     form). 
>>This is actually a reserved word of STAR, but is defined
>>     here so that it may be explicitly excluded as an
>>     unquoted string. This is done so that any possible future
>>     adoption of STAR features will not invalidate existing CIFs.
>>
>>instead of
>>
>>55. The reserved word global_ (in a case insensitive     form). 
>>This is actually a reserved word of STAR, but is defined
>>     here so that it may be explicitly excluded as THE START OF an
>>     unquoted string. This is done so that any possible future
>>     adoption of STAR features will not invalidate existing CIFs.
>>
>
>I thought the use of global_ had been deprecated.  It has no
>correspondence in ddl2 applications and I would prefer to see this
>just die a quiet death.
>
>>(2) Peter Murray-Rust told me that the formal specs do not actually
>>state explicitly that the order of data names is irrelevant. I have
>>addressed this by adding the text in capitals to para (7):
>>
>>7. A given data name (tag) (see 2.4
>>    and 2.7) may appear no more than once in a given data block or
>>    save frame. THERE IS NO SIGNIFICANCE TO THE ORDERING OF DATA NAMES
>>    WITHIN A DATA BLOCK OR SAVE FRAME. THAT IS, A DATA NAME WITH ITS
>>    ASSOCIATED DATA VALUE OR SET OF DATA VALUES MAY BE RELOCATED WITHIN
>>    THE SAME DATA BLOCK OR SAVE FRAME WITHOUT CHANGING THE INTERPRETATION
>>    OF THE DATA. A tag may be followed by a single value, or a list of one
>>    or more tags may be marked by the preceding reserved case-insensitive
>>    word loop_ as the headings of the columns of a table of
>>    values. White space is used to separate a data block or save frame
>>    header from the contents of the data block or save frame, and to
>>    separate tags, values and the reserved word loop_
>>
>Ordering does have some implications.  While the order of categories is
>of no importance, data items within categories must be collected
>together at one point in each data block.  The repetition of a category
>section in mmCIF is both a logical and syntax error.
>
>>If there are no objections, these changes will be incorporated in
>>Volume G.
>>
>>Regards
>>Brian
>>_________________________________________________________________________
>>Brian McMahon                                       tel: +44 1244 342878
>>Research and Development Officer                    fax: +44 1244 314888
>>International Union of Crystallography            e-mail:  bm at iucr.org
>>5 Abbey Square, Chester CH1 2HU, England                   bm at iucr.ac.uk
>>_______________________________________________
>>comcifs mailing list
>>comcifs at iucr.org
>>http://scripts.iucr.org/mailman/listinfo/comcifs
>
>
>--
>******************************************************************
>   John Westbrook, Ph.D.
>   Rutgers, The State University of New Jersey
>   Department of Chemistry and Chemical Biology
>   610 Taylor Road
>   Piscataway, NJ 08854-8087
>   e-mail: jwest at rcsb.rutgers.edu
>   Ph:  (732) 445-4290  Fax: (732) 445-4320
>******************************************************************
>_______________________________________________
>comcifs mailing list
>comcifs at iucr.org
>http://scripts.iucr.org/mailman/listinfo/comcifs

-- 
=====================================================
  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769

                  +1-631-244-3035
                  yaya at dowling.edu
=====================================================


More information about the comcifs mailing list