Revised CIF syntax guidelines

Herbert J. Bernstein yaya at bernstein-plus-sons.com
Tue Apr 5 14:04:16 BST 2011


Dear Colleagues,

   I will refrain from commenting on James's and Brians' remarks,
except to note that the "conservation principles" from PMR are
in a document in which he goes on to say:


"The technical options for software to support CIF/STAR are:
                  - continue with CIF-specific software and commit much more
resource than we currently do
                  - re-use non-CIF tools already written in other contexts
                  - re-define what we wish to do using CIF and what using
other representations
I believe that only the last two are feasible."

Regards,
   Herbert


At 5:16 PM +1000 4/5/11, James Hester wrote:
>It is not my intention to create an absolute code, rather to create a
>framework for discussions and promote a certain prejudice in favour of
>the positions outlined in the guidelines.  COMCIFS voting members
>still remain the final arbiters.
>
>Moving on to the particular comments about points 1(i) and (ii), note
>that the phrase "domain level" is defined in the preamble to include
>DDL dictionaries.  Insofar as a particular DDL can be used across all
>domains, changes that do not satisfy all of the criteria in 1, but do
>satisfy 1(ii), would logically be implemented using DDL mechanisms (ie
>at the "domain level").  So, for example, while fancy syntax could be
>introduced to indicate more detail about the relationships among data
>items in a data file, which would certainly be broadly useful, this
>has been done at a DDL level instead. Does this adequately answer this
>aspect of your concerns, Brian?
>
>The intent of 1(i) and 1(ii) is to shelter the syntax from unnecessary
>complexity.  Simplicity is both a philosophical and a practical goal.
>  It is up to COMCIFS to decide on whether or not it supports
>simplicity as a philosophical goal; there appears to be no fundamental
>mandate on COMCIFS to support such a philosophy, so it is entirely
>possible for COMCIFS to reject these goals.
>
>The practical outcome is that a simpler syntax makes implementation
>easier. Consider the conservation principles put forward by Peter
>Murray-Rust many years ago (see
>http://www.iucr.org/__data/iucr/lists/comcifs-l/msg00115.html)
>
>* You cannot hide complexity, you can only move it around
>* for everything you define in a specification, someone has to write code
>* it is far easier to write a specification than to implement it
>
>The draft guidelines move complexity to the domain level wherever
>possible, thereby reducing the workload on those who need to write
>software that needs only to understand CIF syntax and a set of
>specific datanames/values (a large class of software).
>
>On Mon, Apr 4, 2011 at 7:22 PM, Brian McMahon <bm at iucr.org> wrote:
>>  Where matters of principle are involved, I have a certain reluctance
>>  to work from an absolute written code. Perhaps it stems from cultural
>>  familiarity with British polity, that does not operate from a written
>>  constitution (though be aware that the modern Irish state, founded on
>>  a written constitution, also features large in my cultural baggage).
>>  In any case, unyielding adherence to formal principles is not wise;
>>  and a code of principle should always, in my view, be seen as guiding
>>  rather than prohibitive. Of course, the better formulated the
>  > principles, the less will they be tested, on the whole.
>>
>>  Having said which, I'm in broad agreement with the thrust of the
>>  CIF 2.0 statements. Where I have some concerns:
>>
>>  Treating (i) and (ii) as exclusive does seem dangerous. Where it is
>>  seen that a base-line change could facilitate operation across many
>>  domains, it might make more sense to implement it at that level,
>>  rather than have the different domains separately (and possibly
>>  incompatibly) implement it in dictionaries. The merits would need
>  > to be considered case by case; but I would be happier if these two
>>  statements were rolled into an OR complex:
>>
>>   (i) Implementation of the desired behavior by changes at the domain
>>  level is not feasible, or else such changes, while feasible,  would
>>  significantly reduce human readability; OR the change provides
>>  significant new functionality that is widely applicable to those
>>  scientific domains where CIF is used
>>
>>  As a principle, point 3 ("Non-technical issues should be dealt with in
>>  non-technical arenas") seems to me tautologous. As such, I do not
>>  contest it; but neither would I be troubled if it were omitted from
>>  the list. In practice, of course, it's fair to insist that the
>>  scope of particular discussions be constrained to technical issues
>>  (or, in other circumstances, strictly non-technical issues) where
>>  it is clear what these might be.
>>
>>  (Re-reading that, it seems to me rather opaque. What I mean is that,
>>  broadly speaking, I support the way we have been approaching things,
>>  with the establishment of working groups to consider proposed
>>  technical developments, and occasional referral upwards to COMCIFS
>>  for advice when progress seems to be getting hung up on concerns
>>  over whether we're missing wider implications. Perhaps that referral
>>  might at times come in at a much earlier stage.)
>>
>>  Regards
>>  Brian
>>
>>  On Thu, Mar 31, 2011 at 03:20:57PM +1100, James Hester wrote:
>>>  Dear COMCIFS,
>>>
>>>  Please find below a slightly revised version of the guidelines for
>>>  developing base CIF syntax and semantics.  I have taken the version
>>>  most recently posted by John Bollinger, and following discussion with
>>>  John Westbrook, have added a couple of points relating to maintaining
>>>  compatibility with previous versions of CIF (1 (vi) and (vii)).  Note
>>>  also that I have also adjusted the text in 1 (ii) to refer to
>>>  scientific domains in which CIF is used, rather than scientific
>>>  domains in general.
>>>
>>>  I plan to call a vote on accepting these guidelines following a short
>>>  further period for discussion of the revision below.
>>>
>>>  ==============================================
>>>  Principles guiding development of Base CIF 2.0
>>>  ----------------------------------------------------------------------
>>>
>>>  Preamble
>>>
>>>  CIF is a framework for exchanging and archiving scientific data,
>>>  featuring a human-readable, machine-parseable, file format designed to
>>>  serve as an exchange and archive medium.  'Base' CIF comprises the
>>>  definitions and constraints that underlie CIF and apply to all CIF
>>>  files; those aspects defining the CIF file format are documented in
>>>  the CIF Syntax specification and the CIF Common Semantic Features
>>>  specification.
>>>
>>>  Base CIF aims to remain as simple as possible by delegating
>>>  considerations such as ontology, vocabulary, data relationships, and
>>>  complex and rich data types to domain dictionaries and the DDL
>>>  formalisms by which those dictionaries are defined.  In the following,
>>>  the phrase 'domain level' refers to such documents (though it is
>>>  anticipated that only dictionaries, not DDLs, will be
>>>  domain-specific).  Definitions and constraints at domain level apply
>>>  to a particular CIF file only as declared by that file or as required
>>>  by a particular CIF processor in a particular context.
>>>
>>>  Principles
>>>
>>>  The design of base CIF 2.0 is guided by these principles:
>>>
>>>  1. A feature should be added to or changed in base CIF only if all of
>>>  the following are satisfied:
>>>
>>>   (i) Implementation of the desired behavior by changes at the domain
>  >> level is not feasible, or else such changes, while feasible,  would
>>>  significantly reduce human readability;
>>>   (ii) the change provides significant new functionality that is widely
>>>  applicable to those scientific domains where CIF is used
>>>   (iii) reliable transfer and archiving of data is not compromised
>>>   (iv) there is no simpler way of achieving the desired behaviour
>>>   (v) it has been shown possible to implement the change at a cost
>>>  commensurate with its benefits, as demonstrated in part by a rough
>  >> consensus and running code.
>>>   (vi) Where possible, any new CIF syntax features should be developed
>>>  as an extension to the current standard, and thus not change the
>>>  interpretation of archival files that conform with previous versions
>>>  of the CIF standard.
>>>   (vii) Where it is impractical to provide for full backward 
>>>compatibility as
>>>  described in (vi), the relevant archival repositories and 
>>>software developers
>>>  should be consulted to arrive at a solution that will minimize 
>>>the impact of
>>>  such changes.
>>>
>>>
>>>  2. As long as the requirements in (1) are satisfied, base CIF should:
>>>   (i) behave in a way that is consistent with common usage
>>>   (ii) align with pre-existing standards where those standards provide
>>>  the required behaviour. CIF 1.1 can be considered a pre-existing
>>>  standard for CIF 2.0 in this context.
>>>
>>>  3. Non-technical issues should be dealt with in non-technical arenas.
>>>
>>>  4. Draft changes to base CIF will be made available on the IUCr
>>>  website for public comment for a period of at least 6 weeks, following
>>>  which COMCIFS voting members, after consideration of any objections
>>>  raised, can vote to accept the change. A change will be accepted if
>>>  3/4 of COMCIFS voting members approve it.
>>>  ===============
>>>
>>>  --
>>>  T +61 (02) 9717 9907
>>>  F +61 (02) 9717 3145
>>>  M +61 (04) 0249 4148
>>>  _______________________________________________
>>>  comcifs mailing list
>>>  comcifs at iucr.org
>>>  http://scripts.iucr.org/mailman/listinfo/comcifs
>>  _______________________________________________
>>  comcifs mailing list
>>  comcifs at iucr.org
>>  http://scripts.iucr.org/mailman/listinfo/comcifs
>>
>
>
>
>--
>T +61 (02) 9717 9907
>F +61 (02) 9717 3145
>M +61 (04) 0249 4148
>_______________________________________________
>comcifs mailing list
>comcifs at iucr.org
>http://scripts.iucr.org/mailman/listinfo/comcifs


-- 
=====================================================
  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769

                  +1-631-244-3035
                  yaya at dowling.edu
=====================================================


More information about the comcifs mailing list