Advice on COMCIFS policy regarding compatibility of CIF syntax with other domains
jamesrhester at gmail.com
Fri Mar 4 11:47:11 GMT 2011
Thanks Peter for your comments. While you may not be a voting member
of COMCIFS, you and other COMCIFS members fulfill an important
advisory role and I would encourage everybody to take the opportunity
to provide their perspectives.
I assume you have no particular disagreement with the principles that
you haven't commented on explicitly?
I've added some comments in response to your comments, inserted below:
On Fri, Mar 4, 2011 at 7:25 PM, Peter Murray-Rust <pm286 at cam.ac.uk> wrote:
> I add some comments arising out of my own experience with XML/CML which may
> be useful. I don't think I am a full member of COMCIFs so feel free to
> ignore all or any. I comment after significant paragraphs.
> On Fri, Mar 4, 2011 at 6:03 AM, James Hester <jamesrhester at gmail.com> wrote:
>> 1. A feature should only be added to CIF syntax if all of the
>> following are satisfied:
>> (i) implementation or use of equivalent behaviour at dictionary level
>> is either significantly more cumbersome or not possible;
>> (ii) the feature provides significant new functionality that is widely
>> applicable to most scientific domains
>> (iii) reliable transfer and archiving of data is not compromised
>> (iv) there is no simpler way of achieving the desired behaviour
> I would add:
> * a feature should only be added if it has been shown possible to implement
> it with "reasonable ease". "Rough consensus and running code"
I agree that this is a reasonable requirement. I would express it in
terms of cost/benefit, so something with a significant benefit would
justify extra effort.
>> Example 2: Unicode support in CIF2. This is broadly useful, given the
>> international nature of science and range of symbols used in
>> scientific papers. It could have been implemented in dictionaries
>> using ASCII escapes, but this would have been cumbersome to use, so it
>> satisfies Principle 1. We have adopted Unicode (rather than created
>> our own international character set) and copied the XML character
>> ranges (Principle 2)
> I found the original ASCII escapes difficult/tedious for some code points
> and woudl urge full unicode support (with numeric values).
I perhaps wasn't clear that we have already taken this step. The
current CIF2 draft envisions full Unicode support using UTF-8
encoding. Some provision has been made for allowing other encodings
in the future. The point of the example was to show how this decision
to adopt Unicode was justifiable in terms of these principles.
[rest edited out]
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
More information about the comcifs