CIF Infoset
David Brown idbrown at mcmaster.caTue Sep 7 15:03:55 BST 2004
- Previous message: CIF Infoset
- Next message: CIF Infoset
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Here are a few IDB comments on the comments of DDB >>The core dictionary defines three items which can be looped: >> _audit_conform_dict_name >> _audit_conform_dict_version >> _audit_conform_dict_location # Contains the URL where the >>dictionary can be found >>As far as I know these have not been widely used - Acta Cryst. should >>start insisting that these be included in submitted papers. There is no >>need to give the dictionary version in anything as ephemeral a comment. >> >> > > >That sounds like a positive step, but would that go in every data_block or >is it a global_ thing? > Since each datablock is independent, each would have its own _audit_conform items at least until such time as we develop a better linkage between datablocks. >The problem I see is that the effort invested in implementing it for all >newly created and submitted CIFs is wasted because it is an >incomplete solution and no current software uses it or needs it. > There are already editor/browsers that read in the dictionaries and use them to valicate a CIF. They do not yet check the _audit_conform items so the dictionaries have to be identified to the program by the user (or the program loads all the dictionaries it can find, willy nilly). However, we are looking to the future, not just trying to keep up with the past. >So, to try and resolve the namespace of each name, you would need to >(1) check the _audit_conform list of dictionaries in reverse order >(2) check against the list of registered prefixes for accidental matches >(3) check all versions of all publically accessible dictionaries >(4) then give up. > If an _audit_conform loop is present, it should list all the dictionaries that were used in writing the CIF together with their URLs, so an application should be able to download all the dictionaries it needs. If there are data names appearing in the CIF that do not appear in these dictionaries, then the items are undefined and the user can do what seems most appropriate. In an editor written by some of my students, items not located in the dictionary are loaded into a category called 'miscelaneous' where the user can view them and decide whether they are legitimate or the result of a syntactic error. >If its important enough to create a name for it then isn't it important >enough >define its purpose somewhere? Ad hoc data names seem to provide >nothing useful besides a legitimate excuse for laziness in the >specification. Theres no incentive to organize things tidily. >Maybe they were important originally when COMCIFS were exploring >the field, before dictionaries were introduced, but is it still important >to be able to make up arbitrary stuff and stick it in a CIF without >definition? >Who is doing this and how are they using it? >Do they really intend to save it for posterity? > New concepts are continually being developed in crystallography and it is impractical to assign them names until it is clear that the concept has some permanance, otherwise the dictionaries quickly become filled with a legacy of discarded ideas. Thus people are encouraged to develop software that involves ad hoc names that may later be adopted by CIF or discarded. Yes, this does lead to potential problems in the archive, though such items can be defined in a local dictionary which is listed in the _audit_conform loop. In practice this is not likely to be a problem because such items are not usually used in archived CIFs. We wish to retain the flexibility of CIF to develop with the field and not make people think they have to get the permission of the Academy (COMCIFS) before they try out a new idea. >>>I had a hazy recollection that "this is a string" and >>> >>> >this_is_a_string > > >>>were equally valid CIF constructs containing identical information >>>content, >>>used for example in space group names. Would they be formally identical >>> >>> >in > > >>>an infoset? Does the white space in all strings have to be normalised >>> >>> >(is > > >>>that the right word?)? >>> >>> >>We had a discussion of this point while preparing the symmetry_CIF >>dictionary and came to the decision that these two strings were not >>equivalent, i.e., underscore is not white space.. >> >> > >Bummer. I know one program that needs changes made :-( > Because there is a legacy of underscore space group names (etc.) it is wise to be able to read them, but they should not be written. >But perhaps I could also draw your attention to this: > http://journals.iucr.org/services/cif/stdcodes.html#Appdx4.3 >as evidence that underscores do seem to be an >officially sanctioned form of white space in uchar data types. > The instructions in this URL refer to an item in the 2.2 version of the dictionary that has now been replaced in 2.3 by three separate items that are fully enumerated. Thus this problem is resolved in the latest dictionary version. Tightening up the dictionaries is an ongoing process. David -- Dr. I.D.Brown, Professor Emeritus, Department of Physics and Astronomy McMaster University, Hamilton Ontario, Canada -------------- next part -------------- An HTML attachment was scrubbed... URL: http://scripts.iucr.org/pipermail/comcifs/attachments/20040907/dfe26104/attachment.htm
- Previous message: CIF Infoset
- Next message: CIF Infoset
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the comcifs mailing list