CIF Infoset
Dr P. Murray-Rust pm286 at cam.ac.ukWed Aug 18 09:42:15 BST 2004
- Previous message: CIF Infoset
- Next message: CIF Infoset
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Aug 17 2004, Herbert J. Bernstein wrote: > Peter asks some interesting questions. I do not propose to answer > them in detail here. However, I should point out that interpretation > of a given CIF may require 4 sets of documents: > > 1. The CIF itself. > 2. The dictionary or dictionaries defining the tags > used in the CIF > 3. The relevant DDLs > 4. The CIF specification: > http://www.iucr.org/iucr-top/cif/spec/version1.1/index.html I agree with this. A little while ago I was invited to work with Syd and Nick and spent 2 pleasant weeks looking at whether this could be managed in a self-consistent system. In theory, yes. In practice it was questionable whether it was worthwhile and would be used. It is almost isomorphic with the XML schema hierarchy: DDL-validates->DDL-validates->dictionary-validates->CIF i.e. the DDL is self-validating. The problem was that *any* changes to the DDL have repercussions down the line which multiply. In XMLSchema we have SchemaSchema -validates-> XSDSchema -validates-> instance The construction of slef-consistent schemas in XML has been anything but trivial and has caused much argument. It is unlikely that CIF will benefit from a rerun. So I have taken the pragmatic view that we have DDL2 and DDL1 as currently accepted and used. As my own interests are currently in DDL1 I have restricted my questions and conserns to CIF (i.e. not STAR) and built software for this. My architecture should be sufficiently modular thatif/when CIF extends to fuller STAR it can be enhanced. > > Many of Peter's questions are answered in the specification. The lexical questions are. I have used the syntax and semantics documents as reference. I have assumed these are formal abstractions of the original published article(s). If they are not, then it would be useful to abstract additional rules - I think that implementers need to know exactly what documents apply and what the rules are. > > The infoset concept is useful, but be warned that the appropriate > handling of information depends on the context within which you are > working, regardless of whether you are using CIF or using XML or > the PDB format. For an application intended to just get at the data, > comments may be discarded, while for an application intended to reformat > the presentation of the data, comments are highly significant > information. Similarly, the particular form of quoting, the > distinction between "." and "?", etc. may or may not be > signficant. If the application in question is, say, a > refinement program that just needs to read CIFs to extract > expected crystallographic data, then construction of the "infoset" > from a CIF is particularly simple. More demanding applications, > e.g. in CIF validation and publication suites, may need to deal > with more subtle data and metadata questions. > I am afraid I disagree! If the interpretation of a CIF depends on what program is to be used to process it then it is (IMO) not an abstract archive and transfer format. Peter M-R
- Previous message: CIF Infoset
- Next message: CIF Infoset
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the comcifs mailing list