COMCIFS approval of proposal to flag dataname redefinition
kaduk at polycrystallography.com
kaduk at polycrystallography.com
Mon Mar 27 16:44:28 BST 2017
This seems reasonable to me.
Jim Kaduk
On 2017-03-22 20:38, James Hester wrote:
> Dear COMCIFS,
>
> As per the my email of last December (see
> http://www.iucr.org/__data/iucr/lists/comcifs-l/msg00736.html) a
> simple mechanism has now been developed to clearly flag the presence
> of redefined datanames in a datafile. Neither the ddlm-group nor the
> core CIF DMG have raised objections. The full proposal is reproduced
> below, and can be read in a more nicely formatted form at
> https://github.com/COMCIFS/comcifs.github.io/blob/master/audit.formalism_proposal.md
>
> All comments are welcome. If no objections are received within 3
> weeks, this proposal will be considered approved.
>
> James Hester (chair).
>
> ==========================================
>
> # Proposal for new dataname and attribute to cover differing models
>
> ## Introduction
>
> The following proposal implements part of a solution to incorporating
> multiple models into CIF, discussed
> [here](changing_meanings_discussion_paper.md [1]). It should be read
> in
> conjunction with that document.
>
> ## New datanames: `_audit.formalism` and `_audit.formalism_version`
>
> Each value of `_audit.formalism` corresponds to a particular way of
> deriving some set of CIF datanames from other datanames defined in the
> same, or imported, dictionaries. For better interoperability, we
> stipulate that datanames may only be redefined by dictionaries if the
> redefined datanames take values with the same units and domain as the
> original datanames. So, for example, `_refln.F_complex` for magnetic
> structures is calculated differently to `_refln.F.complex` for core
> CIF, but has the same units and domain (positive real numbers), so it
> is acceptable for a magnetic dictionary to redefine
> `_refln.F_complex`.
>
> `_audit.formalism_version` is provided to allow secondary parameters
> to be added to the model without changing the overall formalism. As a
> guide, parameters are considered secondary if they do not require the
> addition of new columns to any category, and do not significantly
> change the final calculated values in "typical" cases.
>
> All CIF datablocks should include these new datanames when they take
> non-default values; the default values correspond to the
> single-crystal model described in core CIF. Any CIF reading programs
> that perform calculations should check `_audit.formalism` and
> `_audit.formalism_version`
> datanames in order to avoid miscalculating derived values.
>
> The choice of the word `formalism` is purely to avoid clashing with
> the widespread use of `model` in core CIF to refer to the particular
> arrangement of atoms. There may be a better word. See the appendix
> for formal DDLm definitions.
>
> ## New DDLm attributes: `_dictionary.formalism` and
> `_dictionary.formalism_version`
>
> These attributes associate a dictionary with a particular formalism.
>
> ## Treatment of current dictionaries
>
> ### Modulated structures
>
> The modulated structures dictionary is assigned formalism `modulated`
> and redefines `_refln.F_complex`, `_refln.sin_theta_over_lambda` and
> `_refln.symmetry_multiplicity`.
>
> ### Magnetism
>
> The magnetism dictionary builds on the modulated structures
> dictionary.
> It is assigned formalism `magnetic` and redefines `_refln.F_complex`
> only.
>
> ### Powder
>
> The powder dictionary calculates structure factors from information
> that may be held in a different datablock. It therefore redefines
> `_refln.F_complex`. `_refln.F_meas` is also redefined as the
> determination of this from the powder observations is markedly
> different to the way in which it is derived from single-crystal spots,
> not least because of pervasive overlap.
>
> Separate formalisms are necessary for each possible combination of
> powder with other formalisms, for example `_audit.formalism` can
> take values `powder-magnetic`, `powder-modulated` and
> `powder-multipole`.
>
> ### Electron density
>
> The electron density dictionary allows parameterisation of the
> electron density around each atom in terms of multipoles.
> `_refln.F_complex` is redefined, and a formalism of `multipole` is
> assigned.
>
> ### Constraints and restraints
>
> This dictionary relates only to the method of determination of the
> final parameters and therefore does not affect the definitions of
> the final datanames.
>
> ### Twinning
>
> Twinning does not change the structural model, but it may change the
> way `_refln.F_meas` is calculated from the observations. A formalism
> of `twinning` is assigned, and as for powder separate formalisms need
> to be assigned for each distinct structural model.
>
> ### Image CIF
>
> ImgCIF relates only to raw data and is not affected by these changes.
>
> ### mmCIF
>
> mmCIF is based on the core CIF model and is therefore unaffected by
> these changes.
>
> ## Treatment of other techniques
>
> ### Laue
>
> A Laue experiment measures distinct spots, but each spot is produced
> by a distinct wavelength, and spots sometimes overlap.
> `_refln.wavelength` therefore becomes an additional key column in
> `refln`. This change by itself is easily covered by defining a
> different `_audit.schema`. However, a Laue dictionary must also
> redefine `_refln.F_meas` as the extraction of notional observed
> intensities will depend on the model for wavelength distribution, and
> so we must assign a separate `_audit.formalism`. As for powder and
> twinning, there will be a separate `formalism` for each distinct
> structural model.
>
> ## Discussion
>
> ### Mixing and matching not possible
>
> It is tempting to define something like `_audit.technique` to cover
> the technique-based differences, so that `_audit.technique` and
> `_audit.formalism` could correspond to different dictionaries that
> could be mixed and matched. So, instead of a `powder-magnetism`
> formalism, there would simply be a `powder` technique combined with a
> `magnetism` formalism, with both dictionaries being separately
> imported
> and notionally orthogonal to one another.
>
> However, any `formalism` that adds keys to the `refln` category will
> also require the `technique` to be aware of those keys in order to
> explain how `_refln.F_meas` is determined. For example, a powder
> experiment on a modulated structure will calculate the `_refln.F_meas`
> value differently to a powder experiment on a non-modulated structure
> as the calculations of peak position require different numbers of
> indices. Therefore, it is not possible to generally separate the
> technique from the structural model, although it may be possible in
> particular cases.
>
> ### Just use `_audit_conform`?
>
> Core CIF has long provided the `_audit_conform_dict_*` tags to state
> which
> dictionary or dictionaries a datablock conforms to. This appears
> almost
> as simple as the proposed `_audit.formalism` tag, so the need for a
> separate tag may not be apparent.
>
> While the `_audit_conform` mechanism must remain the
> canonical source of information, the proposed dataname provides a
> simplified route to the same information. In order for a CIF reading
> program to confirm that none of the dictionaries listed in a CIF block
> change any of the definitions relied upon by that program, it must in
> general download the stated versions of the dictionary or dictionaries
> from the canonical IUCr site, parse and merge them, and then find any
> definitions that have (apparently) been replaced. Compared to this
> procedure, the `_audit.formalism` tag is a much simpler way for the
> datablock writer to specify to the datablock reader a particular set
> of dataname interpretations that may never change.
>
> Note that the `_audit_conform_*` mechanism is almost never used. As of
> May 28, 2016, there were 195 modulated structures in the
> Crystallographic Open Database (as determined by the presence of
> `Fourier_wave_vector` in a file). Of these, zero had an
> `_audit_conform` entry, which in theory would be required to explain
> the adjusted interpretation of the `refln` category and new datanames.
> We conclude that introduction of `_audit.formalism` should be
> accompanied by an education and outreach program as well.
>
> ### Interaction with `_audit.schema`
>
> `_audit.schema` essentially allows fixed parameters to vary. It
> is therefore orthogonal to `_audit.formalism`: a given formalism
> may have many possible schemas, and many schemas may apply to
> multiple formalisms if they share the same parameters.
>
> In other words, a suitably-written program can handle a variety of
> schemas for a single formalism without needing to change the way in
> which any dataname is calculated, whereas a program must change the
> way
> in which the redefined datanames are calculated if the formalism
> changes.
>
> # Appendix I: New core definitions
>
> ## _audit.formalism
>
> ```
> save_audit.formalism
>
> _definition.id [2] '_audit.formalism'
> _name.category_id audit
> _name.object_id formalism
> _description.text
> ;
>
> The CIF dictionaries listed in _audit.dictionary may redefine
> datanames. _audit.formalism is provided as an efficient
> alternative to parsing and checking those dictionaries. It
> identifies commonly-used sets of meanings for datanames. In
> general, each value taken by _audit.formalism is linked to a
> particular technique and/or structural approach. The
> dictionaries for the datablock (see _audit.dictionary) must be
> compatible with the value of _audit.formalism.
>
> ;
> _type.contents Text
> _type.purpose State
> _type.container Single
> _type.source Assigned
> loop_
> _enumeration_set.state
> _enumeration_set.detail
> Base 'Single crystal model from core CIF'
> Modulated 'Single crystal modulated structure'
> Magnetic 'Single crystal magnetic structure,
> potentially modulated'
> Powder 'Powder diffraction experiment'
> Twinned 'Twinned crystal using core CIF model'
> Multipole 'Single crystal model with multipole
> coefficients'
> Laue 'Laue experiment on single crystal'
> Powder-Modulated 'Powder experiment on a modulated structure'
> Powder-Magnetic 'Powder experiment on a modulated magnetic
> structure'
> Powder-Multipole 'Powder experiment modelled with multipoles'
> Laue-Magnetic 'Laue experiment on magnetic structure'
> Laue-Modulated 'Laue experiment on modulated non-magnetic
> structure'
> Laue-Multipole 'Laue experiment modelled with multipoles'
> Twinned-Magnetic 'Twinned magnetic single crystal structure'
> Twinned-Modulated 'Twinned modulated single crystal structure'
> Laue-Twinned 'Laue experiment on twinned single crystal'
> Laue-Twinned-Modulated 'Laue experiment on twinned modulated
> structure'
> Custom 'Examine dictionaries provided in
> _audit_conform'
> Local 'Locally modified dictionaries. These
> datafiles should not be distributed'
> _enumeration.default Base
> save_
> ```
>
> ## _audit.formalism_version
>
> ```
> save_audit.formalism_version
>
> _definition.id [2] '_audit.formalism_version'
> _name.category_id audit
> _name.object_id formalism_version
> _description.text
> ;
>
> The version of the given formalism (see `_audit.formalism`). The
> version
> number of a formalism is incremented when new model parameters
> are
> added that do not significantly affect the model values in
> typical cases.
>
> ;
> _type.contents Text
> _type.purpose State
> _type.container Single
> _type.source Assigned
> _enumeration.default 1.0
> save_
> ```
>
> ## _dictionary.formalism
>
> ```
> save_dictionary.formalism
>
> _definition.id [2] '_dictionary.formalism'
> _definition.class Attribute
> _definition.update 2016-12-17
> _description.text
> ;
>
> The value of this attribute is associated with the set of
> dataname meanings contained in this dictionary.
>
> ;
> _name.category_id dictionary
> _name.object_id formalism
> _type.purpose Audit
> _type.source Assigned
> _type.container Single
> _type.contents Text
>
> save_
> ```
>
> # Appendix II: a full hybrid dictionary
>
> A complete dictionary using the above mechanisms is presented below.
>
> ```
> #\#CIF_2.0
> ####################################################
> # #
> # Dictionary for modulated powder diffraction #
> # #
> ####################################################
>
> data_MODPOW
>
> _dictionary.title MODPOW
> _dictionary.formalism Powder-Modulated
> _dictionary.class Instance
> _dictionary.version 1.0
> _dictionary.date 2016-12-19
> _dictionary.ddl_conformance 3.12
> _dictionary.namespace MODPOW
> _description.text
> ;
>
> The modulated powder diffraction dictionary redefines datanames
> for use when presenting the results of a powder diffraction
> experiment using a modulated structure model. The remainder of
> the
> relevant definitions are found in the modulated structures
> dictionary and the powder diffraction dictionary.
>
> ;
>
> save_MODPOW_GROUP
>
> _definition.id [2] MODPOW_GROUP
> _definition.scope Category
> _definition.class Head
> _definition.update 2016-12-19
> _description.text
> ;
> This category is the parent category for all definitions
> in the MODPOW dictionary
> ;
>
> _name.category_id MODPOW
> _name.object_id MODPOW_GROUP
>
> # The following import reads in and reparents all powder and
> # modulated structure definitions to the MODPOW_GROUP category. As
> cif_ms is
> # read second, the refln category will have the extra modulation
> indices
> # defined.
>
> _import.get [{"file":"cif_pow.dic" "save":"PD_GROUP"
> "mode":"Full"}
> {"file":"cif_ms.dic" "save":"MS_GROUP"
> "mode":"Full"}]
>
> save_
>
> save__refln.F_complex
>
> _definition.id [2] '_refln.F_complex'
> loop_
> _alias.definition_id
> '_refln.F_complex'
> '_refln_F_complex'
> _definition.update 2016-12-19
> _description.text
> ;
> The structure factor vector for the reflection calculated from
> the modulated structure given in the datablock identified by
> _refln.phase_id
> ;
> _name.category_id refln
> _name.object_id F_complex
> _type.purpose Measurand
> _type.source Derived
> _type.container Single
> _type.contents Complex
> _enumeration.default 0.
>
> #
> # A complete dREL expression for F_complex can be provided here,
> using all of the
> # parameters provided in the powder and modulated structure
> dictionaries.
> #
> save_
>
> save__refln.F_meas
>
> _definition.id [2] '_refln.F_meas'
> loop_
> _alias.definition_id
> '_refln.F_meas'
> '_refln_F_meas'
> _definition.update 2016-12-19
> _description.text
> ;
> The structure factor amplitude for the modulated reflection based
> on
> partitioning of each observed powder diffraction intensity
> between
> contributing reflections in proportion to the model reflection
> contributions.
> ;
> _name.category_id refln
> _name.object_id F_meas
> _type.purpose Measurand
> _type.source Derived
> _type.container Single
> _type.contents Real
> _enumeration.default 0.
> #
> # A complete dREL expression for calculating F_meas from an observed
> powder diffractogram can be given here.
> #
> save_
>
> --
> T +61 (02) 9717 9907
> F +61 (02) 9717 3145
> M +61 (04) 0249 4148
>
> Links:
> ------
> [1] http://changing_meanings_discussion_paper.md
> [2] http://definition.id
> _______________________________________________
> comcifs mailing list
> comcifs at iucr.org
> http://mailman.iucr.org/cgi-bin/mailman/listinfo/comcifs
More information about the comcifs
mailing list