Please advise regarding a design of CIF dictionaries for material properties

Herbert J. Bernstein yaya at bernstein-plus-sons.com
Wed Sep 28 15:54:37 BST 2011


Dear Colleagues,

   The example CIF itself looks intutive and clear, so
the difficuly, if any, seems to lie in expressing this
structure in the dictionary.  As a DDL2 dictionary,
it would be easy, just change the last underscore
in each tag to a period, define a category for
the part to the left of the period, and the
part to the right is a column name.  There would
be one data block, a lot of save frames, and
there is a past practice in DDL2 dictionaries of
gathering information about a lot or related tags
into a common save frame for some master parent
tag and just putting minimal information on the
individual tags in their own save frames.

   I highly recommend taking a look at the mmCIF
dictionary or the pdbx dictionary.  If that is
followed as a model, then when we do the move of
those to DDLm this new dictionary should move forward
with minimal fuss.  If we start retooling DDL1 for
what is really a DDL2 issue, I fear we would be
borrowing trouble.

   Regards,
     Herbert

=====================================================
  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769

                  +1-631-244-3035
                  yaya at dowling.edu
=====================================================

On Wed, 28 Sep 2011, Saulius Grazulis wrote:

> Dear COMCIFS members,
>
> I have a question about the design of domain-specific CIF dictionaries
> and would like to ask for your advise (and please accept my apologies
> and let me know if there is a better mailing list to ask for such
> questions).
>
> I am currently participating in the design of CIF dictionary for the
> Material Properties Open Database (MPOD) that intends to store all
> published experimentally measured crystal properties, such as elasticity
> tensors, dielectric permeability and so forth. All in all there should
> be about 50 different tensors.
>
> Each tensor can be measured at different temperatures or pressures. To
> preset data convenietly, for both humans and computers, we curretnly
> plan to put each tensors' measurements into a separate loop. Since tag
> names may not be repeated int the same data block, we will have to
> define similar measurement condition tags for each tensor:
>
> _prop_elastic_stiffness_temperature
> _prop_piezoelectric_temperature
>
> (_prop_ is a prefix registered for MPOD in the IUCr prefix list).
>
> Now, although this is only a small overhead in CIFs, it would be an
> overkill to specify all these tags separately in a dictionary. Thus, I
> would like to "contract" the definition of all
> _prop_<property>_temperature tags into one dictionary datablock:
>
> data_prop_temperature
> loop_
> _name '_prop_elastic_stiffness_temperature'
>      '_prop_piezoelectric_temperature'
>      # Other names will follow and may be added in the future releases
>      # of the dictionary
> _type             numb
> _type_conditions  esd
> _category         prop # or prop_temperature ? or prop_elastic?
> _list             both
> _description
> ;
>   Specifies measurement temperature of a property in Kelvins.
> ;
> _example
> ;
>   Please see below in this mail...
> ;
>
> Now, my questions are -- is there a problem if:
>
> a) tags of the same property are split into several loops in data CIFs?
>
> b) one dictionary data block describes names that are potentially in
> different categories (but otherwise have the same characteristics)? For
> example, would the dictionary entry above be considered correct if we
> declare _prop_elastic_stiffness_temperature to be in
> 'prop_elastic_stiffness' category, and _prop_piezoelectric_temperature
> to be in 'prop_piezoelectric' category, and still have one dictionary
> datablock to specify their properties?
>
> b') or the category is so inclusive that it describes data spanning
> several loops (like '_prop_' category in the above example)?
>
> c) data_... block name in the dictionary no longer matches tag name. I
> guess this should not be a problem... Is it?
>
> d) would it break anything if category name is not the prefix of the tag
> (e.g. declaring _prop_piezoelectric_temperature to have category
> _prop_temperature, to describe all temperature tags in one data block)?
>
> e) Any other anticipated problems?
>
> Sincerely yours,
> Saulius
>
> PS. We have toyed with two other representations, one putting all
> tensors into one loop, but they seem much worse (would require lots of
> '.' fields and would result in severely denormalised relational tables).
>
> PPS: data examples with the proposed tags:
>
>> The CIF would look like
>>
>> loop_
>> _prop_elastic_stiffness_label
>> _prop_elastic_stiffness_temperature
>> _prop_elastic_stiffness_c11
>> _prop_elastic_stiffness_c12
>> _prop_elastic_stiffness_c13
>> _prop_elastic_stiffness_c22
>> _prop_elastic_stiffness_c23
>> _prop_elastic_stiffness_c33
>> _prop_elastic_stiffness_c44
>> _prop_elastic_stiffness_c55
>> _prop_elastic_stiffness_c66
>> Copper  273  375.1  -48.5  -48.5  375.1   -48.5  375.1  101.4   101.4 101.4
>> Copper  293  375.1  -48.5  -48.5  375.1   -48.5  375.1  101.4   101.4 101.4
>> Copper  313  375.1  -48.5  -48.5  375.1   -48.5  375.1  101.4   101.4 101.4
>>
>> loop_
>> _prop_piezoelectric_label
>> _prop_piezoelectric_temperature
>> _prop_piezoelectric_frequency
>> _prop_piezoelectric_d15
>> _prop_piezoelectric_d16
>> _prop_piezoelectric_d21
>> PIN-PMN-PT 100.0 ? 2190 1022 511
>> PIN-PMN-PT 100.0 ? 2190 1022 511
>> PIN-PMN-PT 100.0 ? 2190 1022 511
>>
>> and so on.
>
> S.G.
>
> -- 
> Dr. Saulius Gra?ulis
> Institute of Biotechnology, Graiciuno 8
> LT-02241 Vilnius, Lietuva (Lithuania)
> fax: (+370-5)-2602116 / phone (office): (+370-5)-2602556
> mobile: (+370-684)-49802, (+370-614)-36366
> _______________________________________________
> comcifs mailing list
> comcifs at iucr.org
> http://scripts.iucr.org/mailman/listinfo/comcifs
>


More information about the comcifs mailing list