Updating list of _audit.schema

James Hester jamesrhester at gmail.com
Thu Jan 7 01:21:40 GMT 2021


Herbert - are you arguing that imgCIF and mmCIF should not be assigned
different schema names? If your comments are not about that, feel free to
ignore the following.

If you scrutinize the definition in mmCIF of _entry.id (
https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Items/_entry.id.html),
you will see that it "identifies the data block" so is therefore restricted
to a single value in a single data block. It follows that all the child
data items of _entry.id are restricted to single values, so where these
child items are the sole keys of their categories those categories become
single-row categories. Such categories are entirely functionally equivalent
to DDLm Set categories and so it would be possible to list which Set
categories in core CIF are multi-row in mmCIF, satisfying the criteria for
a schema. Frankly I was a bit too lazy to write the code to determine this
but from memory it is only diffrn and exptl_crystal. If there are
objections to the label "macromolecular" we can change it to "multi-crystal
multi-wavelength" to avoid any implications or restrictions on mmCIF.

Although imgCIF does not have any categories that have child data names of _
entry.id (so every imgCIF category can have multiple rows), it does add new
key data names to a few mmCIF categories, thereby creating a distinct
"_audit.schema". I don't think that is a controversial statement. For
example, the Diffrn_Detector category in imgCIF has key data names
"_diffrn_detector.diffrn_id" as well as "_diffrn_detector.id", whereas
mmCIF has only the former (as per the text at bottom of p203 of Vol G).

all the best,
James.


On Thu, 7 Jan 2021 at 10:01, Herbert J. Bernstein <yayahjb at gmail.com> wrote:

> I believe both imgCIF and mmCIF only use loop categories and any set
> categories picked up for inclusion with their datasets will need to have
> keys added and be mapped into loop categories.  That is certainly the case
> for imgCIF -- Herbert
>
> On Wed, Jan 6, 2021 at 5:06 PM James Hester <jamesrhester at gmail.com>
> wrote:
>
>> Apologies for the lax terminology. By "looped" I mean "able to have more
>> than one row in a loop". Perhaps the explanations should be rewritten to
>> use 'Loop category' and 'Set category' rigorously?
>>
>> On Thu, 7 Jan 2021 at 03:07, Herbert J. Bernstein <yayahjb at gmail.com>
>> wrote:
>>
>>>  In imgCIF (as with mmCIF) any and all categories may be looped -- its
>>> how you put information into database tables.  - Herbert
>>>
>>> On Wed, Jan 6, 2021 at 1:35 AM James Hester via comcifs <
>>> comcifs at iucr.org> wrote:
>>>
>>>> Dear COMCIFS,
>>>>
>>>> FIrst of all, Happy New Year to you all, I hope you've all been keeping
>>>> well.
>>>>
>>>> I am writing to propose updating the list of _audit.schema in the core
>>>> dictionary. Normally this would be core DMG business, but as it concerns
>>>> most dictionaries covered by COMCIFS I believe this is the more appropriate
>>>> forum. This has been prompted by reviewing the DDLm dictionary chapters for
>>>> the next edition of Volume G. Please examine the list below and discuss any
>>>> changes you would like to see.  The formal changes to the dictionary can be
>>>> viewed as a diff at this link:
>>>> https://github.com/COMCIFS/cif_core/pull/190/commits/5e3b84e6f84997f9822f704a9f380ff500e0410e
>>>>
>>>> As a reminder, the _audit.schema dataname indicates that one or more
>>>> categories have become looped relative to the core CIF dictionary. For
>>>> example, where multiple crystals are used in a measurement, the
>>>> exptl_crystal category becomes looped. Ideally software will check this
>>>> dataname and exit if the dataname has an incompatible value.
>>>>
>>>> best wishes,
>>>> James.
>>>>
>>>> =====================================================
>>>> loop_
>>>> _enumeration_set.state
>>>> _enumeration_set.detail
>>>>     Base                'Original Core CIF schema'
>>>>    'Space group tables' 'space_group category is looped'
>>>>     Entry
>>>> ;
>>>>     entry category is defined and looped: multiple experiments
>>>>     with results may be present
>>>> ;
>>>>     Powder              'Multiple compounds (phases) may be present'
>>>>     Modulated           'Multiple subsystems may be present'
>>>>     Experiments
>>>> ;
>>>>     diffrn and exptl_crystal categories are looped: multiple
>>>>     diffraction measurements on multiple samples may be present
>>>> ;
>>>>     Macromolecular
>>>> ;
>>>>     mmCIF equivalent. Only single-key mmCIF categories containing
>>>> children
>>>>     of _entry.id are Set categories
>>>> ;
>>>>     Raw
>>>> ;
>>>>     imgCIF equivalent. As for Macromolecular, with the addition of
>>>>     multiple detectors.
>>>> ;
>>>>     Laue
>>>> ;
>>>>     diffrn_radiation is looped: Multiple wavelengths are used.
>>>> ;
>>>>     Custom              'Examine dictionaries provided in
>>>> _audit_conform'
>>>>     Local               'Locally modified dictionaries. Datafile not
>>>> for distribution'
>>>> _enumeration.default    Base
>>>> =======================
>>>> --
>>>> T +61 (02) 9717 9907
>>>> F +61 (02) 9717 3145
>>>> M +61 (04) 0249 4148
>>>> _______________________________________________
>>>> comcifs mailing list
>>>> comcifs at iucr.org
>>>> http://mailman.iucr.org/cgi-bin/mailman/listinfo/comcifs
>>>>
>>>
>>
>> --
>> T +61 (02) 9717 9907
>> F +61 (02) 9717 3145
>> M +61 (04) 0249 4148
>>
>

-- 
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.iucr.org/pipermail/comcifs/attachments/20210107/6b9d8173/attachment-0001.html>


More information about the comcifs mailing list