Adding datanames covering database information

Frances C. Bernstein fcb at bernstein-plus-sons.com
Thu Jun 28 10:14:12 BST 2018


Hi James,

      I only looked at a small part of this but in the line for RCSB
the word 'STructural' should be 'Structural'.

      Hope all is well.  Will we see you in Toronto?

                       Frances

=====================================================
****                Bernstein + Sons
*   *       Information Systems Consultants
****    5 Brewster Lane, Bellport, NY 11713-2803
*   * ***
**** *            Frances C. Bernstein
   *   ***      fcb at bernstein-plus-sons.com
  ***     *
   *   *** 1-631-286-1339    FAX: 1-631-286-1999
=====================================================

On Thu, 28 Jun 2018, James Hester wrote:

> Please see below some draft definitions for a new database_related category, as foreshadowed in
> my email of April 12th.? Feel free to comment. If any databases have been left off the initial
> list below, feel free to suggest additions.
> 
> Note that I have chosen not to make these datanames aliases of the DATABASE_2 datanames in
> mmCIF, as the new category has a different key.
> 
> James.
> =============================================================
> #
> #?Draft definitions for a new DATABASE_RELATED category
> #
> 
> save_DATABASE_RELATED
> _definition.id????????? DATABASE_RELATED
> _definition.class?????? Loop
> _definition.scope?????? Category
> _definition.update????? 2018-06-29
> _description.text
> ;
> 
> ??? A category of items recording entries in databases that describe
> ??? the same or related data. Databases wishing to insert their own
> ??? canonical codes when archiving and delivering data blocks should
> ??? use items from the DATABASE category.
> ???
> ;
> _name.category_id?????? PUBLICATION
> _name.object_id???????? DATABASE_RELATED
> _category_key.name????? '_database_related.id'
> save_
> 
> save_database_related.id
> _definition.id????????? '_database_related.id'
> _definition.update????? 2018-06-29
> _description.text
> ;
> ?????? An identifer for this database reference
> ;
> _name.category_id?????? database_related
> _name.object_id???????? id
> _type.purpose?????????? Key
> _type.source??????????? Recorded
> _type.container???????? Single
> _type.contents????????? Text
> save_
> 
> save_database_related.database_id
> _definition.id????????? '_database_related.database_id'
> _definition.update????? 2018-06-29
> _description.text
> ;
> ?????? An identifier for the database that contains the
> ?????? related dataset.
> ;
> _name.category_id?????? database_related
> _name.object_id???????? database_id
> _type.purpose?????????? State
> _type.source??????????? Recorded
> _type.container???????? Single
> _type.contents????????? Text
> _import.get [{'save':database_list 'file':templ_enum.cif}]
> save_
> 
> save_database_related.database_code
> _definition.id????????? '_database_related.database_code'
> _definition.update????? 2018-06-29
> _description.text
> ;
> ?????? The code used by the database referred to in
> ?????? _database_related.database_id to identify the
> ?????? related dataset.
> ;
> _name.category_id?????? database_related
> _name.object_id???????? database_code
> _type.purpose?????????? Encode
> _type.source??????????? Recorded
> _type.container???????? Single
> _type.contents????????? Text
> 
> save_
> 
> save_database_related.relation
> _definition.id????????? '_database_related.relation'
> _definition.update????? 2018-06-29
> _description.text
> ;
> ?????? The general relationship of the data in the data block
> ?????? to the dataset referred to in the database.
> ;
> _name.category_id?????? database_related
> _name.object_id???????? relation
> _type.purpose?????????? State
> _type.source??????????? Recorded
> _type.container???????? Single
> _type.contents????????? Text
> loop_
> ?? _enumeration_set.state
> ?? _enumeration_set.details
> ?? Identical?????????? 'The dataset contents are identical'
> ?? Subset????????????? 'The dataset contents are a proper subset of the contents of the data block'
> ?? Superset??????????? 'The dataset contents include the contents of the data block'
> ?? Derived???????????? 'The dataset contents are derivable from the contents of the data block'
> ?? Common????????????? 'The dataset contents share a common source'
> save_
> 
> save_database_related.special_details
> _definition.id????????? '_database_related.special_details'
> _definition.update????? 2018-06-29
> _description.text??????????????????????
> ;
> ??? Information about the external dataset and relationship not encoded
> ??? elsewhere.
> ;
> _name.category_id?????????????????????? database_related
> _name.object_id???????????????????????? special_details
> _type.purpose?????????????????????????? Describe
> _type.source??????????????????????????? Recorded
> _type.container???????????????????????? Single
> _type.contents????????????????????????? Text
> 
> save_
> 
> 
> #
> # Contents to be added to templ_enum.cif listing database codes
> #
> 
> 
> save_database_list
> loop_
> ??? _enumeration_set.state
> ??? _enumeration_set.detail
> ??? CAS????????? 'Chemical Abstracts'
> ??? COD????????? 'Crystallographic Open Database'
> ??? CSD????????? 'Cambridge Structural Database'
> ??? ICSD???????? 'Inorganic Crystal Structure Database'
> ??? MDF????????? 'MetalsData File'
> ??? NDB????????? 'Nucleic Acid Database'
> ??? PDB????????? 'Protein Data Bank'
> ??? PDF????????? 'PowderDiffraction File (JCPDS/ICDD)'
> ??? RCSB???????? 'Research Collaboratory for STructural Bioinformatics'
> ??? EBI????????? 'European Bioinformatics Institute'
> save_
> 
> 
> On 12 April 2018 at 15:59, James Hester <jamesrhester at gmail.com> wrote:
>       Dear Core CIF users and experts,
> 
> The current core CIF provides the DATABASE and DATABASE_CODE categories for identifying a
> database entry corresponding to the structure contained in the data block, for a variety
> of pre-determined databases.? These are both Set categories, that is, their datanames can
> only take a single value in a single data block.? This restriction is reasonable if the
> database content for that entry is seen as coincident with the data block contents, as
> has been the case for structural databases.
> 
> However, it is possible for multiple entries from a single database to be more broadly
> relevant to the contents of a data block. For example, multiple structures may correspond
> to a single topology.? So I would like you to consider the creation of a (looped)
> DATABASE_RELATED category that would simply list entry codes for databases in the same
> way as CITATION simply lists literature references.? Other categories in other dictionaries
> may then reference these entries for their own uses.? This is not intended to replace the
> current DATABASE categories, which would still be preferred for use by structural
> databases upon deposition and delivery of CIF files.? The new category would instead align
> with the mmCIF DATABASE_2 category.
> 
> The proposed data names are as follows, with short summaries of their meanings:
> 
> _database_related.id?????????? 'An arbitrary identifier for this entry'
> _database_related.database_id??????????? 'An identifier for the database from an enumerated list (e.g. CCDC,
> PDB, ICSD, COD ...)
> _database_related.reference?? 'A code used by the database given in
> _database_related.database_id'
> _database_related.relation ? ?? 'The way in which the database entry is related to the contents
> of the data block, from an enumerated list. Initial suggestions include
> "identical","component","derived","common source" '
> _database_related.special_details?? 'Optional free-form description of the relationship
> between this entry and the data block contents"
> ?
> An example of use in a data file would then be:
> 
> loop_
> _database_related.id?????????
> _database_related.database_id????????
> _database_related.reference?
> _database_related.relation ? ?
> _database_related.special_details
> 1??? COD????????????? 1234?????????????????? identical??????????????????????????? 'As deposited structure'
> 2??? COD????????????? 6789?????????????????? 'common source'??????????? 'Curated version of this structure'
> 3??? CCDC??????????? qrst-12?????????????? 'common source'??????????? 'Curatedversion of this structure'
> 4??? ICSD????????????? lll-ppp???????????????? . ??????????????????????? ? ? ? ? ? ? ? ? 'An earlier version of the structure with missing H atoms'
> 
> Please provide your thoughts on this general scheme, and any further data names that you
> think might be useful in this context.? If there are no objections, I will prepare formal
> definitions and advise this group when they are ready for inclusion.
> 
> best wishes,
> James Hester.
> --
> T +61 (02) 9717 9907
> F +61 (02) 9717 3145
> M +61 (04) 0249 4148
> 
> 
> 
> 
> --
> T +61 (02) 9717 9907
> F +61 (02) 9717 3145
> M +61 (04) 0249 4148
> 
>


More information about the coreDMG mailing list