Theorem proving / reasoning using CIF dictionaries

James Hester jamesrhester at
Fri Dec 14 02:09:45 GMT 2018

If Joe's interest is more towards logical proofs based on 'has a' 'is a'
'contains' etc., then I'm not sure we have much to offer.  These concepts
are rather orthogonal to our approach, because we fundamentally assume the
involvement of human programmers in order to make the initial assignments
of data names based on human-readable definitions. Relationships like 'has
a', 'is a' are embedded in the discipline knowledge of the programmer and
are of little further practical use for typical scientific computational
tasks and so we haven't provided support for them.

In case I have misunderstood Joe's interests, there are some interesting
mathematical approaches related to DDLm and dREL, as follows. Assuming that
any data file has a relational representation, there is a direct one-to-one
correspondence between the relational schema describing the file and a
category (in the sense of mathematical category theory). This one-to-one
correspondence is shown and leveraged by Spivak in several publications
(e.g. Spivak, D. I. "Functorial data migration", Information and
Computation 217 (2012) 31-51). A given data file or collection of data
files represent an instance of the category.   Functors between different
ontologies (i.e. categories) can be set up to enable automatic data
migration between files whose schema are not identical. I gave a talk on
this at the recent AsCA meeting in New Zealand.

Because DDLm and dREL work within a relational model of data, DDLm
dictionaries and dREL can be used to describe and manipulate all forms of
data.  This seems not to be generally appreciated, perhaps because we do
lack the tools and visibility. Note that I have asserted above that any
data file has a relational representation, without providing a strict
mathematical proof, although I think one would be easy starting from the
point that a data file is a mapping from file position to byte. Meanwhile,
I have demonstrated the universality of DDLm and dREL by using DDLm
dictionaries and dREL to manipulate data expressed in HDF5 format
(hierarchical, binary). The code is available from Zenodo, and can be found
via the publication at

Hope that helps in some way!

On Thu, 13 Dec 2018 at 20:22, Peter Murray-Rust <pm286 at> wrote:

> Many thanks Nick,
> CIF and DRel are ahead of their time. I find this all the time with CML.
> noone gets the point.
> It's so obvious to you and us and me that published knowledge is
> computable. But so few people get it. I didn't realise when started withe
> early dictionaries (what was the precursor of CIF? ) that this was anything
> unusual. What's amazing is how few other disciplines have adopted the
> semantic dictionary idea.
> I am now starting this with Wikidata, which I think is the future of
> scientific metadata and can be computed (it's in RDF/SPARQL).
> Is there any activity in getting CIF definitions into Wikidata? All the
> PDB codes are already in.
> I think CIF.DRel is a good playground for developing theorem proving
> On Thu, Dec 13, 2018 at 3:57 AM Nick Spadaccini <
> nick.spadaccini.uwa at> wrote:
>> In our original design considerations of how to structure a dictionary
>> and define dREL we explicitly discussed the concept of a knowledge engine
>> trawling the dictionary to find hitherto undefined relations from the
>> definitions - we were blue sky, spit balling in essence but we were
>> confident eventually such a process could be manufactured.
>> The realities of mapping ideas around dREL into concrete systems means
>> you make compromises. While we tried to engineer dREL around "relations" it
>> is quite imperative in nature. Not to sure what knowledge engines consume
>> as information these days and whether it has moved on from is_a, has_a etc
>> etc semantic networks or whether programming code is trawlable, but good
>> luck - I would love to see DDL and dREL move to the next level.
>> cheers
>> Nick
>> -------------------------------
>> Perth,  WA  6009 AUSTRALIA
>> e: Nick.Spadaccini at
>> On Fri, Dec 7, 2018 at 8:53 PM Peter Murray-Rust <pm286 at> wrote:
>>> Joe Cornell is a mathematician who is interested in theorem proving
>>> starting from semantic definitions. The CIF dictionaries could be an
>>> interesting starting point where definitions and executable code are
>>> coupled. Do you know of efforts to use CIF to calculate properties or
>>> relations that are not explicitly defined in the dictionaries - possible
>>> with constraints.
>>> Hope this makes some sense!
>>> P.
>>> --
>>> Peter Murray-Rust
>>> Reader Emeritus in Molecular Informatics
>>> Unilever Centre, Dept. Of Chemistry
>>> University of Cambridge
>>> CB2 1EW, UK
>>> +44-1223-763069
> --
> Peter Murray-Rust
> Reader Emeritus in Molecular Informatics
> Unilever Centre, Dept. Of Chemistry
> University of Cambridge
> CB2 1EW, UK
> +44-1223-763069
> _______________________________________________
> comcifs mailing list
> comcifs at

T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the comcifs mailing list