Theorem proving / reasoning using CIF dictionaries

Peter Murray-Rust pm286 at
Fri Dec 14 09:09:29 GMT 2018

Thanks NIck and James - very useful . It was me who suggested CIF/DRel
might be a good model - perhaps it may be too data-oriented for Joe.


On Fri, Dec 14, 2018 at 2:10 AM James Hester <jamesrhester at> wrote:

> If Joe's interest is more towards logical proofs based on 'has a' 'is a'
> 'contains' etc., then I'm not sure we have much to offer.  These concepts
> are rather orthogonal to our approach, because we fundamentally assume the
> involvement of human programmers in order to make the initial assignments
> of data names based on human-readable definitions. Relationships like 'has
> a', 'is a' are embedded in the discipline knowledge of the programmer and
> are of little further practical use for typical scientific computational
> tasks and so we haven't provided support for them.
> In case I have misunderstood Joe's interests, there are some interesting
> mathematical approaches related to DDLm and dREL, as follows. Assuming that
> any data file has a relational representation, there is a direct one-to-one
> correspondence between the relational schema describing the file and a
> category (in the sense of mathematical category theory). This one-to-one
> correspondence is shown and leveraged by Spivak in several publications
> (e.g. Spivak, D. I. "Functorial data migration", Information and
> Computation 217 (2012) 31-51). A given data file or collection of data
> files represent an instance of the category.   Functors between different
> ontologies (i.e. categories) can be set up to enable automatic data
> migration between files whose schema are not identical. I gave a talk on
> this at the recent AsCA meeting in New Zealand.
> Because DDLm and dREL work within a relational model of data, DDLm
> dictionaries and dREL can be used to describe and manipulate all forms of
> data.  This seems not to be generally appreciated, perhaps because we do
> lack the tools and visibility. Note that I have asserted above that any
> data file has a relational representation, without providing a strict
> mathematical proof, although I think one would be easy starting from the
> point that a data file is a mapping from file position to byte. Meanwhile,
> I have demonstrated the universality of DDLm and dREL by using DDLm
> dictionaries and dREL to manipulate data expressed in HDF5 format
> (hierarchical, binary). The code is available from Zenodo, and can be found
> via the publication at
> Hope that helps in some way!
> James.
> On Thu, 13 Dec 2018 at 20:22, Peter Murray-Rust <pm286 at> wrote:
>> Many thanks Nick,
>> CIF and DRel are ahead of their time. I find this all the time with CML.
>> noone gets the point.
>> It's so obvious to you and us and me that published knowledge is
>> computable. But so few people get it. I didn't realise when started withe
>> early dictionaries (what was the precursor of CIF? ) that this was anything
>> unusual. What's amazing is how few other disciplines have adopted the
>> semantic dictionary idea.
>> I am now starting this with Wikidata, which I think is the future of
>> scientific metadata and can be computed (it's in RDF/SPARQL).
>> Is there any activity in getting CIF definitions into Wikidata? All the
>> PDB codes are already in.
>> I think CIF.DRel is a good playground for developing theorem proving
>> On Thu, Dec 13, 2018 at 3:57 AM Nick Spadaccini <
>> nick.spadaccini.uwa at> wrote:
>>> In our original design considerations of how to structure a dictionary
>>> and define dREL we explicitly discussed the concept of a knowledge engine
>>> trawling the dictionary to find hitherto undefined relations from the
>>> definitions - we were blue sky, spit balling in essence but we were
>>> confident eventually such a process could be manufactured.
>>> The realities of mapping ideas around dREL into concrete systems means
>>> you make compromises. While we tried to engineer dREL around "relations" it
>>> is quite imperative in nature. Not to sure what knowledge engines consume
>>> as information these days and whether it has moved on from is_a, has_a etc
>>> etc semantic networks or whether programming code is trawlable, but good
>>> luck - I would love to see DDL and dREL move to the next level.
>>> cheers
>>> Nick
>>> -------------------------------
>>> Perth,  WA  6009 AUSTRALIA
>>> e: Nick.Spadaccini at
>>> On Fri, Dec 7, 2018 at 8:53 PM Peter Murray-Rust <pm286 at>
>>> wrote:
>>>> Joe Cornell is a mathematician who is interested in theorem proving
>>>> starting from semantic definitions. The CIF dictionaries could be an
>>>> interesting starting point where definitions and executable code are
>>>> coupled. Do you know of efforts to use CIF to calculate properties or
>>>> relations that are not explicitly defined in the dictionaries - possible
>>>> with constraints.
>>>> Hope this makes some sense!
>>>> P.
>>>> --
>>>> Peter Murray-Rust
>>>> Reader Emeritus in Molecular Informatics
>>>> Unilever Centre, Dept. Of Chemistry
>>>> University of Cambridge
>>>> CB2 1EW, UK
>>>> +44-1223-763069
>> --
>> Peter Murray-Rust
>> Reader Emeritus in Molecular Informatics
>> Unilever Centre, Dept. Of Chemistry
>> University of Cambridge
>> CB2 1EW, UK
>> +44-1223-763069
>> _______________________________________________
>> comcifs mailing list
>> comcifs at
> --
> T +61 (02) 9717 9907
> F +61 (02) 9717 3145
> M +61 (04) 0249 4148
> _______________________________________________
> comcifs mailing list
> comcifs at

Peter Murray-Rust
Reader Emeritus in Molecular Informatics
Unilever Centre, Dept. Of Chemistry
University of Cambridge
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the comcifs mailing list