DDLm, dREL, images and NeXus

Nick Spadaccini nick at csse.uwa.edu.au
Tue Jan 20 01:07:50 GMT 2009


Here is something I sent last month regarding the discussions on NeXus. I
suspect it was blocked at the server


On 16/12/08 5:04 PM, "Nick Spadaccini" <nick at csse.uwa.edu.au> wrote:

>  I am tracking this discussion but don¹t have time at the moment for a long
> and considered response.   I am slowly getting something together though. I
> can see a way of doing much of what Herb suggests without making to great a
> change to the current form of DDLm/dREL and certainly avoiding the need to
> extend DDLm to deal with various alien attributes. It has to do with making
> the use of methods in a dictionary context sensitive (really just exploiting
> the import mechanism).
> 
> What I am thinking is that, for instance, _cell_volume has an evaluation
> method which will generate its value from _cell_vector_a etc. What is
> important is all the definition information associated with _cell_volume, that
> has to be consistent. But I can import all this in to another dictionary,
> where I have an overwrite of the method. In this dictionary a request for
> _cell_volume executes its method, which pokes in to the DOM representation of
> an imported NeXus file and extracts its value, if it is there. It is OK it
> isn¹t there because I can take what I find back to the original dictionary and
> the method there will calculate the _cell_volume for me. I can have a method
> that takes a CIF formalised data item and injects in to a DOM representation
> ready for export OUT to NeXus. The essence is that I use imports to bring in
> the method I want, ³fit for purpose². The neatness of this approach is that
> most the dictionary is constant, consistent and correct, ONLY the methods
> change as needed.
> 
> The problem now is the API. The guts of the dREL parser will do most of what
> you want. We will need to develop an extension that takes a NeXus and reads it
> into its DOM formalism. This can be generalized and much of the Java (and
> Python) library already exists. But what about the complications of extending
> the API. Well in the newest form of DDLm we created a new category called
> _function where all the ³functions² to be used in dREL are defined. Since we
> can access all of Python in our current implementation we should be able to
> build functions that connect dREL to a DOM trawler relatively easy (says the
> man who hasn¹t had time to look at dREL in the last 6 months).
> 
> These are my initial thoughts, I will go a mull them over to see if I am
> making sense.
> 
> 
> On 16/12/08 11:37 AM, "James Hester" <jamesrhester at gmail.com> wrote:
> 
>> Before responding to Doug, I might comment that, although we are thinking
>> about NeXuS in particular here, we should make sure that whatever scheme we
>> come up with is generic enough to allow translations to be implemented from
>> (and to) other data description schemes (e.g. data repositories).
>> 
>> On Mon, Dec 15, 2008 at 3:22 PM, Doug <doug.duboulay at gmail.com> wrote:
>>> 
>>> On Fri, 12 Dec 2008, James Hester wrote:
>>>> > Let me flesh out a proposal in some detail, so that holes can be picked
>>>> in
>>>> > it.
>>> 
>>> The fact that the NeXus data model effectively supports infinite recursion
>>> on
>>> some elements and also that NXdata can hold many things that will not
>>> have CIF equivalents both suggest that NeXus -> CIF conversion could be
>>> lossy.
>>> 
>>>> > First, an overall view.
>>>> >
>>>> > At the moment, a DDLm/dREL engine is initialised with a set of DDLm
>>>> > dictionaries.
>>> 
>>> To elaborate a little bit, a dictionary is compiled to jython/java byte
>>> code as a set of classes, one for each category and containing
>>> methods for get/set and evaluate.  Although DDLm goes to some
>>> effort to express a hierarchy of categories, at least in the dREL prototype
>>> engine, at the implementation level, those categories were flattened to the
>>> two-level CIF model.
>> 
>> As an aside, there are currently two alternative implementations for dealing
>> with dREL and DDLm.  One has been produced by Doug, Nick, Syd and Ian, which
>> I would characterise as 'static': a DDLm dictionary is actually converted to
>> executable code at compile time, allowing distribution to end users of an
>> executable dictionary.   One alternative approach which I have been pursuing
>> is to load the DDLm dictionary into memory at runtime and execute the dREL
>> code as needed.  In either case I think my abstract description above gives
>> the essential gist of what happens.
>> 
>> The nice part about the CIF + DDL way of working is that no particular
>> implementation is mandated, but the correct behaviour is specified.  I think
>> this is why Herbert would prefer to see as much of the NeXuS to CIF
>> conversion logic in a DDLm/dREL form.
>>> 
>>>> > When passed a CIF instance, it will return values of any
>>>> > datanames that are contained in the CIF instance or that it is capable of
>>>> > calculating from those datanames that are already in the instance.
>>> 
>>> An instance of the 2-level dictionary object is created and then populated
>>> with the raw CIF data. Any items for which a "?" was recorded against them
>>> are
>>> subsequently evaluated where possible.
>>> 
>>> Thereafter, to print the CIF, the 2-level dictionary object is
>>> walked/visited
>>> and CIF tag/values are written to some output device.
>>> To generate hierarchical NeXus from CIF, the dREL engine would have to be
>>> reworked, if it hasn't been already.
>> 
>> To be honest, I was tackling only the 'from NeXuS to CIF' issues at this
>> stage, as they are the most difficult.
>>> 
>>>> > Now,
>>>> > what I envision as a 'translating' DDLm engine is initialised as before
>>>> > with the standard DDLm dictionaries, but also with two further
>>>> > dictionaries: a 'NeXuS dictionary' and a 'translation dictionary'
>>>> (contents
>>>> > of these explained later).
>>> 
>>> Those two dictionaries would currently be precompiled and created as above.
>>> I suspect the current dREL interpreter can not understand more than one
>>> dictionary simultaneously. Concatentation of dictionaries at the compilation
>>> stage might be possible, but probably isn't what you want, because that
>>> would likely embed Nexus names and value in the result CIF.
>> 
>> My understanding is that the implementation of which you speak only fills in
>> the question marks in the supplied CIF file: so presumably any NeXuS-specific
>> names would not be output.
>> 
>>> > Finally, it requires a 'NeXuS plugin'. Now,
>>> > when passed a CIF instance the DDLm engine works as before.  When passed a
>>> > NeXuS instance, it returns values of CIF datanames that it can calculate.
>>> >
>>> > Now for an explanation of these various extra bits.
>>> >
>>> > 1.  The 'NeXuS' dictionary is just another DDLm dictionary.  It contains
>>> > definitions for datanames using a CIF namespace: e.g. _nexus.slit_height.
>>> > The linkage to a NeXuS file is accomplished using a set of new DDLm
>>> > attributes, which work like the current 'xref' attributes: in the header
>>> > section of this 'NeXuS' dictionary file the various versions of the NeXuS
>>> > standard are assigned a short code in a loop.  Each of the definitions in
>>> > the body of the dictionary then contains two new DDLm attributes:
>>> > _alien.code (referencing the version of the standard in the header) and
>>> > _alien.location (where to find the dataname).  The syntax of the value of
>>> > _alien.location might be borrowed from, for example, XPath in the case of
>>> > NeXuS.
>> 
>>> XPath can provide a mechanism to locate items in an XML document tree,
>>> but it doesn't provide a mechanism to specify/generate the structure of that
>>> tree.  e.g. //NXdata/@name  might get a nodeset corresponding to a list of
>>> name attribute nodes for potential use as CIF tags, but says nothing about
>>> the location of the NXdata elements.
>>> i.e. this is helpful for NeXus -> CIF, but not for CIF -> NeXus
>> 
>> Yes, I was only aiming to solve the NeXuS -> CIF problem.
>>> 
>>>> > The data definitions containing _alien.location attributes could be
>>>> > considered 'raw' NeXuS data, which may not map easily onto CIF datanames.
>>>> > Therefore, this dictionary could contain further DDLm definitions of
>>>> > dataitems (still in the CIF 'nexus' namespace) which contained dREL
>>>> methods
>>>> > for manipulating the raw datanames into something that mapped more
>>>> directly
>>>> > into CIF.  This is where one might foresee adding a few more builtin
>>>> > functions to dREL to ease e.g. image processing.
>>> 
>>> I get the feeling that somewhere there will need to be a list that says
>>> something like:
>>> nexus4:some_cat.some_item1   ?
>>> nexus4:some_cat.some_item2   ?
>>> ...
>>> - in order to trigger the evaluations. Though maybe they would be deduced
>>> by a CIF full of "?" on the request side.
>> 
>> The idea would be to trigger all the evaluations as usual, and because you
>> have loaded in the 'translate' DDLm dictionary over the top of the normal
>> dictionary, at some stage the evaluation chain will access NeXuS-derived
>> values instead of primitive values.
>> 
>> [example from previous email deleted]
>>> 
>>> 
>>> Just as an alternative:
>>> 
>>> <xsl:stylesheet>
>>> <xsl:output method="text"/>
>>> <xsl:template match="NXuser">
>>>    <xsl:if test="position()= 1">  <!-- if multiple NXuser elements -->
>>>      <xsl:text>loop_&#xA;</xsl:text>   <!--  append newline char in hex -->
>>>      <xsl:text>          audit_author.name <http://audit_author.name>
>>> &#xA;</xsl:text>
>>>      <xsl:text>          audit_author.affiliation&#xA;</xsl:text>
>>>      <xsl:text>          audit_author.phone&#xA;</xsl:text>
>>>      <xsl:text>          audit_author.fax&#xA;</xsl:text>
>>>      <xsl:text>          audit_author.email&#xA;</xsl:text>
>>>    </xsl:if>
>>> 
>>>    <xsl:for-each select="./name">
>>>       <xsl:variable name="audit_author" select="parent::node()"/>
>>>         <xsl:call-template name="dumpItem">
>>>            <xsl:with-param name="item" select="."/><!--i.e. name -->
>>>         </xsl:call-template>
>>>         <xsl:call-template name="dumpItem">
>>>            <xsl:with-param name="item" select="$audit_author/affiliation"/>
>>>         </xsl:call-template>
>>>         <xsl:call-template name="dumpItem">
>>>          <xsl:with-param name="item"
>>> select="$audit_author/telephone_number"/>
>>>         </xsl:call-template>
>>>         <xsl:call-template name="dumpItem">
>>>            <xsl:with-param name="item" select="$audit_author/fax_number"/>
>>>         </xsl:call-template>
>>>         <xsl:call-template name="dumpItem">
>>>            <xsl:with-param name="item" select="$audit_author/email"/>
>>>         </xsl:call-template>
>>>       <xsl:text>&#xA;</xsl:text>
>>>    </xsl:for-each>
>>> </xsl:template>
>>> 
>>> <xsl:template name="dumpItem">
>>>   <xsl:param name="item"/>
>>>   <xsl:text> </xsl:text>
>>>   <xsl:choose>
>>>     <xsl:when test="$item !=''">
>>>         <!-- add space and parentheses checks here -->
>>>        <xsl:value-of select="$item"/>
>>>     </xsl:when>
>>>     <xsl:otherwise>
>>>        <xsl:text>.</xsl:text>
>>>     </xsl:otherwise>
>>>   </xsl:choose>
>>> </xsl:template>
>>> </xsl:stylesheet>
>>> 
>>> 
>>> - a simple (untested) XSLT stylesheet, usable by a significant number of
>>> current XSLT processing engines that could transform NeXus/NXuser data in
>>> XML format directly into CIF. Some XSLT engines provide extension options
>>> for doing more complicated transformations when needed. NeXus HDF would need
>>> transformation to XML first. A separate stylesheet would need to be defined
>>> to do the reverse transformation, assuming that the CIF was first converted
>>> to
>>> some XML format.
>>> 
>>> (not that XSLT is really what I would be looking for in a "mapping" file,
>>> but its good to be aware of other possibilites - but maybe there is already
>>> a CIF->CML->NeXus converter and vice versa?)
>> 
>> This is an intriguing example and I think if the actual values themselves
>> don't need manipulation it would do a good job.  Perhaps the initial
>> transformation to what I called previously a 'raw NeXuS' CIF could be best
>> done by XSLT, using the conventions of that program to do the
>> renormalisation.   Manipulations of data values could then be done by dREL
>> routines in a 'translate' dictionary.  There is however an important
>> practical limitation of this scheme, which is that trying to deal with XML
>> files that have images in them is ridiculously slow even with current desktop
>> processing power (that is our experience at the Bragg, anyway).
>> 
>> Also, Nick S. tells me that back in the late 90s he produced an XSLT-based
>> transformation from CIF and DDL to XML, and was able to use standard XML
>> tools to validate the CIF-derived XML file against the DDL-derived XML
>> schema.  Maybe the time has come for this tool to be dusted off.
>> 
>> James.
> 
> cheers
> 
> Nick
> 
> --------------------------------
> Dr N. Spadaccini
> School of Computer Science & Software Engineering
> 
> The University of Western Australia    t: +(61 8) 6488 3452
> 35 Stirling Highway                    f: +(61 8) 6488 1089
> CRAWLEY, Perth,  WA  6009 AUSTRALIA   w3: www.csse.uwa.edu.au/~nick
> MBDP  M002
> 
> CRICOS Provider Code: 00126G
> 
> e: Nick.Spadaccini at uwa.edu.au
> 
> 

cheers

Nick

--------------------------------
Dr N. Spadaccini
School of Computer Science & Software Engineering

The University of Western Australia    t: +(61 8) 6488 3452
35 Stirling Highway                    f: +(61 8) 6488 1089
CRAWLEY, Perth,  WA  6009 AUSTRALIA   w3: www.csse.uwa.edu.au/~nick
MBDP  M002

CRICOS Provider Code: 00126G

e: Nick.Spadaccini at uwa.edu.au



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://scripts.iucr.org/pipermail/comcifs/attachments/20090120/c2d171ce/attachment-0001.html 


More information about the comcifs mailing list