DDLm, dREL, images and NeXus
Nick Spadaccini
nick at csse.uwa.edu.au
Tue Jan 20 01:07:50 GMT 2009
Here is something I sent last month regarding the discussions on NeXus. I
suspect it was blocked at the server
On 16/12/08 5:04 PM, "Nick Spadaccini" <nick at csse.uwa.edu.au> wrote:
> I am tracking this discussion but don¹t have time at the moment for a long
> and considered response. I am slowly getting something together though. I
> can see a way of doing much of what Herb suggests without making to great a
> change to the current form of DDLm/dREL and certainly avoiding the need to
> extend DDLm to deal with various alien attributes. It has to do with making
> the use of methods in a dictionary context sensitive (really just exploiting
> the import mechanism).
>
> What I am thinking is that, for instance, _cell_volume has an evaluation
> method which will generate its value from _cell_vector_a etc. What is
> important is all the definition information associated with _cell_volume, that
> has to be consistent. But I can import all this in to another dictionary,
> where I have an overwrite of the method. In this dictionary a request for
> _cell_volume executes its method, which pokes in to the DOM representation of
> an imported NeXus file and extracts its value, if it is there. It is OK it
> isn¹t there because I can take what I find back to the original dictionary and
> the method there will calculate the _cell_volume for me. I can have a method
> that takes a CIF formalised data item and injects in to a DOM representation
> ready for export OUT to NeXus. The essence is that I use imports to bring in
> the method I want, ³fit for purpose². The neatness of this approach is that
> most the dictionary is constant, consistent and correct, ONLY the methods
> change as needed.
>
> The problem now is the API. The guts of the dREL parser will do most of what
> you want. We will need to develop an extension that takes a NeXus and reads it
> into its DOM formalism. This can be generalized and much of the Java (and
> Python) library already exists. But what about the complications of extending
> the API. Well in the newest form of DDLm we created a new category called
> _function where all the ³functions² to be used in dREL are defined. Since we
> can access all of Python in our current implementation we should be able to
> build functions that connect dREL to a DOM trawler relatively easy (says the
> man who hasn¹t had time to look at dREL in the last 6 months).
>
> These are my initial thoughts, I will go a mull them over to see if I am
> making sense.
>
>
> On 16/12/08 11:37 AM, "James Hester" <jamesrhester at gmail.com> wrote:
>
>> Before responding to Doug, I might comment that, although we are thinking
>> about NeXuS in particular here, we should make sure that whatever scheme we
>> come up with is generic enough to allow translations to be implemented from
>> (and to) other data description schemes (e.g. data repositories).
>>
>> On Mon, Dec 15, 2008 at 3:22 PM, Doug <doug.duboulay at gmail.com> wrote:
>>>
>>> On Fri, 12 Dec 2008, James Hester wrote:
>>>> > Let me flesh out a proposal in some detail, so that holes can be picked
>>>> in
>>>> > it.
>>>
>>> The fact that the NeXus data model effectively supports infinite recursion
>>> on
>>> some elements and also that NXdata can hold many things that will not
>>> have CIF equivalents both suggest that NeXus -> CIF conversion could be
>>> lossy.
>>>
>>>> > First, an overall view.
>>>> >
>>>> > At the moment, a DDLm/dREL engine is initialised with a set of DDLm
>>>> > dictionaries.
>>>
>>> To elaborate a little bit, a dictionary is compiled to jython/java byte
>>> code as a set of classes, one for each category and containing
>>> methods for get/set and evaluate. Although DDLm goes to some
>>> effort to express a hierarchy of categories, at least in the dREL prototype
>>> engine, at the implementation level, those categories were flattened to the
>>> two-level CIF model.
>>
>> As an aside, there are currently two alternative implementations for dealing
>> with dREL and DDLm. One has been produced by Doug, Nick, Syd and Ian, which
>> I would characterise as 'static': a DDLm dictionary is actually converted to
>> executable code at compile time, allowing distribution to end users of an
>> executable dictionary. One alternative approach which I have been pursuing
>> is to load the DDLm dictionary into memory at runtime and execute the dREL
>> code as needed. In either case I think my abstract description above gives
>> the essential gist of what happens.
>>
>> The nice part about the CIF + DDL way of working is that no particular
>> implementation is mandated, but the correct behaviour is specified. I think
>> this is why Herbert would prefer to see as much of the NeXuS to CIF
>> conversion logic in a DDLm/dREL form.
>>>
>>>> > When passed a CIF instance, it will return values of any
>>>> > datanames that are contained in the CIF instance or that it is capable of
>>>> > calculating from those datanames that are already in the instance.
>>>
>>> An instance of the 2-level dictionary object is created and then populated
>>> with the raw CIF data. Any items for which a "?" was recorded against them
>>> are
>>> subsequently evaluated where possible.
>>>
>>> Thereafter, to print the CIF, the 2-level dictionary object is
>>> walked/visited
>>> and CIF tag/values are written to some output device.
>>> To generate hierarchical NeXus from CIF, the dREL engine would have to be
>>> reworked, if it hasn't been already.
>>
>> To be honest, I was tackling only the 'from NeXuS to CIF' issues at this
>> stage, as they are the most difficult.
>>>
>>>> > Now,
>>>> > what I envision as a 'translating' DDLm engine is initialised as before
>>>> > with the standard DDLm dictionaries, but also with two further
>>>> > dictionaries: a 'NeXuS dictionary' and a 'translation dictionary'
>>>> (contents
>>>> > of these explained later).
>>>
>>> Those two dictionaries would currently be precompiled and created as above.
>>> I suspect the current dREL interpreter can not understand more than one
>>> dictionary simultaneously. Concatentation of dictionaries at the compilation
>>> stage might be possible, but probably isn't what you want, because that
>>> would likely embed Nexus names and value in the result CIF.
>>
>> My understanding is that the implementation of which you speak only fills in
>> the question marks in the supplied CIF file: so presumably any NeXuS-specific
>> names would not be output.
>>
>>> > Finally, it requires a 'NeXuS plugin'. Now,
>>> > when passed a CIF instance the DDLm engine works as before. When passed a
>>> > NeXuS instance, it returns values of CIF datanames that it can calculate.
>>> >
>>> > Now for an explanation of these various extra bits.
>>> >
>>> > 1. The 'NeXuS' dictionary is just another DDLm dictionary. It contains
>>> > definitions for datanames using a CIF namespace: e.g. _nexus.slit_height.
>>> > The linkage to a NeXuS file is accomplished using a set of new DDLm
>>> > attributes, which work like the current 'xref' attributes: in the header
>>> > section of this 'NeXuS' dictionary file the various versions of the NeXuS
>>> > standard are assigned a short code in a loop. Each of the definitions in
>>> > the body of the dictionary then contains two new DDLm attributes:
>>> > _alien.code (referencing the version of the standard in the header) and
>>> > _alien.location (where to find the dataname). The syntax of the value of
>>> > _alien.location might be borrowed from, for example, XPath in the case of
>>> > NeXuS.
>>
>>> XPath can provide a mechanism to locate items in an XML document tree,
>>> but it doesn't provide a mechanism to specify/generate the structure of that
>>> tree. e.g. //NXdata/@name might get a nodeset corresponding to a list of
>>> name attribute nodes for potential use as CIF tags, but says nothing about
>>> the location of the NXdata elements.
>>> i.e. this is helpful for NeXus -> CIF, but not for CIF -> NeXus
>>
>> Yes, I was only aiming to solve the NeXuS -> CIF problem.
>>>
>>>> > The data definitions containing _alien.location attributes could be
>>>> > considered 'raw' NeXuS data, which may not map easily onto CIF datanames.
>>>> > Therefore, this dictionary could contain further DDLm definitions of
>>>> > dataitems (still in the CIF 'nexus' namespace) which contained dREL
>>>> methods
>>>> > for manipulating the raw datanames into something that mapped more
>>>> directly
>>>> > into CIF. This is where one might foresee adding a few more builtin
>>>> > functions to dREL to ease e.g. image processing.
>>>
>>> I get the feeling that somewhere there will need to be a list that says
>>> something like:
>>> nexus4:some_cat.some_item1 ?
>>> nexus4:some_cat.some_item2 ?
>>> ...
>>> - in order to trigger the evaluations. Though maybe they would be deduced
>>> by a CIF full of "?" on the request side.
>>
>> The idea would be to trigger all the evaluations as usual, and because you
>> have loaded in the 'translate' DDLm dictionary over the top of the normal
>> dictionary, at some stage the evaluation chain will access NeXuS-derived
>> values instead of primitive values.
>>
>> [example from previous email deleted]
>>>
>>>
>>> Just as an alternative:
>>>
>>> <xsl:stylesheet>
>>> <xsl:output method="text"/>
>>> <xsl:template match="NXuser">
>>> <xsl:if test="position()= 1"> <!-- if multiple NXuser elements -->
>>> <xsl:text>loop_
</xsl:text> <!-- append newline char in hex -->
>>> <xsl:text> audit_author.name <http://audit_author.name>
>>> 
</xsl:text>
>>> <xsl:text> audit_author.affiliation
</xsl:text>
>>> <xsl:text> audit_author.phone
</xsl:text>
>>> <xsl:text> audit_author.fax
</xsl:text>
>>> <xsl:text> audit_author.email
</xsl:text>
>>> </xsl:if>
>>>
>>> <xsl:for-each select="./name">
>>> <xsl:variable name="audit_author" select="parent::node()"/>
>>> <xsl:call-template name="dumpItem">
>>> <xsl:with-param name="item" select="."/><!--i.e. name -->
>>> </xsl:call-template>
>>> <xsl:call-template name="dumpItem">
>>> <xsl:with-param name="item" select="$audit_author/affiliation"/>
>>> </xsl:call-template>
>>> <xsl:call-template name="dumpItem">
>>> <xsl:with-param name="item"
>>> select="$audit_author/telephone_number"/>
>>> </xsl:call-template>
>>> <xsl:call-template name="dumpItem">
>>> <xsl:with-param name="item" select="$audit_author/fax_number"/>
>>> </xsl:call-template>
>>> <xsl:call-template name="dumpItem">
>>> <xsl:with-param name="item" select="$audit_author/email"/>
>>> </xsl:call-template>
>>> <xsl:text>
</xsl:text>
>>> </xsl:for-each>
>>> </xsl:template>
>>>
>>> <xsl:template name="dumpItem">
>>> <xsl:param name="item"/>
>>> <xsl:text> </xsl:text>
>>> <xsl:choose>
>>> <xsl:when test="$item !=''">
>>> <!-- add space and parentheses checks here -->
>>> <xsl:value-of select="$item"/>
>>> </xsl:when>
>>> <xsl:otherwise>
>>> <xsl:text>.</xsl:text>
>>> </xsl:otherwise>
>>> </xsl:choose>
>>> </xsl:template>
>>> </xsl:stylesheet>
>>>
>>>
>>> - a simple (untested) XSLT stylesheet, usable by a significant number of
>>> current XSLT processing engines that could transform NeXus/NXuser data in
>>> XML format directly into CIF. Some XSLT engines provide extension options
>>> for doing more complicated transformations when needed. NeXus HDF would need
>>> transformation to XML first. A separate stylesheet would need to be defined
>>> to do the reverse transformation, assuming that the CIF was first converted
>>> to
>>> some XML format.
>>>
>>> (not that XSLT is really what I would be looking for in a "mapping" file,
>>> but its good to be aware of other possibilites - but maybe there is already
>>> a CIF->CML->NeXus converter and vice versa?)
>>
>> This is an intriguing example and I think if the actual values themselves
>> don't need manipulation it would do a good job. Perhaps the initial
>> transformation to what I called previously a 'raw NeXuS' CIF could be best
>> done by XSLT, using the conventions of that program to do the
>> renormalisation. Manipulations of data values could then be done by dREL
>> routines in a 'translate' dictionary. There is however an important
>> practical limitation of this scheme, which is that trying to deal with XML
>> files that have images in them is ridiculously slow even with current desktop
>> processing power (that is our experience at the Bragg, anyway).
>>
>> Also, Nick S. tells me that back in the late 90s he produced an XSLT-based
>> transformation from CIF and DDL to XML, and was able to use standard XML
>> tools to validate the CIF-derived XML file against the DDL-derived XML
>> schema. Maybe the time has come for this tool to be dusted off.
>>
>> James.
>
> cheers
>
> Nick
>
> --------------------------------
> Dr N. Spadaccini
> School of Computer Science & Software Engineering
>
> The University of Western Australia t: +(61 8) 6488 3452
> 35 Stirling Highway f: +(61 8) 6488 1089
> CRAWLEY, Perth, WA 6009 AUSTRALIA w3: www.csse.uwa.edu.au/~nick
> MBDP M002
>
> CRICOS Provider Code: 00126G
>
> e: Nick.Spadaccini at uwa.edu.au
>
>
cheers
Nick
--------------------------------
Dr N. Spadaccini
School of Computer Science & Software Engineering
The University of Western Australia t: +(61 8) 6488 3452
35 Stirling Highway f: +(61 8) 6488 1089
CRAWLEY, Perth, WA 6009 AUSTRALIA w3: www.csse.uwa.edu.au/~nick
MBDP M002
CRICOS Provider Code: 00126G
e: Nick.Spadaccini at uwa.edu.au
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://scripts.iucr.org/pipermail/comcifs/attachments/20090120/c2d171ce/attachment-0001.html
More information about the comcifs
mailing list