_database.dataset_doi - any problems if this might be a DOI for raw data?

Horst Puschmann horst.puschmann at gmail.com
Fri May 27 13:59:22 BST 2022


Hello James,

I think providing a DOI in the CIF pointing at large amounts of data would
be fantastic. I must say that I don't quite understand that wording here --
what does the CSD or PDP have to do with it? Surely, all hkl data is now
included in the CIF as standard -- but of course: it would also be
excellent if the hkl could be handled via an 'hkl DOI'.

I take 'raw data' to mean the diffraction images (frames) -- possibly
together with a recipe for how they were processed. Having easy access to
frames 'by default' would be fantastic.

But there are other kinds of files, and some might be required to repeat
the actual refinement. I am thinking of our own NoSpherA2 refinement, which
requires a '.tsc' file -- a file which is clearly too large to be embedded
in the CIF. A DOI for this kind of file would be really useful (right now,
we just include the hkl and the exact parameters to repeat the generation
of the '.tsc' file -- which is clearly not ideal.

And then there might be a third kind of DOI -- one where people can deposit
'random' files -- like videos, images, descriptions of special setup etc,
scripts etc.

I am not sure whether this is the sort of feedback you were looking for,
but there you are.

Greetings
Horst

On Fri, 27 May 2022 at 04:09, James H via coreDMG <coredmg at iucr.org> wrote:

> (cross-posted to cif-developers and core DMG, apologies for cross-posting)
>
> Dear CIF Developers and core DMG,
>
> IUCr Journals are looking at using _database.dataset_doi to indicate the
> DOI of a raw data set associated with a data block. The meaning of
> "dataset" is not clear here, for example, it might have been intended to
> refer to hkl listings.
>
> So, please give feedback on any problems your software/database might
> encounter if this DOI might resolve to a raw dataset.
>
> The current definition:
>
> " The digital object identifier (DOI) registered to identify
>     a data set publication associated with the structure
>     described in the current data block. This should be used
>     for a dataset obtained from a curated database such as
>     CSD or PDB. "
>
> thanks,
> James.
> --
> T +61 (02) 9717 9907
> F +61 (02) 9717 3145
> M +61 (04) 0249 4148
> _______________________________________________
> coreDMG mailing list
> coreDMG at iucr.org
> http://mailman.iucr.org/cgi-bin/mailman/listinfo/coredmg
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.iucr.org/pipermail/coredmg/attachments/20220527/acb832cd/attachment.htm>


More information about the coreDMG mailing list