From yaya at bernstein-plus-sons.com Tue Mar 6 19:53:49 2007 From: yaya at bernstein-plus-sons.com (Herbert J. Bernstein) Date: Tue, 6 Mar 2007 14:53:49 -0500 Subject: [medsbio-l] NSF proposal declined Message-ID: Dear Colleagues, I am sorry to report that the US National Science Foundation has declined to fund the MEDSBIO proposal. The web site and mailing list will remain in operation and I will explore alternate avenues for funding. Regards, Herbert From yaya at bernstein-plus-sons.com Tue Mar 13 12:47:10 2007 From: yaya at bernstein-plus-sons.com (Herbert J. Bernstein) Date: Tue, 13 Mar 2007 08:47:10 -0400 Subject: [medsbio-l] imgCIF workshop at BNL on 24 May 2007 Message-ID: Second imgCIF workshop (new series) at BNL after NSLS/CFN meeting: Synchrotron Image-Data Format Workshop Herbert J. Bernstein, yaya at dowling.edu Robert M. Sweet, sweet at bnl.gov Sponsored by DOE under grant ER64212-1027708-0011962, NSF under grant DBI-0610407. NIH support pending. There will be a workshop on data formats for synchrotron image data after the NSLS/CFN meeting on 24 May 2007 at BNL in the Biology Dept Conference Room, Bldg 463, starting at 9 am. Topics to be discussed include proposed extensions to imgCIF, the use of NeXus, progress on software and the status of imgCIF at Diamond and at SLS. Space is limited, so please contact Herbert J. Bernstein and Bob Sweet at BNL_imgCIF_May07 at medsbio.org to reserve a place. * Review of imgCIF and CBFlib * Proposed extensions to the imgCIF dictionary * Status of imgCIF adoption at SLS, Diamond, ... * Future directions * Discussion From yaya at bernstein-plus-sons.com Mon Apr 16 16:16:44 2007 From: yaya at bernstein-plus-sons.com (Herbert J. Bernstein) Date: Mon, 16 Apr 2007 11:16:44 -0400 Subject: [medsbio-l] imgCIF workshop (new series) at BSR 2007 Message-ID: Third imgCIF workshop (new series) at BSR 2007 in Manchester and at Diamond: The Management of Synchrotron Image Data: Changes to the imgCIF dictionary and software, interaction with NeXus Sponsored by DOE under grant ER64212-1027708-0011962, NSF under grant DBI-0610407. You are cordially invited to a CBF/imgCIF workshop in two lunch sessions at BSR 2007 in Manchester and at Diamond. There will be a working lunch on 17 August as a breakout session to the BSR2007 meeting during a visit to Diamond Light Source. The meeting will be held at 12:30 in room 1.17 of Diamond House. The lunch is open to those not attending the main BSR2007 meeting, though places are limited. The major topics discussed at the Diamond session will be recent changes in the imgCIF dictionary and software and the interaction with NeXus. For those who cannot make it to the Diamond session or who want to get an introduction to the subject, there will also be a working lunch during the BSR meeting in Manchester. The time and room will be announced on the MEDSBIO.org web site and at the meeting. For further information and to register please contact Herbert J. Bernstein and Alun Ashton at bsr_imgcif_aug07 at medsbio.org -- ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 yaya at dowling.edu ===================================================== From yaya at bernstein-plus-sons.com Sat May 12 16:03:02 2007 From: yaya at bernstein-plus-sons.com (Herbert J. Bernstein) Date: Sat, 12 May 2007 11:03:02 -0400 Subject: [medsbio-l] Submission of Revised MEDSBIO proposal Message-ID: Dear Colleagues, I just got the reviews on the MEDSBIO proposal we submitted last year. In view of the reviews themselves and recent events, it seems worthwhile to submit a revised proposal for the 25 June 2007 proposal cycle. The purpose of this message is to see who wishes to be a collaborator in that revised proposal. Let me briefly address the reviews. They ranged from very good down to good. The issues raised against are appended below, but the real issues we need to address are from recent events -- the growing use of imgCIF and the change in attitudes towards preservation of raw experimental data. Clearly, within a few years, we, as a community, will have to have an agreed, organized approach to the archiving and exchange of raw data in structural biology. I suggest the following revised goals for MEDSBIO The goals of the MEDSBIO consortium are to 1. Create a collaborative environment in which to resolve the interface issues among multiple structural biology data management protocols, including imgCIF, NeXuS, vendor data formats, instrument control and signaling protocols, local and remote experiment control protocols, etc. with the objective of making the collection, transfer and archiving of data for experiments in structural biology as efficient as practicable; and 2. In cooperation with the major existing archives in structural biology, the major journal publishers in structural biology, the relevant government and international organizations and the major vendors in structural biology to contribute to the evolving effort to define a mechanism to preserve all raw experimental data in structural biology for future reference; and 3. To maintain an archive of documentation on standards and proposals for ontologies, software, hardware specifications, web templates and other documentation related to such protocols; and 4. To maintain an archive of open source software and links to closed source software related to such protocols; and 5. To maintain a archive of samples and test cases related to such protocols; run annual workshops on issues relating to such protocols; contribute open source software to fill gaps in the infrastructure related to such protocols; gather and where necessary create curricular material to assist in training experimenters in issues related to such protocols. The MEDSBIO activities for which funding is needed include funds to organize and run 1-2 workshops per year, funds for student staff to acquire, organize and disseminate data formats and software and funds to develop software to fill gaps in this infrastructure, especially in creating open source interface and translation software among formats. These efforts are primarily focused on the fine details of data acquisition, of managing raw data in hardware and software in ways that conserve resources, of providing the fully elaborated data format specifications and robust interchange software that will enable archiving and interchange. These are issues that users of this data often gloss over or do not consider at all. For the users, data derived from the raw data, e.g. structure factors derived from pixel-by-pixel photon counts are the primary data, to be provided by "black-box" systems. For an archive the messy details of subtle differences among substantially similar data representations may be serious inconveniences hindering worthwhile efforts at making data indices and feeding databases. MEDSBIO is concerned with issues in the innards of those black boxes and the valid scientific reasons for these subtle differences. The MEDSB Comments please. Regards, Herbert "...The promise of developing new algorithms for analyzing data was nice, but there were no specifics as to how this would be done. "Although the proposal mentioned many kinds of data, in the end the proposal was for X-ray crystallographic data. This is not a proposal for a depository for raw data. A similar group is already established with 3 workshops already planned. There were no plans for implementation or carrying through the ideas from the workshops to practice. ... "Taken together, the observations that the group was already interacting, the fact that the data collection (in whatever form it would take) was just that, not a way to foster new research, and the limited general interest all dampened the panel's enthusiasm." ... "Why is there no contribution toward support from vendors?" ... "It is hard to tell exactly what science will be done as a result of the proposal, it seems more like some workshops will be held and people will be encouraged to talk with one another, but that is already being done in other formats." ... "The proposal is timely, but there is no real detail on how the goals of the project will be achieved. There is a lot of information on what is being done elsewhere by the individual collaborators and other investigators, and it has to be assumed that the planned workshops and meetings will help the consortium consolidate and focus its efforts as it relates to meeting the needs of the community. In addition, little effort is focused on the current formats being used for NMR, cryo-electron microscopy (cryo-EM) and image reconstructed data, or muon spin research and how this information will be documented or interfaced, as was the a premise at the beginning of the proposal. If this consortium really wants to "serve" the structural biology field as a whole, there really should also be discussions on how to archive documentation on data acquisition and storage protocols for these and also solution scattering techniques. Clearly this would be an even larger undertaking than the crystallographic data that became the main focus of this proposal. However, even after having said the above, these groups of collaborators are the ideal investigators to take on this task of creating the forum such as the MEDSBIO since they are the most experienced with the issues involved. -- ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 yaya at dowling.edu ===================================================== From yaya at bernstein-plus-sons.com Mon May 14 03:02:39 2007 From: yaya at bernstein-plus-sons.com (Herbert J. Bernstein) Date: Sun, 13 May 2007 22:02:39 -0400 Subject: [medsbio-l] Agenda for BNL imgCIF workshop Message-ID: The agenda for the Synchrotron Image-Data Format Workshop on Thursday, 24 May 2007 at Brookhaven National Laboratory has been posted at: http://www.medsbio.org/meetings/BNL_May07_imgCIF_Workshop.html There will be a light breakfast at 8:30, and the talks will start at 9 am. Lunch will be provided. Space is limited so ... Important: Even if you are already registered for the CFN/NSLS meeting or are a local BNL person, it is important to contact us if you wish to attend so we can be sure to have enough chairs and food. Please send email to Herbert J. Bernstein and Bob Sweet at BNL_imgCIF_May07 at medsbio.org no later than 17 May 2007. -- ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 yaya at dowling.edu ===================================================== From yaya at bernstein-plus-sons.com Wed Jul 11 15:47:34 2007 From: yaya at bernstein-plus-sons.com (Herbert J. Bernstein) Date: Wed, 11 Jul 2007 14:47:34 -0000 Subject: [medsbio-l] imgCIF workshops, 14 and 17 Aug 2007 Message-ID: The Management of Synchrotron Image Data: Changes to the imgCIF dictionary and software, interaction with NeXus Sponsored by DOE under grant ER64212-1027708-0011962, NSF under grant DBI-0610407 and NIH under grant 1R13RR023192-01A1 You are cordially invited to a CBF/imgCIF workshop in two lunch sessions at BSR 2007 in Manchester and at Diamond. The first session will in Manchester on Tuesday, 14 August from 12:45 to 13:45. It will provide an introduction to imgCIF and NeXus and a brief review of current progress. The room will be announced on the meeting web site and on the MEDSBIO.org web site. This lunch meeting is open, but advance registration would be appreciated. The second session will be at Diamond on Friday, 17 August at 12:30 to discuss recent changes in the imgCIF dictionary and software and the interaction with NeXus. It will be held in room 1.17 of Diamond House as a working lunch. This lunch is open both to BSR attendees and others, though places are limited and advance registration is required. Lunch will be provided. There has been a great deal of progress in the past year. There is a lot to report and a lot to discuss. If you work with raw experimental data in structural biology, these workshops may prove interesting for you. Thanks to funding from DOE, NSF and NIH we have some funds to help with travel to these workshops. If you need assistance, please contact us. For further information and to register please contact Herbert J. Bernstein and Alun Ashton at bsr_imgcif_aug07 at medsbio.org -- ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 yaya at dowling.edu ===================================================== From yaya at bernstein-plus-sons.com Tue Jul 24 05:29:20 2007 From: yaya at bernstein-plus-sons.com (Herbert J. Bernstein) Date: Tue, 24 Jul 2007 00:29:20 -0400 Subject: [medsbio-l] updated workshop announcement Message-ID: Please see the updated announcement for the imgCIF workshops in conjunction with BSR 2007 on 14 August 2007 in Manchester and 17 August 2007 at Diamond: http://www.medsbio.org/meetings/BSR_2007_imgCIF_Workshop.html In interesting collection of participants is coming toghether. The detailed agenda for each session will be added to this page in early August. -- ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 yaya at dowling.edu ===================================================== From yaya at bernstein-plus-sons.com Wed Aug 1 19:01:57 2007 From: yaya at bernstein-plus-sons.com (Herbert J. Bernstein) Date: Wed, 1 Aug 2007 14:01:57 -0400 Subject: [medsbio-l] registering for the imgCIF workshops at BSR 2007 Message-ID: Dear Colleagues, The BSR 2007 meeting organizers have asked me to urge anyone who is planning to attend either the Tuesday, 14 August 2007 imgCIF workshop in Manchester or the Friday, 17 August 2007 imgCIF workshop at Diamond to please register by sending an email message to: >>>>> bsr_imgcif_aug07 at medsbio.org <<<<<<<<<< giving the following information your name if you will be joining us for lunch on Tuesday, 14 August 2007 if you will be joining us for lunch on Friday, 17 August 2007 Please send this message no later than 12:00 GMT on Friday, 3 August 2007. This will help us to ensure that we have enough food for everybody. Please send this message to this email address even if you have already told me you will be coming, even is you are presenting. We certainly want to have food for our presenters. You can learn more about the workshops at: http://www.medsbio.org/meetings/BSR_2007_imgCIF_Workshop.html You may find the recently added preliminary agenda interesting. Thank you for your cooperation. Regards, Herbert J. Bernstein -- ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 yaya at dowling.edu ===================================================== From yaya at bernstein-plus-sons.com Wed Aug 8 14:51:28 2007 From: yaya at bernstein-plus-sons.com (Herbert J. Bernstein) Date: Wed, 8 Aug 2007 09:51:28 -0400 Subject: [medsbio-l] participants in the imgCIF workshops Message-ID: Dear Colleagues, Appended is the current list of participants for the imgCIF workshop at BSR 2007 in Manchester on Tuesday 14 August 2007 from 12:45 to 13:45 and at Diamond on Friday, 17 August 2007 from 12:30 to 17:00. http://www.medsbio.org/meetings/BSR_2007_imgCIF_Workshop/ I was recently informed by a participant that an earlier attempt to send a registration message did not get through. If you wish to participate and are not on this list, or if you have corrections, please send me an email directly at: yaya at bernstein-plus-sons.com no later than 12:00 noon GMT on Thursday, 9 August 2007. Regards, Herbert 14-Aug 17-Aug Name Attend Freddie Akeroyd Present Alun W. Ashton Attend Mark Basham Present Present Herbert J. Bernstein Attend Frances C. Bernstein Attend Ian Clifton Attend Attend Georgi Darakev Present Matt Dougherty Attend Attend Elizabeth M. Duke Attend Attend Judy Flippen-Anderson Present Mike Folk Attend Attend John Jemilawon Attend Louise Jones Attend Laurent Lerusse Attend Brian McMahon Attend Present Chris Nielsen Attend Harry Powell Attend Francois Remacle Attend Stuart Robinson Present Clemens Schulze-Briese Present Graeme Winter Attend Jonathan Wright -- ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 yaya at dowling.edu ===================================================== From yaya at bernstein-plus-sons.com Fri Aug 24 22:24:21 2007 From: yaya at bernstein-plus-sons.com (Herbert J. Bernstein) Date: Fri, 24 Aug 2007 17:24:21 -0400 Subject: [medsbio-l] Draft Report on Third imgCIF workshop Message-ID: You will find a draft of the report on the third imgCIF workshop held at BSR 2007 in Manchester on 14 August 2007 and at Diamond Light Source in Chilton on 17 August 2007 at: http://www.medsbio.org/meetings/BSR_2007_imgCIF_Workshop/ Comments, correction and suggestions would be appreciated. Regards, Herbert J. Bernstein -- ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 yaya at dowling.edu ===================================================== From yaya at bernstein-plus-sons.com Tue Sep 11 01:05:22 2007 From: yaya at bernstein-plus-sons.com (Herbert J. Bernstein) Date: Mon, 10 Sep 2007 20:05:22 -0400 Subject: [medsbio-l] Fwd: RE: HDF image format Message-ID: Matt Dougherty has asked that the following emails be posted to the medsbio list. -- HJB >Delivered-To: yaya-bernstein-plus-sons:com-yaya at bernstein-plus-sons.com >X-Virus-Check-By: mailwash16.pair.com >Subject: RE: HDF image format >Date: Mon, 10 Sep 2007 18:17:09 -0500 >Thread-Topic: HDF image format >Thread-Index: AcfwuScYxuroJ4TrRkyr0l1W6nLn3gDRYo0O >From: "Dougherty, Matthew T." >To: "Thomas Goddard" , > "Ludtke, Steven J." , > "Pawel Penczek" >Cc: , , , > "Chiu, Wah" >X-Status: >X-Keywords: > >Hi Tom, > >To answer your primary question, yes it is very important to have a >unified image data format. > >The EMAN HDF format is based on a straw-man prototype I drafted a >few years ago. >This prototype has revealed some performance/design problems in HDF. >Also, as I thought more deeply about this layout, I realized >deficiencies in my format design requiring a new prototype. > >The real opportunity here is that HDF has not been adopted in the >biological community; all of the prototypes have been created by our >labs, so there is not an installed base of HDF EM formats resisting >change. > >As the EM images get larger, existing EM formats will fail and the >need for a capable format design to replace them is critical; a >unified approach is best. > > >Recently I spoke at the Consortium for Management of Experimental >Data in Structural Biology Third imgCIF workshop at 9th >International Conference on Biology and Synchrotron Radiation, led >by Herbert Bernstein. My talk on the "Status of Data Formats in >Cryo EM" is located at >http://medsbio.org/meetings/BSR_2007_imgCIF_Workshop/ > >Mike Folk, director of HDF, and I are recent members of MEDSBIO core >committee. Mike and I had time to discuss at length the issues >brought up in the December 7, 2006 teleconference on developing an >EM image format. > >Another opportunity is that MEDSBIO is the center point for the >integration of the NEXUS & imgCIF formats within the beamline >community; devising an EM/HDF format that is interoperable with >these efforts would be strategic. > >Getting the experimental data, viz, data storage, repository, >standards and archival communities working in concert is most >desirable in terms of legacy, but will require a concerted effort of >all parties. A well designed EM format could be appropriated by >other biological imaging communities. > > > > >Regarding the four capabilities noted in your email, all of them are >definitely needed, but I differ on the approach. >1) the generation of sub sampled density maps has different >possibilities (i.e. skipping pixels, use of median filters, >generation of datasets directly from inverse space, use of JPEG2000 >part 10). A mechanism to manage/track this is needed. >2) users prefer a variety of coordinate transformations (Euler angle >variations, quaternions, cosine matrices). How best to manage this? >3) alternate disk layouts can be transparently accomplished by a >common EM format API, or by the viz, et al, softwares directly >manipulating the HDF API. What method provides best long term >performance and simplicity? >4) one comprehensive strategy to deal with transformations, >symmetry, and cell angles would be preferred. This also has >implications regarding item #3. > > > >Instead of one long email, I will be sending you eight more emails >over the next few days that address issues I have identified in the >implementation of HDF for biological imaging applications: > >1) definition of an image core >2) metadata, hdf attributes & pytables >3) archiving, provenance, and the role of METS & OAIS >4) management of transformations and symmetry >5) integration of LSID >6) needed enhancements to HDF >7) design goals for digital masters and derivatives >8) proposed collaboration roadmap > > >If there are no objections, I would like to ask Herbert Bernstein to >post these emails on the MEDSBIO website. > > > >Matthew > > > > >-----Original Message----- >From: Thomas Goddard >[mailto:goddard at cgl.ucsf.edu] >Sent: Thu 9/6/2007 1:59 PM >To: Ludtke, Steven J.; Pawel Penczek; Dougherty, Matthew T. >Subject: HDF map format > >Hi Steve, Pawel, Matthew, > > Do you think it is important that Chimera and EMAN use the same HDF >representation for density maps? > > I have been experimenting with HDF5 file format for EM density maps >in Chimera for some months. I've been trying 4 new capabilities, mostly >with EM tomography in mind: > >1) Put subsampled maps in file for fast loading and display of large >data sets (> 100 Mbytes). > >2) Put coordinate rotation in file for bricks of data extracted from >another map that are not aligned with the original maps axes. Commonly >needed in tomography. > >3) Allow disk layout with alternate, possibly more than one chunk shape, >for fast disk reads of xy planes, xz planes, and yz planes, and sub-regions. > >4) Include symmetry matrices for single-particle reconstructions in hdf >map header. Used in fitting of monomers into map. > > None of these added capabilities are part of the EMAN HDF format as >far as I know, so my HDF maps have different format than EMAN HDF maps. > Chimera reads both, but EMAN won't read the HDF maps written by >Chimera. That is not ideal. Below is an example of the current Chimera >HDF. > > Tom > > > ># Example HDF5 format written by Chimera. ># ># /image ># chimera_version "1.2422" ># step (1.2, 1.2, 1.2) ># origin (-123.4, -522, 34.5) ># cell_angles (90.0, 90.0, 90.0) ># rotation_axis (0.0, 0.0, 1.0) ># rotation_angle 45.0 ># data (3d array of uint8 (123,542,82)) ># data_acs (3d array of uint8 (123,542,82), alternate chunk shape) ># data_2 (3d array of uint8 (61,271,41)) ># subsample_spacing (2, 2, 2) ># (more subsampled or alternate chunkshape versions of same data) ># ># Names "chimera_version", "step", "origin", "cell_angles", ># "rotation_axis", "rotation_angle", "subsample_spacing" are fixed ># while "image", "data", "data_acs" and "data_2" can be any name. ># ># In the example "image" is an HDF group, "chimera_version", "step", ># "origin", "cell_angles", "rotation_axis", "rotation_angle", ># are group attributes, "data", "data_acs" and "data_2" are ># hdf datasets (arrays), and "subsample_step" is a dataset attribute. ># ># All data sets within the group represent the same data, such as optional ># subsampled arrays or alternate chunk shape for efficient disk access. ># ># Cell angles need not be included if they are 90,90,90. They are ># included for handling crystallographic density maps. An identity ># rotation need not be included. The rotation angle is in degrees. ># ># The file is saved with the Python PyTables modules which includes ># additional attributes "VERSION", "CLASS", "TITLE", >"PYTABLES_FORMAT_VERSION". ># -- ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 yaya at dowling.edu =====================================================