[Cif2-encoding] How we wrap this up

SIMON WESTRIP simonwestrip at btinternet.com
Sun Sep 26 21:40:54 BST 2010


Dear all

While reviewing my hypothetical 'to do' list for implementing CIF2 in current 
software, I realized that
the issue of current support for elided character codes hasnt really been 
addressed in the context of CIF2.
My 'to do' list contains notes that software could treat them as keyboard 
shortcuts, and their use could be
defined in the dictionary. However, that was based on a distinct difference 
between CIF1 and CIF2,
while the current arguments for 'as for CIF1...' suggest that the distinction 
between CIF1 and CIF2 

should almost be imperceptible.

How is this issue to be addressed in the specification? 

Cheers

Simon




________________________________
From: Herbert J. Bernstein <yaya at bernstein-plus-sons.com>
To: Group for discussing encoding and content validation schemes for CIF2 
<cif2-encoding at iucr.org>
Sent: Saturday, 25 September, 2010 20:37:46
Subject: Re: [Cif2-encoding] How we wrap this up

Thank you for your cooperation. -- Herbert

=====================================================
Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
        Idle Hour Blvd, Oakdale, NY, 11769

                 +1-631-244-3035
                yaya at dowling.edu
=====================================================

On Sat, 25 Sep 2010, SIMON WESTRIP wrote:

> OK - as promised, I wont pursue the matter :-)
> 
> 
> ____________________________________________________________________________
> From: Herbert J. Bernstein <yaya at bernstein-plus-sons.com>
> To: Group for discussing encoding and content validation schemes for CIF2
> <cif2-encoding at iucr.org>
> Sent: Saturday, 25 September, 2010 19:18:54
> Subject: Re: [Cif2-encoding] How we wrap this up
> 
> Dear Simon,
> 
>   Unfortunately, that is likely to take us back into our infinite loop or
> into a diverging spiral.  Right now, we would have UTF8 as no more or less a
> default for CIF2 than ASCII is for CIF1 -- i.e. a not too bad first guess as
> the likely default encoding for any given CIF, but not a formal constraint. 
> I would suggest we leave the wording in that imprecise state, get CIF2 out
> and accepted and then work further on the encoding issue.
> 
>   Regards,
>     Herbert
> 
> =====================================================
> Herbert J. Bernstein, Professor of Computer Science
>   Dowling College, Kramer Science Center, KSC 121
>         Idle Hour Blvd, Oakdale, NY, 11769
> 
>                 +1-631-244-3035
>                 yaya at dowling.edu
> =====================================================
> 
> On Sat, 25 Sep 2010, SIMON WESTRIP wrote:
> 
> > Dear all
> >
> > In the event that CIF2 adopts the 'any encoding' approach, would there be
> > any objections to
> > explicitly defining a default encoding in the specification, to be
> defaulted
> > to when there were no indications
> > to the contrary. At worst this would give CIF2 service providers an excuse
> > to interpret CIFs as e.g. UTF8 if they couldnt
> > determine the encoding by other means - but such intollerant service
> > providers would soon find that their service is
> > not successful - while at best this might raise awareness of the issues
> > regarding encoding once non-ASCII is used in
> > a CIF. Essentially, it does not require users to change there working
> > practices, which is one of the main arguments for
> > 'any encoding'.
> >
> > So, CIF2 would remain 'any encoding', and specifications in terms of e.g.
> > "Herbert's as for CIF1..."
> > might only require a single sentence to define the default after stating
> > what the 'preferred' encoding was;
> > the proposal might be phrased as "Herbert's as for CIF1..." + "explicit
> > default encoding"?
> >
> > I do not wish to prolong this debate - if there are objections I will not
> > launch into an endless round of exchanges
> > that cover the same ground that has led us this far.
> >
> > Cheers
> >
> > Simon
> >
> >
> >
> >
> >
> >
> >___________________________________________________________________________
> _
> > From: SIMON WESTRIP <simonwestrip at btinternet.com>
> > To: Group for discussing encoding and content validation schemes for CIF2
> > <cif2-encoding at iucr.org>
> > Sent: Friday, 24 September, 2010 20:10:13
> > Subject: Re: [Cif2-encoding] How we wrap this up
> >
> > Dear James
> >
> > As you may have gathered I have been reconsidering my position on this
> > issue.
> > Please forgive me, but I would like to change my vote if that is OK, in
> > favour of the 'any encoding' camp.
> > This apparent U-turn is not a response to recent contributions; rather it
> is
> > the outcome of a meeting I had this morning
> > where I demonstrated some new software to the Managing Editor of IUCr
> > journals.
> >
> > By way of explanation:
> >
> > I have been developing a new docx template which the IUCr editorial office
> > is shortly to release for use by
> > authors. The template will be packaged with some tools to extract data
> from
> > CIFs
> > and tabulate them in the Word document, e.g. open an mmCIF, click a
> button,
> > and standard
> > tables populated with data from the CIF will be included in the document,
> > acting as
> > table templates for the author to edit as appropriate for their
> manuscript.
> >
> > Inclusion of the mmCIF tools is part of an unofficial policy to 'coax'
> > biologists to start using/accepting mmCIF
> > as a useful medium, rather than as a product of their deposition to the
> PDB,
> > and to encourage them to become comfortable
> > with passing mmCIFs between applications, and even to edit the things (in
> > the same way as the core-CIF community
> > treats CIFs). For example, our perception is that there is no reason why
> an
> > author should not feel free to take an mmCIF
> > that has been created by e.g. pdb_extract and populate it using
> third-party
> > software before uploading to the PDB for
> > deposition.
> >
> > This cause would not be furthered by effectively invalidating an mmCIF if
> it
> > were not to be encoded in one of
> > the specified encodings.
> >
> > So although I am uneasy about a specification that propogates uncertainty,
> > I'm also uneasy about alienating users,
> > especially when we are struggling to change their mindset as in the case
> of
> > the biological community
> > (my perception of the biological community's attitude to mmCIF is based on
> > feedback from authors/coeditors to
> > IUCr journals).
> >
> > Granted this may not be the most compelling argument in favour of 'any
> > encoding', but recognizing the hurdles that
> > may have to be overcome once we move beyond ASCII whatever the CIF2
> > specification, I support 'any encoding'
> > as 'a means to an end'.
> >
> > I will not provide my preferences in terms of the numbered options until
> you
> > say so; afterall, I have already voted and
> > all this has to be signed off by COMCIFs in any case.
> >
> > Cheers
> >
> > Simon
> >
> >
> >
> >
> >___________________________________________________________________________
> _
> > From: "Bollinger, John C" <John.Bollinger at STJUDE.ORG>
> > To: Group for discussing encoding and content validation schemes for CIF2
> > <cif2-encoding at iucr.org>
> > Sent: Friday, 24 September, 2010 14:50:57
> > Subject: Re: [Cif2-encoding] How we wrap this up
> >
> > Dear Simon,
> >
> > It is exactly this sort of issue that drove me to support more permissive
> > encoding rules and ultimately to devise the UTF-8 + UTF-16 + local
> proposal.
> >
> > Do please think about the considerations Herb raised.  As you reconsider
> > your votes, I urge you also to ask yourself what, *precisely*, a "text
> file"
> > is, and to consider whether your answer is functionally different from my
> > "local".  If you decide not, then please consider what that answer implies
> > about CIF2 support of UTF-8 and UTF-16 (which evidently you favor) under
> > each option on the table, especially for CIFs containing non-ASCII
> > characters.  Whatever you decide about the meaning of "text file", please
> > consider whether reasonable people might reach a different conclusion, as
> I
> > assert they might do, and to what extent the standard needs to address
> that.
> >
> >
> > Regards,
> >
> > John
> > --
> > John C. Bollinger, Ph.D.
> > Department of Structural Biology
> > St. Jude Children's Research Hospital
> >
> >
> > >From: cif2-encoding-bounces at iucr.org
> > [mailto:cif2-encoding-bounces at iucr.org] On Behalf Of SIMON WESTRIP
> > >Sent: Friday, September 24, 2010 7:53 AM
> > >To: Group for discussing encoding and content validation schemes for CIF2
> > >Subject: Re: [Cif2-encoding] How we wrap this up. .
> > >
> > >Dear Herbert
> > >
> > >Not for the first time, I find your arguement persuasive. Brian's vote
> and
> > explanation have also raised some
> > >questions that I would like to look into.
> > >
> > >I will confirm or otherwise my vote as soon as possible, assuming that is
> > OK with James and assuming that
> > >this round of votes might wrap this up.
> > >
> > >Cheers
> > >
> > >Simon
> > >
> > >________________________________________
> > >From: Herbert J. Bernstein <yaya at bernstein-plus-sons.com>
> > >To: Group for discussing encoding and content validation schemes for CIF2
> > <cif2-encoding at iucr.org>
> > >Sent: Friday, 24 September, 2010 13:17:14
> > >Subject: Re: [Cif2-encoding] How we wrap this up
> > >
> > >If he ignores the standard, in most cases all he has to do to comply with
> > CIF2 is to run whatever applications he currently runs to produce CIF1
> and,
> > perhaps, in some cases, run a minor edit pass at the end, to convert for
> the
> > minor syntactive differences and/or changed tags required to comply with
> > CIF2 and the new dictionaries, but he is unlikely to have to do anything
> to
> > deal with the messy business of whether his encoding is really a proper
> UTF8
> > encoding or not.
> >
> > >The punishment if he tries to comply, is that he has to totally uproot
> and
> > reconfigure the environment in which he produces CIFs from whatever he is
> > currently doing to create an enviroment in which he can reliably create
> and,
> > more importantly, transmit compliant UTF8 files.  This can be very tricky
> if
> > he does only a partial job, say fudging in one special application (yet to
> > be written), because if he stays with his old system, all kinds of tools
> > will keep trying to transcode whatever he has produced back to whatever
> his
> > system considers a standard. Those of us who have files, applications and
> > tools that have lived through several generations of macs are living proof
> > of the problem. Macs now have excellent UTF8/16 unicode support, but every
> > once in a while in working with a unicode file I find it has been
> strangely
> > and unexpectedly converted to something else, and it can be really tricky
> to
> > spot when the unaccented roman text part has been left untouched but just
> a
> > few accen
> > ted letters have gotten different accents.
> >
> > >Mandating UTF8 is simply trying to shift a serious software problem from
> > the central handlers of CIF (IUCr, PDB, etc.) to the external users. Most
> > users will probably have the good sense to simply ignore the demand and
> > leave the burden just where it is now.  A few sophisticated users will
> > probably adapt with no trouble, but the punishment for those users who
> > blindly follow orders before we have a complete multiplatform supporting
> > infrastructure in place by mandating UTF8 is severe, expensive and
> > undeserved.  Until and unless we have developed solid support, we will
> just
> > be alienating people from CIF.  I will continue to oppose such a move.
> >
> > [...]
> >
> >
> > Email Disclaimer:  www.stjude.org/emaildisclaimer
> > _______________________________________________
> > cif2-encoding mailing list
> > cif2-encoding at iucr.org
> > http://scripts.iucr.org/mailman/listinfo/cif2-encoding
> >
> >
> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://scripts.iucr.org/pipermail/cif2-encoding/attachments/20100926/01078d2e/attachment-0001.html 


More information about the cif2-encoding mailing list