[Cif2-encoding] How we wrap this up

Herbert J. Bernstein yaya at bernstein-plus-sons.com
Fri Sep 24 14:22:01 BST 2010


Dear Simon,

   Thank you very much for being willing to reconsider.  I will be
praying.

   Regards,
     Herbert

=====================================================
  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769

                  +1-631-244-3035
                  yaya at dowling.edu
=====================================================

On Fri, 24 Sep 2010, SIMON WESTRIP wrote:

> Dear Herbert
> 
> Not for the first time, I find your arguement persuasive. Brian's vote and explanation
> have also raised some
> questions that I would like to look into.
> 
> I will confirm or otherwise my vote as soon as possible, assuming that is OK with James
> and assuming that
> this round of votes might wrap this up.
> 
> Cheers
> 
> Simon
> 
> _______________________________________________________________________________________
> From: Herbert J. Bernstein <yaya at bernstein-plus-sons.com>
> To: Group for discussing encoding and content validation schemes for CIF2
> <cif2-encoding at iucr.org>
> Sent: Friday, 24 September, 2010 13:17:14
> Subject: Re: [Cif2-encoding] How we wrap this up
> 
> If he ignores the standard, in most cases all he has to do to comply with CIF2 is to
> run whatever applications he currently runs to produce CIF1 and, perhaps, in some
> cases, run a minor edit pass at the end, to convert for the minor syntactive
> differences and/or changed tags required to comply with CIF2 and the new dictionaries,
> but he is unlikely to have to do anything to deal with the messy business of whether
> his encoding is really a proper UTF8 encoding or not.
> 
> The punishment if he tries to comply, is that he has to totally uproot and reconfigure
> the environment in which he produces CIFs from whatever he is currently doing to create
> an enviroment in which he can reliably create and, more importantly, transmit compliant
> UTF8 files.  This can be very tricky if he does only a partial job, say fudging in one
> special application (yet to be written), because if he stays with his old system, all
> kinds of tools will keep trying to transcode whatever he has produced back to whatever
> his system considers a standard. Those of us who have files, applications and tools
> that have lived through several generations of macs are living proof of the problem.
> Macs now have excellent UTF8/16 unicode support, but every once in a while in working
> with a unicode file I find it has been strangely and unexpectedly converted to
> something else, and it can be really tricky to spot when the unaccented roman text part
> has been left untouched but just a few accented letters have gotten different accents.
> 
> Mandating UTF8 is simply trying to shift a serious software problem from the central
> handlers of CIF (IUCr, PDB, etc.) to the external users. Most users will probably have
> the good sense to simply ignore the demand and leave the burden just where it is now. 
> A few sophisticated users will probably adapt with no trouble, but the punishment for
> those users who blindly follow orders before we have a complete multiplatform
> supporting infrastructure in place by mandating UTF8 is severe, expensive and
> undeserved.  Until and unless we have developed solid support, we will just be
> alienating people from CIF.  I will continue to oppose such a move.
> 
> Simon, I beg you to change your vote.  Once we have the rest of CIF2 in
> place and supported, I will be happy to cooperate in trying to develop
> the software support we would need to make UTF8/UTF16 work well for
> users on Mac, Linux and Windows, but it is a big job that I do not
> believe can be done soon enough and well enough for options 3 through
> 5 to make sense right now.  Please do not "make the perfect the
> enemy of the good".
> 
> =====================================================
> Herbert J. Bernstein, Professor of Computer Science
>   Dowling College, Kramer Science Center, KSC 121
>         Idle Hour Blvd, Oakdale, NY, 11769
> 
>                 +1-631-244-3035
>                 yaya at dowling.edu
> =====================================================
> 
> On Fri, 24 Sep 2010, SIMON WESTRIP wrote:
> 
> > I do not understand why a user who adhere's to a CIF2 standard
> > that specifies an encoding will be 'punished'?
> > What worries me about a specification that allows any encodng
> > is that users who ignore any recommendations regarding
> > a preferred encoding might experience difficulties when e.g.
> > submitting their CIF to a journal/archive, even though they
> > have adhered to the standard (unjustly punished).
> >
> > With regard to the lack of CIF2 software support, surely CIF2
> > in general is of little use to users, not just its encoding requirements.
> > But perhaps you already have CIF2 software that can be dropped into existing
> > workflows save for the fact that it would require modification to work
> > with 'specified encodings'?
> >
> >
> >
> > ____________________________________________________________________________
> > From: Herbert J. Bernstein <yaya at bernstein-plus-sons.com>
> > To: Group for discussing encoding and content validation schemes for CIF2
> > <cif2-encoding at iucr.org>
> > Sent: Friday, 24 September, 2010 2:03:50
> > Subject: Re: [Cif2-encoding] How we wrap this up
> >
> > I see not point in a final specification that users will
> > ignore, and that will actually punish users who
> > pay attention to it.  That is not a useful standard,
> > and very damaging to the CIF brand.  We should be
> > promolgating reasonable standards that we expect will
> > in fact be adhered to, not ignored.  In the present
> > state of lack of software support and clear guidance,
> > all the prescriptive UTF8 recommendations are unhelpful
> > to users who read and pay attention to what the standard
> > says.
> >
> > =====================================================
> > Herbert J. Bernstein, Professor of Computer Science
> >   Dowling College, Kramer Science Center, KSC 121
> >         Idle Hour Blvd, Oakdale, NY, 11769
> >
> >                 +1-631-244-3035
> >                 yaya at dowling.edu
> > =====================================================
> >
> > On Thu, 23 Sep 2010, SIMON WESTRIP wrote:
> >
> > > I agree to some extent with what you say, Herbert, but I'm
> > > a bit more optomistic (for once) that the IUCr at least will be able to
> > > adapt to
> > > a 'specified encoding' system relatively quickly, and in the interim
> > > certainly not reject non-UTFx CIFs. I'm not convinced that whatever
> > > appears in the specification will have any influence on user practice,
> > > especially in the non-IUCr world; rather I think the success (or
> > otherwise)
> > > of CIF2 will result from the software that implements it (as you suggest).
> > > I don't share your pessimism about the potential confusion of specifying
> > > UTF8 etc.,
> > > and certainly don't think that a restricted encoding will be any more
> > > confusing than
> > > 'any encoding', given, as you say, "people may not understand what they
> > are
> > > doing..."
> > >
> > > I suppose much of the difference in our views lies in our perception of
> > user
> > > interest -
> > > I suspect there may even be overlap in this respect - but I'm perhaps less
> > > inclined to
> > > think that the final specification will have a marked influence on users:
> > > "they can keep doing whatever they are currently doing that is currently
> > > working for them"
> > >
> > > Anyway, its not me you have to convince :-), and its time I went to bed!
> > >
> > > Cheers
> > >
> > > Simon
> > >
> > >___________________________________________________________________________
> > _
> > > From: Herbert J. Bernstein <yaya at bernstein-plus-sons.com>
> > > To: Group for discussing encoding and content validation schemes for CIF2
> > > <cif2-encoding at iucr.org>
> > > Sent: Thursday, 23 September, 2010 22:39:24
> > > Subject: Re: [Cif2-encoding] How we wrap this up
> > >
> > > Dear Simon,
> > >
> > >   That is precisely the point -- there is a serious and growing
> > > problems with encodings.  The strict UTF8 proposal then makes
> > > it a universal problem for everybody using CIF, and we do _not_
> > > have a coherent means setup to deal with it.  The substitution
> > > of UTF8 for ASCII in the CIF1 spec does not, in and of itself
> > > make anything worse for anybody currently receiving 128 character
> > > ASCII -- it is identical, and it does not force users working
> > > in other systems that the IUCr journals are currently coping
> > > with to jump into the boiling water, they can keep doing whatever
> > > they are currently doing that is currently working for them
> > > and the IUCr.  All the journals have to do until something that
> > > is actually supports not-lower-128-ASCII is ready is to tell people that
> > for
> > > the jounrnals they will still have to use Brian's reverse solidus
> > > escape codes for anything else -- nothing major changes for most
> > > people.  If and when there really is a coherent scheme to support
> > > more native Unicode code points for journal submission with tested
> > > software, then we can do something more.  Right now, proposals
> > > 3,4 and 5 will make things worse for large numbers of users
> > > and not really make anything better for the IUCr.  It is too
> > > early in the UTF8 conversion process.
> > >
> > > =====================================================
> > > Herbert J. Bernstein, Professor of Computer Science
> > >   Dowling College, Kramer Science Center, KSC 121
> > >         Idle Hour Blvd, Oakdale, NY, 11769
> > >
> > >                 +1-631-244-3035
> > >                 yaya at dowling.edu
> > > =====================================================
> > >
> > > On Thu, 23 Sep 2010, SIMON WESTRIP wrote:
> > >
> > > > Just because I'm still at my desk - and despite the fact that I told
> > > myself
> > > > I would not
> > > > contribute further beyond my vote - it might be worth mentioning that
> > the
> > > > IUCr are already
> > > > experiencing problems related to encoding issues (in their web
> > services),
> > > > and the occurence
> > > > of such problems is most likely to increase when CIFs can contain
> > > non-ASCII
> > > > text.
> > > >
> > > > Cheers
> > > >
> > > > Simon
> > > >
> > >>__________________________________________________________________________
> > _
> > > _
> > > > From: Herbert J. Bernstein <yaya at bernstein-plus-sons.com>
> > > > To: Group for discussing encoding and content validation schemes for
> > CIF2
> > > > <cif2-encoding at iucr.org>
> > > > Sent: Thursday, 23 September, 2010 21:31:24
> > > > Subject: Re: [Cif2-encoding] How we wrap this up
> > > >
> > > > Votes:
> > > >
> > > > In terms of the requested preference voting, I vote in declining order
> > of
> > > > preference
> > > >
> > > > 1, then 2, then (big gap) 5, then 4, then 3.
> > > >
> > > > On absolute voting up or down in COMCIFS, I will accept 1 or 2, but will
> > > > lobby against and vote strongly against 3, 4, and 5.
> > > >
> > > > Explanation:
> > > >
> > > > I am not opposed to Brian recommendations.  The only reason I would vote
> > > > for 1 over 2 is that I fear Brian's recommendation would generate yet
> > > > more debate over the precise details and I believe we have more than
> > > > run out of time to get something concrete ready for the IUCr meeting.
> > > >
> > > > I am very strongly opposed to 3, 4 and 5 because I believe they will
> > > > cause confusion and delay in adoption of CIF2, while choices
> > > > 1 and 2 keep the practices the community and the IUCr have lived
> > > > with successfully for many years, simply applying then to UTF8
> > > > instead of ASCII.  People may not understand what they are doing
> > > > in that mode, but they manage to successfully submit CIFs to the
> > > > IUCr that way, and we don't have software ready to support anything
> > > > else.
> > > >
> > > >   -- Herbert
> > > >
> > > > At 8:13 PM +0000 9/23/10, SIMON WESTRIP wrote:
> > > > >Faced with the options:
> > > > >
> > > > >1. Herbert's 'as for CIF1 proposal with UTF8 in place of ASCII'
> > > > >recently posted here and to COMCIFS.
> > > > >2. Herbert's 'as for CIF1 proposal with UTF8 in place of ASCII',
> > > > >together with Brian's *recommendations*
> > > > >3. UTF8-only as in the original draft
> > > > >4. UTF8 + UTF16
> > > > >5. UTF8, UTF16 + "local"
> > > > >
> > > > >I have to vote for (4).
> > > > >
> > > > >When it comes down to it, I believe that the specification of a
> > > > >'standard' should not be based on uncertainty,
> > > > >and as 'any encoding' presents uncertainty, it should not be in the
> > > > standard.
> > > > >
> > > > >I might be accused of changing my position (I have recently
> > > > >expressed support for flexibilty and even a qualified
> > > > >acceptance of the 'as for CIF1 proposal with UTF8 in place of
> > > > >ASCII'), but part of the value of these discussions
> > > > >is to question your own views in the light of other's perspectives.
> > > > >Indeed, I have found these discussions
> > > > >extremely informative and am now in a far better position to handle
> > > > >the realities of introducing non-ASCII CIFs,
> > > > >whatever the final COMCIFS decision.
> > > > >
> > > > >Cheers
> > > > >
> > > > >Simon
> > > > >
> > > > >
> > > > >
> > > > >From: "Bollinger, John C" <John.Bollinger at STJUDE.ORG>
> > > > >To: Group for discussing encoding and content validation schemes for
> > > > >CIF2 <cif2-encoding at iucr.org>
> > > > >Sent: Thursday, 23 September, 2010 15:02:25
> > > > >Subject: Re: [Cif2-encoding] How we wrap this up
> > > > >
> > > > >On Thursday, September 23, 2010 5:46 AM, SIMON WESTRIP wrote:
> > > > >
> > > > >>1. Herbert's 'as for CIF1 proposal with UTF8 in place of ASCII'
> > > > >>recently posted here and to COMCIFS.
> > > > >>2. Herbert's 'as for CIF1 proposal with UTF8 in place of ASCII',
> > > > >>together with Brian's *recommendations*
> > > > >>3. UTF8-only as in the original draft
> > > > >>4. UTF8 + UTF16
> > > > >>5. UTF8, UTF16 + "local"
> > > > >>
> > > > >>These can be broken down to:
> > > > >>
> > > > >>'any encoding' (1, 2, and 5)
> > > > >>
> > > > >>'specified encoding' (3 and 4)
> > > > >>
> > > > >>Note I put 5 in the 'any encoding' category as I think 'local'
> > > > >>could be interpretted as any encoding.
> > > > >
> > > > >I agree that 'local' could be interpreted as "any encoding", but I
> > > > >choose to view it as "context-dependent".  Thus a file that is
> > > > >CIF-conformant on one computer might not be CIF-conformant on
> > > > >another.  Some will find that unsatisfactory.  In my view, however,
> > > > >it is the best interpretation of CIF1's provisions; its purpose is
> > > > >thus to ensure that *all* well-formed CIF1 files are also
> > > > >well-formed CIF2 files (a context-dependent question).  Lest I
> > > > >appear to overstate the case, I acknowledge that the UTF8-only and
> > > > >UTF-8 + UTF-16 proposals would have the result that a large majority
> > > > >of well-formed CIF1 files are also well-formed CIF2 files.  The
> > > > >variations of Herb's proposal probably also make all well-formed
> > > > >CIF1 files well-formed CIF2 files, but I disfavor them on different
> > > > >grounds (mostly that they are too open to differing interpretations).
> > > > >
> > > > >[...]
> > > > >
> > > > >>In either case, a degree of work will be required to accommodate
> > > > >>user practice and the legacy of CIF1.
> > > > >
> > > > >I think the entire question reduces to which accommodations for the
> > > > >CIF1 legacy are assured by CIF2 vs. which will constitute
> > > > >non-standard extensions.  I don't think that individual responses,
> > > > >from Chester for example, are likely to depend much on which option
> > > > >is adopted, but I do think the overall consistency of responses will
> > > > >be affected.  Thus I favor precision of the specification and
> > > > >coverage of the likely uses, in hope of achieving the greatest
> > > > >consistency of response.
> > > > >
> > > > >I doubt this has swayed anyone's opinion, so please consider it an
> > > > >advance explanation for my upcoming vote (inasmuch as I rely on
> > > > >James's previous assurance that voting rights in this context are
> > > > >not restricted to COMCIFS members).
> > > > >
> > > > >
> > > > >Best Regards,
> > > > >
> > > > >John
> > > > >--
> > > > >John C. Bollinger, Ph.D.
> > > > >Department of Structural Biology
> > > > >St. Jude Children's Research Hospital
> > > > >
> > > > >
> > > > >Email Disclaimer:
> > > > ><http://www.stjude.org/emaildisclaimer>www.stjude.org/emaildisclaimer
> > > > >_______________________________________________
> > > > >cif2-encoding mailing list
> > > > ><mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org
> > >>><http://scripts.iucr.org/mailman/listinfo/cif2-encoding>http://scripts.iu
> > c
> > >
> > > > r.org/mailman/listinfo/cif2-encoding
> > > > >
> > > > >
> > > > >_______________________________________________
> > > > >cif2-encoding mailing list
> > > > >cif2-encoding at iucr.org
> > > > >http://scripts.iucr.org/mailman/listinfo/cif2-encoding
> > > >
> > > >
> > > > --
> > > > =====================================================
> > > >   Herbert J. Bernstein, Professor of Computer Science
> > > >     Dowling College, Kramer Science Center, KSC 121
> > > >         Idle Hour Blvd, Oakdale, NY, 11769
> > > >
> > > >                   +1-631-244-3035
> > > >                   yaya at dowling.edu
> > > > =====================================================
> > > > _______________________________________________
> > > > cif2-encoding mailing list
> > > > cif2-encoding at iucr.org
> > > > http://scripts.iucr.org/mailman/listinfo/cif2-encoding
> > > >
> > > >
> > >
> > >
> >
> >
> 
>


More information about the cif2-encoding mailing list