[Cif2-encoding] How we wrap this up
Herbert J. Bernstein
yaya at bernstein-plus-sons.com
Mon Sep 27 17:23:32 BST 2010
Dear Simon,
We do not seem to be communicating effectively. Do you have a
Skype account? We really need a meeting.
Regards,
Herbert
At 3:27 PM +0000 9/27/10, SIMON WESTRIP wrote:
>I see nothing wrong with a strategy to introduce CIF2 if necessary.
>My initial thoughts are that the current 'as for CIF1...' description
>is not best suited as base specification on which to build full
>unicode support, should such a strategy be pursued.
>
>However, I will reflect on this along with recent contributions from
>James and John...
>
>Cheers
>
>Simon
>
>
>
>From: Herbert J. Bernstein <yaya at bernstein-plus-sons.com>
>To: Group for discussing encoding and content validation schemes for
>CIF2 <cif2-encoding at iucr.org>
>Sent: Monday, 27 September, 2010 14:45:16
>Subject: Re: [Cif2-encoding] How we wrap this up
>
>The problem is that options 3,4 and 5 specifically prescribe the
>use of Unicode characters (that is the entire point of those
>options -- and that is the point in dispute -- whether we should
>be prescribing UTF8 or using is as we now use ASCII, as a way to
>be clear what we are talking about as in CIF1) and we simply are not
>ready to deal such a requirement yet.
>
>I take the blame for starting this discussion many years ago when
>I simply asked for just what my motion says, that we start using
>UTF8 in the same way we had been using ASCII. Unfortunately
>this discussion has turned into a strong push to focus CIF on
>that particular encoding, stop using Brian's elides, etc. With
>the current weak state of software support for CIF and the large
>investment at the IUCr and at the PDB in current workflows, I
>think it would be a very disruptive and expensive change to make
>right now. God and the Devil are in the details.
>
>Note that I am _not_ basing this argument on imgCIF. At this point
>it appears, unfortunately, that CIF2 and imgCIF will have to diverge.
>If we have enough face-to-face discussions, perhaps we can bring
>them together again, as we did in 1998, but that is an even more
>difficult discussion than the one we need to have on encodings.
>What is I we will do is to go at this in incremental stages:
>
>1. Make the transition from CIF1 to CIF2 using new dictionaries
>but allowing most data files to remain unchanges, and providing
>simple algorithmic transformations for the rest, but keeping
>most of the current semantic extensions that we have in CIF1,
>focusing our enegry on getting the new dictionaries used and
>making use of dREL;
>
>2. Work on a CIF2.1 that, by creative and well-supported use
>of Unicode, allows for a well organized transition from Brian's
>elides to use of Unicode characters
>
>3. Then working in that context, whatever it turns out to be,
>work on having imgCIF make the transition to CIF2 in some
>reasonably compatible way.
>
>I see how to do item 1 for next summer. I don't see how to do 2 and
>3 in that time frame, though I am sure we could make a dent in
>them if we could meet face to face. email tends to stiffen too
>many positions.
>
>Regards,
> Herbert
>
>=====================================================
>Herbert J. Bernstein, Professor of Computer Science
> Dowling College, Kramer Science Center, KSC 121
> Idle Hour Blvd, Oakdale, NY, 11769
>
> +1-631-244-3035
> <mailto:yaya at dowling.edu>yaya at dowling.edu
>=====================================================
>
>On Mon, 27 Sep 2010, SIMON WESTRIP wrote:
>
>> Dear Herbert
>>
>> I do not understand why it is *only* options 3, 4 or 5 that allow users to
>> start using
>> unicode characters?
>>
>> More generally, are you suggesting that the use of anything but ASCII in a
>> data value is only allowed if
>> e.g. the dictionary definition of the data item permits, or even only if the
>> IUCr says that's OK?
>>
>> Fundamentally, I'm starting to infer that the purpose of the 'as for
> > CIF1...' approach to encoding is
>> to open the door to full unicode support, but not actually let anyone cross
>> the threshold?
>>
>>
>> Cheers
>>
>> Simon
>>
>> ____________________________________________________________________________
>> From: Herbert J. Bernstein
>><<mailto:yaya at bernstein-plus-sons.com>yaya at bernstein-plus-sons.com>
>> To: Group for discussing encoding and content validation schemes for CIF2
>> <<mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org>
>> Sent: Monday, 27 September, 2010 11:48:49
>> Subject: Re: [Cif2-encoding] How we wrap this up
>>
>> Dear Simon,
>>
>> Under the CIF2 specification with UTF8 in place of ASCII there is
>> _no_ change in the use of elided ASCII sequences to represent non-ASCII
>> characters until and unless the IUCr publications office decides that,
>> for that particular application, they are ready to accept something
>> new.
>>
>> It is _only_ if you go forward with options 3, 4 or 5 that you
>> are giving the green light to users to do precisely what you are
>> concerned about -- using the unicode characters instead instead
>> in possibly strange admixtures that nobody is ready to process.
>>
>> Remember, under the CIF2 specification as now written, it is
>> _not_ part of the CIF2 specification to determine the handling
>> of the characters in quoted strings other than to ensure that
>> those string do not contain illegal characters from the point
>> of view of CIF2. Dealing with the validity of particular character
>> sequences in strings users provide is, just as in CIF1, the
>> responsibility of the application (i.e. the IUCr journal flows
>> or the PDB archiving flows).
>>
>> My apologies to James, who I know is trying to do what he believes
>> to be right, but I believe James has things backwards -- the "deep
>> breath" is provided by my proposal -- taking the time to properly engineer
>> the use of the extra characters UTF8 allows us to discuss clearly,
>> while James' push for an immediate prescriptive use of UTF8 with
>> prescriptions that differ drastically from what has been adopted
>> by all other frameworks (HTML, XML, python, etc.) in ways that
>> are untested and unsupported by most existing software is
>> the untimely rush to judgement.
>>
>> I beg you to support options 1 and/or 2 to allow CIF2 to go forward
>> in all other respects while we all take a deep breath and deal
>> with the tricky issue you raised slowly and carefully without the
>> pressure of trying to have CIF2 itself ready for next summer.
>>
>> Regards,
>> Herbert
>>
>> At 9:34 AM +0000 9/27/10, SIMON WESTRIP wrote:
>> >I was not so concerned about invalidating existing CIFs, or even the
>> >likelihood
>> >that users will continue to write e.g. 'f\'oo' - this is a syntax
>> >error in CIF2 that is readily recoverable.
>> >
>> >Rather there is a large group of CIF1 users that are in the habit of
>> >using elided ASCII sequences to
>> >represent non-ASCII characters. With CIF2 these users will be able
>> >to use the unicode character itself.
>> >So we might end up with a mixture of esacaped sequences and unicode
>> >characters (e.g. a user may have a keyboard shortcut
>> >for an accented character that forms part of their name, but might
>> >still resort to \a for alpha, under the assumption that \a is still
>> >valid because CIF2 is basically the same as CIF1, and, rightly or
>> >wrongly, they perceive the eliding machanism as part of
>> >CIF syntax.
>> >
>> >I think this is an issue where we can't afford to take an 'as for
>> >CIF1...' approach, especially as the CIF1 specification
>> >isn't entirely satisfactory (e.g. there's an example in the
>> >line-folding protocal that uses elides in a file path to make a
>> >point,
>> >but actually these elides may easily be interpretted as escape
>> >sequences), and as the encoding issue is very much concerned with
>> >user practice, the large group of users that currently use elided
>> >character codes need to be aware what the situation is in
>> >CIF2?
>> >
>> >I'm not convinced this issue should be left for discussion later;
>> >it is relevant when considering how the move beyond ASCII is specified.
>> >
>> >Cheers
> > >
>> >Simon
>> >
>> >
>> >
>> >
>> >From: Herbert J. Bernstein
>><<mailto:yaya at bernstein-plus-sons.com>yaya at bernstein-plus-sons.com>
>> >To: Group for discussing encoding and content validation schemes for
>> >CIF2 <<mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org>
>> >Sent: Sunday, 26 September, 2010 23:14:55
>> >Subject: Re: [Cif2-encoding] How we wrap this up
>> >
>> >Dear Simon,
>> >
>> > The current CIF2 spec, with or without the changes I have suggested
>> >to temporarily resolve the encoding issue is at best vague and
>> >confusing on the elide character issue. The interacting issue on
>> >which the CIF2 spec
>> >is clear is that we are changing the handling of quoted strings so
>> >that they end on the first occurrence of the quoting character and leaves
>> >the handling of elides to the calling application.
>> >
>> > This will be a problem -- the change from CIF1 in the termination of
>> >quoted strings along with the absence of a way of eliding the quotes
>> >will invalidate a significant number of existing CIFS without any simple
>> >mechanism to recover. Rather than reopen another endless discussion,
>> >I would suggest we simply add the python string concatenation character
>> >"+" to ensure we can map all current CIF1 files and use Brian's common
>> >semantic features for the moment. We can then deal with the full elides
>> >discussion at a future date.
>> >
>> > Regards,
>> > Herbert
>> >
>> >
>> >
>> >
>> >
>> >At 1:40 PM -0700 9/26/10, SIMON WESTRIP wrote:
>> >>Dear all
>> >>
>> >>While reviewing my hypothetical 'to do' list for implementing CIF2
>> >>in current software, I realized that
>> >>the issue of current support for elided character codes hasnt really
>> >>been addressed in the context of CIF2.
>> >>My 'to do' list contains notes that software could treat them as
>> >>keyboard shortcuts, and their use could be
>> >>defined in the dictionary. However, that was based on a distinct
>> >>difference between CIF1 and CIF2,
>> >>while the current arguments for 'as for CIF1...' suggest that the
>> >>distinction between CIF1 and CIF2
>> >>should almost be imperceptible.
>> >>
>> >>How is this issue to be addressed in the specification?
>> >>
>> >>Cheers
>> >>
>> >>Simon
>> >>
>> >>
>> >>
>> >>From: Herbert J. Bernstein
>> >><<mailto:<mailto:yaya at bernstein-plus-sons.com>yaya at bernstein-plus-sons.com><mailto:yaya at bernstein-plus-sons.com>yaya at bernstein-plus-sons.com>
>> >>To: Group for discussing encoding and content validation schemes for
>> >>CIF2
>><<mailto:<mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org><mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org>
>> >>Sent: Saturday, 25 September, 2010 20:37:46
>> >>Subject: Re: [Cif2-encoding] How we wrap this up
>> >>
>> >>Thank you for your cooperation. -- Herbert
>> >>
>> >>=====================================================
>> >>Herbert J. Bernstein, Professor of Computer Science
>> >> Dowling College, Kramer Science Center, KSC 121
>> >> Idle Hour Blvd, Oakdale, NY, 11769
>> >>
>> >> +1-631-244-3035
>> >>
>> >><mailto:<mailto:<mailto:yaya at dowling.edu>yaya at dowling.edu><mailto:yaya at dowling.edu>yaya at dowling.edu><mailto:<mailto:yaya at dowling.ed>yaya at dowling.ed
>> u><mailto:yaya at dowling.edu>yaya at dowling.edu
>> >>=====================================================
>> >>
>> >>On Sat, 25 Sep 2010, SIMON WESTRIP wrote:
>> >>
>> >>> OK - as promised, I wont pursue the matter :-)
>> >>>
>> >>>
>> >>>
>> >>>________________________________________________________________________
>> ____
>> >>> From: Herbert J. Bernstein
>> >>><<mailto:<mailto:<mailto:yaya at bernstein-plus-sons.com>yaya at bernstein-plus-sons.com><mailto:yaya at bernstein-plus-sons.c>yaya at bernstein-plus-sons.c
>>
>>om><mailto:<mailto:yaya at bernstein-plus-sons.com>yaya at bernstein-plus-sons.com><mailto:yaya at bernstein-plus-sons.com>yaya at bernstein-plus-sons.com>
>> >>> To: Group for discussing encoding and content validation schemes for
>> CIF2
>> >>>
>> >>><<mailto:<mailto:<mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org><mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org><mailto:c
>>
>><mailto:if2-encoding at iucr.org>if2-encoding at iucr.org><mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org>
> > >>> Sent: Saturday, 25 September, 2010 19:18:54
>> >>> Subject: Re: [Cif2-encoding] How we wrap this up
>> >>>
>> >>> Dear Simon,
>> >>>
>> >>> Unfortunately, that is likely to take us back into our infinite loop
>> or
>> >>> into a diverging spiral. Right now, we would have UTF8 as no
>> >>>more or less a
>> >>> default for CIF2 than ASCII is for CIF1 -- i.e. a not too bad
>> >>>first guess as
>> >>> the likely default encoding for any given CIF, but not a formal
>> >>>constraint.
>> >>> I would suggest we leave the wording in that imprecise state, get CIF2
>> out
>> >>> and accepted and then work further on the encoding issue.
>> >>>
>> >>> Regards,
>> >>> Herbert
>> >>>
>> >>> =====================================================
>> >>> Herbert J. Bernstein, Professor of Computer Science
>> >>> Dowling College, Kramer Science Center, KSC 121
>> >>> Idle Hour Blvd, Oakdale, NY, 11769
>> >>>
>> >>> +1-631-244-3035
>> >>>
>> >>><mailto:<mailto:<mailto:yaya at dowling.edu>yaya at dowling.edu><mailto:yaya at dowling.edu>yaya at dowling.edu><mailto:<mailto:yaya at dowling.e>yaya at dowling.e
>> du><mailto:yaya at dowling.edu>yaya at dowling.edu
>> >>> =====================================================
>> >>>
>> >>> On Sat, 25 Sep 2010, SIMON WESTRIP wrote:
>> >>>
>> >>> > Dear all
>> >>> >
>> >>> > In the event that CIF2 adopts the 'any encoding' approach,
>> >>>would there be
>> >> > > any objections to
>> > >> > explicitly defining a default encoding in the specification, to be
>> >>> defaulted
>> >>> > to when there were no indications
>> >>> > to the contrary. At worst this would give CIF2 service
>> >>>providers an excuse
>> >>> > to interpret CIFs as e.g. UTF8 if they couldnt
>> >>> > determine the encoding by other means - but such intollerant service
>> >>> > providers would soon find that their service is
>> >>> > not successful - while at best this might raise awareness of the
>> issues
>> >>> > regarding encoding once non-ASCII is used in
>> >>> > a CIF. Essentially, it does not require users to change there working
>> >>> > practices, which is one of the main arguments for
>> >>> > 'any encoding'.
>> >>> >
>> >>> > So, CIF2 would remain 'any encoding', and specifications in
>> >>>terms of e.g.
>> >>> > "Herbert's as for CIF1..."
>> >>> > might only require a single sentence to define the default after
>> stating
>> >>> > what the 'preferred' encoding was;
>> >>> > the proposal might be phrased as "Herbert's as for CIF1..." +
>> "explicit
>> >>> > default encoding"?
>> >>> >
>> >>> > I do not wish to prolong this debate - if there are objections
>> >>>I will not
>> >>> > launch into an endless round of exchanges
>> >>> > that cover the same ground that has led us this far.
>> >>> >
>> >>> > Cheers
>> >>> >
>> >>> > Simon
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>>
>> >>>>_______________________________________________________________________
>> ____
>> >>> _
>> >>> > From: SIMON WESTRIP
>> >>><<mailto:<mailto:<mailto:simonwestrip at btinternet.com>simonwestrip at btinternet.com><mailto:simonwestrip at btinternet.com>simonwestrip at btinternet.com
>> ><mailto:<mailto:simonwestrip at btinternet.com>simonwestrip at btinternet.com><mailto:simonwestrip at btinternet.com>simonwestrip at btinternet.com>
>> >>> > To: Group for discussing encoding and content validation
>> >>>schemes for CIF2
>> >>> >
>> >>><<mailto:<mailto:<mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org><mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org><mailto:c
>>
>><mailto:if2-encoding at iucr.org>if2-encoding at iucr.org><mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org>
>> >>> > Sent: Friday, 24 September, 2010 20:10:13
>> >>> > Subject: Re: [Cif2-encoding] How we wrap this up
>> >>> >
>> >>> > Dear James
>> >>> >
>> >>> > As you may have gathered I have been reconsidering my position on
>> this
>> >>> > issue.
>> >>> > Please forgive me, but I would like to change my vote if that is OK,
>> in
>> >>> > favour of the 'any encoding' camp.
>> >>> > This apparent U-turn is not a response to recent
>> >>>contributions; rather it
>> >>> is
>> >>> > the outcome of a meeting I had this morning
> > >>> > where I demonstrated some new software to the Managing
>Editor of IUCr
>> >>> > journals.
>> >>> >
>> >>> > By way of explanation:
>> >>> >
>> >>> > I have been developing a new docx template which the IUCr
>> >>>editorial office
>> >>> > is shortly to release for use by
>> >>> > authors. The template will be packaged with some tools to extract
>> data
>> >>> from
>> >>> > CIFs
>> >>> > and tabulate them in the Word document, e.g. open an mmCIF, click a
>> >>> button,
>> >>> > and standard
>> >>> > tables populated with data from the CIF will be included in
>> >>>the document,
>> >>> > acting as
>> >>> > table templates for the author to edit as appropriate for their
>> >>> manuscript.
>> >>> >
>> >>> > Inclusion of the mmCIF tools is part of an unofficial policy to
>> 'coax'
>> >>> > biologists to start using/accepting mmCIF
>> >>> > as a useful medium, rather than as a product of their deposition to
>> the
>> >>> PDB,
>> >>> > and to encourage them to become comfortable
>> >>> > with passing mmCIFs between applications, and even to edit the
>> >>>things (in
>> >>> > the same way as the core-CIF community
>> >>> > treats CIFs). For example, our perception is that there is no reason
>> why
>> >>> an
>> >>> > author should not feel free to take an mmCIF
>> >>> > that has been created by e.g. pdb_extract and populate it using
>> >>> third-party
>> >>> > software before uploading to the PDB for
>> >>> > deposition.
>> >>> >
>> >>> > This cause would not be furthered by effectively invalidating
>> >>>an mmCIF if
>> >>> it
>> >>> > were not to be encoded in one of
>> >>> > the specified encodings.
>> >>> >
>> >>> > So although I am uneasy about a specification that propogates
>> >>>uncertainty,
>> >>> > I'm also uneasy about alienating users,
>> >>> > especially when we are struggling to change their mindset as in the
>> case
>> >>> of
>> >>> > the biological community
>> >>> > (my perception of the biological community's attitude to mmCIF
>> >>>is based on
>> >>> > feedback from authors/coeditors to
>> >>> > IUCr journals).
>> >>> >
>> > >> > Granted this may not be the most compelling argument in favour of
>> 'any
>> >>> > encoding', but recognizing the hurdles that
>> >>> > may have to be overcome once we move beyond ASCII whatever the CIF2
>> >>> > specification, I support 'any encoding'
>> >>> > as 'a means to an end'.
>> >>> >
>> >>> > I will not provide my preferences in terms of the numbered options
>> until
>> >> > you
>> >>> > say so; afterall, I have already voted and
>> >>> > all this has to be signed off by COMCIFs in any case.
>> >>> >
>> >>> > Cheers
>> >>> >
>> >>> > Simon
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>>
>> >>>>_______________________________________________________________________
>> ____
>> >>> _
>> >>> > From: "Bollinger, John C"
>> >>><<mailto:<mailto:<mailto:John.Bollinger at STJUDE.ORG>John.Bollinger at STJUDE.ORG><mailto:John.Bollinger at STJUDE.ORG>John.Bollinger at STJUDE.ORG><ma
>>
>>ilto:<mailto:John.Bollinger at STJUDE.ORG>John.Bollinger at STJUDE.ORG><mailto:John.Bollinger at STJUDE.ORG>John.Bollinger at STJUDE.ORG>
>> >>> > To: Group for discussing encoding and content validation
>> >>>schemes for CIF2
>> >>> >
>> >>><<mailto:<mailto:<mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org><mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org><mailto:c
>>
>><mailto:if2-encoding at iucr.org>if2-encoding at iucr.org><mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org>
>> >>> > Sent: Friday, 24 September, 2010 14:50:57
>> >>> > Subject: Re: [Cif2-encoding] How we wrap this up
>> >>> >
>> >>> > Dear Simon,
>> >>> >
>> >>> > It is exactly this sort of issue that drove me to support more
>> >>>permissive
>> >>> > encoding rules and ultimately to devise the UTF-8 + UTF-16 + local
>> >>> proposal.
>> >>> >
>> >>> > Do please think about the considerations Herb raised. As you
>> reconsider
>> >>> > your votes, I urge you also to ask yourself what, *precisely*, a
>> "text
>> >>> file"
>> >>> > is, and to consider whether your answer is functionally
>> >>>different from my
>> >>> > "local". If you decide not, then please consider what that
>> >>>answer implies
> > >>> > about CIF2 support of UTF-8 and UTF-16 (which evidently you favor)
>> under
>> >>> > each option on the table, especially for CIFs containing non-ASCII
>> >>> > characters. Whatever you decide about the meaning of "text
>> >>>file", please
>> >>> > consider whether reasonable people might reach a different
>> >>>conclusion, as
>> >>> I
>> >>> > assert they might do, and to what extent the standard needs to
>> address
>> >>> that.
>> >>> >
>> >>> >
>> >>> > Regards,
>> >>> >
>> >>> > John
>> >>> > --
>> >>> > John C. Bollinger, Ph.D.
>> >>> > Department of Structural Biology
>> >>> > St. Jude Children's Research Hospital
>> >>> >
>> >>> >
>> >>> > >From:
>> >>><mailto:<mailto:<mailto:cif2-encoding-bounces at iucr.org>cif2-encoding-bounces at iucr.org>cif2-encoding-bounces at iuc
>>
>>r.org><mailto:<mailto:cif2-encoding-bounces at iucr.org>cif2-encoding-bounces at iucr.org><mailto:cif2-encoding-bounces at iucr.org>cif2-encoding-bounces at iucr.org
>>
>> >>> >
>> >>>[mailto:<mailto:<mailto:<mailto:cif2-encoding-bounces at iucr.org>cif2-encoding-bounces at iucr.org>cif2-encoding-bou
>>
>><mailto:nces at iucr.org>nces at iucr.org><mailto:<mailto:cif2-encoding-bounces at iucr.org>cif2-encoding-bounces at iucr.org>cif2-encoding-bounces@
>> iucr.org]
>> >>>On Behalf Of SIMON WESTRIP
>> >>> > >Sent: Friday, September 24, 2010 7:53 AM
>> >>> > >To: Group for discussing encoding and content validation
>> >>>schemes for CIF2
>> >>> > >Subject: Re: [Cif2-encoding] How we wrap this up. .
>> >>> > >
>> >>> > >Dear Herbert
>> >>> > >
>> >>> > >Not for the first time, I find your arguement persuasive. Brian's
>> vote
>> >>> and
>> >>> > explanation have also raised some
>> >>> > >questions that I would like to look into.
>> >>> > >
>> >>> > >I will confirm or otherwise my vote as soon as possible,
>> >>>assuming that is
>> >>> > OK with James and assuming that
>> >>> > >this round of votes might wrap this up.
>> >>> > >
>> >>> > >Cheers
>> >>> > >
>> >>> > >Simon
>> >>> > >
>> >>> > >________________________________________
>> >>> > >From: Herbert J. Bernstein
>> >>><<mailto:<mailto:<mailto:yaya at bernstein-plus-sons.com>yaya at bernstein-plus-sons.com><mailto:yaya at bernstein-plus-sons.c>yaya at bernstein-plus-sons.c
>>
>>om><mailto:<mailto:yaya at bernstein-plus-sons.com>yaya at bernstein-plus-sons.com><mailto:yaya at bernstein-plus-sons.com>yaya at bernstein-plus-sons.com>
>> >>> > >To: Group for discussing encoding and content validation
>> >>>schemes for CIF2
>> >>> >
>> >>><<mailto:<mailto:<mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org><mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org><mailto:c
>>
>><mailto:if2-encoding at iucr.org>if2-encoding at iucr.org><mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org>
>> >>> > >Sent: Friday, 24 September, 2010 13:17:14
>> >>> > >Subject: Re: [Cif2-encoding] How we wrap this up
>> >>> > >
>> >>> > >If he ignores the standard, in most cases all he has to do to
>> >>>comply with
>> >>> > CIF2 is to run whatever applications he currently runs to produce
>> CIF1
>> >>> and,
>> >>> > perhaps, in some cases, run a minor edit pass at the end, to convert
>> for
>> >>> the
>> >>> > minor syntactive differences and/or changed tags required to comply
>> with
>> >>> > CIF2 and the new dictionaries, but he is unlikely to have to do
>> anything
>> > >> to
>> >>> > deal with the messy business of whether his encoding is really a
>> proper
>> >>> UTF8
>> >>> > encoding or not.
>> >>> >
>> >>> > >The punishment if he tries to comply, is that he has to totally
>> uproot
>> >>> and
>> >>> > reconfigure the environment in which he produces CIFs from
>> >>>whatever he is
>> >>> > currently doing to create an enviroment in which he can reliably
>> create
>> >>> and,
>> >>> > more importantly, transmit compliant UTF8 files. This can be
>> >>>very tricky
>> >>> if
>> >>> > he does only a partial job, say fudging in one special
>> >>>application (yet to
>> >>> > be written), because if he stays with his old system, all kinds of
>> tools
>> >>> > will keep trying to transcode whatever he has produced back to
>> whatever
>> >>> his
>> >>> > system considers a standard. Those of us who have files,
>> >>>applications and
> > >>> > tools that have lived through several generations of macs are
>> >>>living proof
>> >>> > of the problem. Macs now have excellent UTF8/16 unicode
>> >>>support, but every
>> >> > > once in a while in working with a unicode file I find it has been
>> >>> strangely
>> >>> > and unexpectedly converted to something else, and it can be
>> >>>really tricky
>> >>> to
>> >>> > spot when the unaccented roman text part has been left
>> >>>untouched but just
>> >>> a
>> >>> > few accen
>> >>> > ted letters have gotten different accents.
>> >>> >
>> >>> > >Mandating UTF8 is simply trying to shift a serious software
>> >>>problem from
>> >>> > the central handlers of CIF (IUCr, PDB, etc.) to the external
>> >>>users. Most
>> >>> > users will probably have the good sense to simply ignore the demand
>> and
>> >>> > leave the burden just where it is now. A few sophisticated users
>> will
>> >>> > probably adapt with no trouble, but the punishment for those users
>> who
>> >>> > blindly follow orders before we have a complete multiplatform
>> supporting
>> >>> > infrastructure in place by mandating UTF8 is severe, expensive and
>> >>> > undeserved. Until and unless we have developed solid support, we
>> will
>> >>> just
>> >>> > be alienating people from CIF. I will continue to oppose such a
>> move.
>> >>> >
>> >>> > [...]
>> >>> >
>> >>> >
>> >>> > Email Disclaimer:
>> >>><<<http://www.stjude.org/emaildisclaimer>http://www.stjude.org/emaildisclaimer><http://www.stjude.org/emaildiscl>http://www.stjude.org/emaildiscl
>>
>>aimer><<http://www.stjude.org/emaildisclaimer>http://www.stjude.org/emaildisclaimer><http://www.stjude.org/emaildisclaimer>www.stjude.org/emaildisclaimer
>>
>> >>> > _______________________________________________
>> >>> > cif2-encoding mailing list
>> >>> >
>> >>><mailto:<mailto:<mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org><mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org><mailto:ci
>>
>><mailto:f2-encoding at iucr.org>f2-encoding at iucr.org><mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org
>> >>> >
>> >>><<<http://scripts.iucr.org/mailman/listinfo/cif2-encoding>http://scripts.iucr.org/mailman/listinfo/cif2-encoding><http://scripts>http://scripts.
>>
>>iucr.org/mailman/listinfo/cif2-encoding><<http://scripts.iucr.org/mailman/li>http://scripts.iucr.org/mailman/li
>>
>>stinfo/cif2-encoding><http://scripts.iucr.org/mailman/listinfo/cif2-encoding>http://scripts.iucr.org/mailman/listinfo/cif2-encoding
>>
>> >>> >
>> >>> >
>> >>>
>> >>>
>> >>
>> >>_______________________________________________
>> >>cif2-encoding mailing list
>> >><mailto:<mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org><mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org
>> >><<http://scripts.iucr.org/mailman/listinfo/cif2-encoding>http://scripts.iucr.org/mailman/listinfo/cif2-encoding><http://scripts.iu>http://scripts.iu
>> cr.org/mailman/listinfo/cif2-encoding
>> >
>> >
>> >--
>> >=====================================================
>> > Herbert J. Bernstein, Professor of Computer Science
>> > Dowling College, Kramer Science Center, KSC 121
>> > Idle Hour Blvd, Oakdale, NY, 11769
>> >
>> > +1-631-244-3035
>> >
>><mailto:<mailto:yaya at dowling.edu>yaya at dowling.edu><mailto:yaya at dowling.edu>yaya at dowling.edu
>> >=====================================================
>> >_______________________________________________
>> >cif2-encoding mailing list
>> ><mailto:<mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org><mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org
>> ><<http://scripts.iucr.org/mailman/listinfo/cif2-encoding>http://scripts.iucr.org/mailman/listinfo/cif2-encoding><http://scripts.iuc>http://scripts.iuc
>> r.org/mailman/listinfo/cif2-encoding
>> >
>> >
>> >_______________________________________________
>> >cif2-encoding mailing list
>> ><mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org
>> ><http://scripts.iucr.org/mailman/listinfo/cif2-encoding>http://scripts.iucr.org/mailman/listinfo/cif2-encoding
>>
>>
>> --
>> =====================================================
>> Herbert J. Bernstein, Professor of Computer Science
>> Dowling College, Kramer Science Center, KSC 121
> > Idle Hour Blvd, Oakdale, NY, 11769
>>
>> +1-631-244-3035
>> <mailto:yaya at dowling.edu>yaya at dowling.edu
>> =====================================================
>> _______________________________________________
>> cif2-encoding mailing list
>> <mailto:cif2-encoding at iucr.org>cif2-encoding at iucr.org
>>
>><http://scripts.iucr.org/mailman/listinfo/cif2-encoding>http://scripts.iucr.org/mailman/listinfo/cif2-encoding
>>
>>
>
>_______________________________________________
>cif2-encoding mailing list
>cif2-encoding at iucr.org
>http://scripts.iucr.org/mailman/listinfo/cif2-encoding
--
=====================================================
Herbert J. Bernstein, Professor of Computer Science
Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769
+1-631-244-3035
yaya at dowling.edu
=====================================================
More information about the cif2-encoding
mailing list