New accent modifier types?

Fri Mar 16 17:07:34 GMT 2007

Joe

On Wed, Mar 07, 2007 at 12:40:05PM -0500, Joe Krahn wrote:
> In terms of simple CIF markup, I think that a few extra accent types are
> OK, but the the basic syntax should be simple. That is why I ignores
> Cyrillic, Arabic, etc., because they are just too different from plain
> ASCII.
> 
> In fact, my hope is to actually simplify the syntax by converting the
> ones with no backslash to a more uniform format. I think it is also a
> good idea to stop using the <B> and <I> html tags. Of course, having a
> means to do proper HTML as an alternative makes this more reasonable.

I've now had a chance to discuss these ideas with Simon, our
publcif developer. He has agreed to look at your proposal for
extended textual markup and report on the practicality of
implementing it within publcif.

We also chatted about the use of MIME headers within text fields
to allow multipart content and/or to delegate complex formatting
to external handlers. The problem that we immediately run into
is what the return value from the handler should be. The input
to the handler is simple - it's just a string comprising the
portion of the text field delegate; so you pass a chunk of TeX
to a TeX handler. The TeX handler may need to wrap this in
a preamble and postamble (e.g. "\end"); but what does it return
to the invoking application? TeX itself will create a
device-independent (.dvi) file external to the application,
or it could write PostScript or PDF - again, external
representations of a rendered document. What we typically need
is a transformed string that can be reinserted into the output
stream from the invoking application.

Simon suggested that the TeX handler just return an image - a GIF,
say, of the rendered page, but that won't do for all sorts of
reasons. You lose the ability to embed hyperlinks, to index
content, to convert to SGML/XML; and you may end up with images
that are larger than the available location within the document
page that you are constructing.

In the case of publcif, which creates as the first stage of its
output an HTML document, we shall look at tth (a TeX-to-HTML
converter) as the TeX handler. However, we already know from
using tth in the journal production workflow that its capabilities
are limited (HTML itself has rather poor facilities for rendering
maths), so our expectations are not high.

This isn't an academic exercise for us. Until very recently,
CIF-based articles in Acta C and E used TeX as the typesetting
engine and an unofficial hack allowed embedded TeX to be
integrated in the typesetting process. (If you're interested,
the hack consisted of putting %T as the first non-blank characters
of the text field. It would be easy to detect this and convert it
to a MIME-delimited form if we do decide to go that way. The hack
was rather quietly documented on page 4 of A Guide to CIF For Authors
http://www.iucr.org/iucr-top/cif/doc/cifguide.pdf)

There are many good reasons for us to convert from ciftex to
publcif as our composition engine, but I fear that it will be a
retrograde step if we lose the potential for rich markup with
TeX that we previously had.

Brian

Crystallography Online: the website of the International Union of Crystallography

New accent modifier types?