Advice on COMCIFS policy regarding compatibility of CIFsyntax with other domains. .
Bollinger, John C
John.Bollinger at STJUDE.ORG
Fri Mar 4 17:04:41 GMT 2011
On Friday, March 04, 2011 8:21 AM, Herbert J. Bernstein wrote:
>Before turning to the lower numbered principles, I would suggest we
>discuss item 3, because I believe it conflicts with item 1, which
>discusses features in terms of being "cumbersome," "significant,"
>being "simpler," "desired behavior," "reasonable ease," and "rough
>consensus," all of which have strong psychological and other
>human factors components that may be difficult to quantify
>and make into "technical issues."
That's a very reasonable objection. Although I do think that some of the ideas in item 1 are fundamentally technical and can be expressed in more explicitly technical terms, I'm not at all sure that all of them are in that category. Certainly some of them at least involve subjective technical evaluation.
>Precisely because we don't fully understand the non-technical issues
>in the design of information systems (and, I suspect, in almost all
>systems), one of the accepted principles of software engineering
>is to clearly identify all stakeholders, bring them into the discussion
>and work to achieve their "buy-in". Therefore I propose that we
>replace principle 3 with
>3. The stakeholders impacted by any change should be clearly
>identified and the proposed changes should be fully and openly
>discussed them in an effort to achieve their buy-in to the change,
>and the change should not go forward in absence of such buy-in
>absent pressing technical reasons for making the change over
I find that attractive in principle (no pun intended), but unclear as to how it should be applied in practice. At what points and at which level(s) should stakeholder buy-in be sought or required? Is explicit affirmation required, or simply lack of objection? How much objection should be tolerated before a proposition is rejected? How should conflicting stakeholder opinions be handled?
In our case, the DDLm group is open to all interested parties, and records of its proceedings are available to everyone. I would be delighted for the group to be advertised to all CIF stakeholders, but to the extent that any stakeholder or group of stakeholders knowingly declines to participate, it is inappropriate to insist on having their affirmation of details of the group's work.
Inasmuch as COMCIFS is a closed group, there is a better argument for seeking external stakeholder buy-in for decisions made at this level. Those decisions might include the CIF 2 / DDLm principles now under discussion, the general design imperatives COMCIFS may choose for CIF 2 and DDLm (such as the requirement that CIF 2 syntax be able to express all data values), and the final syntax and DDLm specifications. A common model for seeking such buy-in is to post documents for a period of public review and comment prior to making a decision. I would endorse COMCIFS doing so with the CIF 2 syntax specification and the DDLm specifications.
To the extent that COMCIFS may adopt a public review policy for CIF 2 and DDLm, however, it should be prepared for the likelihood that some of the commentary it receives will mirror points previously discussed by COMCIFS or the DDLm group. It should moreover recognize that critical comments are more likely than affirmative ones, therefore the proportion of critical comments on any particular issue is not necessarily a good measure of general stakeholder opinion. Furthermore, if there is significant community participation, then it is unlikely that any suitable specifications can be adopted without any objections.
> In principle 2, we reference CIF1. I believe that should be CIF1.1.
Yes. I suspect that's what James meant, but why leave room for uncertainty?
>Now I would like to turn to principle 1.(ii): the feature provides
>significant new functionality that is widely applicable to most
>This principle would prevent CIF from having features which
>support any one scientific domain. Under this principle, we never
>would have had DDL2 and mmCIF, nor imgCIF. I would suggest changing
>this principle to"
>1.(ii). the feature provides significant new functionality for
>some scientific application domain, and does not interfere with
>the use of CIF in other scientific application domains.
Do note that principle 1 is expressly about CIF *syntax*. DDLs and dictionaries are outside its scope, so 1.(ii) is not directly relevant to DDL2 / mmCIF / imgCIF or their kindred.
Nevertheless, the point is well taken. If there were a syntactic feature that was important only for one particular domain, then it might still be reasonable to include it. However this principle is phrased, though, I suggest that it be taken only as a necessary condition for including a feature, not automatically as a sufficient one.
>Finally, let us consider principle 1.(i): implementation or use of
>equivalent behaviour at dictionary level is either significantly more
>cumbersome or not possible;
>Depending on how we interpret the non-technical word "cumbersome", this
>may create the impression that we will require all uses of CIF
>to require use of dictionaries. I would suggest instead:
>1.(i): If it is feasible to implement the desired behavior by
>specification of changes to dictionaries rather then to CIF syntax,
>that alternative should be seriously considered and balanced against
>the human-readability of the resulting CIFs without reference to
Inasmuch as the proposed principles' audience is primarily COMCIFS and the DDLm group, I don't think the language need be excessively lawyered. At the same time, I do agree that its focus on dictionaries may miss the mark. The point, I think, is separating the language syntax from domain-specific semantics, and putting each proposed feature in the right bin. The fact that semantics are documented in dictionaries is largely ancillary.
In some cases there is indeed a tension between whether a feature should be syntactic or semantic, and I hope the principles adopted will help resolve such tensions. I am not convinced, however, that the key concern is the extent to which users must reference dictionaries, or the relative difficulty of implementing a feature on one side or the other. To me, the question is "For whom and in what contexts is the feature intended?". If the answer is not "everyone, everywhere" then the feature is semantic. After some consideration, I think Herbert's proposed language expresses roughly the same idea.
Sadly, that question may not admit a consensus answer. I take James's intent to be that in such situations the decision would be biased toward the semantic choice, absent an technical imperative to the contrary. I favor that position.
John C. Bollinger, Ph.D.
Department of Structural Biology
St. Jude Children's Research Hospital
Email Disclaimer: www.stjude.org/emaildisclaimer
More information about the comcifs