We introduce a three-phase, nine-step methodology for specification of clinical guidelines (GLs) by expert physicians, clinical editors, and knowledge engineers and for quantitative evaluation of the specification's quality. We applied this methodology to a particular framework for incremental GL structuring (mark-up) and to GLs in three clinical domains. A gold-standard mark-up was created, including 196 plans and subplans, and 326 instances of ontological knowledge roles (KRs). A completeness measure of the acquired knowledge revealed that 97% of the plans and 91% of the KR instances of the GLs were recreated by the clinical editors. A correctness measure often revealed high variability within clinical editor pairs structuring each GL, but for all GLs and clinical editors the specification quality was significantly higher than random (p<0.01). Procedural KRs were more difficult to mark-up than declarative KRs. We conclude that given an ontology-specific consensus, clinical editors with mark-up training can structure GL knowledge with high completeness, whereas the main demand for correct structuring is training in the ontology's semantics.