ATHENA-HTN is a clinical decision support system (CDSS) that delivers guideline-based patient-specific recommendations about hypertension management at the time of clinical decision-making. The ATHENA-HTN knowledge is stored in a knowledge-base (KB). Changes in best-practice recommendations require updates to the KB. We describe a method of offline testing to evaluate the accuracy of recommendations generated from the KB. A physician reviewed 100 test cases and made drug recommendations based on guidelines and the "Rules" (descriptions of encoded knowledge). These drug recommendations were compared to those generated by ATHENA-HTN. Nineteen drug-recommendation discrepancies were identified: ATHENA-HTN was more complete in generating recommendations (15); ambiguities in the Rules misled the physician (3); and content in the Rules was not encoded (1). Three new boundaries were identified. Three updates were made to the KB based on the results. The offline testing method was successful in identifying areas for KB improvement and led to improved accuracy of guideline-based recommendations.