includes the corresponding HAM-D6 items. A major pitfall in a microanalysis of the HAM-D is the use of factor analysis to test Faravelli' s assumptions. A comprehensive review by Bagby et al7 has shown that factor analysis as used from 1980 to 2003 in many psychometric analyses of the HAM-D has identified quite different factor scores. As discussed elsewhere,32 the clinimetric analysis of a rating scale should indicate to what extent the total score is a sufficient statistic by considering both the individual items of the scale and the population under examination. When trying to define the antidepressant effect of a drug, Prien and Le vine33 concluded that a greater improvement in total HAM-D scores does not necessarily indicate antidepressant action ("… assume that a group treated with an experimental drug shows significantly more improvement than a group treated with placebo on the factors of anxiety, somatization or sleep disturbances and no significant change on other factors. These changes, by themselves, should not qualify the drug as an antidepressant…"33). Another major pitfall to be considered is the use of several depression scales in the same trial

without clearly indicating a priori which of them has been determined to be the the primary measure of antidepressant effect. To avoid this problem, a researcher should always use the specific items of depression, eg, the HAM-D6 or the MADRS6, as the primary efficacy measure. When determining clinically significant antidepressant effect, it is recommended to use standardized effect size statistics.34 These statistics examine the reduction of rating scale scores from baseline to end point (mean scores) for both active drug and placebo in relation to the pooled standard deviation of the two treatments. Thus, if the baseline score is 24 for both treatments, but the change score is 14 for the active drug while it is 10 for the placebo, and if the pooled standard deviation is 8, then the effect size is 4/8 or 0.50. In clinical trials with antidepressants an effect size of 0.40

or higher is considered a clinically significant response criterion.35 This equals a 20% advantage of the active drug over placebo by using either a global impression score of very much and much response36 or a 50% reduction in baseline rating scores on the HAM-D.23 Illustrating antidepressant effect, as shown in (Figure 1)., is yet another difficult area. Because both groups of patients, ie, on active drug treatment as well as on placebo treatment, exceed 100 subjects, a small statistically significant difference will be found. In the example illustrated in (Figure 2)., it is obvious that the effect of escitalopram is of clinical significance (effect size >0.40) in depressed patients after only 4 weeks. Figure 1. A typical illustration from a placebo-controlled trial with a new potential antidepressant.