11 and occurs at a test-minus-control value of 0.64. Applying these threshold values to Fig. 1 gives 291 positive test wells and 63 pseudo-positive control wells for haemagglutinin. The corresponding numbers for neuraminidase are much closer — 222 and 204 — suggesting that reliable discrimination is not possible for neuraminidase. By quartile of the transformed mean, the proportions positive for haemagglutinin are: Small molecule library 0, 68, 13 and 15%, and for neuraminidase are 22, 50, 12 and 11%. The maximum difference between the two
ECDFs is also used by the Kolmogorov–Smirnov test for differences between distributions. A large p value from this test would again suggest that reliable identification of positive samples is not possible, although the converse is not necessarily true. In other words, the p value being less than 5%, for example, does not imply that reliable identification will
be possible. Rather, the hypothesis test screens out examples for which no reliable identification can be expected (Armitage et al., 2001, page 472). Over all 20 pools, the p values ranged from 2 × 10− 16 to 0.67, those for haemagglutinin and neuraminidase being 2 × 10− 9 and 0.02 respectively. screening assay Hence for some pools there is no tendency for test to exceed control, as opposed to the other way round, and in such cases trying to assign a threshold would be futile. This analysis can be expressed in terms of the probability of correctly identifying which pool is test and which is control, when this status is unknown. Suppose we have i) one person’s test and control results tetracosactide x and y (possibly on a transformed scale), x being the larger, but without knowing whether x or y is test, and ii) the
distribution of previous test-minus-control values (with the experimental conditions known). We expect larger values to result from the test condition, so suppose our rule is to conclude that x is from the test condition if it exceeds the smaller one by more than a value k. The conditional probability that x is the test sample, given that x − y > k, is Probxistestx−y>k=Probxistest&x−y>kProbx−y>k=Probxistest&x−y>kProbxistest&x−y>k+Probxiscontrol&x−y>k This last expression is the area of the upper tail of the distribution (above a test-minus-control value of k) divided by the sum of the upper and lower tails (above k or below −k). If the control value rarely exceeds the test by k, then this probability will be high. This argument is applied to the cohort data in Fig. 3. For haemagglutinin, the test value is likely to exceed control, for a wide range of threshold values. For neuraminidase, however, the control value is about as likely to exceed test as the other way round. Results from simulated data confirm that the proportion of samples identified as positive increases with the excedent test mean over the control mean (see Supplementary Material). These results also suggest that the current approach may be conservative in identifying positives.