Generalized Error Exponents For Small Sample Universal Hypothesis Testing

Generalized Error Exponents For Small Sample Universal Hypothesis Testing

 

Generalized Error Exponents For Small Sample Universal Hypothesis Testing 

Dayu Huang and Sean Meyn
IEEE Transactions on Information Theory, 2013
See also the Illinois archive [Link]
The small sample universal hypothesis testing problem is investigated:   This means that the number of samples n is smaller than the number of possible outcomes m.  The goal of this work is to find an appropriate criterion to analyze statistical tests in this setting.
A suitable model for analysis is the high-dimensional model in which both and m increase to infinity, and  = o(m).
A new performance criterion based on large deviations analysis is proposed and it generalizes the classical error exponent applicable for large sample problems (in which = O(m)). This generalized error exponent criterion provides insights that are not available from asymptotic consistency or central limit theorem analysis. The following results are established for the uniform null distribution:
  •  The best achievable probability of error decays exponentially in the ration n^2/m.  The optimal error exponent is also identified.
  • A class of tests based on separable statistics, including the coincidence-based test, attains the optimal error exponent.
  •  Pearson’s chi-square test has a zero error exponent

This and related references (bibtex)

@article{huamey13,
Title = {Generalized Error Exponents For Small Sample Universal Hypothesis Testing},
Author = {Huang, D. and Meyn, S.},
Journal = {TIT},
Year = {2013}}
@inproceedings{huamey12e,
Title = {Classification with high-dimensional sparse samples},
Abstract = {The task of the binary classification problem is to determine which of two distributions has generated a length-n test sequence. The two distributions are unknown; two training sequences of length N, one from each distribution, are observed. The distributions share an alphabet of size m, which is significantly larger than n and N. How does N,n,m affect the probability of classification error? We characterize the achievable error rate in a high-dimensional setting in which N,n,m all tend to infinity, under the assumption that probability of any symbol is O(m-1). The results are: 1) There exists an asymptotically consistent classifier if and only if m = o(min{N2, Nn}). This extends the previous consistency result in [1] to the case N ≠ n. 2) For the sparse sample case where max{n, N} = o(m), finer results are obtained: The best achievable probability of error decays as – log(Pe) = J min {N2, Nn}(1 +o(1))/m with J >; 0. 3) A weighted coincidence-based classifier has non-zero generalized error exponent J. 4) The l2-norm based classifier has J = 0.},
Author = {Huang, Dayu and Meyn, S.},
Booktitle = {Information Theory Proceedings (ISIT), 2012 IEEE International Symposium on},
Doi = {10.1109/ISIT.2012.6283985},
Issn = {2157-8095},
Keywords = {error statistics;signal classification;signal sampling; l2-norm based classifier;asymptotically consistent classifier;binary classification;classification error;error decays;generalized error exponent;high dimensional sparse samples;length-n test sequence;training sequences;Approximation methods;Information theory;Random variables;Testing;Training;Vectors;Vocabulary;classification;generalized error exponent;high-dimensional model;large deviations;sparse sample},
Pages = {2586-2590},
Year = {2012},
@inproceedings{huamey12d,
Title = {Optimality of coincidence-based goodness of fit test for sparse sample problems},
Abstract = {We consider the sparse sample goodness of fit problem, where the number of samples n is smaller than the size of the alphabet m. The generalized error exponent based on large deviation analysis was proposed to characterize the performance of tests, using the high-dimensional model in which both n and m tend to infinity and n = o(m). In previous work, the best achievable probability of error is shown to decay -log(Pe) = (n2/m)(1 + o(1))J with upper and lower bounds on J. However, there is a significant gap between the two bounds. In this paper, we close the gap by proving a tight upper-bound, which matches the lower-bound over the entire region of generalized error exponents of false alarm and missed detection, achieved by the coincidence-based test. This implies that the coincidence-based test is asymptotically optimal.},
Author = {Huang, Dayu and Meyn, S.},
Booktitle = {Information Theory and Applications Workshop (ITA), 2012},
Doi = {10.1109/ITA.2012.6181779},
Keywords = {error statistics;sparse matrices;statistical testing;asymptotically optimal;best achievable error probability;coincidence-based goodness;coincidence-based test;false alarm;fit test;generalized error exponents;high-dimensional model;large deviation analysis;missed detection;optimality;sparse sample goodness;sparse sample problems;test performance;tight upper-bound;Analytical models;Biological system modeling;Convergence;Indexes;Materials;Probability;Speech processing;chi-square test;goodness of fit;high-dimensional model;large deviations;optimal test},
Pages = {344-346},
Year = {2012},