Milan Hladík's Publications:

The complexity of computation and approximation of the t-ratio over one-dimensional interval data

Michal Černý and Milan Hladík. The complexity of computation and approximation of the t-ratio over one-dimensional interval data. Comput. Stat. Data Anal., 80(0):26–43, 2014.

Download

[PDF] [gzipped postscript] [postscript] [HTML] 

Abstract

The main question is how to compute the upper and lower limits of the range of possible values of a given statistic, when the data range over given intervals. Initially some well-known statistics, such as sample mean, sample variance or F-ratio, are considered in order to illustrate that in some cases the limits can be computed efficiently, while in some cases their computation is NP-hard. Subsequently, the t-ratio (variation coefficient) is considered. It is investigated when the limits for t-ratio are computable in polynomial time and a new efficient algorithm is designed for this case. Conversely, complementary NP-hardness results are proved, demonstrating the cases when the computation cannot be done efficiently. Subsequently, the NP-hardness results are strengthened: it is shown that under certain assumptions, even an approximate evaluation with an arbitrary absolute error is NP-hard. Finally, it is shown that the situation can also be (in some sense) regarded positively: a new pseudopolynomial algorithm is developed. The algorithm is of practical importance, especially when the dataset to be processed is large and does not contain excessively large numbers.

BibTeX

@article{CerHla2014a,
 author = "Michal {\v{C}}ern\'{y} and Milan Hlad\'{\i}k",
 title = "The complexity of computation and approximation of the t-ratio over one-dimensional interval data",
 journal = "Comput. Stat. Data Anal.",
 fjournal = "Computational Statistics and Data Analysis",
 volume = "80",
 number = "0",
 pages = "26-43",
 year = "2014",
 doi = "10.1016/j.csda.2014.06.007",
 issn = "0167-9473",
 url = "https://doi.org/10.1016/j.csda.2014.06.007",
 bib2html_dl_html = "http://www.sciencedirect.com/science/article/pii/S0167947314001789",
 abstract = "The main question is how to compute the upper and lower limits of the range of possible values of a given statistic, when the data range over given intervals. Initially some well-known statistics, such as sample mean, sample variance or F-ratio, are considered in order to illustrate that in some cases the limits can be computed efficiently, while in some cases their computation is NP-hard. Subsequently, the t-ratio (variation coefficient) is considered. It is investigated when the limits for t-ratio are computable in polynomial time and a new efficient algorithm is designed for this case. Conversely, complementary NP-hardness results are proved, demonstrating the cases when the computation cannot be done efficiently. Subsequently, the NP-hardness results are strengthened: it is shown that under certain assumptions, even an approximate evaluation with an arbitrary absolute error is NP-hard. Finally, it is shown that the situation can also be (in some sense) regarded positively: a new pseudopolynomial algorithm is developed. The algorithm is of practical importance, especially when the dataset to be processed is large and does not contain excessively large numbers.",
 keywords = "Interval data, t-ratio, Computational complexity, NP-hardness, Inapproximability, Pseudopolynomial algorithms",
}

Generated by bib2html.pl (written by Patrick Riley ) on Mon Apr 15, 2024 08:26:42