Milan Hladík's Publications:

Sparse least-squares Universum twin bounded support vector machine with adaptive $L_p$-norms and feature selection

Hossein Moosaei, Fatemeh Bazikar, Milan Hladík, and Panos M. Pardalos. Sparse least-squares Universum twin bounded support vector machine with adaptive L_p-norms and feature selection. Expert Syst. Appl., 248:123378:1–23, August 2024.

Download

[PDF] [gzipped postscript] [postscript] [HTML]

Abstract

In data analysis, when attempting to solve classification problems, we may encounter a large number of features. However, not all features are relevant for the current classification, and including irrelevant features can occasionally degrade learning performance. As a result, selecting the most relevant features is critical, especially for high-dimensional data sets in classification problems. Feature selection is an effective method for resolving this issue. It attempts to represent the original data by extracting relevant features containing useful information. In this research, our aim is to propose a p-norm least-squares Universum twin bounded support vector machine (LSp-UTBSVM) to perform classification and feature selection at the same time. Indeed, the proposed method, which outperforms the traditional least-squares Universum twin bounded support vector machine, can achieve good classification accuracy in a reasonable amount of time while also providing a sparse solution. The model we propose is an adaptive learning procedure with p-norm (0 < p < 1), where the parameter p can be automatically selected by the data set. The algorithm we use to find the approximate solution of this model involves solving systems of linear equations. Furthermore, we obtain new bounds for the absolute values of non-zero components of a local optimal solution. These bounds allow us to remove the zero components from an arbitrary numerical solution. Setting the parameter p, LSp-UTBSVM improves classification accuracy and selects the relevant features. Numerical experiments on a handwritten digit recognition, University of California Irvine (UCI) benchmark, Normally Distributed Clusters (NDC) and high dimensional data sets confirm the superiority of the proposed method in the accuracy of classification and the selection of relevant features in comparison with some popular methods.

BibTeX

@article{MooBaz2024b,
 author = "Hossein Moosaei and Fatemeh Bazikar and Milan Hlad\'{\i}k and Panos M. Pardalos",
 title = "Sparse least-squares {Universum} twin bounded support vector machine with adaptive $L_p$-norms and feature selection",
 webtitle = "Sparse least-squares {Universum} twin bounded support vector machine with adaptive L<sub>p</sub>-norms and feature selection",
 journal = "Expert Syst. Appl.",
 fjournal = "Expert Systems with Applications",
 volume = "248",
 month = "August",
 pages = "123378:1-23",
 year = "2024",
 doi = "10.1016/j.eswa.2024.123378",
 issn = "0957-4174",
 url = "https://www.sciencedirect.com/science/article/pii/S0957417424002434",
 bib2html_dl_html = "https://doi.org/10.1016/j.eswa.2024.123378",
 abstract = "In data analysis, when attempting to solve classification problems, we may encounter a large number of features. However, not all features are relevant for the current classification, and including irrelevant features can occasionally degrade learning performance. As a result, selecting the most relevant features is critical, especially for high-dimensional data sets in classification problems. Feature selection is an effective method for resolving this issue. It attempts to represent the original data by extracting relevant features containing useful information. In this research, our aim is to propose a p-norm least-squares Universum twin bounded support vector machine (LSp-UTBSVM) to perform classification and feature selection at the same time. Indeed, the proposed method, which outperforms the traditional least-squares Universum twin bounded support vector machine, can achieve good classification accuracy in a reasonable amount of time while also providing a sparse solution. The model we propose is an adaptive learning procedure with p-norm (0 < p < 1), where the parameter p can be automatically selected by the data set. The algorithm we use to find the approximate solution of this model involves solving systems of linear equations. Furthermore, we obtain new bounds for the absolute values of non-zero components of a local optimal solution. These bounds allow us to remove the zero components from an arbitrary numerical solution. Setting the parameter p, LSp-UTBSVM improves classification accuracy and selects the relevant features. Numerical experiments on a handwritten digit recognition, University of California Irvine (UCI) benchmark, Normally Distributed Clusters (NDC) and high dimensional data sets confirm the superiority of the proposed method in the accuracy of classification and the selection of relevant features in comparison with some popular methods.",
 keywords = "Universum; Twin bounded support vector machine; Least-squares twin bounded support vector machine with Uiversum; p-norm; Feature selection",
}

Generated by bib2html.pl (written by Patrick Riley ) on Mon Apr 15, 2024 08:26:42