Μηχανική μάθηση βασισμένη σε στιγμιότυπα μέσω τεχνικών μείωσης δεδομένων για μη μετρικούς χώρους (Master thesis)

Φωτιάδης, Γεώργιος


In the era of information and rapid internet growth, the management and processing of large volumes of training data becomes increasingly important. Handling such large datasets is not feasible by classification algorithms due to high computational costs and memory storage requirements. Therefore, this data is pre-processed using Data Reduction Techniques to reduce computational costs and the memory storage. Most data reduction techniques which have been proposed and are available in the literature primarily focus on the k-Nearest Neighbor (k-NN) classifier. The k-NN classifier is the simplest instance-based learning method in machine learning. In most practical data science applications, datasets contain categorical variables. However, the k-NN classifier cannot handle categorical data, thus a preprocessing step is necessary to convert categorical data into numerical data. Various methods for this purpose can be found in the literature, and this work presents the most important ones. However, applying an additional preprocessing step is a drawback because it adds computational cost. This issue is the motivation behind this thesis. The purpose of this thesis is to address the challenge of effective classification of data containing categorical features without requiring additional preprocessing steps for their conversion. The methodology used includes the development of new variations of the CNN-rule algorithm (Condensed Nearest Neighbor rule), which use distance metrics for non-metric spaces. By conducting experiments on eight datasets, the three variations of the CNN-rule algorithm were compared to the k-Nearest Neighbor algorithm without data reduction, evaluating accuracy and reduction rate. The experimental results demonstrate remarkable performance in all three variations of the CNN-rule algorithm.
Institution and School/Department of submitter: Σχολή Μηχανικών - Τμήμα Μηχανικών Πληροφορικής και Ηλεκτρικών Συστημάτων
Keywords: Data Reduction Techniques;τεχνικές μείωσης του πληθυσμού;κατηγοριοποιητής εγγύτερων γειτόνων;Instance-Based Learning;μετρικοί χώροι;μη μετρικοί χώροι;Hamming απόσταση
Description: Μεταπτυχιακή εργασία - Σχολή Μηχανικών - Τμήμα Μηχανικών Πληροφορικής και Ηλεκτρικών Συστημάτων, 2023 (α/α 14115)
URI: http://195.251.240.227/jspui/handle/123456789/16791
Appears in Collections:Μεταπτυχιακές Διατριβές

Files in This Item:
File Description SizeFormat 
Master_Thesis_IHU_Fotiadis_Georgios.pdfΜεταπτυχιακή εργασία 1.43 MBAdobe PDFView/Open



 Please use this identifier to cite or link to this item:
http://195.251.240.227/jspui/handle/123456789/16791
  This item is a favorite for 0 people.

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.