Μείωση δεδομένων με διαμερισμό του χώρου χρησιμοποιώντας αλγόριθμους Convex Hull για την εύρεση των πιο απομακρυσμένων αντικειμένων (Bachelor thesis)

Γκιοργκίνης, Θωμάς


In the ever evolving scientific field of data categorization, the nearest neighbors algorithm is a stable, efficient method. However, it has sev eral weaknesses that deem it inappropriate in some cases of data sets. Its main drawbacks are: high cost of categorizing each object due to multiple calculations, high storage requirements, and dependence of the accuracy of the results on the quality of the training data. In or der to address its weaknesses, several data reduction algorithms have been implemented, aiming at minimizing the processing burden of the categorizer without affecting its accuracy. The Reduction by Space Partitioning algorithm is one of the most well-known prototype gen eration algorithms for accelerating instance based categorists. RSP3 is based on a repetitive separation process of the original training set. As the criterion for the most subset will be divided first, RSP3 adopts the larger diameter criterion. That is, the subset with the largest diameter defined by the two most distant objects is divided first. The process continues until the resulting subsets are homoge neous, that is, they include objects of the same class. The search for the two most remote objects of each subset is a costly process, since it requires the calculation of all the distances between the objects of each sub-set. Thus, RSP3 has a high computational pre-processing cost. In the framework of the thesis, computational geometry algo rithms will be studied to find the Convex Hull of each subset. Convex Hull is composed by the objects of the data set that define its con tour. The motivation for studying these algorithms stems from the following observation: If these objects that define Convex Hull are found, searching for the most remote objects in each subset will be done quickly since it will not be necessary to calculate all possible distances to the subset but only all possible distances between the objects of Convex Hull. In the experimental part of the thesis, two Convex Hull algorithms will be implemented and incorporated into RSP3. It will also be investigated whether it is possible to use Con vex Hull algorithms in multi-dimensional datasets. The new variants of RSP3 will be tested in 15 sets of real data categorization data and their speed will be compared to that of the “conventional” RSP3.
Institution and School/Department of submitter: Σχολή Τεχνολογικών Εφαρμογών - Τμήμα Μηχανικών Πληροφορικής
Subject classification: Αλγόριθμοι
Algorithms
Μείωση δεδομένων
Data reduction
Keywords: Convex Hull;απομακρυσμένα αντικείμενα;remotely objects;RSP3;αλγόριθμος των εγγύτερων γειτόνων;nearest neighbor algorithm;δεδομένα;data;Τεχνικές μείωσης δεδομένων;Data reduction techniques
Description: Πτυχιακή εργασία - Σχολή Τεχνολογικών Εφαρμογών - Τμήμα Μηχανικών Πληροφορικής, 2019 (α/α 11044)
URI: http://195.251.240.227/jspui/handle/123456789/15059
Appears in Collections:Πτυχιακές Εργασίες

Files in This Item:
File Description SizeFormat 
GKIORKINIS.pdf1.32 MBAdobe PDFView/Open



 Please use this identifier to cite or link to this item:
http://195.251.240.227/jspui/handle/123456789/15059
  This item is a favorite for 0 people.

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.