UMP Institutional Repository

Enhancement of new smooth support vector machines for classification problems

Santi Wulan, Purnami (2011) Enhancement of new smooth support vector machines for classification problems. PhD thesis, Universiti Malaysia Pahang.

Enhancement of new smooth support vector machines for classification problems - Table of contents.pdf - Accepted Version

Download (116kB) | Preview
Enhancement of new smooth support vector machines for classification problems - Abstract.pdf - Accepted Version

Download (44kB) | Preview
Enhancement of new smooth support vector machines for classification problems - References.pdf - Accepted Version

Download (49kB) | Preview


Research on Smooth Support Vector Machine (SSVM) for classification problem is an active field in data mining. SSVM is reformulation of standard Support Vector Machines (SVM). In SSVM, smoothing technique must be applied to convert constraint optimization to the unconstraint optimization problem since the objective function of this unconstraint optimization is not twice differentiable. The smooth function is used to replace the plus function to obtain a smooth support vector machine (SSVM). To get more accuracy performance, Multiple Knot Spline SSVM (MKS-SSVM) is proposed. MKS-SSVM is a new SSVM which used multiple knot spline function to approximate the plus function instead the integral sigmoid function in SSVM. To obtain optimal accuracy results, Uniform Design method is used to select parameter. The performance of the method is evaluated using 10-fold cross validation accuracy, confusion matrix, sensitivity and specificity. To evaluate the effectiveness of our method, an experiment is carried out on four medical dataset, i.e. Pima Indian diabetes dataset, heart disease, breast cancer prognosis, and breast cancer diagnosis. The results of this study showed that MKS-SSVM was effective to diagnose medical dataset and this is promising results compared to the previously reported results. SSVM algorithms are developed for binary classification. However, in many real problems data points are discriminated into multiple categories. Hence, MKS-SSVM is extended for multiclass classification. Two popular multiclass classification methods One against All (OAA) and One against One (OAO)) were used to extend MKS-SSVM. Numerical experiments show that the classification accuracy of OAA and OAO method are competitive with each other and there is no clear superiority of one method over another. While the computation time, the OAO method is lower than the OAA method on three dataset. This indicated that the OAO method is usually more efficient than the OAA. In the final part, the reduced support vector machine (RSVM) was proposed to solve computational difficulties of SSVM in large dataset. To generate representative reduce set for RSVM, clustering reduced support vector machine (CRSVM) had been proposed. However, CRSVM is restricted to solve classification problems for large dataset with numeric attributes. In this research, an alternative algorithm, k-mode RSVM (KMo-RSVM) that combines RSVM and k-mode clustering technique to handle classification problems on categorical large dataset and k-prototype RSVM (KPro-RSVM) which combine k-prototype and RSVM to classify large dataset with mixed attributes were proposed. In our experiments, the effectiveness of KMo-RSVM is tested on four public available dataset. It turns out that KMo-RSVM can improve speed of running time significantly than SSVM and still obtained a high accuracy. Comparison with RSVM indicates that KMo-RSVM is faster, gets smaller reduced set and comparable testing accuracy than RSVM. From experiments on three public dataset also show that KPro- RSVM can tremendously reduces the computational time and can handling classification for large mixed dataset, when the SSVM method ran out of memory (in case: census dataset). The comparison with RSVM indicate that the computational time of KPro RSVM less than RSVM method, and obtained testing accuracy of KPro-RSVM a little decrease than RSVM.

Item Type: Thesis (PhD)
Additional Information: Thesis (Doctor of Philosophy in Computer Science) -- Universiti Malaysia Pahang - 2011, SV: PROF. DR. JASNI MOHAMAD ZAIN, NO. CD: 6042
Uncontrolled Keywords: Support vector machines; Data mining
Subjects: Q Science > Q Science (General)
Faculty/Division: Faculty of Computer System And Software Engineering
Depositing User: Mrs. Sufarini Mohd Sudin
Date Deposited: 19 Sep 2018 01:37
Last Modified: 19 Sep 2018 01:37
Download Statistic: View Download Statistics

Actions (login required)

View Item View Item