Train a classification model on a labeled dataset using the svm-train command and predict labels on new data with svm-predict.
Run the easy.py Python script to automatically scale features and search for the best SVM parameters for a beginner-friendly end-to-end training pipeline.
Use LIBSVM's Python interface to integrate SVM training directly into a Python data science workflow alongside NumPy or pandas.
Apply one-class SVM to detect anomalies or outliers by training only on normal examples and flagging data points that don't resemble them.
Core library requires compiling from C source, Python and Java interfaces are included, Windows pre-built binaries are provided for those who skip compilation.
LIBSVM is a software library for training and using Support Vector Machines, a family of mathematical models used in machine learning for classification and regression tasks. Given a dataset with labeled examples, a support vector machine learns a boundary that separates the categories and can then classify new, unseen data points. LIBSVM is one of the most widely used implementations of this technique and has been a standard reference in academic and applied machine learning for decades. The library covers several variations of the SVM approach: C-SVC and nu-SVC for classifying data into categories, epsilon-SVR and nu-SVR for predicting continuous numeric values, and one-class SVM for detecting whether new data resembles the training set. These variants differ in how they handle the tradeoffs between fitting the training data and tolerating errors. Users interact with LIBSVM through three command-line programs. The svm-train program reads a data file, fits a model, and writes it to disk. The svm-predict program loads a saved model and produces predictions on new data. The svm-scale program rescales input features to a consistent range, which the documentation says improves results in practice. A Python script called easy.py automates the full pipeline, including scaling and searching for good model parameters, making it accessible to people new to the technique. Data files use a plain-text format where each line represents one example: a label followed by index-value pairs for each feature. This sparse format is efficient for datasets where many features are zero. Interfaces are available for Java, Python, and MATLAB/Octave, in addition to the core C implementation. A simple graphical toy program lets users draw data points on screen and visualize how the model separates them. Pre-built Windows binaries are included. The library is distributed with a copyright notice that permits free use for research and commercial purposes with attribution.
← cjlin1 on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.