Support Vector Machine (SVM) for Anomaly Detection

towards-data-science

This post was originally published by Mahbubul Alam at Towards Data Science

Step 1: import libraries

# import libraries
import pandas as pd
from sklearn.svm import OneClassSVM
import matplotlib.pyplot as plt
from numpy import where

Step 2: Prepare data

# import data
data = pd.read_csv("https://raw.githubusercontent.com/uiuc-cse/data-fa14/gh-pages/data/iris.csv")# input data
df = data[["sepal_length", "sepal_width"]]

Step 3: Modeling

# model specification
model = OneClassSVM(kernel = 'rbf', gamma = 0.001, nu = 0.03).fit(df)

Step 4: Prediction

# prediction
y_pred = model.predict(df)
y_pred

Step 5: Filtering Anomalies

# filter outlier index
outlier_index = where(y_pred == -1) # filter outlier values
outlier_values = df.iloc[outlier_index]
outlier_values

Step 6: Visualizing anomalies

# visualize outputs
plt.scatter(data["sepal_length"], df["sepal_width"])
plt.scatter(outlier_values["sepal_length"], outlier_values["sepal_width"], c = "r")

The data points in red are outliers

Summary

Spread the word

This post was originally published by Mahbubul Alam at Towards Data Science

Related posts