Top 4 Python libraries for interpreted Machine Learning

mediumThis post was originally published by Mikhail Raevskiy at Medium [AI]

Because of the buzz around artificial intelligence bias, organizations are increasingly in need of an explanation of both the predictions of the models being created and how they work.

Fortunately, there are a growing number of libraries that the Python programming language offers to solve this problem. Below is a quick guide to four popular libraries for interpreting and explaining machine learning models. Installed with use pip, supplied with detailed documentation, and emphasizes visual interpretation.

This Python library and extension to the scikit-learn package. Provides some useful and cute visualizations for machine learning models. The renderer objects, the main interface are scikit-learn evaluations, so if you are used to working with scikit-learn, the workflow will seem familiar.

The visualizations provided cover model selection, feature significance determination, and model performance analysis. Let’s walk through a few quick examples.

The library is installed with pip:

pip install yellowbrick

To illustrate a couple of functionalities, we will use a scikit-learn dataset called wine recognition. This dataset with 13 features and 3 target classes is loaded directly from the scikit-learn library. In the code below, we import the dataset and convert it to an object DataFrame. The classifier is able to use the information without preliminary processing.

Use scikit-learn to further split your dataset into validation and training:

In the next step, use the Yellowbricks visualizer to view the correlations between features in the dataset.

Yellowbrick Rand2D

Now let’s tweak RandomForestClassifierand evaluate performance using another renderer:

Yellowbrick ClassificationReport

ELI5 is another visualization library that comes in handy for debugging machine learning models and explaining the predictions made. Works with the most common Python machine learning tools including scikit-learn, XGBoost, and Keras.

Use ELI5 to test the significance of the features of the model discussed above:

By default, the method show_weightsuses gainto calculate the weight, and when other types are needed, add an argument importance_type.

And also use show_predictionto test the basis of individual predictions.

LIME stands for Local Interpreted, Model Independent Explanations. Interprets predictions made by machine learning algorithms. Lime supports explaining unit predictions from a range of classifiers and also interacts with scikit-learn out of the box.

Let’s use Lime to interpret the predictions of the model we trained earlier.

Install the library via pip:

pip install lime

First, let’s create an interpreter. To do this, we take the training dataset as an array from the names of the features used in the model and the class names in the target variable.

Then we create a lambda function that takes a model to predict a sample of data. The line was taken from the detailed tutorial on Lime.

Use an interpreter to explain the forecast on a sampled sample. You will see the result below. Lime creates visualizations that show how the traits contributed to a particular prediction.

In the library, you will find a lot of support functions for machine learning. It covers stacking and voting classifiers, model evaluation, feature extraction, and design and charting. In addition to the documentation to help with the Python library, we recommend reading the in-depth material.

Let’s turn to MLxtend to compare the decision bounds of the voting classifier and the composite classifier.

You will need it again pipfor installation.

pip install mlxtend

See below for used imports.

The following visualization only works with two features at the same time, so first let’s create an array with properties prolineand color_intensity. Chose these traits because of the greatest weight compared to those tested above using ELI5.

Then we create classifiers, fit them to the training data and get a visualization of decision boundaries using MLxtend. The result is below the code.

This does not end with the list of libraries for interpreting, explaining, and visualizing machine learning models that a Python developer uses. Try other useful tools from a long list as well.

Spread the word

This post was originally published by Mikhail Raevskiy at Medium [AI]

Related posts