LinkedIn open-sources GDMix, a framework for training AI personalization models

Venture BeatThis post was originally published by Kyle Wiggers at Venture Beat

LinkedIn recently open-sourced GDMix, a framework that makes training AI personalization models ostensibly more efficient and less time-consuming. The Microsoft-owned company says it’s an improvement over LinkedIn’s previous release in the space — Photon ML — because it supports deep learning models.

GDMix trains fixed effect and random effect models, two kinds of models used in search personalization and recommender systems. They’re normally challenging to train in isolation, but GDMix accelerates the process by breaking down large models into a global model (fixed effect) and many small models (random effects) and then solving them individually. This divide-and-conquer approach allows for swifter training of models with commodity hardware, according to LinkedIn, thus eliminating the need for specialized processors, memory, and networking equipment.

GDMix taps TensorFlow for data reading and gradient computation, which LinkedIn says led to a 10% to 40% training speed improvement on various datasets compared with Photon ML. The framework trains and evaluates models automatically and can handle models on the order of hundreds of millions.

GDMix

DeText, a toolkit for ranking with an emphasis on textual features, can be used within GDMix to train natively as a global fixed effect model. (DeText itself can be applied to a range of tasks, including search and recommendation ranking, multi-class classification, and query understanding.) It leverages semantic matching, using deep neural networks to understand member intents in search and recommender systems. Users can specify a fixed effect model type and DeText and DMix will train and evaluate it automatically, connecting the model to the subsequent random effect models. Currently, GDMix supports logistic regression models and deep natural models DeText supports, as well as arbitrary models users design and train outside of GDMix.

The open-sourcing of GDMix comes after LinkedIn released a toolkit to measure AI model fairness: LinkedIn Fairness Toolkit (LiFT). LiFT can be deployed during training to measure biases in corpora and evaluate notions of fairness for models while detecting differences in their performance across subgroups. LinkedIn says it has applied LiFT internally to measure the fairness metrics of training datasets for models prior to their training.

Spread the word

This post was originally published by Kyle Wiggers at Venture Beat

Related posts