Infrastructure

MLOps

The practice of streamlining the ML lifecycle — from experimentation to production deployment and ongoing monitoring.

Definition

MLOps (Machine Learning Operations) applies DevOps principles to ML systems, addressing the unique challenges of deploying and maintaining models in production. While software code is deterministic, ML models are probabilistic and degrade when the data distribution shifts ("concept drift").

MLOps encompasses: experiment tracking (logging hyperparameters, metrics, artefacts), model versioning and registry, CI/CD pipelines for model training and deployment, feature stores, automated retraining triggers, performance monitoring, and A/B testing.

Mature MLOps platforms (Weights & Biases, MLflow, Kubeflow, Vertex AI) are necessary for organisations running many models in production. Without MLOps, the "85% of ML projects never reach production" statistic reflects the gap between experimentation and reliable deployment.

Examples

Weights & Biases
MLflow
Kubeflow
Amazon SageMaker
Databricks MLflow

Related Terms

Vector Database

A database optimised for storing and querying high-dimensional embedding vectors via approximate nearest-neighbour search.

Inference

Using a trained AI model to generate predictions or outputs on new input data.

GPU (Graphics Processing Unit)

Specialised processors with thousands of cores enabling the massive parallel computation required for AI training and inference.

Explore

← All glossary terms AI concept guides AI timeline Browse companies