How to Build a Machine Learning Pipeline in Rust

When Python gets in the way

You built a prototype in Python because the libraries were ready. Now you need to ship the model inside a Rust application that handles real-time sensor data, powers a high-throughput API, or runs on an embedded device. The Python interpreter is too heavy. The GIL blocks your threads. You want the model to run alongside your core logic with zero overhead, memory safety, and predictable performance.

Rust can do this. The ecosystem isn't as vast as Python's, but the core pieces fit together cleanly. You define your data structures, load the data, train a model, and evaluate results using ndarray for numerical computation and linfa for machine learning algorithms. The compiler enforces shape correctness and type safety at every step, catching errors before the pipeline runs.

The pipeline as an assembly line

A machine learning pipeline is an assembly line for data. Raw numbers enter at one end. They get cleaned, normalized, and fed into a mathematical model. Predictions exit at the other end. In Python, you often glue this together with loose functions and hope the shapes match. In Rust, the types are the blueprint. If the data shape changes, the compiler rejects the code.

The two main crates you'll reach for are ndarray and linfa. ndarray provides multi-dimensional arrays with strict shape checking and cache-friendly memory layouts. linfa provides the algorithms, from logistic regression to clustering, all built on top of ndarray. The linfa ecosystem splits algorithms into separate crates. You add linfa-logistic for logistic regression, linfa-clustering for k-means, and so on. This keeps your binary small and compile times fast.

Trust the types. If the shapes don't match, the code doesn't compile.

Minimal example

Start with a simple classification task. You have a matrix of features and a vector of labels. You want to train a logistic regression model and see the predictions.

Add these dependencies to your Cargo.toml:

[dependencies]
ndarray = "0.15"
linfa = "0.8"
linfa-logistic = "0.8"

The linfa crate provides the core traits and types. linfa-logistic provides the logistic regression algorithm. ndarray provides the array structures.

use ndarray::{Array1, Array2};
use linfa::prelude::*;
use linfa_logistic::LogisticRegression;

fn main() {
    // Create a 4x3 matrix of features.
    // Shape is (rows, columns). Here: 4 samples, 3 features each.
    // from_shape_vec consumes the vector and checks the shape.
    let data = Array2::from_shape_vec(
        (4, 3),
        vec![1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0],
    ).unwrap();

    // Labels for each sample. Must match the number of rows in data.
    // from_vec consumes the vector.
    let labels = Array1::from_vec(vec![0, 1, 0, 1]);

    // Bundle data and labels into a Dataset.
    // This enforces that shapes align. The Dataset owns the data.
    let dataset = linfa::Dataset::new(data, labels).unwrap();

    // Train a logistic regression model.
    // .params() returns a builder for hyperparameters.
    // .fit() runs the training algorithm and returns the trained model.
    let model = LogisticRegression::params().fit(&dataset).unwrap();

    // Generate predictions on the same data.
    // predict returns an Array1 of predictions.
    let predictions = model.predict(&data);
    println!("Predictions: {:?}", predictions);
}

When you pass data to functions, pass ArrayView2 instead of Array2. It's a zero-copy reference. The community expects views for inputs and owned arrays for outputs. This avoids unnecessary allocations.

Run the code. Watch the predictions match the labels for this simple linear case.

How the pieces fit together

The Dataset struct holds the feature matrix and the response vector. It implements the DatasetBase trait, which algorithms require. When you call Dataset::new, the library checks that the number of rows in your feature matrix matches the length of your label vector. If they don't match, you get a panic at runtime, but the type system prevents many mismatches earlier.

The params() method returns a builder. This is a Rust convention for configuration. You chain methods like .regularization() or .max_iter() to set hyperparameters. The builder ensures you can't create a model with invalid parameters. The fit method runs the training algorithm. For logistic regression, this involves gradient descent or a closed-form solver. The model learns weights that map features to labels. The trained model struct holds these weights.

The predict method applies the weights to new data. Because ndarray uses contiguous memory layouts, the matrix operations are fast and cache-friendly. The linfa crate can link against BLAS libraries for even faster linear algebra. Enable the ndarray-linalg feature in your dependencies to use BLAS.

Profile the hot path. Matrix multiplication is where the time goes.

Realistic pipeline with preprocessing

Real pipelines don't start with hardcoded vectors. You load data, split it, normalize it, train, and evaluate. Models perform better when features are on similar scales. Normalization is almost always required.

use ndarray::{Array1, Array2};
use linfa::prelude::*;
use linfa::datasets::Iris;
use linfa::preprocessing::StandardScaler;
use linfa_logistic::LogisticRegression;

/// Train a logistic regression model on the Iris dataset and report accuracy.
fn run_pipeline() -> Result<(), Box<dyn std::error::Error>> {
    // Load the Iris dataset.
    // This returns a Dataset with features and labels.
    let dataset = Iris::load()?;

    // Split into training and test sets.
    // 80% train, 20% test.
    // split_with_ratio returns two Datasets.
    let (train, test) = dataset.split_with_ratio(0.8);

    // Normalize features.
    // fit_transform on train computes mean/variance and scales.
    // transform on test applies the same scaling without recomputing stats.
    // This prevents data leakage.
    let scaler = StandardScaler::params().fit(&train);
    let train_scaled = scaler.transform(&train);
    let test_scaled = scaler.transform(&test);

    // Train the model on scaled data.
    let model = LogisticRegression::params().fit(&train_scaled)?;

    // Predict on test set.
    let predictions = model.predict(&test_scaled);

    // Evaluate accuracy.
    // accuracy compares predictions to true labels.
    let accuracy = linfa::metrics::accuracy(&predictions, &test.response);
    println!("Accuracy: {:.2}%", accuracy * 100.0);

    Ok(())
}

fn main() {
    run_pipeline().unwrap();
}

The StandardScaler transforms features to have zero mean and unit variance. This helps gradient descent converge faster and prevents features with large ranges from dominating the model. The fit method computes the mean and variance from the training data. The transform method applies the scaling. You must call fit on the training data and transform on the test data. Never fit the scaler on the test set. That leaks information from the test set into the preprocessing, which inflates your accuracy metrics.

Fit the scaler on training data only. Transform the test data. Never let the test set influence the preprocessing.

Pitfalls and compiler errors

Shape mismatches are the enemy. If you pass a vector where a matrix is expected, the compiler rejects you with E0308 (mismatched types). If you try to use a type that doesn't implement linfa's traits, you'll see E0277 (trait bound not satisfied). For example, linfa algorithms require numeric types that implement NdFloat. If you use i32 instead of f64, you'll get a trait error.

Data leakage is a silent killer. If you normalize the entire dataset before splitting, the test set influences the scaling. Your accuracy will look good during development but drop in production. Always split first, then preprocess.

Memory allocation in tight loops can hurt performance. ndarray allocates on the heap. If you create arrays inside a loop, you trigger allocations on every iteration. Reuse buffers or use ArrayView to avoid copies.

Check your shapes before you call fit. A panic at runtime is easier to debug than a silent wrong prediction.

Choosing your tools

Use linfa for classical machine learning tasks like logistic regression, k-means, or PCA when you want a pure Rust stack with predictable performance and no C dependencies.

Use candle or tch-rs for deep learning and neural networks when you need GPU support or want to load models trained in Python frameworks.

Use ndarray as your data structure for tabular data and tensors when you need strict shape enforcement and cache-friendly memory access.

Reach for polars or arrow2 when your pipeline involves heavy data manipulation, filtering, or joining before the modeling step; these crates optimize for dataframe operations rather than numerical computation.

Pick the tool that matches the model, not the hype.

Where to go next

A machine learning pipeline in Rust is a sequence of steps to prepare data, teach a computer to recognize patterns, and make predictions. You use libraries like linfa to handle the math and ndarray to organize your data into tables. Think of it like a factory assembly line where raw data enters one end and smart predictions come out the other.