We need a principled way of managing state in real-time ML pipelines.
As more models are deployed in real-world pipelines, the recurring lesson is that data and data featurization matters above all else. The last generation of big data systems scaled ML to real-world datasets, and now feature stores are quickly emerging as a new frontier for connecting models to real-time data .
Keeping features up-to-date is critical for model accuracy, but expensive and hard to scale.
Feature stores, as the name implies, store features derived from raw data and serve them to downstream models for training and inference. For…