An Accelerated Human-in-the-Loop Machine Learning System

Why Helix?

Machine learning (ML) research in the past few years has gifted us with numerous effective tools and techniques to power applications with data. Currently, ML applications require a great deal of customization, and ML practitioners spend an inordinate amount of time discovering, through trial and error, the precise recipe for their applications. From raw data to a learned model, perfecting a machine learning pipeline involves many iterations with incremental modifications over both the data preparation and the model training components.
Helix offers an integrated framework for effortlessly developing and iterating over end-to-end machine learning pipelines. Through intelligent materialization of intermediate results and fine-grained data provenance across data preparation and model learning, Helix significantly reduces the manual overhead for incremental modification and shortens the iteration cycle. The declarative programming interface allows the data scientist to focus on data intelligence rather than system details.


Recent Releases

Currently in private beta. Public beta to be announced.


The Helix development team includes Doris Xin (dorx0 @, Stephen Macke (smacke @, Jialin Liu (jialin2 @, Doris Lee ( and Angie Lee (, under the guidance of Prof. Aditya Parameswaran (adityagp @ at the University of Illinois at Urbana-Champaign.