Download PDFOpen PDF in browser

Offline and Online Feature Store for Faster and Consistent Machine Learning Modelling in Wellness Domain

EasyChair Preprint 7179

3 pagesDate: December 7, 2021

Abstract

Machine Learning (ML) projects in the industry must draw from various data sources often complicating the data analysis process. Dealing with messy data, conversion to usable formats, feature extraction, and engineering take ~70% of the development time. Ensuring feature consistency is typically challenging due to disparities between development and production infrastructures. Also, reusability of these features across projects and teams is difficult. To address these challenges, Feature Stores are being developed to make the curated features readily available to developers while also ensuring consistency across time and environments.

We report on the development of a proprietary Feature Store to deal with 20 years’ worth of data in the Wellness industry. We present an Offline Feature Store for model development cycle, using Snowflake and DBT. And an Online Feature Store for serving features in real-time, using Amazon DynamoDB. We further demonstrate one of our projects: Lead Scoring developed using the Feature Store, which was completed in 1 quarter, much before business allocated 2 quarters. Feature Store reduced the data processing time of our developers by at least 50% expediting the development process.

Keyphrases: Feature Store, Offline Online Database, feature consistency, feature engineering, fitness, wellness

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@booklet{EasyChair:7179,
  author    = {Sukant Kumar and Ramya Velaga and Saishradha Mohanty and Prasad Saripalli},
  title     = {Offline and Online Feature Store for Faster and Consistent Machine Learning Modelling in Wellness Domain},
  howpublished = {EasyChair Preprint 7179},
  year      = {EasyChair, 2021}}
Download PDFOpen PDF in browser