hip-data-ml-utils

This python library package covers the common utility packages that data/ml project will use

hip-data-ml-utils has a few utilities that we try to generalise across projects.

Why are packaging this into a python library?

There are utilities that are copied and pasted across different repositories (and projects), and we can streamline this to a package import.

Also, this would allow us to save some time by writing lesser tests.

Additionally, this would align the way how analysts query with athena through python too.

Getting Started

You can install hip-data-ml-utils from the git repo using pip

$ pip install hip-data-ml-utils --upgrade

Pyathena

We try to fit the function calls to be as simple as possible with a one-liner. This covers:

  • query athena tables and return as pandas dataframe

  • drop athena tables from offline feature store

  • create athena tables through schema, and update table with missing partitions

See Pyathena client for more details.

MLflow tracker

We try to fit the function calls to be as simple as possible with a one-liner. This covers:

  • log artifact

  • log and register a model

  • log params

  • log metrics

See MLflow tracker utils for more details.

MLflow utils

We try to fit the function calls to be as simple as possible with a one-liner. This covers:

  • load model

  • load artifact

  • get mlflow model evaluation metrics

  • get registered model run info and mlflow run_id

  • mlflow promote model

See MLflow utils for more details.

MLflow serve

We try to fit the function calls to be as simple as possible with a one-liner. This covers:

  • enable model endpoint

  • get endpoint status

  • get endpoint state status

  • update databricks model endpoint compute config

See MLflow serve for more details.

MLflow prediction requests

We try to fit the function calls to be as simple as possible with a one-liner. This covers:

  • verify prediction of requests and expected

  • post requests for integration tests

See MLflow prediction requests for more details.