hip-data-ml-utils¶
This python library package covers the common utility packages that data/ml project will use
hip-data-ml-utils has a few utilities that we try to generalise across projects.
Contents:
Why are packaging this into a python library?¶
There are utilities that are copied and pasted across different repositories (and projects), and we can streamline this to a package import.
Also, this would allow us to save some time by writing lesser tests.
Additionally, this would align the way how analysts query with athena through python too.
Getting Started¶
You can install hip-data-ml-utils from the git repo using pip
$ pip install hip-data-ml-utils --upgrade
Pyathena¶
We try to fit the function calls to be as simple as possible with a one-liner. This covers:
query athena tables and return as pandas dataframe
drop athena tables from offline feature store
create athena tables through schema, and update table with missing partitions
See Pyathena client for more details.
MLflow tracker¶
We try to fit the function calls to be as simple as possible with a one-liner. This covers:
log artifact
log and register a model
log params
log metrics
See MLflow tracker utils for more details.
MLflow utils¶
We try to fit the function calls to be as simple as possible with a one-liner. This covers:
load model
load artifact
get mlflow model evaluation metrics
get registered model run info and mlflow run_id
mlflow promote model
See MLflow utils for more details.
MLflow serve¶
We try to fit the function calls to be as simple as possible with a one-liner. This covers:
enable model endpoint
get endpoint status
get endpoint state status
update databricks model endpoint compute config
See MLflow serve for more details.
MLflow prediction requests¶
We try to fit the function calls to be as simple as possible with a one-liner. This covers:
verify prediction of requests and expected
post requests for integration tests
See MLflow prediction requests for more details.