Pyathena API Specs

data_ml_utils.pyathena_client.client.PyAthenaClient

class data_ml_utils.pyathena_client.client.PyAthenaClient
Initialises pyathena connection.

:param : None.

The PyAthenaClient class does not take in any parameters to initialise it.

Methods

__init__ ()

initialise self

data_ml_utils.pyathena_client.client.PyAthenaClient

_connect ()

create pyathena connection

_connect

create_msck_repair_table (create_raw_query, repair_raw_query, yaml_schema_file_path)

create and repair table through defined schema

create_msck_repair_table

drop_table (table_name, database)

drop table

drop_table

query_as_pandas (final_query)

query athena tables and return as pandas dataframe

query_as_pandas

_connect

_connect()
create a pyathena connection with pandas cursor

:param : None :return: pyathena connection engine :rtype: pyathena.connection.Connection

create_msck_repair_table

create_msck_repair_table(create_raw_query: str, repair_raw_query: str, yaml_schema_file_path: str)
create table and msck repair table in athena with pyathena connection
Parameters:
  • create_raw_query (str) – create table raw sql query

  • repair_raw_query (str) – repair table raw sql query

  • yaml_schema_file_path (str) – file path to yaml schema

Returns:

non exit function value if successful

Return type:

int

drop_table

drop_tables(table_name: str, database: str)
drop table in athena with pyathena connection
Parameters:
  • table_name (str) – table name

  • database (str) – database name

Returns:

non exit function value if successful

Return type:

int

query_as_pandas

query_as_pandas(final_query: str)
query athena sqls with pyathena connection and store them into pandas
Parameters:
  • table_name (str) – table name

  • database (str) – database name

Returns:

return of pandas dataframe

Return type:

pd.DataFrame