Published on

Using Snowflake with Python

Authors

Before we begin, DO NOT PIP INSTALL SNOWFLAKE!

Intro:

Note, this entry just covers intstalling and doing a basic read of data. I have (at least) two other posts that cover:

  1. Creating role permissions in python for Snowflake
  2. Uploading a parquet to snowflake from local that infers type of columns

Install snowflake package

Don't be tempted to just pip install "snowflake"! Instead install the "snowflake-connector-python" package using your favorite package manager. If you install the former, you will be in for a sad time of fixing the installion as it is incompatible with the actual snowflake package you need.

mamba install snowflake-connector-python

Copy your account from the dashboard

Some snowflake documentation is either old or incompatible with trial accounts. It recommends you use the url, but new urls / trial urls do not have the necessary information. Instead go to the account view in the bottom right of the UI and you will be able to see your account id. Note: in the UI, it has a period that you need to change from a period to a dash

It should look something like:

ABCDEFG-HIJ12345

for your account number. Make sure to put these into a .env and include the .env in a gitignore file so that you don't submit your credentials to git.

Read data from Snowflake

A good first step would be to do an initial read. You can do this of data from the snowflake trial account that is preloaded or data you loaded into the UI. You just need to know the database, schema and table you want to pull.

# Connecting to Snowflake
conn = snowflake.connector.connect(
    user=USER,
    password=PASSWORD,
    account=ACCOUNT,
)

start = time.time()

# connect
cur = conn.cursor()

# pick the database you want to use
database_text = "USE DATABASE database_name"
print(database_text)
conn.cursor().execute(database_text)

# read from a database 
sql = "select * from database_name.schema_nmae.table_name"
print(sql)
cur.execute(sql)

print(request_text)
conn.cursor().execute(request_text)

middle = time.time()
print("query took ", middle-start, " seconds")

df = cur.fetch_pandas_all()

end = time.time()
print("unwrap took", end-start, " seconds")

This is a intro into a command that gets you to reading data. From here, you can use any command that you would use within the UI to pull data. Additionally, you can run any command (i.e. role permissions) from these steps as well.

I will make a couple additional posts on specific commands as well, such as uploading a parquet with column type inference as well as creating role permissions.