Determing Average Storage Costs via Azure with Python
Enabling you to say “I saved us $3,000 per subscription!” with ease.
About a month ago, a photographer (and filmmaker in the making) friend approached me about hosting in Azure a copy of his media for safekeeping, and wanted to also understand an average cost over time as they’d add more files to the Storage Account. Funnily enough, this is a small application script that I had written before for the green office, along with a script that I had integrated into one of my monthly to-be-automated tasks here in the red office. I figured I’s share the Simple Python script, seeing that despite some of the excellent documentation provided by Microsoft, there are multiple ways to approach the solution which can easily be mangled and confused with other solutions and recommendations. It took me two days of work to get it all working together nicely, having scoured Stackoverflow and documentation sites all pointing to their “solutions” without specifying SDK version etc. So, let’s go over a coherent working method that I provided my friend for them to utilize as they traverse and leverage Azure in their media-creation process.
Getting Setup
Requirements:
- Python3.7
- Pip
- Virtualenv
With the above requirements installed, use the following commands to setup the dependencies that we’ll need for this:
# Default to using python3.7, modify path to your local py3.7 install
virtualenv -p /usr/local/bin/python3.7 venv
# Activate the virtual environment
source ./venv/bin/activate
# Install requirements
pip install 'azure-storage-blob==12.2.0'
pip install 'azure-mgmt-resource==2.2.0'
pip install 'azure-mgmt-storage==2.0.0'
pip install 'azure-common==1.1.24'
The Script:
So, let’s break down what the script is doing to achieve our analytic requirements. I will note now, that I haven’t tested this script against a valid subscription since rewriting it to accommodate this post, so there may be a typo-friendo along the way. We’ll ignore and fix those as they come up!
# Script AverageCostPerMonth.py
# Version: 1.1
# Usage: python3.7 AverageCostPerMonth.py
from azure.common.credentials import ServicePrincipalCredentials
from azure.mgmt.resource import ResourceManagementClient
from azure.mgmt.storage import StorageManagementClient
from azure.storage.blob import BlobServiceClient, ContainerClient
# Globals (REPLACE WITH YOURS)
TENANT = ""
CLIENT_ID = ""
SECRET = ""
SUBSCRIPTION_ID = ""
RESOURCE_GROUP_NAME = ""
STORAGE_ACCOUNT_NAME = ""
CONTAINER_NAME = ""
FILE_FILE_EXTENSION = ".raw"
# Functional Helpers
def pretty_size(bytes):
units = [
(1<<50, ' PB'),
(1<<40, ' TB'),
(1<<30, ' GB'),
(1<<20, ' MB'),
(1<<10, ' KB'),
(1, (' byte', ' bytes'))
]
for factor, suffix in units:
if bytes >= factor:
break
amount = int(bytes/factor)
if isinstance(suffix, tuple):
singular, multiple = suffix
if amount == 1:
suffix = singular
else:
suffix = multiple
return str(amount) + suffix
def average_cost(bytes, cost_average):
return (bytes / (1 << 30 )) * cost_average
def format_row():
return " {:80} {:25} \n".format
if __name__ == '__main__':
# Authenticate against ServicePrincipal
credentials = ServicePrincipalCredentials(tenant=TENANT, client_id=CLIENT_ID, secret=SECRET)
# Create Storage Client
storage_client = StorageManagementClient(credentials, SUBSCRIPTION_ID)
# Retrieve Storage Account Keys
storage_keys = storage_client.storage_accounts.list_keys(RESOURCE_GROUP_NAME, STORAGE_ACCOUNT_NAME)
storage_keys = { v.key_name: v.value for v in storage_keys.keys }
# Create Container Client, Grant it 20 second lease
blob_service_client = BlobServiceClient("https://{}.blob.core.windows.net".format(STORAGE_ACCOUNT_NAME), credentials=storage_keys["key1"])
container_service = blob_service_client.get_container_client(CONTAINER_NAME)
container_service.acquire_lease(20)
# Retrieve Blobs
blobs = container_service.list_blobs()
archive = [blob for blob in blobs]
archive_size = sum(blob.size for blob in archive)
# Print Stats
print(format_row("Name", "Size"))
for blob in archive:
print(format_row(blob.name, pretty_size(blob.size)))
print(f"\n\nTotal Size: {pretty_size(archive_size)}")
print(f"\nAverage Monthly Cost: ${average_cost(archive_size, 0.015)}")
Let’s go over some of the functional-style helpers that I wrote to help keep the script clean:
- For this version, I opted for brevity to make all the dynamic variables that would be useful to have as proper command-line arguments as global constant variables. These are used throughout the script and should never be committed to a repository (public or private).
- The
pretty_size
function was taken from ccpizza’s Stackoverflow answer, and allows for us to convert bytes to petabytes, with all the magic in-between. - the
average_cost
function is a simple GB * cost_average calculation, utilizing the bytes to GB formula to match Azure’s $0.015/GB pricing scheme. - the
format_row
function is a neat little trick I learned on Stackoverflow (cannot find the link, will go through work-computer when I have the chance) to print out tables without having to worry about multiple format calls etc. You can see me calling it by the end of the script for final output of the blobs found in the Storage Account.