I wrote this script a few days ago to find all files which were taking up over a certain threshold of memory on the hard drive. A hard drive on one of my computers had just 6 GB left, so I needed to clean some space up. I used this to identify bulky folders to find out where the problem was.
Is there anything I could have done better? I'm particularly displeased with the exception handling in the get_folder_size
function, since I ended up handling the same exception twice, just at different levels. Couldn't think of a way around it off hand, though. Also, efficiency was an issue with this script. I know I could've invented some framework to keep track of all the folders previously scanned, but that seemed like a bit too monumental of an effort to make for such a trivial script. The recursion was much easier to implement.
Any and all suggestions are welcome!
import os
BYTES_IN_GIGABYTE = 1073741824
GIGABYTE_DISPLAY_LIMIT = 2
ACCESS_ERROR_CODE = 13
def get_folder_size(filepath):
size = os.path.getsize(filepath)
try:
for item in os.listdir(filepath):
item_path = os.path.join(filepath, item)
try:
if os.path.isfile(item_path):
size += os.path.getsize(item_path)
elif os.path.isdir(item_path):
size += get_folder_size(item_path)
except OSError as err:
if err.errno == ACCESS_ERROR_CODE:
print('Unable to access ' + item_path)
continue
else:
raise
except OSError as err:
if err.errno != ACCESS_ERROR_CODE:
raise
else:
print('Unable to access ' + filepath)
return size
def get_all_folder_sizes(root_filepath):
folders = []
for item in os.walk(root_filepath):
if os.path.isdir(os.path.join(root_filepath, item[0])):
folders.append([item[0], get_folder_size(item[0])])
return folders
def convert_bytes_to_gigabytes(bytes):
return bytes / BYTES_IN_GIGABYTE
def main():
folder_sizes = get_all_folder_sizes('C:\\')
folder_sizes.sort()
for folder in folder_sizes:
gigabytes = convert_bytes_to_gigabytes(folder[1])
if gigabytes > GIGABYTE_DISPLAY_LIMIT:
print(folder[0] + ' = ' + format(gigabytes, '.2f') + ' GB')
if __name__ == '__main__':
main()
BYTES_IN_GIGABYTE = 2**30
. – Josay May 16 '14 at 15:05