Analyzing a directory’s metadata#
The generate_analytics class method returns a dictionary that stores each file type and its number of files and total file size (in MBs).
The report_file is an optional path to the output CSV file. If not specified, then no CSV will be generated the the method will simply return a dictionary containing the analytics.
A filter can be applied to include/exclude file types and directories. See Filters for more information.
from file_processing import Directory
directory = Directory('./tests/resources/directory_test_files/')
directory.generate_analytics(report_file='./report.csv')
{
'size (MB)': {
'.csv': 5.384414,
'.docx': 0.019456,
'.html': 0.168865,
'.msg': 0.0768,
'.pdf': 0.443368,
'.png': 0.004125,
'.pptx': 100.248831,
'.rtf': 0.103257,
'.txt': 0.039357,
'.xlsx': 0.011885,
'.xml': 0.004548,
'.zip': 0.064254
},
'count': {
'.csv': 1,
'.docx': 1,
'.html': 1,
'.msg': 1,
'.pdf': 2,
'.png': 1,
'.pptx': 3,
'.rtf': 1,
'.txt': 1,
'.xlsx': 1,
'.xml': 1,
'.zip': 1
}
}