I would like to present for review my (much) revised batch templating utility which had it's humble beginnings here in a previous post. As I mentioned there, this program is my entry into python programming. I am trying to grow the simple script from my previous post into a more robust utility.
Questions I am hoping to answer with this post:
- Is the overall structure sound?
- Is my use of exceptions correct?
- Is my documentation OK? This was my first intro into docstrings and I have worked hard to make them as complete as possible.
- One part that bugs me is the
try
block in themain()
function.- First, this whole block should probably be a separate function? I left it in
main()
since it's the meat of the program. - Secondly, there seems to be a lot of code between the
try:
and theexcept:
and I know this should be minimized, but I couldn't come up with a better method.
- First, this whole block should probably be a separate function? I left it in
Program Inputs The program takes as inputs two required files, a CSV data file and a template file and an optional appended file. The output of the program is a set of rendered files, one for each data row in the CSV file. The CSV data file contains a header row which is used as a set keys mapped to tags inside of the template file. For every row in the data file, each data item associated with the keys is substituted with the tag in the template file, the appended file added (with some added tags for *.js files) and the rendered file written to disk. Pretty straight forward I think. The main docstring illustrates a quick example.
Template Syntax
The program uses Python's string.Template() string substitution method which utilizes the $
replacement syntax with the added requirement of mandating the optional (to the method) {
and }
curly braces. So, for a particular Key
from the data file header row, the template tag would be ${Key}
.
Wall of Code I think the docstrings explain pretty well what all is going on...
"""
A simple batch templating utility for Python.
Initially conceived as a patch to quickly generate small HTML files from
catalog data. The program takes as inputs two (2) required files, a CSV
data file and a template file (see below) and the option to append a
third file. Output of the program is a set of rendered files, one for
each data row in the CSV data file.
USAGE:
Current rendition of program uses a simple guided prompt interface
to walk user through process.
**SPECIAL WARNING**
This program copies template file and appended file to strings which
means they will both be loaded fully into memory. Common sense
should be exercised when dealing with extremely large files.
CSV DATA FILE:
Data File shall contain a header row. Header row contains the keys
that will be used to render the output files. Keys shall not
contain spaces. There shall be a corresponding tag in the template
file for each key in the CSV Data File.
File can contain any (reasonable) number of data rows and columns.
Each item in a row is swapped out with the tag in the template file
which corresponds to appropriate key from the header row. There
will be one output file generated for each row in the data file.
TEMPLATE FILE:
The template file is basically a copy of the desired output file
with tags placed wherever a particular piece of data from the CSV
Data File should be placed in the output.
Syntax:
The program uses Python's string.Template() string substitution
method which utilizes the `$` replacement syntax. The program
further restricts the syntax requiring the use of the optional `{`
and `}` curly braces surrounding tags. So, for a particular 'Key'
from the data file header row, the template tag would be ${Key}.
APPENDED FILE:
The appended file is strictly copied _ver batum_ to the end of the
rendered output file. There is really no restriction on the
appended file other than special warning above.
Special Feature:
If the appended file is a Javascript file (detected using the *.js
file extension), the program will add appropriate opening and
closing HTML tags.
QUICK EXAMPLE:
Assume CSV Data File: <some_file.csv>
stockID,color,material,url
340,Blue,80% Wool / 20% Acrylic,http://placehold.it/400
275,brown,100% Cotton,http://placehold.it/600
Assume Template File: <another_file.html>
<h1>Stock ID: ${stockID}</h1>
<ul>
<li>${color}</li>
<li>${material}</li>
</ul>
<img src='${url}'>
Assume ...Appended File? --> No
Output file 1 = 'listing-340.html'
<h1>Stock ID: 340</h1>
<ul>
<li>Blue</li>
<li>80% Wool / 20% Acrylic</li>
</ul>
<img src='http://placehold.it/400'>
Output file 2 = 'listing-340.html'
<h1>Stock ID: 275</h1>
<ul>
<li>brown</li>
<li>100% Cotton</li>
</ul>
<img src='http://placehold.it/600'>
Author: Chris E. Pearson (christoper.e.pearson.1 at gmail dot com)
Copyright (c) Chris E. Pearson, 2015
License: TBD
"""
import os
import re
import csv
import string
def main():
"""
A simple batch templating utility for Python.
See main docstring for details.
"""
# Collect input file names and contents for text files.
fname_data = prompt_filename('Data File')
fname_template = prompt_filename('Template File')
fcontents_template = get_contents(fname_template)
fname_appended, fcontents_appended = get_appended()
# Validate the inputs
tag_set = set(re.findall('\${(\S+)}', fcontents_template))
primary_key, key_set = get_keys(fname_data)
validate_inputs(tag_set, key_set)
validated_template = string.Template(fcontents_template)
# Generate the output
try:
# This seems like a lot to put in a try statement...?
with open(fname_data) as f:
reader = csv.DictReader(f)
f_count = 0
for row in reader:
# Create output filename
output_filename = ('Listing_{}.html'.format(row[primary_key]))
f_count += 1
print('File #{}: {}'.format(f_count, output_filename))
# Prep string
output_main = validated_template.substitute(row)
write_string = '{}{}'.format(output_main, fcontents_appended)
# Write File
with open(output_filename, 'w') as f_out:
f_out.write(write_string)
except OSError:
print('No such file {!r}. Check file name and path and try again.'
.format(fname))
raise
else:
print('{} of {} files created'.format(str(f_count),
str(reader.line_num-1)))
def prompt_filename(fclass):
"""
Prompt user for a filename for given file classification.
Args:
fclass (string):
A descriptive string describing the type of file for which the
filename is requested. _e.g._ 'Template File'
Returns:
filename (string)
"""
while True:
filename = input('Enter {0} --> '.format(fclass))
if os.path.isfile(filename):
return filename
else:
print('No such file: {!r}.'.format(filename))
print('Please enter a valid file name')
continue
def get_contents(fname):
"""
Return contents of file `fname` as a string if file exists.
Args:
fname (string):
Name of the file to be opened and returned as a string.
Returns:
text_file (string):
The entire contents of `fname` read in as a string.
Exceptions:
OSError: informs user that fname is invalid.
"""
try:
with open(fname) as f:
text_file = f.read()
except OSError:
print('No such file {!r}. Check file name and path and try again.'
.format(fname))
raise
else:
return text_file
def get_appended():
"""
Ask user if appended file and prompt filename if so.
Returns:
fname_appended (string)
Filename for appended file.
fcontents_appended (string)
The entire contents of `fname_appended` as a string.
Exceptions:
OSError: Raised by function prompt_filename informs user that
fname is invalid.
See Also:
Function: prompt_filename
Function: get_contents
"""
prompt_for_appended = input('Is there an appended file? --> ')
if prompt_for_appended.lower().startswith('y'):
fname_appended = prompt_filename('Appended File')
fcontents_appended = get_contents(fname_appended)
if fname_appended.lower().endswith('.js'):
open_tag = '<script type="text/javascript">'
close_tag = '</script>'
fcontents_appended = '\n{0}\n{1}\n{2}'.format(open_tag,
fcontents_appended,
close_tag)
else:
fname_appended = None
fcontents_appended = ''
return fname_appended, fcontents_appended
def get_keys(fname):
"""
Get key set as header row of given CSV file and get primary key.
Given a CSV data file `fname`, return the header row from file
as a set of "keys". Also return the primary key for the data file.
The primary key is simply the header for the first column.
Args:
fname (string):
Name of the CSV file for which the keys are needed.
Returns:
primary_key (string)
Header value of first column in given CSV file.
key_set (set of strings)
A set comprised of all header row values for given CSV file.
Exceptions:
OSError: informs user that fname is invalid.
"""
try:
with open(fname) as f:
key_list = f.readline().strip().split(',')
except OSError:
print('No such file {!r}. Check file name and path and try again.'
.format(fname))
raise
else:
primary_key = key_list[0]
key_set = set(key_list)
return primary_key, key_set
def validate_spaces(item_set):
"""
Read through a set of strings and checks for spaces.
The function takes a set of strings and searches through each string
looking for spaces. If a space is found, string is appended to a
list. Once all strings are searched, if any spaces found, print
error with generated list and terminate program.
Args:
item_set (set of strings)
Returns:
None
Exceptions:
A `KeyingError` is raised if any spaces are detected in the data
file key set.
"""
bad_items = []
for item in item_set:
if ' ' in item:
bad_items.append(item)
if bad_items != []:
try:
raise KeyingError('Keys cannot contain spaces.')
except KeyingError as e:
print(e)
print('Please correct these keys:\n', bad_items)
# quit()
raise
def validate_inputs(tag_set, key_set):
"""
Validate template tag_set against data file key_set.
Validates the key_set from a given data file against the tag_set
from the corresponding template file, first checking the key set for
lack of spaces and then checking if the two sets are equivalent. If
either condition is not met, an exception will be raised and the
program will terminate.
Args:
tag_set (set of strings)
key_set (set of strings)
Returns:
None
Exceptions:
A `KeyingError` is raised by function `validate_spaces` if any
spaces are detected in the data file key set.
A `MisMatchError` is raised if the two input sets are not
equivalent.
See also:
Function: validate_spaces
"""
try:
validate_spaces(key_set)
except KeyingError as e:
print('Goodbye')
quit()
if key_set != tag_set:
try:
raise MisMatchError('Tags and keys do not match')
except MisMatchError as e:
print(e)
if tag_set - key_set == set():
print('missing tags for key(s):', key_set - tag_set)
print('(or tag(s) contains spaces)')
else:
print('Check template file tags for key(s):',
key_set - tag_set)
print('Template shows:', tag_set - key_set)
print('Goodbye')
quit()
class KeyingError(Exception):
def __init__(self, arg):
self.arg = arg
class MisMatchError(Exception):
def __init__(self, arg):
self.arg = arg
if __name__ == '__main__':
main()