Python Programming/Files
Files, specifically file handles, are an important example of resources, and thus should generally be managed using the with
statement; see context managers section.
In rare cases – namely when a file is not only used within a single block of code – it is necessary to do manual resource management using File.close()
, but this is error-prone and requires great care to be exception safe. In interactive use using explicit open()
and File.close()
results in immediate evaluation, instead of the delayed evaluation of using a with
statement.
Contents
File I/O[edit]
Read entire file:
with open('testit.txt') as f: inputFileText = f.read() print(inputFileText)
Notes:
- The
with
statement ensures that the file is closed when execution exits thewith
clause. - Files are automatically opened in read-only text mode – no
mode
argument is necessary. Read-only mode can be specified explicitly with the'r'
argument, and text mode (undocument) with't'
, or combined with'rt'
.
Read certain number of bytes from a file:
with open('testit.txt') as f: inputFileText = f.read(123) print(inputFileText)
When opening a file, one starts reading at the beginning of the file, if one would want more random access to the file, it is possible to use seek()
to change the current position in a file and tell()
to get to know the current position in the file. This is illustrated in the following example, using manual open
and File.close()
to get immediate evaluation:
>>> f = open('/proc/cpuinfo') >>> f.tell() 0L >>> f.read(10) 'processor\t' >>> f.read(10) ': 0\nvendor' >>> f.tell() 20L >>> f.seek(10) >>> f.tell() 10L >>> f.read(10) ': 0\nvendor' >>> f.close() >>> f <closed file '/proc/cpuinfo', mode 'r' at 0xb7d79770>
Here a file is opened, twice ten bytes are read, tell()
shows that the current offset is at position 20, now seek()
is used to go back to position 10 (the same position where the second read was started) and ten bytes are read and printed again. And when no more operations on a file are needed the close()
function is used to close the file we opened.
Read one line at a time:
with open('testit.txt') as f: for line in f: print line
In this case readlines()
will return an array containing the individual lines of the file as array entries. Reading a single line can be done using the readline()
function which returns the current line as a string. This example will output an additional newline between the individual lines of the file, this is because one is read from the file and print introduces another newline.
Write to a file requires the second argument of open()
to be 'w'
, this will overwrite the existing contents of the file if it already exists when opening the file:
output_file_text = "Here's some text to save in a file" with open('testit.txt', 'w') as f: f.write(output_file_text)
Append to a file requires the second argument of open()
to be 'a'
(from append):
output_file_text = "Here's some text to add to the existing file." with open('testit.txt', 'a') as f: f.write(output_file_text)
Note that this does not add a line break between the existing file content and the string to be added.
Testing Files[edit]
Determine whether path exists:
import os os.path.exists('<path string>')
When working on systems such as Microsoft Windows, the directory separators will conflict with the path string. To get around this, do the following:
import os os.path.exists('C:\\windows\\example\\path')
A better way however is to use "raw", or r
:
import os os.path.exists(r'C:\windows\example\path')
But there are some other convenient functions in os.path
, where path.code.exists()
only confirms whether or not path exists, there are functions which let you know if the path is a file, a directory, a mount point or a symlink. There is even a function os.path.realpath()
which reveals the true destination of a symlink:
>>> import os >>> os.path.isfile('/') False >>> os.path.isfile('/proc/cpuinfo') True >>> os.path.isdir('/') True >>> os.path.isdir('/proc/cpuinfo') False >>> os.path.ismount("/") True >>> os.path.islink('/') False >>> os.path.islink('/vmlinuz') True >>> os.path.realpath('/vmlinuz') '/boot/vmlinuz-2.6.24-21-generic'
Common File Operations[edit]
To copy or move a file, use the shutil library.
import shutil shutil.move('originallocation.txt', 'newlocation.txt') shutil.copy('original.txt', 'copy.txt')
To perform a recursive copy it is possible to use copytree()
, to perform a recursive remove it is possible to use rmtree()
import shutil shutil.copytree('dir1', 'dir2') shutil.rmtree('dir1')
To remove an individual file there exists the remove()
function in the os module:
import os os.remove('file.txt')
Finding Files[edit]
Files can be found using glob.glob
:
glob.glob('*.txt') # Finds files in the current directory ending in '.txt' glob.glob(r'*\*.txt') # Finds files in any of the direct subdirectories # of the current directory ending in '.txt' glob.glob(r'C:\Windows\*.exe') for file_name in glob.glob(r'C:\Windows\*.exe'): print file_name
The content of a directory can be listed using os.listdir
:
for item in os.listdir('.'): if os.path.isfile(item) and item.endswith('.txt'): print 'Text file: ', item if os.path.isdir(item): print 'Directory: ', item
Getting a list of all items in a directory, including the nested ones:
for root, directories, files in os.walk('/user/Joe Hoe'): print 'Root: ', root for directory in directories: print 'Directory: ', directory for file in files: print 'File: ', file
Current Directory[edit]
Getting current working directory:
os.getcwd()
Changing current working directory:
os.chdir(r'C:\')
External Links[edit]
- os — Miscellaneous operating system interfaces in Python documentation
- glob — Unix style pathname pattern expansion in Python documentation
- shutil — High-level file operations in Python documentation
- Brief Tour of the Standard Library in The Python Tutorial