I'm not sure if there is a solution using NumPy, loadtxt
and genfromtxt
raise errors and warnings respectively if the number of columns change, so you'll probably have to write your own method.
Edit: The following was edited slightly to refelct DSM's comment.
You could use the built-in csv
module:
import csv
arr = []
with open('test.txt', 'r') as fh:
reader = csv.reader(fh)
for row in reader:
if row:
arr.extend(row)
The csv approach has the advantage that it strips newlines, which is not the case if you just read the file using fileobj = open(...)
and for line in fileobj
.
At this point you should have
>>> arr
['883825.00', '373395.00', '0.00', '20,080.84', '2012500.00', '#EANF#', '121449.
39', '0.00', '0.00', '0.00', '38,849.10', '0.00', '#EANF#', '0.00', '0.00', '0.0
0', '0.00', '83,167.42', '1640625.00', '#EANF#', '0.00', '#EANF#', '#EANF#', '#E
ANF#', '#EANF#', '#EANF#', '#EANF#', '#EANF#', '-1,202,600.00', '-0.00', '#EANF#
', '2267168', '0.00', '#EANF#', '-173,710.66', '-125.60', '#EANF#', '17,459.68',
'#EANF#.']
You then have to convert to floats and replace the #EANF#
values with, say, numpy.NaN
. We also have to take care of the commas in some of the values. The commas are easily handled with
float(str(float_string).replace(',', ''))
For the #EANF#
values we can just check if an item starts with this (not equal to this, since the last item in the list has a trailing .
). Combining these two conversions into a function convert
and wrapping with a list comprehension we have:
import numpy
def convert(v):
try:
return float(v)
except ValueError:
if v.startswith('#EANF#'):
return numpy.NaN
else:
return float(str(v).replace(',', ''))
arr = numpy.asarray([convert(a) for a in arr])
The function convert
could be generalised to take a second, optional argument which defines which values should be mapped to numpy.NaN
.
The final result of this is
>>> arr
[883825.0, 373395.0, 0.0, 20080.84, 2012500.0, nan, 121449.39, 0.0, 0.0, 0.0, 38
849.1, 0.0, nan, 0.0, 0.0, 0.0, 0.0, 83167.42, 1640625.0, nan, 0.0, nan, nan, na
n, nan, nan, nan, nan, -1202600.0, -0.0, nan, 2267168.0, 0.0, nan, -173710.66, -
125.6, nan, 17459.68, nan]
Note: this answer assumes that you are happy with a one dimensional list as the result. If you want a different shape for the result you should say so in the question.