the function signature for pandas.read_csv
gives, among others, the following options:
read_csv(filepath_or_buffer, low_memory=True, memory_map=False, iterator=False, chunksize=None, ...)
I couldn't find any documentation for either low_memory
or memory_map
flags. I am confused about whether these features are implemented yet and if so how do they work.
Specifically,
memory_map
: If implemented does it usenp.memmap
and if so does it store the individual columns as memmap or the rows.low_memory
: Does it specify something likecache
to store in memory?- can we convert an existing
DataFrame
to amemampped
DataFrame``
P.S. : versions of relevant modules
pandas==0.14.0
scipy==0.14.0
numpy==1.8.1
low_memory
should prob be documented (though it is an older option that doesn't really do much).memory_map
is not documented because its not implemented (nor does it do anything). So the answer to your questions are all no. – Jeff Jun 16 '14 at 18:21memory_map
is technically defined and tested. Never seen it used. Give it a try and report back. (it doesn't usenp.memmap
, but just holds a limited amount of data in-memory). But I think this is an older / deprecated option anyhow. – Jeff Jun 16 '14 at 18:26help(pd.read_csv)
to get the docstrings. Thanks for the github reference. – goofd Jun 16 '14 at 18:52