Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

the function signature for pandas.read_csv gives, among others, the following options:

read_csv(filepath_or_buffer, low_memory=True, memory_map=False, iterator=False, chunksize=None, ...)

I couldn't find any documentation for either low_memoryor memory_map flags. I am confused about whether these features are implemented yet and if so how do they work.

Specifically,

  1. memory_map: If implemented does it use np.memmap and if so does it store the individual columns as memmap or the rows.
  2. low_memory: Does it specify something like cache to store in memory?
  3. can we convert an existing DataFrame to a memamppedDataFrame``

P.S. : versions of relevant modules

pandas==0.14.0
scipy==0.14.0
numpy==1.8.1
share|improve this question
    
low_memory should prob be documented (though it is an older option that doesn't really do much). memory_map is not documented because its not implemented (nor does it do anything). So the answer to your questions are all no. –  Jeff Jun 16 '14 at 18:21
1  
github.com/pydata/pandas/issues/5888 –  Jeff Jun 16 '14 at 18:21
    
FYI, these are not in the public doc-strings either, so not sure where you are looking. –  Jeff Jun 16 '14 at 18:22
    
I will revise slightly, memory_map is technically defined and tested. Never seen it used. Give it a try and report back. (it doesn't use np.memmap, but just holds a limited amount of data in-memory). But I think this is an older / deprecated option anyhow. –  Jeff Jun 16 '14 at 18:26
    
Thanks @Jeff! I did a help(pd.read_csv) to get the docstrings. Thanks for the github reference. –  goofd Jun 16 '14 at 18:52

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Browse other questions tagged or ask your own question.