Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

I have a tab-delimited file that I need to be in an array (so I can look for values in it). I had thought that I could iterate through the lines and just push the exploded values into an array, like so:

error_reporting(E_ALL);
ini_set("memory_limit","20M");

$holder = array();
$csv = array();
$lines = file('Path\\To\\File\\2014_04_05_08_48.txt');

for ($a=0; $a<count($lines); $a++) {
    $csv = '';
    $csv = $lines[$a];
    $csv = explode("\t", $csv);
    array_splice($csv, 10);
    array_push($holder, $csv);
}

print_r($holder);

This file is about 11000 lines, 170 fields/line (output of current inventory from a POS system). This file is less than 6MB. I kept getting errors about memory limits being exceeded. Looking here on SO I found that increasing the memory limit was usually a way of covering a memory leak/poor memory usage.

I finally got it to work by setting the memory limit to 20M and stripping each array after the 10th value.

My question is why do I need such a high memory limit to get the file contents into an array? More than 3x the size seems like a lot. Am I doing something wrong?

share|improve this question
    
Point #1 - Learn about fgetcsv() for loading CSV files –  Mark Baker Apr 5 '14 at 15:13
    
I tried that first, same issue of running out of memory. Does it make a difference in the memory the array uses? –  Josiah Apr 5 '14 at 15:16
    
It doesn't make a difference in memory usage, but it's easier to code.... but your use of $lines makes a major difference as you're loading the entire content of the file into memory, which uses 6MB before you even start your loop. While $holder is the big memory hog, that initial 6MB is never released, so it's still a sizeable chunk of your 20MB limit –  Mark Baker Apr 5 '14 at 15:20
1  
If you coded using fgetcsv() instead, then you're only loading one line at a time from your file, not all 11000 lines; so while it won't reduce the size of $holder, it does eliminate most of that 6MB grabbed by $lines –  Mark Baker Apr 5 '14 at 15:23
    
Note that 20MB isn't that large a limit for a PHP script; the standard default memory config in php.ini is 32MB –  Mark Baker Apr 5 '14 at 15:25

1 Answer 1

up vote 6 down vote accepted

Each element in the array is a variable with potentially 48 bytes of overhead, and each element in holder is an array itself which has 76 bytes of overhead

So your 170 fields per line have an overhead of 170x48=8160 bytes; and you have 11000 lines, so that's 89,760,000 bytes total if you loaded it all into memory rather than in blocks of 10 rows. Then each line is an array, which has 76 bytes of overhead (11000x76 = 836,000 bytes) plus a further 76 bytes for the holder array..... and that doesn't even include the actual size of data in your fields

EDIT

You're only building $holder from 10 fields in your CSV line; but that still equates to 10x11000x48 overhead for the data elements 5,280,000 bytes (plus the data in each element), plus 11000x76 = 836,000 bytes plus the $holder array at 78 bytes giving a total of 6,116,078 (about 6MB)... with the 6MB from loading the entirety of $file into memory leaving only 8MB from your 20MB

Add the fact that PHP itself takes memory, plus the code of your script, plus the other variables in your script, and you're rapidly approaching your 20MB limit

share|improve this answer
    
Figures based on 32-bit PHP.... see nikic.github.io/2011/12/12/… for details of the math, and for equivalents for 64-bit PHP –  Mark Baker Apr 5 '14 at 15:26

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.