I have an apache webserver which I have setup a website using flask using mod_wsgi. I am having a couple of issues which may or may not be related.
With every call to a certain page (which runs a function performing heavy computation that takes over 2 seconds), the memory increases about 20 megabytes. My server starts out with about 350megabytes consumed by everything on the machine. The server has a total of 3,620megabytes shown in htop. After I reload this page many times, the total memory used by the server eventually starts topping out around 2,400 megabytes and stops increasing as much. After it gets to this level I haven't been able to get it consume enough memory to go into swap after hundreds of page reloads. Is this by design of flask or apache or python? To me, if there were some kind of caching mechanism, it didn't seem like memory accumulation would happen if the same URL is called every time. If I restart apache, the memory is released.
Sometimes calls to this page result in called functions erroring out, even though they are all read only calls (not writing any data to the disk) and the query string is the same for every page.
I have another page (calling another function which does much less computation), when called concurrently with other pages running on the web server, randomly errors out or the result (an image) comes back unexpectedly.
Could issues 2 and 3 be related to issue 1? Could issues 2 and 3 be due to bad programming somehow or bad memory in the machine? I am able to reproduce the randomness by loading the same URL in about 40 firefox tabs and then choosing the "reload all tabs" option.
What more information should be provided to get a better answer?
I have tried placing
import gc
gc.collect()
into my code.
I do have
WSGIDaemonProcess website user=www-data group=www-data processes=2 threads=2 home=https://waybackassets.bk21.net/website
WSGIScriptAlias / https://waybackassets.bk21.net/website/website.wsgi
<Directory https://waybackassets.bk21.net/website>
WSGIProcessGroup website
WSGIScriptReloading On
WSGIApplicationGroup %{GLOBAL}
Order deny,allow
Allow from all
</Directory>
in my /etc/apache2/sites-available/default file. It doesn't seem like the memory should grow that much if there are only a total of 4 threads being created, should there?
UPDATE
If I set processes=1 threads=4, then the seemingly random issues occur all the time when two requests are placed at once. One I set processes=4 threads=1, then the seemingly random issues don't happen. The rise in memory is still occurring though, and actually will now rise all the way to the max RAM of the system and start swapping.
UPDATE
Although I haven't gotten this runaway RAM consumption issue resolved, I didn't have problems for several months with my current application. Apparently it wasn't too popular, and after several days or so, apache may have been clearing out the RAM automatically or something.
Now, I've made another application, which is fairly unrelated to the previous one. The previous application was generating about 1 megapixel images using matplotlib. My new application is generating 20 megapixel images and 1 megapixel images using matplotlib. The problem is monumentally larger now when 20 megapixel images are generated with the new application. After the entire swap space is filled up, something seems to get killed, and things work at a decent speed for a while while there is some RAM and swap space available, but is much slower to run when the RAM is consumed. Here are the processes running. I don't think that there are any extra zombie processes running.
$ ps -ef|grep apache
root 3753 1 0 03:45 ? 00:00:02 /usr/sbin/apache2 -k start
www-data 3756 3753 0 03:45 ? 00:00:00 /usr/sbin/apache2 -k start
www-data 3759 3753 0 03:45 ? 00:02:06 /usr/sbin/apache2 -k start
www-data 3762 3753 0 03:45 ? 00:00:01 /usr/sbin/apache2 -k start
www-data 3763 3753 0 03:45 ? 00:00:01 /usr/sbin/apache2 -k start
test 4644 4591 0 12:27 pts/1 00:00:00 tail -f /var/log/apache2/access.log
www-data 4894 3753 0 21:34 ? 00:00:37 /usr/sbin/apache2 -k start
www-data 4917 3753 2 22:33 ? 00:00:36 /usr/sbin/apache2 -k start
www-data 4980 3753 1 22:46 ? 00:00:12 /usr/sbin/apache2 -k start
I am a little confused though when I look at htop because it shows a lot more processes than top or ps.
UPDATE
I have figured out that the memory leak is due to matplotlib (or the way I am using it), and not flask or apache, so the problems 2 and 3 I originally posted are indeed a separate issue from problem 1. Below is a basic function that I made to eliminate/reproduce the problem, interactively in ipython.
def BigComputation():
import cStringIO
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
#larger figure size causes more RAM to be used when savefig is run.
#this function also uses some RAM that is never released automatically
#if plt.close('all') is never run, but it is a small amount,
#so it is hard to tell unless run BigComputation thousands of times.
TheFigure=plt.figure(figsize=(250,8))
file_output = cStringIO.StringIO()
#causes lots of RAM to be used, and never released automatically
TheFigure.savefig(file_output)
#releases all the RAM that is never released automatically
plt.close('all')
return None
The trick to getting rid of the RAM leak is to run
plt.close('all')
within BigComputation(), otherwise, BigComputation() will just keep accumulating RAM every time the function is called. I don't know if I am just using matplotlib inappropriately or have bad coding technique, but I really would think that once BigComputation() returns, it should release all the memory except any global objects or the objects it returned. It seems to me like matplotlib must be creating some global variables in an inappropriate way, because I have no idea what they are named.
I guess where my question stands now is why do I need plt.close('all')? I also need to try the suggestions of Graham Dumpleton in order to further diagnose my apache configuration to see why I need to set threads=1 in apache to get the random errors to go away.