Tell me more ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

We are running Asp.Net WebApi on 3 servers behind HAProxy. HAProxy simply randomly distributes requests among these 3 instances.

These instances connect to mongodb, redis and some windows services.

Normally, w3wp.exe uses about %30 cpu on each api server.

From time to time (a few times in an hour) one of the api servers decide to use high amount of cpu. In correlation with this behavior, we start to see increasing response times. The numbers keep raising until HAProxy sees 10000ms response times and decides to route requests to other two servers. All these occur in 10-20 seconds. After a while, this server goes back to its normal state and start taking requests again. After a few minutes, another server does exactly the same thing. This keeps going on and on.

We are using New Relic but since the application is a WebApi application, we do not get any useful info. We monitor all our servers (redis, mongo and windows services) for cpu usage, memory usage, network traffic and I/O but we do not see any significant load during aforementioned outages.

How can we detect the cause behind this application behavior?

share|improve this question

3 Answers

A good option would be to take a mini-dump using something like Process Explorer and then inspect it with WinDBG or otherwise, to see what the threads are doing and so forth. I have a good blog post about how to do it here:

http://www.haneycodes.net/but-it-didnt-happen-in-dev-or-qa/

share|improve this answer

As DavidH has said, getting a memory dump is a really important step. If you want, I can offer help to read the dump.

Another useful too is CPU Analyser which is free: http://samsaffron.com/archive/2009/11/11/Diagnosing+runaway+CPU+in+a+Net+production+application

Another option is to use PerfView.

Yet another option is to use JetBrains dotTrace and attach to the w3wp.exe process.

share|improve this answer

One thing shared between .NET and Java EE is the garbage collector. So, if your application uses large amounts of memory then the periods of high CPU could be the garbage collector coming in. I had this problem with .NET 3.5 IIS 7 running an application that consistently used over a gigabyte per process. The Garbage Collector basically stops everything while it is recovering memory for your application. You can tweak the garbage collector and even call it from your code when it makes sense. There are a lot of little strategies you can use. Another problem will come up with the GC if you are doing lots and lots of string stuff. For example, you are parsing character strings coming through a Restful Web service. This causes a lot of memory fragmentation and can cause the GC to spend a lot more time and CPU recovering memory.

Its easy to see this happening if this indeed is what is going on. You can use the Taskmanager to watch the memory usage and CPU of the process. Look at the memory used when the CPU goes up and after it goes down again.

share|improve this answer
The application uses very little memory since it does not store any session and the only thing taking up memory space is locally created instances. – Serhat Özgel 2 days ago

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.