I want to develop an application, which will generate user profiles based on web serer access log analysis. This profiling will be based on IP address, user-agent string etc.
After generating the user profile, next time when the user access the web site if the pattern does not match with the generated profile, the system needs to send an alert. (Not done in real time).
This type of systems are called "anomaly based detection" in the literature.
I have been using Pandas DataFrames
for past few months and have some experiences working with MySQL databases. Any way currently I'm lost with out having a proper idea of handling web log data in this particular problem domain. Do I need to use MySQL database? Any guidance is appreciated.