Alerting

How do I write a search to Compare Bot Traffic in the past hour to yesterday to possibly detect a rogue bot?

Explorer

I have a site that was hit recently by a bot that ended up basically DDOSing the site for a few hours as it crawled the site. We were noticing an increase in connection to the web server, but the numbers were not too far out of normal overall to throw alarm bells right away (increased from around 1500-3000 per time period over a few hours, coinciding with the start of business)

However after some deep dives into the log, we learned that nearly all of that increase was due to a bot hitting the site (38% of all site traffic that day was this one bot)

I wanted to write a search that would allow me to categorize all bot traffic into one category and then create a running total for each time period, and then set up an alert to go off if the current period is at or above the previous day's numbers.

1 Solution

Explorer

I consulted with the guys on splunk-usergroups and we decided that calculating the median for the previosu day, and then compare this hour to the median value from yesterday and alerting if the hours bot traffic is higher than the median value, With the help of those guys and a little of my own magic, i was able to figure out the queries i want.

first query to populate the output table:

index=foo sourcetype="ms:iis:default" sc_status!=403
    |eval useragent = if(match(cs_User_Agent_ ,".*bot.*"), "BOT", "USER")
    | timechart span=1h count by useragent
    | streamstats median(BOT) as thembots, median(USER) as themusers 
    | outputlookup foo_botHistory.csv

Second query to compare this hour to the table and alert on any interesting fields.

 index=foo sourcetype="ms:iis:default" sc_status!=403
| eval useragent = if(match(cs_User_Agent_ ,".*bot.*"), "BOT", "USER") 
| timechart span=1h count by useragent
| stats max(BOT) as nowBots
| inputlookup append=true foo_botHistory.csv
| stats max(thembots) as normalbots, max(themusers) as normalusers, max(nowBots) as nowBots 
| where nowBots > normalbots * 1.5

Any advice on improving this search or its alerts would be much appreciated!

View solution in original post

Explorer

I consulted with the guys on splunk-usergroups and we decided that calculating the median for the previosu day, and then compare this hour to the median value from yesterday and alerting if the hours bot traffic is higher than the median value, With the help of those guys and a little of my own magic, i was able to figure out the queries i want.

first query to populate the output table:

index=foo sourcetype="ms:iis:default" sc_status!=403
    |eval useragent = if(match(cs_User_Agent_ ,".*bot.*"), "BOT", "USER")
    | timechart span=1h count by useragent
    | streamstats median(BOT) as thembots, median(USER) as themusers 
    | outputlookup foo_botHistory.csv

Second query to compare this hour to the table and alert on any interesting fields.

 index=foo sourcetype="ms:iis:default" sc_status!=403
| eval useragent = if(match(cs_User_Agent_ ,".*bot.*"), "BOT", "USER") 
| timechart span=1h count by useragent
| stats max(BOT) as nowBots
| inputlookup append=true foo_botHistory.csv
| stats max(thembots) as normalbots, max(themusers) as normalusers, max(nowBots) as nowBots 
| where nowBots > normalbots * 1.5

Any advice on improving this search or its alerts would be much appreciated!

View solution in original post

State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!