Alerting

How do I write a search to Compare Bot Traffic in the past hour to yesterday to possibly detect a rogue bot?

dmcgeearke
Explorer

I have a site that was hit recently by a bot that ended up basically DDOSing the site for a few hours as it crawled the site. We were noticing an increase in connection to the web server, but the numbers were not too far out of normal overall to throw alarm bells right away (increased from around 1500-3000 per time period over a few hours, coinciding with the start of business)

However after some deep dives into the log, we learned that nearly all of that increase was due to a bot hitting the site (38% of all site traffic that day was this one bot)

I wanted to write a search that would allow me to categorize all bot traffic into one category and then create a running total for each time period, and then set up an alert to go off if the current period is at or above the previous day's numbers.

1 Solution

dmcgeearke
Explorer

I consulted with the guys on splunk-usergroups and we decided that calculating the median for the previosu day, and then compare this hour to the median value from yesterday and alerting if the hours bot traffic is higher than the median value, With the help of those guys and a little of my own magic, i was able to figure out the queries i want.

first query to populate the output table:

index=foo sourcetype="ms:iis:default" sc_status!=403
    |eval useragent = if(match(cs_User_Agent_ ,".*bot.*"), "BOT", "USER")
    | timechart span=1h count by useragent
    | streamstats median(BOT) as thembots, median(USER) as themusers 
    | outputlookup foo_botHistory.csv

Second query to compare this hour to the table and alert on any interesting fields.

 index=foo sourcetype="ms:iis:default" sc_status!=403
| eval useragent = if(match(cs_User_Agent_ ,".*bot.*"), "BOT", "USER") 
| timechart span=1h count by useragent
| stats max(BOT) as nowBots
| inputlookup append=true foo_botHistory.csv
| stats max(thembots) as normalbots, max(themusers) as normalusers, max(nowBots) as nowBots 
| where nowBots > normalbots * 1.5

Any advice on improving this search or its alerts would be much appreciated!

View solution in original post

dmcgeearke
Explorer

I consulted with the guys on splunk-usergroups and we decided that calculating the median for the previosu day, and then compare this hour to the median value from yesterday and alerting if the hours bot traffic is higher than the median value, With the help of those guys and a little of my own magic, i was able to figure out the queries i want.

first query to populate the output table:

index=foo sourcetype="ms:iis:default" sc_status!=403
    |eval useragent = if(match(cs_User_Agent_ ,".*bot.*"), "BOT", "USER")
    | timechart span=1h count by useragent
    | streamstats median(BOT) as thembots, median(USER) as themusers 
    | outputlookup foo_botHistory.csv

Second query to compare this hour to the table and alert on any interesting fields.

 index=foo sourcetype="ms:iis:default" sc_status!=403
| eval useragent = if(match(cs_User_Agent_ ,".*bot.*"), "BOT", "USER") 
| timechart span=1h count by useragent
| stats max(BOT) as nowBots
| inputlookup append=true foo_botHistory.csv
| stats max(thembots) as normalbots, max(themusers) as normalusers, max(nowBots) as nowBots 
| where nowBots > normalbots * 1.5

Any advice on improving this search or its alerts would be much appreciated!

pdakshanadi
Loves-to-Learn Lots

May I know the reason for using cs_User_Agent_ ..I mean aunderscore after Agent ? Thanks!

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...