Splunk Search

Search Query for checking the Uptime of Website

sathish2k8
Explorer

Hi Folks, 

 

I want to check at what time url has been brought up. Url already added in website monitoring. For example if the url was down at 12 PM and it has been brought up at 1 AM this dashboard panel should indicate 1 PM url went up. I want to monitor multiple urls for this scenario. Appreciate your expertise advise. 

Labels (1)
0 Karma

Richfez
SplunkTrust
SplunkTrust

This might be possible with the stats command, transaction command, or in a variety of other ways.

What does your data look like?  Can you provide a few rows of that raw data as you see it in Splunk?

0 Karma

sathish2k8
Explorer

Firstly, Thanks. Here is the Sample events

 

 

Event

 
 
total_time=18.15 content_md5=2922faf0859c07df6e2364140f6eee9b proxy_server="" proxy_type=http timeout=30 content_sha224=3ab22d7e15f71cc057bbe37b3947ce1e6f8c6458d7fd359dc9a61104 url= content_size=8524 title=clust1 proxy_port="" request_time=18.15 response_code=200 timed_out=False
 
 
total_time=23.58 content_md5=2922faf0859c07df6e2364140f6eee9b proxy_server="" proxy_type=http timeout=30 content_sha224=3ab22d7e15f71cc057bbe37b3947ce1e6f8c6458d7fd359dc9a61104 url= content_size=8524 title=clust2 proxy_port="" request_time=23.58 response_code=200 timed_out=False
 
 
total_time=18.86 content_md5=2922faf0859c07df6e2364140f6eee9b proxy_server="" proxy_type=http timeout=30 content_sha224=3ab22d7e15f71cc057bbe37b3947ce1e6f8c6458d7fd359dc9a61104 url= :6801/ content_size=8524 title=Clust2 proxy_port="" request_time=18.86 response_code=200 timed_out=False
 
 
total_time=16.54 content_md5=2922faf0859c07df6e2364140f6eee9b proxy_server="" proxy_type=http timeout=30 content_sha224=3ab22d7e15f71cc057bbe37b3947ce1e6f8c6458d7fd359dc9a61104 url= content_size=8524 title=Clust4 proxy_port="" request_time=16.54 response_code=200 timed_out=False

 

Tags (1)
0 Karma

Richfez
SplunkTrust
SplunkTrust

Well, that doesn't actually have "website" in it as a field.

Still.  If you have that data ingested, and the fields that appear like they should be extracted are (total_time, content_md5, etc...), then ...

OK, so I'm looking even closer at this.  How would you, like as a regular person using words in English, describe how you would manually use these 4 rows events to know if the site/page was up at the time? 

Because, what I see is that it might be more effective to count where field timed_out=True (assumption that's it's value when it's not false).

Or where response_code is 400 or higher (assuming these are http status codes, or similar, and that 300-level ones are redirects.  And if this is server code, my guess is problems will be in the error codes at 500 and above in that case.

Either way... for a count of "not timed out" vs. "timed out"

 

<your base search here>
| timechart span=1h count by timed_out

 

Or maybe you ONLY want the ones that timed_out, this way you can reserve the "by" clause in the timechart for "by title" to split it based on ... well, title.  You wanted server or web page, but I don't see that directly so this is my proxy for it. 

 

<your base search here> timed_out=False
| timechart span=1h count by title

 

Or maybe only where the status codes are 400+?

<your base search here> status>400
| timechart span=1h count by status

There are quite a few options.

A lot of the options are pretty simple ones, leading me to suggest that you take Splunk Fundamentals 1. It's a free 6-10 hour on-line course from Splunk that covers a lot of fairly simple use cases like that, and a lot more. 

Just search for it in ... oh they've changed things in Splunk education recently.  Try here:

https://education.splunk.com/single-subject-courses

and look at the ones that offer "free e-learning".  Many of those map to Splunk Fundamentals Part 1.

 

 

0 Karma
.conf21 Now Fully Virtual!
Register for FREE Today!

We've made .conf21 totally virtual and totally FREE! Our completely online experience will run from 10/19 through 10/20 with some additional events, too!