Hi Folks,
I want to check at what time url has been brought up. Url already added in website monitoring. For example if the url was down at 12 PM and it has been brought up at 1 AM this dashboard panel should indicate 1 PM url went up. I want to monitor multiple urls for this scenario. Appreciate your expertise advise.
This might be possible with the stats command, transaction command, or in a variety of other ways.
What does your data look like? Can you provide a few rows of that raw data as you see it in Splunk?
Firstly, Thanks. Here is the Sample events
Event
total_time=18.15 content_md5=2922faf0859c07df6e2364140f6eee9b proxy_server="" proxy_type=http timeout=30 content_sha224=3ab22d7e15f71cc057bbe37b3947ce1e6f8c6458d7fd359dc9a61104 url= content_size=8524 title=clust1 proxy_port="" request_time=18.15 response_code=200 timed_out=False | |
total_time=23.58 content_md5=2922faf0859c07df6e2364140f6eee9b proxy_server="" proxy_type=http timeout=30 content_sha224=3ab22d7e15f71cc057bbe37b3947ce1e6f8c6458d7fd359dc9a61104 url= content_size=8524 title=clust2 proxy_port="" request_time=23.58 response_code=200 timed_out=False | |
total_time=18.86 content_md5=2922faf0859c07df6e2364140f6eee9b proxy_server="" proxy_type=http timeout=30 content_sha224=3ab22d7e15f71cc057bbe37b3947ce1e6f8c6458d7fd359dc9a61104 url= :6801/ content_size=8524 title=Clust2 proxy_port="" request_time=18.86 response_code=200 timed_out=False | |
total_time=16.54 content_md5=2922faf0859c07df6e2364140f6eee9b proxy_server="" proxy_type=http timeout=30 content_sha224=3ab22d7e15f71cc057bbe37b3947ce1e6f8c6458d7fd359dc9a61104 url= content_size=8524 title=Clust4 proxy_port="" request_time=16.54 response_code=200 timed_out=False |
Well, that doesn't actually have "website" in it as a field.
Still. If you have that data ingested, and the fields that appear like they should be extracted are (total_time, content_md5, etc...), then ...
OK, so I'm looking even closer at this. How would you, like as a regular person using words in English, describe how you would manually use these 4 rows events to know if the site/page was up at the time?
Because, what I see is that it might be more effective to count where field timed_out=True (assumption that's it's value when it's not false).
Or where response_code is 400 or higher (assuming these are http status codes, or similar, and that 300-level ones are redirects. And if this is server code, my guess is problems will be in the error codes at 500 and above in that case.
Either way... for a count of "not timed out" vs. "timed out"
<your base search here>
| timechart span=1h count by timed_out
Or maybe you ONLY want the ones that timed_out, this way you can reserve the "by" clause in the timechart for "by title" to split it based on ... well, title. You wanted server or web page, but I don't see that directly so this is my proxy for it.
<your base search here> timed_out=False
| timechart span=1h count by title
Or maybe only where the status codes are 400+?
<your base search here> status>400
| timechart span=1h count by status
There are quite a few options.
A lot of the options are pretty simple ones, leading me to suggest that you take Splunk Fundamentals 1. It's a free 6-10 hour on-line course from Splunk that covers a lot of fairly simple use cases like that, and a lot more.
Just search for it in ... oh they've changed things in Splunk education recently. Try here:
https://education.splunk.com/single-subject-courses
and look at the ones that offer "free e-learning". Many of those map to Splunk Fundamentals Part 1.