Splunk Search

How can I create an output in such a way that columns will be "Month", "Host", "Count"?

splunkuser1
Loves-to-Learn

This question was asked in the interview. Index is splunk's _internal, fields are host and date_month. I want to create an output in such a way that columns will be "Month", "Host", "Count". In month column each row will represent each day of month, in host column name of the 1st host will appear for all days of that month and count in column count. After the last day of month, in month column I want to display month name, in host column name of the 2nd host for all the days of month and count. How to write SPL query for this?

Labels (2)
0 Karma

woodcock
Esteemed Legend

This is a stupid question on many levels.  You should NEVER use the "date_*" fields for anything because they are pre-TZ-normalized values and do not take into account your user's personal TZ value and so almost always will be incorrect.  You should instead calculate your own "date_*" values, if you even need them (which in this case, you do not).  The question is phrased EXTREMELY poorly and is very confusing to understand.  I spent 5 times as much time reading and re-reading the question as I did coming up with the answer.  I suspect that part of the answer is that you should know (and explain in your answer) that by default, only the last 30 days are stored in the internal index data so this question is even dumber because you will never have a complete month in your data.  In any case, if I understand it correctly, here is one solution:

|tstats count AS Count
WHERE index="_internal" AND date_month="*" earliest=0 latest=now
BY host _time span=1d
| rename _time AS Month, host AS Host
| eval _mon=strftime(Month, "%Y%B")
| streamstats current=f last(_mon) AS prevmon BY Host
| eval neweventhere = if(_mon!=prevmon, mvappend("FIXME","KEEPME"), "NO")
| fields - prevmon
| mvexpand neweventhere
| rename neweventhere AS _neweventhere
| eval Month = if(_neweventhere="FIXME", (Month - 1), Month)
| eval _mon=strftime(Month, "%Y%B")
| eventstats sum(Count) AS _TotalCount BY _mon Host
| eval Count = if(_neweventhere="FIXME", _TotalCount, Count)
| fieldformat Month = if(_neweventhere="FIXME", strftime(Month, "%B"), strftime(Month, "%B %d"))
| table Month Host Count _*

yeahnah
Motivator

Hi @splunkuser1 

The interview question is poorly worded and a little hard to understand.  If I break it down there looks to be  5 key requirements

  1. Index is Splunk's _internal, fields are host and date_month
  2. I want to create an output in such a way that columns will be "Month", "Host", "Count"
  3. In month column each row will represent each day of month
  4. In host column name of the 1st host, will appear for all days of that month and count in column count
  5. After the last day of month, in month column I want to display month name, in host column name of the 2nd host for all the days of month and count

Requirement 4 and 5 are hardest to make sense of but as an exercise in writing SPL, I guess it has merit as it would test your knowledge.

Anyway, based on the above, this is how I've interpreted the question and what I believe would provide a method to produce a result set with the answer...

 

index=_internal date_month=*
| bin span=1d _time
| stats 
    count
  BY _time date_month host
| eval day=strftime(_time, "%d")
| streamstats
    count AS order
    first(host) AS first_host
  BY date_month
| appendpipe [ where order=2 | eval second_host=host ]
| eventstats
    max(order) as last_month_day
    values(*_host) AS *_host
  BY date_month
| eval host=if(order=last_month_day, second_host, first_host)
      ,day=if(order=last_month_day, date_month, day)
| sort _time order
| rename day AS Month host AS Host count AS Count
| table Month Host Count

 

 Hint: if testing, then I suggest using a high event sampling ratio to make the query fast and efficient.

yeahnah_0-1681967572329.png

Would be interested in how others in the Splunk Community would use SPL to answer this problem too

Hope it helps

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

What have you tried so far?

0 Karma

splunkuser1
Loves-to-Learn

index=_internal, |stats count by date_month, host| rename date_month AS Month

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

If you want counts by day, you could try by date_day

Can you provide a mock-up of what you want the result to look like?

0 Karma

splunkuser1
Loves-to-Learn
MonthHostCount
1xyz3
2xyz7
3xyz7
Aprilxyz17
4abc4
5abc6
6abc8
7abc3
Aprilabc21
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Try something like this

index=_internal
| bin _time span=1d
| stats count by _time, host
| eval month=strftime(_time,"%b")
| eventstats sum(count) as total max(_time) as lastday by host month
| eval row=if(_time==lastday,mvrange(0,2),0)
| mvexpand row
| eval Day=if(row==1,Month,strftime(_time,"%d"))
| eval count=if(row==1,total,count)
| table Day, host, count
0 Karma

splunkuser1
Loves-to-Learn

Your query gave me almost the correct answer, but as I mentioned in the above table, the month name is not displayed in the answer

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Is it just that you are getting the abbreviated month and not the full month - if so, try this change

| eval month=strftime(_time,"%B")

If not, please provide more information about what is and is not working

0 Karma
Get Updates on the Splunk Community!

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...

Adoption of Infrastructure Monitoring at Splunk

  Splunk's Growth Engineering team showcases one of their first Splunk product adoption-Splunk Infrastructure ...

Modern way of developing distributed application using OTel

Recently, I had the opportunity to work on a complex microservice using Spring boot and Quarkus to develop a ...