Monitoring Splunk

Join time from a CSV file, and an index summary

adale25
Engager

How do I join the time field with a different field name from a CSV lookup file, with the time field specified in an index summary?

Tags (1)
0 Karma

Vijeta
Influencer

You will have to convert _time and date into the internal format using strptime, try the below code-

index=summary *avg_Total_Capacity*  | rename avg_Total_Capacity AS Total_GB
      | rename avg_Used_GB AS Used_GB| eval _time=strptime(_time,"%Y-%m-%d")
      | append

      [| inputlookup ServerName_G | fields - Disk
      | eval _time=strptime(DateLogged,"%m/%d/%Y")
      | rename DiskFreeSpaceMB as Free_GB 
      | eval Free_GB=(Free_GB/1024) 
      | rename DiskSizeMB as Total_GB
      | eval Total_GB=(Free_GB/PercentFree)*100
      | eval Used_GB=(Total_GB-Free_GB)]
      | sort - _time

     | apply "RFRCapacity_v2"  
     | timechart span=1d avg(Predict_UsedGB_Future) as Predict_UsedGB_Future
     | eval residual = 'Used_GB' - 'predicted(Used_GB)'
     | eval baseline=0
     | eval Predict_UsedGB_Future='Predict_UsedGB_Future'
0 Karma

somesoni2
Revered Legend

You can rename the time field to match the time field name in index summary in the join subsearch.

Joins are expensive and often can be replaced by alternative commands. Could you provide more details here on what's your current search is, what you're trying to accomplish, possibly with sample data/output?

0 Karma

adale25
Engager

Hey somesoni2,
I'm trying to build a predictive model using RandomForest in Splunk. I'm having issues with the inputlookup subsearch, and am trying to change it to use lookup instead and join the DateLogged field from the csv with _time from the index summary and see if the model will work as expected and push the prediction into the future. Currently its not doing that when I run the apply step on the model., if you have any suggestions for me I'm glad to hear them! Thanks!

Fit ->

index=summary *avg_Total_Capacity*  | rename avg_Total_Capacity AS Total_GB
 | rename avg_Used_GB AS Used_GB
 | append

 [| inputlookup ServerName_G | fields - Disk
 | convert timeformat="%m/%d/%Y" mktime(DateLogged) as _time 
 | rename DiskFreeSpaceMB as Free_GB 
 | eval Free_GB=(Free_GB/1024) 
 | rename DiskSizeMB as Total_GB
 | eval Total_GB=(Free_GB/PercentFree)*100
 | eval Used_GB=(Total_GB-Free_GB)]

 | sort - _time
 | timechart span=1d avg(Total_GB) AS Total_GB avg(Used_GB) AS Used_GB 
 | eval Total_GB=round(Total_GB,2)
 | eval Used_GB=round(Used_GB,2)

 | reverse
 | streamstats window=60 current=f first(Used_GB) as Predict_UsedGB_Future
 | reverse

 | fit RandomForestRegressor "Predict_UsedGB_Future" from "_time" into "RFRCapacity_v2"
 | eval DesiredCapacity_70PercUsed=(Predict_UsedGB_Future/.7)
 | eval Space_To_Add_GB=(DesiredCapacity_70PercUsed-Total_GB)
 | eval Predicted_PercUsed=(Predict_UsedGB_Future/Total_GB)*100

Apply ->

 index=summary *avg_Total_Capacity*  | rename avg_Total_Capacity AS Total_GB
     | rename avg_Used_GB AS Used_GB
     | append

     [| inputlookup ServerName_G | fields - Disk
     | convert timeformat="%m/%d/%Y" mktime(DateLogged) as _time 
     | rename DiskFreeSpaceMB as Free_GB 
     | eval Free_GB=(Free_GB/1024) 
     | rename DiskSizeMB as Total_GB
     | eval Total_GB=(Free_GB/PercentFree)*100
     | eval Used_GB=(Total_GB-Free_GB)]
     | sort - _time

    | apply "RFRCapacity_v2"  
    | timechart span=1d avg(Predict_UsedGB_Future) as Predict_UsedGB_Future
    | eval residual = 'Used_GB' - 'predicted(Used_GB)'
    | eval baseline=0
    | eval Predict_UsedGB_Future='Predict_UsedGB_Future'
0 Karma

Vijeta
Influencer

what is the time/Date format in your lookup table ?

0 Karma

adale25
Engager

Format for the csv file ->

    DateLogged
1   5/9/2016
2   5/9/2016
3   5/10/2016
4   5/10/2016
5   5/11/2016

Format from the summary index ->

    _time
1   2018-11-02
2   2018-07-22
3   2018-07-20
4   2018-07-13
5   2018-07-09
0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer Certification at ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...