Splunk Search

How to union two time ranges

marshalll3302
Explorer

I'm trying to UNION two different tables containing info on foreign traffic - the first table is a log with time range earliest=-24h latest=-1h. The second are logs of those same systems for the full 24 hours (earliest=-24h latest=now()). My search:

| union [ search index=<index1> src_ip IN (<srcvalues>) AND dest_ip!=<ipvalues> NOT dest_location IN ("<locvalues>") earliest=-24h latest=-1h
| eval dest_loc_ip1=dest_location. "-" .dest_ip
| stats DC(dest_loc_ip1) as oldconnections by src_ip] [ search index=<index1> src_ip IN (<srcvalues>) AND dest_ip!=<ipvalues> NOT dest_location IN ("<locvalues>") earliest=-24h latest=now()
| eval dest_loc_ip2=dest_location. "-" .dest_ip
| stats DC(dest_loc_ip2) as allconnections by src_ip] | fields src_ip oldconnections allconnections

I am trying to compare the values of oldconnections vs allconnections for only the original systems (basically a left join), but for some reason, the allconnections shows all null values. I get a similar issue when trying to left join - the allconnections values are not consistent to the values when I run the search by itself. I can run the two searches separately with the expected result, so I'm guessing there's an error in my UNION syntax and ordering.

Thanks for the help!
-also open to other ways to solve this 🙂

Labels (2)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

The first dc() needs  eval() as the main function.

index=<index1> src_ip IN (<srcvalues>) AND dest_ip!=<ipvalues> NOT dest_location IN ("<locvalues>") earliest=-24h latest=now()
| stats dc(eval(if(_time < relative_time(now(), "-1h"), dest_location. "-" .dest_ip, null())) as oldconnections
 dc(eval(dest_location. "-" .dest_ip)) as allconnections
 by src_ip

 For some reason Splunk doesn't print warning about such, perhaps assuming that all terms are strings unless it begins with eval.

View solution in original post

yuanliu
SplunkTrust
SplunkTrust

Splunk commandment #3: Whenever you have the urge to join, purge that thought and restate the problem in clear terms.

Based on the mock codes, your two subsearches contain the exact same terms except one is one hour shorter than the other; the stats is also exactly the same except field name of the output.  I sense that the problem you are trying to solve is this: count unique dest_location-dest_ip combinations by src_ip in the two time intervals.  Is this correct?

The following is a transliteration of the requirement.

 

index=<index1> src_ip IN (<srcvalues>) AND dest_ip!=<ipvalues> NOT dest_location IN ("<locvalues>") earliest=-24h ```latest=now```
| stats dc(if(_time < relative_time(now(), "-1h"), dest_location . "-" . dest_ip, null())) as oldconnections
 dc(eval(dest_location . "-" . dest_ip)) as allconnections
 by src_ip

 

It only performs one index search covering one time interval.  This is a lot more efficient than union on two largely overlapping subsearches.

Tags (1)

marshalll3302
Explorer

I tried the the following and all values for oldconnection field are coming up as 0, which I'm assuming is due to the if statement returning null for each event. 

index=<index1> src_ip IN (<srcvalues>) AND dest_ip!=<ipvalues> NOT dest_location IN ("<locvalues>") earliest=-24h latest=now()
| stats dc(if(_time < relative_time(now(), "-1h"), eval(dest_location. "-" .dest_ip), null())) as oldconnections
 dc(eval(dest_location. "-" .dest_ip)) as allconnections
 by src_ip

 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

The first dc() needs  eval() as the main function.

index=<index1> src_ip IN (<srcvalues>) AND dest_ip!=<ipvalues> NOT dest_location IN ("<locvalues>") earliest=-24h latest=now()
| stats dc(eval(if(_time < relative_time(now(), "-1h"), dest_location. "-" .dest_ip, null())) as oldconnections
 dc(eval(dest_location. "-" .dest_ip)) as allconnections
 by src_ip

 For some reason Splunk doesn't print warning about such, perhaps assuming that all terms are strings unless it begins with eval.

marshalll3302
Explorer

This did it. Thank you for all your help @Anonymous !!

0 Karma

marshalll3302
Explorer

Thanks and noted. All your assumptions were correct except:

the stats is also exactly the same except field name of the output

My target output fields will look like this:

  • Src_ip of systems of the first 23 hours (-24h to -1h)
  • count distinct number of dest_location-dest_ip combinations (-24h to -1h)
  • count distinct number of dest_location-dest_ip combinations in full last 24hrs (-24h to now())

I see the logic of your code and am trying to tweak it so that it will match my target output.

0 Karma
Get Updates on the Splunk Community!

How to Monitor Google Kubernetes Engine (GKE)

We’ve looked at how to integrate Kubernetes environments with Splunk Observability Cloud, but what about ...

Index This | How can you make 45 using only 4?

October 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...

Splunk Education Goes to Washington | Splunk GovSummit 2024

If you’re in the Washington, D.C. area, this is your opportunity to take your career and Splunk skills to the ...