Splunk Search

Different search time in subsearch/join

KellyP
Splunk Employee
Splunk Employee

Hi

I have two sets of data, one is proxy logs (index=netproxy) and the other is an extract of LTE Logs which is logs every time the device joins. I'd like to cross reference the proxy logs with the LTE data so I can extract the IMEI number but the IMEI number could exist in logs outside of the search time window.

The below search works but only if the timeframe is big enough that it includes the device in the proxy logs. Is there a way I can maybe extend the earliest time for 24 hours prior to the search time window? I don't want to do "all time" on the subsearch because the IP Address allocations will change over time and then be matched against the wrong IMEI.

index=netproxymobility sourcetype="zscalernss-web"
| fields transactionsize responsesize requestsize urlcategory serverip ClientIP hostname appname appclass urlclass
type=left ClientIP 
         [ search index=netlte
         | dedup ClientIP
         | fields ClientIP IMEI 
         ]



thanks

Labels (1)
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @KellyP ,

in the search you shared you forgot the join command, but anyway avoid to use join, and possible forget this command because it's very slow and resource consuming: Splunk isn't a relational DB. t's a search engine.

So you can correlate events in a different way usng stats:

(index=netproxymobility sourcetype="zscalernss-web") OR index=netlte
| stats 
   values(transactionsize) AS transactionsize
   values(responsesize) AS responsesize
   values(requestsize) AS requestsize
   values(urlcategory) AS urlcategory
   values(serverip)serverip
   values(ClientIP) ASClientIP
   values(hostname) AS hostname
   values(appname) AS appname
   values(appclass) AS appclass
   values(urlclass) AS urlclass
   values(IMEI) AS IMEI
   BY ClientIP

if you want onlythe events in both the indexes, you can add an additional clause:

(index=netproxymobility sourcetype="zscalernss-web") OR index=netlte
| stats 
   values(transactionsize) AS transactionsize
   values(responsesize) AS responsesize
   values(requestsize) AS requestsize
   values(urlcategory) AS urlcategory
   values(serverip)serverip
   values(ClientIP) ASClientIP
   values(hostname) AS hostname
   values(appname) AS appname
   values(appclass) AS appclass
   values(urlclass) AS urlclass
   values(IMEI) AS IMEI
   dc(index) AS index_count
   BY ClientIP
| where index_count=2
| fields - index_count

Ciao.

Giuseppe

0 Karma

yuanliu
SplunkTrust
SplunkTrust

There is a recent question about doing this in dashboard.  Using a time selector token would be the cleanest.

If you are not doing it in dashboard, but this is a singular use case, I can think of an ugly map, like this

 

| makeresults
| addinfo
| eval start_24h_earlier = relative_time(info_min_time, "-24h")
| map start_24h_earlier search="search index=netlte earliest=$start_24h_earlier
         | dedup ClientIP
         | fields ClientIP IMEI"
| join ClientIP
    [index=netproxymobility sourcetype="zscalernss-web"
    | fields transactionsize responsesize requestsize urlcategory serverip ClientIP hostname appname appclass urlclass]

 

Here is a proof of concept:

 

| makeresults
| addinfo
| eval int_start = relative_time(info_min_time, "-24h"), int_end = relative_time(info_max_time, "-4h")
| map info_min_time info_max_time search="search index=_audit earliest=$int_start$ latest=$int_end$
  | stats min(_time) as in_begin max(_time) as in_end by action"
| join action
    [search index = _audit
    | stats min(_time) as out_begin max(_time) as out_end by action]
| fieldformat in_begin = strftime(in_begin, "%F %T")
| fieldformat in_end = strftime(in_end, "%F %T")
| fieldformat out_begin = strftime(out_begin, "%F %T")
| fieldformat out_end = strftime(out_end, "%F %T")

 

My output looks like

actionin_beginin_endout_beginout_end
expired_session_token2024-03-24 23:23:202024-03-25 12:25:332024-03-25 22:33:512024-03-25 23:19:51
login attempt2024-03-24 22:23:122024-03-25 04:25:112024-03-25 21:33:262024-03-25 21:33:26
quota2024-03-24 22:23:162024-03-25 04:25:172024-03-25 21:41:522024-03-25 23:21:45
read_session_token2024-03-24 19:21:022024-03-25 19:18:582024-03-25 20:17:302024-03-25 23:21:49
search2024-03-24 19:37:092024-03-25 11:29:402024-03-25 21:15:352024-03-25 23:21:05
update2024-03-24 19:51:572024-03-25 09:15:172024-03-25 21:13:232024-03-25 23:09:53
validate_token2024-03-24 19:21:022024-03-25 19:18:582024-03-25 20:17:302024-03-25 23:21:49
Tags (1)
0 Karma

KendallW
Path Finder

Would adding "earliest=<24 hours prior to the search time window>" in the subsearch fix this?

0 Karma

KellyP
Splunk Employee
Splunk Employee

that's what I was wondering but not sure how to reference the earliest values from within the join

0 Karma
Get Updates on the Splunk Community!

Index This | Forward, I’m heavy; backward, I’m not. What am I?

April 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

A Guide To Cloud Migration Success

As enterprises’ rapid expansion to the cloud continues, IT leaders are continuously looking for ways to focus ...

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...