Hi
I have two sets of data, one is proxy logs (index=netproxy) and the other is an extract of LTE Logs which is logs every time the device joins. I'd like to cross reference the proxy logs with the LTE data so I can extract the IMEI number but the IMEI number could exist in logs outside of the search time window.
The below search works but only if the timeframe is big enough that it includes the device in the proxy logs. Is there a way I can maybe extend the earliest time for 24 hours prior to the search time window? I don't want to do "all time" on the subsearch because the IP Address allocations will change over time and then be matched against the wrong IMEI.
index=netproxymobility sourcetype="zscalernss-web"
| fields transactionsize responsesize requestsize urlcategory serverip ClientIP hostname appname appclass urlclass
type=left ClientIP
[ search index=netlte
| dedup ClientIP
| fields ClientIP IMEI
]
thanks
Hi @KellyP ,
in the search you shared you forgot the join command, but anyway avoid to use join, and possible forget this command because it's very slow and resource consuming: Splunk isn't a relational DB. t's a search engine.
So you can correlate events in a different way usng stats:
(index=netproxymobility sourcetype="zscalernss-web") OR index=netlte
| stats
values(transactionsize) AS transactionsize
values(responsesize) AS responsesize
values(requestsize) AS requestsize
values(urlcategory) AS urlcategory
values(serverip)serverip
values(ClientIP) ASClientIP
values(hostname) AS hostname
values(appname) AS appname
values(appclass) AS appclass
values(urlclass) AS urlclass
values(IMEI) AS IMEI
BY ClientIP
if you want onlythe events in both the indexes, you can add an additional clause:
(index=netproxymobility sourcetype="zscalernss-web") OR index=netlte
| stats
values(transactionsize) AS transactionsize
values(responsesize) AS responsesize
values(requestsize) AS requestsize
values(urlcategory) AS urlcategory
values(serverip)serverip
values(ClientIP) ASClientIP
values(hostname) AS hostname
values(appname) AS appname
values(appclass) AS appclass
values(urlclass) AS urlclass
values(IMEI) AS IMEI
dc(index) AS index_count
BY ClientIP
| where index_count=2
| fields - index_count
Ciao.
Giuseppe
There is a recent question about doing this in dashboard. Using a time selector token would be the cleanest.
If you are not doing it in dashboard, but this is a singular use case, I can think of an ugly map, like this
| makeresults
| addinfo
| eval start_24h_earlier = relative_time(info_min_time, "-24h")
| map start_24h_earlier search="search index=netlte earliest=$start_24h_earlier
| dedup ClientIP
| fields ClientIP IMEI"
| join ClientIP
[index=netproxymobility sourcetype="zscalernss-web"
| fields transactionsize responsesize requestsize urlcategory serverip ClientIP hostname appname appclass urlclass]
Here is a proof of concept:
| makeresults
| addinfo
| eval int_start = relative_time(info_min_time, "-24h"), int_end = relative_time(info_max_time, "-4h")
| map info_min_time info_max_time search="search index=_audit earliest=$int_start$ latest=$int_end$
| stats min(_time) as in_begin max(_time) as in_end by action"
| join action
[search index = _audit
| stats min(_time) as out_begin max(_time) as out_end by action]
| fieldformat in_begin = strftime(in_begin, "%F %T")
| fieldformat in_end = strftime(in_end, "%F %T")
| fieldformat out_begin = strftime(out_begin, "%F %T")
| fieldformat out_end = strftime(out_end, "%F %T")
My output looks like
action | in_begin | in_end | out_begin | out_end |
expired_session_token | 2024-03-24 23:23:20 | 2024-03-25 12:25:33 | 2024-03-25 22:33:51 | 2024-03-25 23:19:51 |
login attempt | 2024-03-24 22:23:12 | 2024-03-25 04:25:11 | 2024-03-25 21:33:26 | 2024-03-25 21:33:26 |
quota | 2024-03-24 22:23:16 | 2024-03-25 04:25:17 | 2024-03-25 21:41:52 | 2024-03-25 23:21:45 |
read_session_token | 2024-03-24 19:21:02 | 2024-03-25 19:18:58 | 2024-03-25 20:17:30 | 2024-03-25 23:21:49 |
search | 2024-03-24 19:37:09 | 2024-03-25 11:29:40 | 2024-03-25 21:15:35 | 2024-03-25 23:21:05 |
update | 2024-03-24 19:51:57 | 2024-03-25 09:15:17 | 2024-03-25 21:13:23 | 2024-03-25 23:09:53 |
validate_token | 2024-03-24 19:21:02 | 2024-03-25 19:18:58 | 2024-03-25 20:17:30 | 2024-03-25 23:21:49 |
Would adding "earliest=<24 hours prior to the search time window>" in the subsearch fix this?
that's what I was wondering but not sure how to reference the earliest values from within the join