Dear All,
Kindly suggest , How to sort data in stats command output as per event time .
Example:
Requirement : VPN login details as per source user In last one hour.
SPL Query :
index="network" sourcetype=vpn
| eval "Login_Time" = strftime(_time, "%m/%d/%Y %I:%M:%S %p")
| stats values(SourceUser)as User values(Login_Time) as VPN_Login_Time count by _time host
Date | Host | User | VPN Login Time | Count |
1/7/2025 0:00 | 10.10.8.45 | Amar Rajesh Zainab | 01/07/2025 06:01:25 AM 01/7/2025 06:30:21 AM 01/7/2025 06:50:49 AM | 3 |
challenge in above example output, Amar was logged in at 01/7/2025 06:30:21AM and Zainab was logged in at 01/07/2025 06:01:25 AM but in output result user were sorted with alphabetical order and Login time were sorted in descending order. And User field can not be added as third field in by expression .
If you want to end up with several multivalue fields that are correlated with each other, you can't use stats values() as the output from a values() aggregation is always in sorted order.
There are a number of options
1. Use stats list() which will record the item for EVERY event but the order is preserved, but of course if you have duplicates for the same user on the same _time, you will have multiple entries. Note that list() has a maximum list length of 100 items
2. Make a combination field of the items you want to end up with and use stats values(new_field) and then split them out again, e.g. like this
...
| eval _tmp=SourceUser."###".Login_Time
| stats values(_tmp) as _tmp count by _time host
| rex field=_tmp max_match=0 "(?<User>.*)###(?<VPN_Login_Time>.*)"
| fields - _tmp
3. Do this to handle the potential duplicate logins on the same _time for the same user
...
| stats values(Login_Time) as VPN_Login_Time count by _time host SourceUser
| stats list(*) as * sum(count) as count by _time host
so include the SourceUser initially then use stats list finally
Hope this helps
If you want to end up with several multivalue fields that are correlated with each other, you can't use stats values() as the output from a values() aggregation is always in sorted order.
There are a number of options
1. Use stats list() which will record the item for EVERY event but the order is preserved, but of course if you have duplicates for the same user on the same _time, you will have multiple entries. Note that list() has a maximum list length of 100 items
2. Make a combination field of the items you want to end up with and use stats values(new_field) and then split them out again, e.g. like this
...
| eval _tmp=SourceUser."###".Login_Time
| stats values(_tmp) as _tmp count by _time host
| rex field=_tmp max_match=0 "(?<User>.*)###(?<VPN_Login_Time>.*)"
| fields - _tmp
3. Do this to handle the potential duplicate logins on the same _time for the same user
...
| stats values(Login_Time) as VPN_Login_Time count by _time host SourceUser
| stats list(*) as * sum(count) as count by _time host
so include the SourceUser initially then use stats list finally
Hope this helps