- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've got some logs I need to join and put on the same row.
I've tried a few different ways and searched the community but I can't seem to get exactly what I need.
There's a log every 10 minutes for each host and each drive on said hosts (there are a lot of hosts and drives). Each log has 2 events for the same time and drive letter. One for free MB and one for percent. Basically I need to join together each set of these two separate events based on the time, host and drive letter of the log. Is this possible?
base query:
index=perfmon host=host1 Category="PERFORMANCE" collection="WIN_PERF" object="LogicalDisk" counter="% Free Space" OR counter="Free Megabytes"
Drive letter is extracted as "instance"
percent and MB are both extracted as "Value"
Returns these logs:
"09/02/2021 21:48:49","host1","PERFORMANCE","WIN_PERF","LogicalDisk","Free Megabytes","d:","36092.00"
"09/02/2021 21:48:49","host1","PERFORMANCE","WIN_PERF","LogicalDisk","% Free Space","d:","41.47"
"09/02/2021 22:08:49","host1","PERFORMANCE","WIN_PERF","LogicalDisk","% Free Space","C:","19.30"
"09/02/2021 22:08:49","host1","PERFORMANCE","WIN_PERF","LogicalDisk","Free Megabytes","C:","19767.00"
Desired output:
Time Host Drive FreePercent FreeGB
09/02/2021 21:48:49 host1 d: 41.47 36092.0
09/02/2021 22:08:49 host1 C: 19.30 19767.00
Any help would be appreciated.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Hi
here is one option to do it. I just use your "Returns these logs:" here as input, but you should replace that part with your original query with correct field names.
| makeresults
| eval _raw = "_time, host, type, stype, etype, item, Drive, value
09/02/2021 21:48:49,host1,PERFORMANCE,WIN_PERF,LogicalDisk,Free Megabytes,d:,36092.00
09/02/2021 21:48:49,host1,PERFORMANCE,WIN_PERF,LogicalDisk,% Free Space,d:,41.47
09/02/2021 22:08:49,host1,PERFORMANCE,WIN_PERF,LogicalDisk,% Free Space,C:,19.30
09/02/2021 22:08:49,host1,PERFORMANCE,WIN_PERF,LogicalDisk,Free Megabytes,C:,19767.00"
| multikv forceheader=1
``` Above generates sample data ```
| eval FM = if (item = "Free Megabytes", value,null()), FS = if (item="% Free Space", value,null())
| stats values(FM) as FreeGB values(FS) as FreePercent by _time, host Drive
| table _time host Drive FreePercent FreeGB
r. Ismo
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for all the help guys.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Hi
here is one option to do it. I just use your "Returns these logs:" here as input, but you should replace that part with your original query with correct field names.
| makeresults
| eval _raw = "_time, host, type, stype, etype, item, Drive, value
09/02/2021 21:48:49,host1,PERFORMANCE,WIN_PERF,LogicalDisk,Free Megabytes,d:,36092.00
09/02/2021 21:48:49,host1,PERFORMANCE,WIN_PERF,LogicalDisk,% Free Space,d:,41.47
09/02/2021 22:08:49,host1,PERFORMANCE,WIN_PERF,LogicalDisk,% Free Space,C:,19.30
09/02/2021 22:08:49,host1,PERFORMANCE,WIN_PERF,LogicalDisk,Free Megabytes,C:,19767.00"
| multikv forceheader=1
``` Above generates sample data ```
| eval FM = if (item = "Free Megabytes", value,null()), FS = if (item="% Free Space", value,null())
| stats values(FM) as FreeGB values(FS) as FreePercent by _time, host Drive
| table _time host Drive FreePercent FreeGB
r. Ismo
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Yep, values() is another way to do it. Thought about it myself.
I'm wondering however which way - this or selfjoin is more effective in terms of CPU/memory usage and which has less limitations.
The selfjoin method is - IMHO - more obvious to construct (especially for a person with RDBMS background) but the results of selfjoin can be a bit confusing sometimes (see the example for selfjoin on splunk docs page).
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Almost every time stats is better than join. Of course there could be situation where it's another way. Best way to check it is just create both queries and then check with Job inspector which one is better. There are many presentations about Job Inspector and replacing join with stats.
- https://conf.splunk.com/files/2020/slides/TRU1143C.pdf
- https://conf.splunk.com/files/2019/slides/FNC2751.pdf
- https://conf.splunk.com/files/2020/slides/TRU1761C.pdf
Above is couple of those, but you could found easily more.
r. Ismo
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Thanks! Good to know.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Firstly, lose the unneeded fields, then do a selfjoin on host, drive and _time fields.
