Splunk Search

How do you merge multisearch results?

gmasca
Explorer

Hi,

I tried many alternatives but no good results. Please help if possible.

I have a multi search with two sets of results:

place
location:
31/12 9:00 cat sofa red
31/12 9:00 dog bed blue
30/12 9:00 dog bed red
29/12 9:00 dog door orange
29/12 9:00 cow sofa blue

Event
31/12 10:00 dog
31/12 3:00 cat
31/12 1:00 cow
30/12 11:00 dog
30/12 2:00 dog
30/12 1:00 cow
29/12 4:00 cat
29/12 11:00 cat
29:00 5:00 dog

I need to use multisearch because both tables can be huge. I need to merge the results following:

Where was the animal with which color on each event.

The place should look for the last 24 hours info only from the location table.

The events result without match stay. The locations info without match need to be excluded.

Your help is much appreciated,

Happy New Year.

0 Karma

woodcock
Esteemed Legend

I would use one search to construct a time-based lookup table (https://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Configureatime-boundedlookup) from one dataset. Then use this lookup in another search.

0 Karma

DalJeanis
Legend

Okay, as far as logic is concerned, multisearch is exactly the same as an OR, so I'll pseudocode this with an OR. Try it both ways and see what actual performance is on your systems.

I'm going to assume that color is a detail not directly modifying the animal or the location. That is, dog bed red is not a red dog in the bed or a dog in a red bed. Let's say it's the color of a balloon that happens to be in the location. If you want to differentiate between the location of a red dog and a blue dog, then we'll need more discussion.

Given your 24 hour stricture, and the fact that your location info seem to exist once a day, I'm going to use streamstats to copy the most recent location data to the animal, subject to a 24 hour window. I'm also going to sort to make sure the location data appears before the animal data in the sort order for a given hour. You'll have to revisit that assumption for your real, confidential use case.

I pseudocode this query thus:

( your search that gets hour animal location color )  OR
( your search that gets hour and animal )
| fields _time animal location color  
| rename COMMENT as "the above selects the records and limits to only data we need" 

| rename COMMENT as "roll the location data (within 24 hours) over to the animal record" 
| eval sortlocationfirst=if(isnotnull(location), "1","2")
| sort 0 _time sortlocationfirst

| rename COMMENT as "roll the location data (within 24 hours) over to the animal record" 
| streamstats time_window=24h last(location) as location last(color) as color by animal

| rename COMMENT as "Now throw away ALL location records, since the data we needed has been copied. " 
| where (it is an animal record)

If there is some other data you need to retain on the location records, then there would be various strategies to identify which ones have been used or not use. Let us know, and we can help you with that mod.

0 Karma

gmasca
Explorer

Hi,

Thanks a lot it worked very well but I am having some issues.
Is there any replacement for the command time_window? This command is limited to 10000 events but I have a bigger number.

Thank you,

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Please show your existing search query and what the results should look like.

---
If this reply helps you, Karma would be appreciated.
0 Karma

woodcock
Esteemed Legend

What did you end up doing, @gmasca?

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...