Hello,
I am trying to join two searches for see, same hash exists on the other index as well. Below is my search, the issue is every time I run a search for the same timelimit, I see different results. WHY?
Basesearch: I've tried to combine results of three different hash fields into one
(index=a sourcetype="a" (hash1=* OR hash2=* OR hash3=*))
| fields hash1, hash2, hash3
| table hash1, hash2, hash3
| eval hash=mvzip(mvzip('hash1','hash2',"|"),'hash3',"|")
| fields hash
| makemv hash delim="|"
| mvexpand hash
From here, I've joined two indexes and both indexes have same field for hash files, so I'm attempting to join hash as the focus. Search seems to work fine
join type=left hash
[| search (index=b sourcetype=b hashfile=*) OR (index=c sourcetype=c hashfile=*)
| fields hashfile, filename,index
| eval hash=hashfile]
Both the search on running individually returns 2k+ results, whereas on combining it, I could see only 1 result in the stats table and on hitting run for the same time limit every time I see different file name WHYYY?
Any help would be appreciated, thanks!
Hi @Woodpecker,
two questions:
did you used a closed time range (e.g. last hour or yesterday) or did you used a time period containing "latest=now"?
Because in the second case, you cannot compare results, you have to choose a closed period (e.g. last hour or yesterday).
if you run the joined search, in the same time period of the main, have you more or less than 50,000 results?
remember that subsearches have the limit of 50,000 results and this influeces the final result.
this is one of the reasons (the second are the prestactions) to use join only as last option when you haven't any other choice!
Then I don't understand why you created the first search in this way, if you could have three fields as hash and you want only one, you could use coalesce, something like this:
(index=a sourcetype="a" (hash1=* OR hash2=* OR hash3=*)) OR index=b sourcetype=b hashfile=*) OR (index=c sourcetype=c hashfile=*)
| eval hash=if(index=a,coalesce("hash1","hash2","hash3",hash)
| stats values(filename) As filename values(index) AS index dc(index) AS dc_index BY hash
| where dc_index=2
| table hash filename
I don't know if this search is running because I cannot test it, but I'd like to show you the approach to replace join with stats.
Ciao.
Giuseppe
Hi
just check those limits and it's even lower. Only 10k events and/or 60s and TTL 300s.
If needed and you cannot user @gcusello 's suggestion to replace sub search e.g. with stats you can increase those values in limits.conf.
r. Ismo
If you are getting varying results for the same time interval and are using join or some of the other commands with limits, then it could be you are hitting limits.
An alternative search could be this - remove join - and use stats instead
(index=a sourcetype="a" (hash1=* OR hash2=* OR hash3=*)) OR (index=b sourcetype=b hashfile=*) OR (index=c sourcetype=c hashfile=*)
| eval hash=if(index=a,mvzip(mvzip('hash1','hash2',"|"),'hash3',"|"), hashfile)
| fields hash filename index
| mvexpand hash
| stats values(*) as * by hash
That makes hash either the combined hash from your first search, or hashfile from your second. It then expands hash as you do, but includes the fields from the second search.
It then uses stats to aggregate the fields again based on hash value.