I have the below code. I know that values exist under the subsearch which are not returning when I run the below query. However, when I uncomment the "where clause" in the sub search the values appear. I don't know what I have done incorrectly for my results to not show. I've also commented out the |search and it still doesn't show that these values exist in the subsearch.
Any help would be appreciated.
index=customer name IN (gate-green, gate-blue) msg="*First time: *"
| rex field=msg "First time: (?<UserId>\d+)"
| eval FirstRequest = 1
| join type=left UserId
[search index=customer name IN (cust-blue, cust-green) msg="*COMPLETED *"
| rex field=msg "Message\|[^\t\{]*(?<json>{[^\t]+})"
| spath input=json path=infoId output=UserId
| eval Completed = 1
```| where UserId IN (125,999,418,208)```]
| table UserId, Completed
| search UserId IN (125,999,418,208)
Hi @MrJohn230 ,
at first, if possible try to avoid to use join command!
I understand that all of us arrive from SQL, but Splunk isn't a database so join command should be avoided all the times it's possible and replaced e.g. with the stats command, because it's a very slow and resource eater command.
e.g. try something like this (obviously I cannot check it):
index=customer ((name IN (gate-green, gate-blue) msg="*First time: *") OR name IN (cust-blue, cust-green) msg="*COMPLETED *")
| rex field=msg "First time: (?<UserId>\d+)"
| rex field=msg "Message\|[^\t\{]*(?<json>{[^\t]+})"
| spath input=json path=infoId output=UserId
| eval status=if(name IN (gate-green, gate-blue) AND msg="*First time: *","FirstRequest","Completed")
| stats dc(status) AS status_count values(status) AS status BY UserId
| eval status=if(status_count=2,"both",status)
| table UserId status
| search UserId IN (125,999,418,208)
Then you can define if to maintain all the UserIds or only the ones with both the statuses.
About your search, try to use quotes in the IN values.
Ciao.
Giuseppe
One additional remark about your searches (both the outer one and the subsearch) - don't use wildcards at the beginning of your search term if you can avoid it.
Your where clause is wrong - it does not support the IN construct, like search. You could do it with
| where in(UserId,125,999,418,208)
or using search IN. As @gcusello says, using join and subsearches is not a good habit, using stats can normall do the same and does not have limitations that join/subsearch has, e.g. this is an example of using stats.
index=customer (name IN (gate-green, gate-blue) msg="*First time: *") OR (name IN (cust-blue, cust-green) msg="*COMPLETED *")
| rex field=msg "First time: (?<UserId>\d+)"
| eval FirstRequest = if(isnotnull(UserId),1,null())
| rex field=msg "Message\|[^\t\{]*(?<json>{[^\t]+})"
| spath input=json path=infoId output=CompletedUserId
| eval Completed = if(isnotnull(CompletedUserId), 1, null())
| eval UserId=coalesce(UserId, CompletedUserId)
| stats values(Completed) as Completed by UserId
| search UserId IN (125,999,418,208)
One more vote to not to use join or union. It usually have more issues than it solves.
Here is one old post how to replace join with stats with different join types https://community.splunk.com/t5/Splunk-Search/What-is-the-relation-between-the-Splunk-inner-left-joi...
There are also many .conf presentations why you should use stats instead of join like https://conf.splunk.com/watch/conf-online.html?search=join%20without%20join#/
r. Ismo
Hello @gcusello I managed to not use the join command, the above feedback helped.
I decided to go with Union this seems to be working, what do you think?
index=customer ((name IN (gate-green, gate-blue) msg="*First time: *")
| rex field=msg "First time: (?<UserId>\d+)"
| eval FirtRequest = 1
|union
[ search index name IN (cust-blue, cust-green) msg="*COMPLETED *")
| rex field=msg "Message\|[^\t\{]*(?<json>{[^\t]+})"
| spath input=json path=infoId output=UserId
| eval Completed = 1]
| stats max(FirstRequest) AS FirstRequest max(Completed) BY UserId
Hi @MrJohn230 ,
I continue to prefer the solution I hinted because union is very similar to join and maintain the ame limit of 50,000 results like all the subsearches.
If it solves your requirement, use it, but I hint to tale practice using stats that's better and faster.
did you solved the initial issue of the eval?
Ciao.
Giuseppe
Hi @MrJohn230 ,
at first, if possible try to avoid to use join command!
I understand that all of us arrive from SQL, but Splunk isn't a database so join command should be avoided all the times it's possible and replaced e.g. with the stats command, because it's a very slow and resource eater command.
e.g. try something like this (obviously I cannot check it):
index=customer ((name IN (gate-green, gate-blue) msg="*First time: *") OR name IN (cust-blue, cust-green) msg="*COMPLETED *")
| rex field=msg "First time: (?<UserId>\d+)"
| rex field=msg "Message\|[^\t\{]*(?<json>{[^\t]+})"
| spath input=json path=infoId output=UserId
| eval status=if(name IN (gate-green, gate-blue) AND msg="*First time: *","FirstRequest","Completed")
| stats dc(status) AS status_count values(status) AS status BY UserId
| eval status=if(status_count=2,"both",status)
| table UserId status
| search UserId IN (125,999,418,208)
Then you can define if to maintain all the UserIds or only the ones with both the statuses.
About your search, try to use quotes in the IN values.
Ciao.
Giuseppe