I have a index named Events
Example events:
AccountCreated
{
"AccountId": 1234,
"EventName": "AccountCreated",
"SomeOtherProperty": "Some value"
}
FavoriteCreated
{
"AccountId": 1234,
"EventName": "FavoritesCreated
}
Let's say that I have a bunch of these events, like millions.
Now, I want to create a query that returns the AccountCreated event IF 1 (or more) FavoriteCreated event exists with the same AccountId.
I've tried the following query and it works
index=events EventName=AccountCreated
[search index=events EventName=FavoriteCreated | dedup AccountId | fields AccountId]
| table AccountId, SomeOtherProperty
The only problem with that one is that it's using a subsearch and im hitting the 10000 results limit.
So then I tried using a JOIN instead (this also works)
index=events EventName=AccountCreated AccountId=*
| stats count by AccountId, EventName
| fields - count
| join AccountId
[ | search index=events EventName=FavoriteCreated AccountId=*
| stats count by AccountId ]
| fields - count
| table AccountId, EventName
...but now im facing the 50K events limit (JOIN subsearch).
So, I need to be able to write the query without using JOIN and/or subsearch, do you guys have any tips?
Managed to solve it like this:
index=events AccountId=* (EventName=AccountCreated OR EventName=FavoriteCreated)
| stats
values(Brand) as Brand,
values(DeviceType) as DeviceType,
values(Email) as Email,
values(EventName) as EventName
values(EventTime) as EventTime,
values(Locale) as Locale,
values(ClientIp) as ClientIp
by AccountId
| where mvcount(EventName)>1
| eval EventName=mvindex(EventName,0)
| eval EventTime=mvindex(EventTime,0)
| eval ClientIp=mvindex(ClientIp,0)
| eval DeviceType=mvindex(DeviceType,0)
Managed to solve it like this:
index=events AccountId=* (EventName=AccountCreated OR EventName=FavoriteCreated)
| stats
values(Brand) as Brand,
values(DeviceType) as DeviceType,
values(Email) as Email,
values(EventName) as EventName
values(EventTime) as EventTime,
values(Locale) as Locale,
values(ClientIp) as ClientIp
by AccountId
| where mvcount(EventName)>1
| eval EventName=mvindex(EventName,0)
| eval EventTime=mvindex(EventTime,0)
| eval ClientIp=mvindex(ClientIp,0)
| eval DeviceType=mvindex(DeviceType,0)
Hi @joseftw,
Thanks for asking a question! If you were able to resolve this issue, please click “Accept” directly below the answer to resolve the post.
index=events (EventName=AccountCreated OR EventName=FavoriteCreated) AccountId=*
| stats estdc(EventName) as flag values(*) as * by AccountId
| where flag > 1
please modify *
to appropriate field.
Hi joseftw,
try something like this:
index=events (EventName=AccountCreated OR EventName=FavoriteCreated) AccountId=*
| eval EventName=case(isnotnull(EventName), EventName, 1=1, "unknown")
| stats count by AccountId, EventName
| fields - count
| table AccountId, EventName
If this does not work please modify it 😉 But you will get a starting point and should you need more help please read this post https://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-jo...
cheers, MuS
Hello!
Thank you for your answer! I tried it but it does not work for me. It returns BOTH AccountCreated and FavoriteCreated events, I only want the query to return AccountCreated events. This is my first day with Splunk so please bare with me... 🙂
If you have time, please look at my two examples I provided (they work but are using subsearch/join). I basically want to translate one of those queries to something that does not use join/subsearch.
Thank you for your help!
Basically "Return this AccountEvent IF 1 or more FavoriteCreated event exists with the same AccountId"