Suppose I have events of user purchases
<pre>
eventName=purchase userId=1 time=1000 item=food price=100
eventName=purchase userId=1 time=1002 item=cloth price=200
eventName=purchase userId=1 time=1010 item=cloth price=150
eventName=purchase userId=99 time=1050 item=book price=200
</pre>
I would like to set a real time alert that informs me whether a user's FIRST EVER purchase has price >= 200
If I use normal search, I can just use "All time" with this query
<pre>
eventName=purchase | sort time | dedup userId | where price >= 200
</pre>
However, how do I efficiently implement the same thing using real time search? Something like
<pre>
eventName=purchase | ONLY PROCEED WHEN THIS IS THE FIRST ONE FOR THE USER | where price >= 200
</pre>
Thanks
You will need to maintain a lookup file of all the users who have ever made a purchase. Anyone making a purchase who is not in the lookup file must be a first-time purchaser. The lookup file (I call it 'purchasers.csv') will contain userId and at least one other field, perhaps 'time'.
eventName=purchase | where price >= 200 | lookup purchasers.csv userId OUTPUT time | where isnotnull(time)
Consider whether this really needs to be a real-time search. RT searches tie up a CPU on your search head and all indexers. They should be reserved for when an event must be responded to instantly and automatically.
You will need to maintain a lookup file of all the users who have ever made a purchase. Anyone making a purchase who is not in the lookup file must be a first-time purchaser. The lookup file (I call it 'purchasers.csv') will contain userId and at least one other field, perhaps 'time'.
eventName=purchase | where price >= 200 | lookup purchasers.csv userId OUTPUT time | where isnotnull(time)
Consider whether this really needs to be a real-time search. RT searches tie up a CPU on your search head and all indexers. They should be reserved for when an event must be responded to instantly and automatically.
Thanks @richgalloway
But now the question becomes, how do I maintain such lookup using only Splunk?
My idea is to run a script on/before index and then somehow writes into Splunk's key value store. However I couldn't find info on how to make a custom script called on/before index. Any idea? Thanks
Related question is here: https://answers.splunk.com/answers/718386/run-a-python-script-on-or-before-index.html