Hi,
I followed instructions here:
https://answers.splunk.com/answers/132016/process-flow-tracing-point-to-point-latency-calculation-vi...
to analyse users stuck on steps of my business process.
My current search is:
host="sp.dentfix.ro" | stats values(event) as step values(eval(strftime(_time, "%Y-%d-%m %H:%M"))) as times by id | mvcombine step | stats count by step
I have a number of issues here:
1. the values(event) is listing values in alphabetical order, but I would like to see values in event time order (that is first event of every type).
2. Even more important, I have some events sp_alias where users get identified. The event contains fields id (as the new id) and oldId as former id of the user. How can I consolidate all oldId s with new id s and group by real users?
OK, so I managed to solve the question 2.
Steps:
1. I created this search event=sp_alias | where oldId != id | stats values(id) as myid by oldId | rename oldId as id | outputlookup sp_aliases.csv
that makes a table of distinct oldId that should be replaced with identified id into a new field via lookup
2. I created the main search host="sp.dentfix.ro" | where id!="" | lookup sp_aliases id OUTPUT myid | eval theId=if(myid!="",myid,id) | stats list(event) as steps list(eval(strftime(_time, "%Y-%d-%m %H:%M"))) as times by theId | mvcombine steps | where like(steps,"%Payment%") | stats count as num by steps | sort num desc
where I created a new field, theId that unifies all events that happened via various ids in the past to the identified user id. I added a filter to display only steps that contain a Payment event.
OK, so I managed to solve the question 2.
Steps:
1. I created this search event=sp_alias | where oldId != id | stats values(id) as myid by oldId | rename oldId as id | outputlookup sp_aliases.csv
that makes a table of distinct oldId that should be replaced with identified id into a new field via lookup
2. I created the main search host="sp.dentfix.ro" | where id!="" | lookup sp_aliases id OUTPUT myid | eval theId=if(myid!="",myid,id) | stats list(event) as steps list(eval(strftime(_time, "%Y-%d-%m %H:%M"))) as times by theId | mvcombine steps | where like(steps,"%Payment%") | stats count as num by steps | sort num desc
where I created a new field, theId that unifies all events that happened via various ids in the past to the identified user id. I added a filter to display only steps that contain a Payment event.
There is a catch, sometimes the user can identify itself multiple times between several id. The lookup table is not ideal made to take care of this situation, perhaps a better approach is to collect all ids ever related into a single field then select one id to represent the whole group. Needs improvement but the impact is not big
Since the free license does not seem to include report scheduler, I had to setup a crontab entry such as
0 0 * * * root /opt/splunk/bin/splunk search '|savedsearch "Job to update sp_aliases lookup"' >/dev/null 2>&1
I'm not following question 2, but question 1 should use list() instead of values(), which will show in received order and non-dedup'd.
I made a table of aliases as a multivalue field named myid
* | stats values(oldId) as myid by id | eval myid=mvdedup(mvappend(myid,id)) | table myid
Somehow I need to combine the stats in the top comment to group by any value found in myid. Still having no idea how to do it.
So using the above search I get a table like a lookup table. I would calculate a new field theId in main search based on id, if id is inside myid field in the lookup table I would return the identified id.
then I could group events by the new theId field.
However implementing such a lookup apparently required exporting the table to a csv file? Then I need to define a job to do this from time to time to update the lookup file.
So I got this step, by creating a lookup file from the search
event=sp_alias | where oldId != id | stats values(oldId) as myid by id | outputlookup sp_aliases.csv
Now I have to tune the lookup file to include everything I need and find a way to use it in the main search. Also I need to automate the above search through a job I suppose.
yes, thanks for the tip, changing both values with list helped to display the steps more clearly, even if the stats table increased.
New search have a sort too to see better the mass behavior:
host="sp.dentfix.ro" | stats list(event) as step list(eval(strftime(_time, "%Y-%d-%m %H:%M"))) as times by id | mvcombine step | stats count as num by step | sort num desc