Splunk Search

Creating a query for Sankey diagram from log data

rschuetzler
Explorer

I have a set of log data that is basically in this format:

Event    timestamp       user
6    10/14/2019 1:29    User1
33  10/14/2019 1:30 User1
3    10/14/2019 1:30    User1
6    10/14/2019 2:12    User2
36  10/14/2019 2:12 User2
31  10/14/2019 2:27 User2

I don't have the required source and target columns in the logs, so I'm not quite sure how to approach creating a Sankey diagram. I have a "login" event that I can use as the starting point to figure out what users are doing. How can I get from the log files I have now to a format that the Sankey diagram can use? Do I need to edit the files to create a source and target field?

For example, I'd like to take the example data from above, and turn it into the following Sankey-friendly format:

User    Source     Dest
User1     6         33
User1    33          3
User2     6         36
User2    36         31
0 Karma

arjunpkishore5
Motivator

Based on the data provided, SAnkey might not be the best for this. Sankey is more useful when there is a more commonly defined hierarchy between events.

For example, Calculating the amount of data/traffic flow between app to backend service to actions

If you are able to establish hierarchy between user actions, such as in an app where the flow of events is predefined, it makes more sense to use a sankey and see how many users drop off at each step. If there is no proper hierarchy between events, your sankey will be a mess. You might want to look at the timeline chart which will probably be better able to depict this by calculating an aggregate order for the user actions.

For Sankey, You can start with

| stats count by User, Event

This will start an individual flow per user in the sankey diagram.

Cheers

0 Karma

rschuetzler
Explorer

Any suggestions for how to get started on the timeline chart? I'm trying to look into it, as I'm happy to consider alternative visualizations.

0 Karma

arjunpkishore5
Motivator

What exactly is the end goal? I understand that you want to use the Sankey Diagram, but what are you trying to visualize with it? Have you evaluated whether the Sankey diagram is the best way to tell the story with your data?

If yes, please provide samples of the different data sources that would contribute to this.

0 Karma

rschuetzler
Explorer

I've added some sample data. I'm looking to produce a Sankey diagram showing how users flow through the system. I have an event that represents login (6 in the example above), then others to represent different actions they can take in the system. I want to see what users are doing after they log in.

0 Karma