I have a set of log data that is basically in this format:
Event timestamp user
6 10/14/2019 1:29 User1
33 10/14/2019 1:30 User1
3 10/14/2019 1:30 User1
6 10/14/2019 2:12 User2
36 10/14/2019 2:12 User2
31 10/14/2019 2:27 User2
I don't have the required source and target columns in the logs, so I'm not quite sure how to approach creating a Sankey diagram. I have a "login" event that I can use as the starting point to figure out what users are doing. How can I get from the log files I have now to a format that the Sankey diagram can use? Do I need to edit the files to create a source and target field?
For example, I'd like to take the example data from above, and turn it into the following Sankey-friendly format:
User Source Dest
User1 6 33
User1 33 3
User2 6 36
User2 36 31
Based on the data provided, SAnkey might not be the best for this. Sankey is more useful when there is a more commonly defined hierarchy between events.
For example, Calculating the amount of data/traffic flow between app to backend service to actions
If you are able to establish hierarchy between user actions, such as in an app where the flow of events is predefined, it makes more sense to use a sankey and see how many users drop off at each step. If there is no proper hierarchy between events, your sankey will be a mess. You might want to look at the timeline chart which will probably be better able to depict this by calculating an aggregate order for the user actions.
For Sankey, You can start with
| stats count by User, Event
This will start an individual flow per user in the sankey diagram.
Cheers
Any suggestions for how to get started on the timeline chart? I'm trying to look into it, as I'm happy to consider alternative visualizations.
What exactly is the end goal? I understand that you want to use the Sankey Diagram, but what are you trying to visualize with it? Have you evaluated whether the Sankey diagram is the best way to tell the story with your data?
If yes, please provide samples of the different data sources that would contribute to this.
I've added some sample data. I'm looking to produce a Sankey diagram showing how users flow through the system. I have an event that represents login (6 in the example above), then others to represent different actions they can take in the system. I want to see what users are doing after they log in.