Splunk Search

How to create a multistage Sankey diagram with a single search without needing to "append"?

Path Finder

I have a dataset where each event summarizes a workflow, using the fields Foo->Bar->Baz, and I'm looking to create a Sankey diagram to visualize the flow. The only way I've come up with to get the output I want is to run one search, do a stats call, and then append the same query with a different stats call, like:

index=myIndex | stats count BY Foo, Bar | rename Foo AS source, Bar AS target | append [search index=myIndex | stats count BY Bar, Baz | rename Bar AS source, Baz AS target]

This works, but it's incredibly inefficient, and MUCH slower than it needs to be. Is there a way to get the output I'm looking for with a single search that I'm missing?

The output table would look something like:

source | target | count
foo1   | bar1   | 3
foo1   | bar2   | 12
bar1   | baz1   | 1
bar1   | baz2   | 2
bar2   | baz1   | 12
1 Solution

Splunk Employee
Splunk Employee

If you can count by all three fields, maybe using appendpipe would be less resource intensive than using append:

sourcetype="access_combined" 
| stats count by host categoryId product_name
| appendpipe [stats count by host categoryId | rename host as source, categoryId as target]
| appendpipe [stats count by categoryId product_name | rename categoryId as source, product_name as target]
| search source=*
| fields source target count

gives me

alt text

View solution in original post

Splunk Employee
Splunk Employee

If you can count by all three fields, maybe using appendpipe would be less resource intensive than using append:

sourcetype="access_combined" 
| stats count by host categoryId product_name
| appendpipe [stats count by host categoryId | rename host as source, categoryId as target]
| appendpipe [stats count by categoryId product_name | rename categoryId as source, product_name as target]
| search source=*
| fields source target count

gives me

alt text

View solution in original post

Communicator

Hi aljohnson. I want to thank you very much for this solution. I applied it on my problem and it worked very well. Well done.

0 Karma

Path Finder

Hmm - I tried to post your comment as the answer, but Splunk is saying I can't make more than 2 posts per day until I hit 40 points. Pretty sure I've only made one post today, but...

/shrug

If you paste that same thing as the answer, I'll mark it solved 🙂

Explorer

Hi aljohnson,

Thanks for your answer, it would greatly help to have it integrated in the documentation...

Find below a little amendment that helps to size correctly the lines :

sourcetype="accesscombined"
| table host categoryId product
name
| appendpipe [stats count by host categoryId | rename host as source, categoryId as target]
| appendpipe [stats count by categoryId productname | rename categoryId as source, productname as target]
| search source=*
| fields source target count

0 Karma

Splunk Employee
Splunk Employee

Glad it worked. Converted 🙂

0 Karma

Path Finder

Yes! Perfect!

Didn't realize appendpipe was a thing. Thanks for your help!

Path Finder

...I have no idea why a random "5." is showing up in the middle of the table...

0 Karma

Splunk Employee
Splunk Employee

Cool question @doweaver. How many distinct values are there of foo bar and baz? As a solution for dc(foo) = 2 might be a lot simpler than all of those distinct values being an unknown variable.

Path Finder

There are probably ~5 distinct values for each.

I'm not sure I understand what you're getting at here:

As a solution for dc(foo) = 2 might be a lot simpler than all of those distinct values being an unknown variable.

0 Karma

Splunk Employee
Splunk Employee

Sorry, that wasn't well worded. I just meant that if there is a smaller number of distinct values, you might be able to get a simpler answer (I'm more thinking out loud haha, sorry).

So obviously foo and bar occur together, and bar and baz occur together, but do foo and baz NOT occur together, that is, is there a reason you can't search

index=myIndex | stats count by foo bar baz

Path Finder

No worries 🙂

Unfortunately, they all three occur in a single event 😞 Technically, it's a transaction that links A -> B, with A containing Foo, and B containing Bar and Baz. I don't THINK there's a way to split things up in a way that will make that work... but I'll keep thinking about that.

0 Karma

Community Manager
Community Manager

Hi @doweaver

That's just automatic numbering with anything in code blocks so people can help users point out where they've identified errors in syntax when people are sharing multiple lines of sample data/code.

Path Finder

Oh, that makes sense 🙂 That was the best way I could figure out to put in a table (HTML table markup didn't seem to work).

0 Karma

Community Manager
Community Manager

heh yeah, that's the best way to display a table format on here. you're doin it right 🙂