Solved: Sankey d3 chart missing "middle" nodes

0YAoNnmRmKDg · ‎04-20-2015

Hi Guys,

longtime lurker, first time poster....

so after many hours of work and rework I surrender - I cant get Sankey to display all the nodes I want it to.

The flow i am looking for in Sankey is

host > service > customer

this is to show if host x dies, it kills service y and affects customer z

i started by taking apart the 6.x samples app apart, and replacing with my search but the "services" nodes never show, i just get links from host to customer. the sample data in the 6.x app is csv and all the do is reference 2 columns and the magic happens, what am i missing?

I feel it is something fundamental but simple i'm not getting here...

Many thanks in advance for any help!

Original 6.x samples app search

            <search id="sankey_search">
                <query><![CDATA[
                    index=_internal sourcetype=splunk_web_access NOT uri_path=*/static/* uri_path=*/app/* OR uri_path=*/manager/* 
                    | rex field=referer "https?://.+?/.+?(?<referer_path>/[^\\?]+)" 
                    | rex field=uri_path "/.+?(?<path>/.+)" 
                    | rename referer_path as from path as to 
                    | stats count by from to | head 50
                ]]></query>

My new search

      <search id="sankey_search">
        <query>
          <![CDATA[
                    |inputlookup hosts.csv | search host="*" service=* customer=* | head 100 | stats count by host, customer
                ]]>
        </query>

Sample csv

host,service,customer
ABC123431,Service1,Customer1
ABC123300,Service2,Customer2
ABC123321,Service3,Customer3
ABC123332,Service4,Customer4
ABC123940,Service5,Customer5
ABC123334,Service6,Customer6
ABC123702,Service7,Customer7
ABC123341,Service8,Customer8
ABC123740,Service9,Customer9
ABC123431,Service1,Customer1
ABC123300,Service2,Customer2
ABC123321,Service3,Customer3
ABC123332,Service4,Customer4
ABC123940,Service5,Customer5
ABC123334,Service6,Customer6
ABC123702,Service7,Customer7
ABC123341,Service8,Customer8
ABC123740,Service9,Customer9
ABC123431,Service1,Customer1
ABC123300,Service2,Customer2
ABC123321,Service3,Customer3
ABC123332,Service4,Customer4
ABC123940,Service5,Customer5
ABC123334,Service6,Customer6
ABC123702,Service7,Customer7
ABC123341,Service8,Customer8
ABC123740,Service9,Customer9
ABC123431,Service1,Customer1
ABC123300,Service2,Customer2
ABC123321,Service3,Customer3
ABC123332,Service4,Customer4
ABC123940,Service5,Customer5
ABC123334,Service6,Customer6
ABC123702,Service7,Customer7
ABC123341,Service8,Customer8
ABC123740,Service9,Customer9
ABC123431,Service1,Customer1
ABC123300,Service2,Customer2
ABC123321,Service3,Customer3
ABC123332,Service4,Customer4
ABC123940,Service5,Customer5
ABC123334,Service6,Customer6
ABC123702,Service7,Customer7
ABC123341,Service8,Customer8
ABC123740,Service9,Customer9
ABC123431,Service1,Customer1
ABC123300,Service2,Customer2
ABC123321,Service3,Customer3
ABC123332,Service4,Customer4
ABC123940,Service5,Customer5
ABC123334,Service6,Customer6
ABC123702,Service7,Customer7
ABC123341,Service8,Customer8
ABC123740,Service9,Customer9
ABC123740,Service1,Customer5
ABC123640,Service2,Customer6
ABC123433,Service3,Customer7
ABC123710,Service4,Customer8
ABC123722,Service5,Customer9
ABC123330,Service6,Customer10
ABC123603,Service7,Customer1
ABC123801,Service8,Customer2
ABC123513,Service9,Customer3
ABC123800,Service1,Customer4
ABC123312,Service2,Customer5

jeffland · ‎04-20-2015

I think the fundamental but simple thing you're missing is that the sankey diagram as it is can only work with two columns ("to" and "from" on the original example), and that "middle" segments in the diagram are simply entries that appear both in "to" and "from" and thus form a chain.
So this is an initial attempt:

| inputlookup threecolumns.csv | search host="*" service=* customer=* | head 100
| table host service | rename host as from | rename service as to 
| append [|inputlookup threecolumns.csv | search host="*" service=* customer=* | head 100
| table service customer | rename service as from | rename customer as to] | stats count by from, to

This satisfactory in what it does, but I am unhappy with how it works (two searches). I'm pretty sure there is a better way to rearrange the data. Maybe someone smart would like to propose a way to achieve the same behavior, only with formatting the data from a single search?

View solution in original post

jeffland · ‎04-20-2015

I think the fundamental but simple thing you're missing is that the sankey diagram as it is can only work with two columns ("to" and "from" on the original example), and that "middle" segments in the diagram are simply entries that appear both in "to" and "from" and thus form a chain.
So this is an initial attempt:

| inputlookup threecolumns.csv | search host="*" service=* customer=* | head 100
| table host service | rename host as from | rename service as to 
| append [|inputlookup threecolumns.csv | search host="*" service=* customer=* | head 100
| table service customer | rename service as from | rename customer as to] | stats count by from, to

This satisfactory in what it does, but I am unhappy with how it works (two searches). I'm pretty sure there is a better way to rearrange the data. Maybe someone smart would like to propose a way to achieve the same behavior, only with formatting the data from a single search?

0YAoNnmRmKDg · ‎04-20-2015

Thats great, works nicely - sorted what i needed for a smoke and mirrors demo - thank you very much for taking the time to answer!

Sankey d3 chart missing "middle" nodes

Original 6.x samples app search

My new search

Sample csv

Enterprise Security Content Update (ESCU) | New Releases

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

Index This | What are the 12 Days of Splunk-mas?