Re: Flattening Hierarchical Relationships without ...

ErikaE · ‎06-18-2018

I'm looking for an efficient way to build multiple parent child event pairs into a flat string of parent,children,grandchildren etc. I'd prefer not to use join if possible.

I have a set of data representing parent child relationships for process steps completed. The source data does not (consistently) maintain a lineage for any particular iteration, and it's possible to fork a single parent into many children, or for two parents to converge to one child.

For example:

Parent   Child
AM       BU
AK       BT 
BU       CY
BT       CX 
CY       D0
CY       DZ
CX       D9
CX       D8
CX       D1
CX       D0
CX       D2

Each row in that table represents a simplified event in my summary index, which contains 2M+ event records.

The ideal output would be a mv or delimited field with one entry per 'step' down the hierarchy, for ex:

AM, BU, CY, D0 
AM, BU, CY, DZ 
AM, BT, CX, D0 
AK, BT, CX, D8

Note that sometimes two processes can also converge to a final step, i.e. in the example of D0. In this case there are two separate processes for D0 that share an identifier but have slightly different properties.

The heirarchy can have different levels based on what fields are in it and there's not a consistent nomenclature for the parent child ids.

There is a field which summarizes the steps taken, i.e. something link ABCD, but not every processing step generates a new parent child pairing. Building a dictionary for that field nomenclature would be possible via a lookup - but then would have to be maintained.

I had done something that mostly works with iterative joins but it's messy and slow. Transaction is also slow and seems to be dropping events. I'd also like something than can continuously update itself to the summary index (via report/alert) as new parent/child events come in.

So for ex, if a new event came in with Parent=D0 and Child=E9
The two associated hierarchies would go :

AM, BU, CY, D0, E9
AM, BT, CX, D0, E9

niketn · ‎06-19-2018

@ErikaE, once you have the required relation in the table what is the next thing you would want to do?

Have you check out Force Directed Visualization app which would directly plot the relation from your source table in the question. Following is a query that generates dummy data as per the question:

| makeresults
| eval data="AM,BU;AK,BT;BU,CY;BT,CX;CY,D0;CY,DZ;CX,D9;CX,D8;CX,D1;CX,D0;CX,D2"
| makemv data delim=";"
| mvexpand data
| makemv data delim=","
| eval Parent=mvindex(data,0),Child=mvindex(data,1)
| table Parent Child

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

ErikaE · ‎06-19-2018

@niketninlay

Is there a way to take that visual and output a set of strings describing all the unique paths? Or be able to pull info through nodes? I'll definitely check the app out!

For each entry in the table above there is metadata or other field-value pairs that add additional context. Some of that data doesn't get propagated as the items move through the process. If I can propagate the context info through all the parent child pairs I can classify outcomes to answer the business question I've been working on.

Hope that made sense. The example data I provided was just one set of steps for one complete process, there would be thousands in a 24 hour period so it'd be difficult to use graphical representation to see all the data.

Flattening Hierarchical Relationships without transaction or join

How to Monitor Google Kubernetes Engine (GKE)

Index This | How can you make 45 using only 4?

Splunk Education Goes to Washington | Splunk GovSummit 2024