Re: Filter data from MultiValue field

Thulasinathan_M · ‎12-09-2024

Hi Splunk Experts,

I'v been trying to apply three condition, but I'm bit complicating. So would like to have some inputs.
I have a runtime search which will produce three fields Category, Data, Percent and I join/ append some data from lookup using User. The lookup has multi-value fields which are prefixed with Lookup.

User	Category	Data	Percent	LookupCategory	LookupData	LookupPercent	LookupND1	LookupND2
User094	103	2064	3.44	101 102 104	7865 4268 1976	7.10 3.21 3.56	4.90 2.11 3.10	2.20 1.10 0.46
User871	102	5108	5.58	103	3897	7.31	5.23	2.08
User131	104	664	0.71	103 104 105	2287 1576 438	0.22 0.30 0.82	0.11 0.08 0.50	0.11 0.02 0.32
User755	104	1241	1.23	102 104	4493 975	0.97 1.12	0.42 1.01	0.55 0.11

My conditions are as follow:
1. Use Precedence Category if it's greater than current Category.
For Ex below dataset: The Category is 103, I have to check which is the max(LookupPercent) between 101 to 103 and use it if the value in (101 or 102) is greater than 103.

User094

103

2064

3.44

101
102
104

7865
4268
1976

7.10
3.21
3.56

4.90
2.11
3.10

2.20
1.10
0.46

2. Ignore if the LookupCategory has no CategoryValue equal to or greater than

In below case Category is 102, but the lookup has only 103, but no data between 101 to 102. So ignore.

User871

102

5108

5.58

103

3897

7.31

5.23

2.08

3. If the Lookup Current Category Percent is lesser than immediate following category, then find abs difference of Current Category with lookup Category and immediate following Category using Data field and if immediate following is near then use immediate following category.
LookupCategory 104's Percent 0.30 is less than 105's 0.82. So as further step abs(664 - 1576) and abs(664 - 438), as (664 - 438) is less than (664 - 1576), the 105's row data should be filtered/ used.

User131

104

664

0.71

103
104
105

2287
1576
438

0.22
0.30
0.82

0.11
0.08
0.50

0.11
0.02
0.32

4. Straight forward, none of above condition matches Same lookupCatagory 104's row should be used for Category 104.

User755

104

1241

1.23

102
104

4493
975

0.97
1.12

0.42
1.01

0.55
0.11

PickleRick · ‎12-10-2024

OK. Back up a little.

Where does this data come from? You seem to have multiple multivalued fields. That might be a problem because with Splunk there is no implied relationship between those fields whatsoever so values in one multivalued field do not have to be connected with values in another multivalued field. And their order doesn't need to match the order in another multivalued field.

Take this for example:

| makeresults format=csv data="a,b,c
a,,c
,b,c
a,b"

It will give you this:

a	b	c
a		c
	b	c
a	b

But if you try to "squeeze" it into multivalued fields by doing

| stats list(*) as *

You'll get

a	b	c
a a	b b	c c

These don't match the "layout" of the input data for the stats command.

So be extremely cautious when handling multivalued fields because you might get completely different values from what you expect.

bowesmana · ‎12-09-2024

I don't quite follow your logic, but your solution will probably require mv eval functions and/or foreach.

e.g. you can find the Category index into your LookupCategory something like this

| eval c=0
| foreach mode=multivalue LookupCategory [ eval mv_match=case(Category=<<ITEM>>, c, Category><<ITEM>>, -c, true(), mv_match), c=c+1 ]

i.e a positive result of mv_match means the MV index of an exact match (offsets from 0). A negative mv_match result indicates the last LookupCategory that Category was > than and an empty result means Category was never greater than any LookupCategory.

Then with that knowledge you can mvindex() the other MV values based on your needs, e.g. abs(mv_match)

dural_yyz · ‎12-10-2024

I hope I can find this trick in the future if I ever need this.

bowesmana · ‎12-10-2024

Note @PickleRick response though about relationship between MV values in different fields. You can easily find yourself in a world of MV pain if you're not careful. You need to KNOW your data well to start correlating MV fields using an index/offset.

If there is a null() or empty value in any of the MV fields then the index offset will be out of alignment.

dural_yyz · ‎12-09-2024

Ok that's way too much logic for me to follow on a Monday morning before I have even had coffee. I would split the fields into mv unique options. Then start evaluating a new field based upon your logic flow. Anything with a TRUE outcome can be your final results.

Filter data from MultiValue field

lookup

Dashboards: Hiding charts while search is being executed and other uses for tokens

Splunk Observability Cloud's AI Assistant in Action Series: Explaining Metrics and ...

Brains, Bytes, and Boston: Learn from the Best at .conf25

Are you a member of the Splunk Community?

Filter data from MultiValue field

lookup

Dashboards: Hiding charts while search is being executed and other uses for tokens

Splunk Observability Cloud's AI Assistant in Action Series: Explaining Metrics and ...

Brains, Bytes, and Boston: Learn from the Best at .conf25