Getting Data In

Lookup command distinct column values

matthewb4
Path Finder

I am trying to copy the counts of field X (already in logs) into a new field Y (from a lookup csv) so that they have the exact same counts but field Y has better named values.

The problem I am running into is that my csv file has multiple of the same X values in it so instead of field Y ending up with the same counts as field X, it is becoming (X * # of matched duplicates in csv). Is there any way for me to only get distinct rows from a lookup csv based on the matching field value (X in this case).

I am currently just using this... "lookup test.csv X AS X OUTPUTNEW Y AS Y"

And my csv file look something like this

Y      X      Z

y1  |  x1  |  z1
y1  |  x1  |  z2
y1  |  x1  |  z3
y2  |  x2  |  z1
y2  |  x2  |  z2
y3  |  x3  |  z1
y3  |  x3  |  z2
y3  |  x3  |  z3
y3  |  x3  |  z4
y3  |  x3  |  z5
y3  |  x3  |  z6

Given the previous csv file, if my counts for x1 in splunk logs were 1000, my new y1 value in field Y would get the value 3000. This is unwanted behavior for me, I would like for them to be the same. Any help on how I can modify my query is greatly appreciated.

0 Karma
1 Solution

woodcock
Esteemed Legend

Like this:

... | lookup test.csv X OUTPUTNEW Y | eval Y = mvindex(Y, 0) | ...

View solution in original post

woodcock
Esteemed Legend

Like this:

... | lookup test.csv X OUTPUTNEW Y | eval Y = mvindex(Y, 0) | ...

matthewb4
Path Finder

Great! This worked, do you mind explaining what mvindex is doing exactly? Each time outputnew is run for a specific Y value is it appending it to a multivalue list? I thought that it was simply adding to the current count, not creating a new index every time.

0 Karma

woodcock
Esteemed Legend

Don't forget to click Accept!

0 Karma

matthewb4
Path Finder

Thank you for the followup, I just have one more question. How was I to know that this field would become multivalued. If I omit the mvindex command, I do not see a comma delimited list or anything like that. I don't even see something like "y1, y1, y1".

0 Karma

woodcock
Esteemed Legend

A single value of X has multiple values of Y in your lookup file. How else could this be sensibly handled other than to create a multivalued output for Y. I guess the answer is "common sense" but maybe not totally obvious. I have requested Splunk update the documentation here:

http://docs.splunk.com/Documentation/Splunk/6.6.0/SearchReference/Lookup

0 Karma

woodcock
Esteemed Legend

The lookup is creating a multivalued field so we just take the first value and throw away the rest. Probably an even better solution would be to use | eval Y = mvdedup(Y) which would keep every distinct value of Y (which, if your lookup is correct should be only a single value, but if the lookup is broken, this version of the solution would give you some hope of discovering that).

0 Karma

matthewb4
Path Finder

Would there be any way for me to see this multivalued list as a sanity check. I only see the combined chunk count for each new Y value, which is limiting my comprehension a bit. I think why this is most troubling to me is that when you said " single value of X has multiple values of Y in your lookup file. How else could this be sensibly handled other than to create a multivalued output for Y?" In my case the multiple values of Y are actually all the exact same string for the same X, I'm guessing this is why it's not actually showing up as a list but being treated as one?

0 Karma

woodcock
Esteemed Legend

Sure, do this:

... lookup test.csv X OUTPUTNEW Y | nomv Y ...

This will flatten the list into a single value.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Index This | What travels the world but is also stuck in place?

April 2026 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Discover New Use Cases: Unlock Greater Value from Your Existing Splunk Data

Realizing the full potential of your Splunk investment requires more than just understanding current usage; it ...

Continue Your Journey: Join Session 2 of the Data Management and Federation Bootcamp ...

As data volumes continue to grow and environments become more distributed, managing and optimizing data ...