Re: Alternatives to Lookup

aamirs291 · ‎05-09-2018

Everyone,

The events on splunk for me have data in the following format :

ticket_num,actual_start_time,finish_time,assigned_to.

For Example :

A particular ticket number IN1234 has a start time of "January 1 2018" and finish time of "January 5 2018" along with whom the ticket was assigned to, for example, "A". This particular ticket may have been worked by "A" and also by "B" and "C". "A" might have charged 5 hours to the ticket, "B" - 3 hours and "C" - 2 hours.

The file consisting of the hours charged by "A","B" and "C" is in the format of :

"Resource Name","Date Charged in mm/dd/yyyy" ,"Hours Charged","Ticket Number"

"A",01/02/2018,5,IN1234
"B",01/04/2018,3,IN1234
"C",01/05/2018,2,IN1234

The current approach I am following to utilize the hours charged values is to :

1) Since IN1234 is only going to be present once in the indexed data (one event); I use the ticket_num to lookup with the file mentioned above.
2) I get a multivalued field like below :

| table ticket_num name date_charged effort

IN1234 "A" 01/02/2018 5
"B" 01/04/2018 3
"C" 01/05/2018 2

3) I do an mvzip --> | eval Test = mvzip(name,date_charged) -->
"A",01/02/2018
"B",01/04/2018
"C",01/05/2018

4) I do another mvzip -- | eval Test = mvzip(Test,effort) -->
"A",01/02/2018,5
"B",01/04/2018,3
"C",01/05/2018,2

5) I do a mvexpand on Test, so now I have 3 events like the following

|table ticket_num Test
IN1234 "A",01/02/2018,5
IN1234 "B",01/04/2018,3
IN1234 "C",01/05/2018,2

6) I use Split on Test and use mvindex to assign values
| eval Split = split(Test,",")
| eval name = mvindex(Split,0)
| eval date_charged = mvindex(Split,1)
| eval effort = mvindex(Split,2)

Using the above I can now use the data I retrieved from the lookup.

I wanted to know if there was a better alternative for the "Lookup" approach used above as there are many restrictions to this method, slower searches with an increase in tickets being one of them.

Let me know.

somesoni2 · ‎05-09-2018

Assuming that file that contains hours charged is a lookup table file and you're using | lookup yourlookupname.csv.. to do the lookup, that should be fastest way.

aamirs291 · ‎05-10-2018

Thank you somesoni2. Yes the hours charged are in a lookup table.

Just to clarify I wanted to know if there was any other way to accomplish what I am doing above, but without using lookups. If there isnt then I will stick to this approach.

woodcock · ‎05-09-2018

There is a more efficient way to breakout data after the mvexpand but you have the right ideas. I would do it like this:

|makeresults | eval _raw="01/02/2018 01/04/2018 01/05/2018,A B C,5 3 2"
| eval tikcet_num="IN1234"
| rex "^(?<date_changed>[^,]+),(?<name>[^,]+),(?<effort>[^,]+)$"
| makemv name
| makemv date_changed
| makemv effort
| fields - _*

| rename COMMENT AS "Everything above generates sample event data;everything below is your solution)"

| eval raw=mvzip(mvzip(date_changed, name), effort)
| table tikcet_num raw
| mvexpand raw
| rename raw AS _raw
| rex "^(?<date_changed>[^,]+),(?<name>[^,]+),(?<effort>[^,]+)$"

aamirs291 · ‎05-10-2018

Thank you woodcock.

There seems to be slight improvement in speed when I use rex instead of split. I think I will use rex since you would need to write lesser code.

As mentioned in my comment to somesoni2, for the scenario mentioned above is retrieving data from the lookup table the fastest way ? Let me know.

woodcock · ‎05-10-2018

You could index the events and pull from a search, but if your data is correct in the lookups, I'd keep it that way.

Alternatives to Lookup

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Casting Call: Compete in Cyber Games

Data Management Digest – June 2026

Think Like an Architect: Introducing the Splunk Certified Cybersecurity Defense ...

Join the Conversation