Everyone,
The events on splunk for me have data in the following format :
ticket_num,actual_start_time,finish_time,assigned_to.
For Example :
A particular ticket number IN1234 has a start time of "January 1 2018" and finish time of "January 5 2018" along with whom the ticket was assigned to, for example, "A". This particular ticket may have been worked by "A" and also by "B" and "C". "A" might have charged 5 hours to the ticket, "B" - 3 hours and "C" - 2 hours.
The file consisting of the hours charged by "A","B" and "C" is in the format of :
"Resource Name","Date Charged in mm/dd/yyyy" ,"Hours Charged","Ticket Number"
"A",01/02/2018,5,IN1234
"B",01/04/2018,3,IN1234
"C",01/05/2018,2,IN1234
The current approach I am following to utilize the hours charged values is to :
1) Since IN1234 is only going to be present once in the indexed data (one event); I use the ticket_num to lookup with the file mentioned above.
2) I get a multivalued field like below :
| table ticket_num name date_charged effort
IN1234 "A" 01/02/2018 5
"B" 01/04/2018 3
"C" 01/05/2018 2
3) I do an mvzip --> | eval Test = mvzip(name,date_charged) -->
"A",01/02/2018
"B",01/04/2018
"C",01/05/2018
4) I do another mvzip -- | eval Test = mvzip(Test,effort) -->
"A",01/02/2018,5
"B",01/04/2018,3
"C",01/05/2018,2
5) I do a mvexpand on Test, so now I have 3 events like the following
|table ticket_num Test
IN1234 "A",01/02/2018,5
IN1234 "B",01/04/2018,3
IN1234 "C",01/05/2018,2
6) I use Split on Test and use mvindex to assign values
| eval Split = split(Test,",")
| eval name = mvindex(Split,0)
| eval date_charged = mvindex(Split,1)
| eval effort = mvindex(Split,2)
Using the above I can now use the data I retrieved from the lookup.
I wanted to know if there was a better alternative for the "Lookup" approach used above as there are many restrictions to this method, slower searches with an increase in tickets being one of them.
Let me know.
Assuming that file that contains hours charged is a lookup table file and you're using | lookup yourlookupname.csv..
to do the lookup, that should be fastest way.
Thank you somesoni2. Yes the hours charged are in a lookup table.
Just to clarify I wanted to know if there was any other way to accomplish what I am doing above, but without using lookups. If there isnt then I will stick to this approach.
There is a more efficient way to breakout data after the mvexpand
but you have the right ideas. I would do it like this:
|makeresults | eval _raw="01/02/2018 01/04/2018 01/05/2018,A B C,5 3 2"
| eval tikcet_num="IN1234"
| rex "^(?<date_changed>[^,]+),(?<name>[^,]+),(?<effort>[^,]+)$"
| makemv name
| makemv date_changed
| makemv effort
| fields - _*
| rename COMMENT AS "Everything above generates sample event data;everything below is your solution)"
| eval raw=mvzip(mvzip(date_changed, name), effort)
| table tikcet_num raw
| mvexpand raw
| rename raw AS _raw
| rex "^(?<date_changed>[^,]+),(?<name>[^,]+),(?<effort>[^,]+)$"
Thank you woodcock.
There seems to be slight improvement in speed when I use rex instead of split. I think I will use rex since you would need to write lesser code.
As mentioned in my comment to somesoni2, for the scenario mentioned above is retrieving data from the lookup table the fastest way ? Let me know.
You could index the events and pull from a search, but if your data is correct in the lookups, I'd keep it that way.