Hello my gorgeous people from this amazing community,
I've been trying to solve this problem that I have but I couldn't find a way...
I have to create a category (a new field named "PROMO") that will have only two values "YES" ,"NO" I work for a hotel chain and I want to know if a customer in the hotel has called the company hotline at least one time within the last month prior to the reservation date. I have the reservation dates and ID by customers and I can also generate the calls on the hotline but I have no work around as to how to classify this new field 😞
To illustrate if this is the data for the HOTLINE
ID_CX | DATE_CALL |
BETHANY | 2020-04-15T09:05:49-04:00 |
ALEX | 2020-04-21T07:25:15-04:00 |
ID_CX | DATE_BOOKING | BOOKING_REF |
BETHANY | 2020-04-26T09:05:49-04:00 | TY873 |
ALEX | 2020-03-28T07:25:15-04:00 | UJU4 |
Then my expected results will be:
ID_CX | BOOKING_REF | PROMO |
BETHANY | TY873 | YES |
ALEX | UJU4 | NO |
This is because Bethany in fact called on the line 9 days prior to her booking (TY873) but Alex's call was actually AFTER his booking thats why he gets classify as "NO"
I feel like my main challenge is how to properly write my code so that Splunk looks for dates (if there are any) BEFORE the booking date so that splunk can decide if there is a 30 days length of difference between them... and also I have the challenge to write the code to classify them.. Since this data is unstructured (as the information comes from different events and a cx may or many have not called on the line) I dont know what would be the proper way to count for this information..
please dont judge... but I have been trying out this code:
|multisearch
[| search index ="hotline" | fields CX_ID CALL_DATE]
[| search index ="bookings" | fields CX_ID DATE_BOOKING CX_ID]
| stats values(CALL_DATE) as HOTLINE values(DATE_BOOKING ) as DATE_BOOKING by ID_CX DATE_BOOKING
| eval PROMO=if(...)
so bacsically I have not been able to classify because when I ask splunk to subtract the times I get nothing... And I don't know how to code that it has to look for the dates prior to the date of bookings...
Thank you so much to all the people that can help me out on this one I am truly grateful for that...
kindly,
Cindy
Search both indexes, parse the date fields into the _time field, sort and copy down the latest call time, then compare.
index="hotline" OR index="bookings"
| eval _time=if(isnotnull(DATE_CALL),strptime(DATE_CALL,"%Y-%m-%dT%H:%M:%S%:z"),if(isnotnull(DATE_BOOKING),strptime(DATE_BOOKING,"%Y-%m-%dT%H:%M:%S%:z"),_time))
| sort ID_CX _time
| streamstats latest(DATE_CALL) as last_call by ID_CX
| eval promo=if(isnotnull(BOOKING_REF),if(_time-strptime(last_call,"%Y-%m-%dT%H:%M:%S%:z")<30*24*60*60,"YES","NO"),null)
@ITWhisperer 1) Why you always gotta do everythig right? 2) Thanks A LOT my dear gentelman.... 3) Why are we using streamstats instead of stats.. this is the first time I heard of that command.. Thank you so so much this solved my issue! thanks a lot...
To answer question 3 - the way splunk processes the data can be thought of as a queue or stream of events being passed along a chain of processes; so, the sort (particularly by _time in this instance - strictly speaking, the sort need only be done by _time because the streamstats is grouping by id) gathers all the events together and, once it has determined the order, sends them down the chain (pipeline) to the next process. The streamstats processes the stream of events as they are passing through, so the latest function will return the latest value (for the current id) it has seen so far and add it to each event as it passes. This combination makes sure that when the booking event comes along, it will have the last call time, allowing you to determine whether the promo can be applied.
omg! Your explanation was everything I needed to undertand this challenge and the code you gave me, thank you so much @ITWhisperer like trully... thanks I feel like I own you at least a thousand hugs! THANKS A LOTTT INDEED
Search both indexes, parse the date fields into the _time field, sort and copy down the latest call time, then compare.
index="hotline" OR index="bookings"
| eval _time=if(isnotnull(DATE_CALL),strptime(DATE_CALL,"%Y-%m-%dT%H:%M:%S%:z"),if(isnotnull(DATE_BOOKING),strptime(DATE_BOOKING,"%Y-%m-%dT%H:%M:%S%:z"),_time))
| sort ID_CX _time
| streamstats latest(DATE_CALL) as last_call by ID_CX
| eval promo=if(isnotnull(BOOKING_REF),if(_time-strptime(last_call,"%Y-%m-%dT%H:%M:%S%:z")<30*24*60*60,"YES","NO"),null)