Solved: Dates and Datetime: How to correlate or reference ...

cindygibbs_08 · ‎05-19-2021

Hello guys I hope everyone is doing well

I would like to know is there is a way to code in Splunk how to show the earliest and latest event based on a datetime reference. In the Hotel chain where I work we deal with a lot of people who fraudulently create new accounts and use other people's credit cards or names in order to make a reservation and in our system we have a way to set up a "fraud alarm" what I'd like to do is find a way to create a table where I can see the date of the fraud alarm and the very call after and before that fraud alarm, please allow me to illustrate I can have a table like this one:

FRAUD_DATE	ID	RISK
2021-01-14 15:30:12	1	AGT
2021-02-03 18:46:59	1	JUI
2021-02-05 17:16:23	1	JUI
2021-11-26 21:15:14	3	AGT

and also a table like this:

ID	RESERVATION	DATE
1	UJS01	2021-01-13 15:30:12
1	UJS02	2021-01-14 16:30:12
2	UJS03	2021-02-03 18:46:59
1	UJS04	2021-02-06 14:00:23
2	UJS05	2021-02-03 18:46:59

This will be the table I will be looking for:
all IDs that have made a reservation and also have a fraud alarm and I have the date of the fraud alarm (related to that ID) with the date of the reservation right before the fraud alarm and the date of the reservation right after the fraud alarm

ID	RESERVATION_CODE_BEFORE	DATE_REV_BEFORE	FRAUD_ALARM	DATE_REV_AFTER	RESERVATION_CODE_AFTER
1	UJS01	2021-01-13 15:30:12	2021-01-14 15:30:12	2021-01-14 16:30:12	UJS02
1	UJS02	2021-01-14 16:30:12	2021-02-03 18:46:59	2021-02-06 14:00:23	UJS04
1	UJS02	2021-01-14 16:30:12	2021-02-05 17:16:23	2021-02-06 14:00:23	UJS04

If there is no prior or post date them N.A will be the best chose to tabulate, thank you so much to anyone that can help me out.

These are the codes that I can use to extract the information:

Fraud Alarms:
index="fraud"
| search "fraud_alarms"=*
| fields ID, DATE_FRAUD, RISK

Reservations
index="cx_reserv"
| search "pay_ok"=*
| fields ID, date_reser, code_rev

Kindly,

Cindy

PS: I was wondering if perhaps one could group all the reservation dates by ID and fraud date and then subtract all reservation dates to the fraud date and create a field with the result named "value" and after that select as the "very before reservation date" the one with the greatest negative value and as the very next reservation date the one with the lowest positive value.. but I dont know is splunk has its own way to do this is a faster way perhaps thank you so much everyone

ITWhisperer · ‎05-19-2021

| makeresults
| eval _raw="FRAUD_DATE	ID	RISK
2021-01-14 15:30:12	1	AGT
2021-02-03 18:46:59 	1	JUI
2021-02-05 17:16:23 	1	JUI
2021-11-26  21:15:14	3	AGT"
| multikv forceheader=1
| fields - _* linecount
| append 
    [| makeresults 
    | eval _raw="ID	RESERVATION	DATE
1	UJS01	2021-01-13 15:30:12
1	UJS02	2021-01-14 16:30:12
2	UJS03	2021-02-03 18:46:59 
1	UJS04	2021-02-06 14:00:23 
2	UJS05	2021-02-03 18:46:59"
    | multikv forceheader=1 
    | fields - _* linecount ]
| eval _time=coalesce(FRAUD_DATE,DATE)
| sort 0 - _time
| eval DATE_REV_AFTER=DATE
| eval RESERVATION_CODE_AFTER=RESERVATION
| streamstats last(DATE_REV_AFTER) as DATE_REV_AFTER last(RESERVATION_CODE_AFTER) as RESERVATION_CODE_AFTER by ID
| sort 0 _time
| eval DATE_REV_BEFORE=DATE
| eval RESERVATION_CODE_BEFORE=RESERVATION
| streamstats last(DATE_REV_BEFORE) as DATE_REV_BEFORE last(RESERVATION_CODE_BEFORE) as RESERVATION_CODE_BEFORE by ID
| where isnotnull(FRAUD_DATE)
| fillnull value="N/A" DATE_REV_BEFORE DATE_REV_AFTER
| table ID RESERVATION_CODE_BEFORE DATE_REV_BEFORE FRAUD_DATE DATE_REV_AFTER RESERVATION_CODE_AFTER

View solution in original post

ITWhisperer · ‎05-19-2021

| makeresults
| eval _raw="FRAUD_DATE	ID	RISK
2021-01-14 15:30:12	1	AGT
2021-02-03 18:46:59 	1	JUI
2021-02-05 17:16:23 	1	JUI
2021-11-26  21:15:14	3	AGT"
| multikv forceheader=1
| fields - _* linecount
| append 
    [| makeresults 
    | eval _raw="ID	RESERVATION	DATE
1	UJS01	2021-01-13 15:30:12
1	UJS02	2021-01-14 16:30:12
2	UJS03	2021-02-03 18:46:59 
1	UJS04	2021-02-06 14:00:23 
2	UJS05	2021-02-03 18:46:59"
    | multikv forceheader=1 
    | fields - _* linecount ]
| eval _time=coalesce(FRAUD_DATE,DATE)
| sort 0 - _time
| eval DATE_REV_AFTER=DATE
| eval RESERVATION_CODE_AFTER=RESERVATION
| streamstats last(DATE_REV_AFTER) as DATE_REV_AFTER last(RESERVATION_CODE_AFTER) as RESERVATION_CODE_AFTER by ID
| sort 0 _time
| eval DATE_REV_BEFORE=DATE
| eval RESERVATION_CODE_BEFORE=RESERVATION
| streamstats last(DATE_REV_BEFORE) as DATE_REV_BEFORE last(RESERVATION_CODE_BEFORE) as RESERVATION_CODE_BEFORE by ID
| where isnotnull(FRAUD_DATE)
| fillnull value="N/A" DATE_REV_BEFORE DATE_REV_AFTER
| table ID RESERVATION_CODE_BEFORE DATE_REV_BEFORE FRAUD_DATE DATE_REV_AFTER RESERVATION_CODE_AFTER

cindygibbs_08 · ‎05-19-2021

Amazing thank you so much my beloved gentleman @ITWhisperer I have a question and I am sorry if I'm taking up your time, is it possible to add a | stats command if I want to later on add some other variable like a comment from the fraud review team? Or what I should do is also append the search for this comments of the review team and then uses the streamstats command? Thank you so much my dear @ITWhisperer from the bottom of my heart I love you

ITWhisperer · ‎05-19-2021

Hi @cindygibbs_08

Essentially what is happening here is we are copying the previous date and code and next date and code from the reservation events to the fraud event and then just keeping the fraud events, so any additional fields in the fraud events would be preserved (you would just need to include the fields you wanted). If the comments are from different events in a different index, you would need to be able to correlate the comments with the fraud event and either use append and stats in a similar way or use a join. I hope that makes sense.

cindygibbs_08 · ‎05-20-2021

@ITWhisperer Got you! Thank you so much

cindygibbs_08 · ‎05-21-2021

Hello @ITWhisperer I hope you are doing well, I decided to make some modifications and check how to apply it to my data but I realized that when I modified to only show me the very next reservation made after the each fraud date is actually not working for me (I must be doing something wrong because your code works like a charm) I intentionally create a reservation one hour after each fraud date and only use one ID which is (1) but the results are not correct, I use this code:

| makeresults
| eval _raw="FRAUD_DATE	ID	RISK
2021-01-10 15:30:12	    1	AGT
2021-02-03 18:46:59 	1	JUI
2021-02-05 17:16:23 	1	JUI
2021-11-26  21:15:14	1	AGT"
| multikv forceheader=1
| fields - _* linecount
| append 
    [| makeresults 
    | eval _raw="ID	RESERVATION	DATE
1	UJSXX   2021-01-10 16:30:12	
1	UJS02	2021-02-03 19:46:59
1	UJS03	2021-02-05 18:16:23
1	UJS04	2021-11-26 22:15:14
1	UJS05	2021-02-03 18:46:59"
    | multikv forceheader=1 
    | fields - _* linecount ]
    | eval time=coalesce(FRAUD_DATE,DATE)
| eval time=strptime(time,"%Y-%m-%d %H:%M:%S")
| eval time=strftime(time,"%Y-%m-%d %H:%M:%S")
| eval _time=time
| sort 0 - _time
| eval DATE_REV_AFTER=DATE
| eval RESERVATION_CODE_AFTER=RESERVATION
| streamstats last(DATE_REV_AFTER) as DATE_REV_AFTER last(RESERVATION_CODE_AFTER) as RESERVATION_CODE_AFTER by ID
| where isnotnull(FRAUD_DATE) AND isnotnull(DATE_REV_AFTER)
| table ID FRAUD_DATE RISK DATE_REV_AFTER RESERVATION_CODE_AFTER
| sort FRAUD_DATE

yields the following:

ID	FRAUD_DATE	RISK	 DATE_REV_AFTER	    RESERVATION_CODE_AFTER
1	2021-02-03 18:46:59 	JUI	     2021-02-03 19:46:59	UJS02
1	2021-02-05 17:16:23 	JUI	     2021-02-05 18:16:23	UJS03
1	2021-11-26  21:15:14	AGT	     2021-11-26 22:15:14	UJS04

The very first fraud date is

2021-01-10 15:30:12

and this ID made a reservation one hour later with rev code UJSXX, I will be so thankful if you could let me know what would be the proper way to adapt the code to properly show only the very next call after every single one fraud alarm.. if you have time of course, I apologize if I am being too annoying by now..
kindly,

Cindy G.

ITWhisperer · ‎05-23-2021

The "problem" is with the way the data is set up. Sometimes multikv doesn't interpret the data correctly, especially when space/tab delimiters are inferred. Changing the delimiters to commas gives the missing result

| makeresults
| eval _raw="FRAUD_DATE,ID,RISK
2021-01-10 15:30:12,1,AGT
2021-02-03 18:46:59,1,JUI
2021-02-05 17:16:23,1,JUI
2021-11-26 21:15:14,1,AGT"
| multikv forceheader=1
| fields - _* linecount
| append 
    [| makeresults 
    | eval _raw="ID,RESERVATION,DATE
1,UJSXX,2021-01-10 16:30:12	
1,UJS02,2021-02-03 19:46:59
1,UJS03,2021-02-05 18:16:23
1,UJS04,2021-11-26 22:15:14
1,UJS05,2021-02-03 18:46:59"
    | multikv forceheader=1 
    | fields - _* linecount ]
| eval time=coalesce(FRAUD_DATE,DATE)
| eval time=strptime(time,"%Y-%m-%d %H:%M:%S")
| eval _time=time
| eval time=strftime(time,"%Y-%m-%d %H:%M:%S")
| sort 0 - _time
| eval DATE_REV_AFTER=DATE
| eval RESERVATION_CODE_AFTER=RESERVATION
| streamstats last(DATE_REV_AFTER) as DATE_REV_AFTER last(RESERVATION_CODE_AFTER) as RESERVATION_CODE_AFTER by ID
| where isnotnull(FRAUD_DATE) AND isnotnull(DATE_REV_AFTER)
| table ID FRAUD_DATE RISK DATE_REV_AFTER RESERVATION_CODE_AFTER
| sort FRAUD_DATE

Also, I moved the assignment to _time after the strptime and before the strftime to make it an epoch time value instead of a string. Given the format used, this doesn't actually matter since values in the string represent decreasing magnitudes of times e.g. years before months before days etc., but from a purist point of view, it would be better to sort using the epoch time value rather than the string.

cindygibbs_08 · ‎05-23-2021

Amazing thank you so much @ITWhisperer from the bottom of my heart

Dates and Datetime: How to correlate or reference events based on date time

summary indexing

Splunk Decoded: Service Maps vs Service Analyzer Tree View vs Flow Maps

What’s New in Splunk Observability – September 2025

Fun with Regular Expression - multiples of nine

Are you a member of the Splunk Community?

Dates and Datetime: How to correlate or reference events based on date time

summary indexing

Splunk Decoded: Service Maps vs Service Analyzer Tree View vs Flow Maps

What’s New in Splunk Observability – September 2025

Fun with Regular Expression - multiples of nine