Splunk Search

Grouping Data based on conditions on datetimes and counts over time!

andres91302
Communicator
Hello friends!
 
I am faced with a challenge I will be uploading two CSV files to splunk which represents two different table of information

Table A: Has the information of purchases in a clothing store with the variables: name of client, date of purchase, agent and product purchased during a period of time t.

NAME PRODUCT AGENT DATE_PURCHASE
KarenM_14X_18-25-20021 18:21:28
JeanM_78X_38-26-20021 18:11:06
JeanM_71X_48-26-20021 18:21:01
JeanM_64X_48-27-20021 20:21:59
KeithM_57X_48-27-20021 20:21:02
AlbaM_50X_18-28-20021 20:21:03
AlbaM_43X_38-29-20021 20:21:04
AlexM_36X_28-25-20021 20:21:05

Table B: Has the information of clients who have called the CX SERVICE line of the company during a period of time t and stores the variables name of client, date of call, and type of call.

NAME TYPE DATE_OF_CALL DATE_PURCHASE
KarenCOMPLAIN8-26-20021 18:21:288-25-20021 18:21:28
JeanCX_SERVICE8-27-20021 18:11:068-26-20021 18:11:06
JeanCOMPLAIN8-28-20021 18:21:018-26-20021 18:21:01
JeanCX_SERVICE8-29-20021 20:21:598-27-20021 20:21:59
KeithCX_SERVICE8-29-20021 20:21:028-27-20021 20:21:02
AlbaCOMPLAIN8-30-20021 20:21:038-28-20021 20:21:03
AlexCX_SERVICE8-25-20021 21:21:058-29-20021 20:21:04

 

I have to build a table in which It will be shown by NAME what was the very last product purchased by the customer prior to their very last call to the customer service line and it should include the variables: NAME ,LAST_PRODUCT_PURCHASED, AGENT, DATE_PURCHASE, TYPE, DATE_OF_CALL that table should look something like this:

RESULTS

NAME LAST_PRODUCT_PURCHASED AGENT DATE_PURCHASE TYPE DATE_OF_CALL
KarenM_14X_18-25-20021 18:21:28COMPLAIN8-26-20021 18:21:28
JeanM_64X_48-27-20021 20:21:59CX_SERVICE8-29-20021 20:21:59
KeithM_57X_48-27-20021 20:21:02CX_SERVICE8-29-20021 20:21:02
AlbaM_43X_38-29-20021 20:21:04COMPLAIN8-30-20021 20:21:03
AlexM_36X_28-25-20021 20:21:05CX_SERVICE8-25-20021 21:21:05

 

For example: The second raw shows the desired result as the very last product purchased by Jean was M-78 and her very last call on the line was a TYPE= CX_SERVICE with date 8-29-20021 20:21:59

I have been trying to come up with a solution but I finding myself needing of help. I kindly thank you for your assitance or tips, or reference to documentation that can help me achive my results. Thank you so much.

PD: What if we would try to add a column that counts how many time the custumer (NAME) has called prior to their most recent call on the line.

Thank you so MUCH! thanks a lot

 

Labels (1)
0 Karma
1 Solution

ITWhisperer
Ultra Champion
| makeresults 
| eval _raw="NAME,PRODUCT,AGENT,DATE_PURCHASE
Karen,M_14,X_1,8-25-2021 18:21:28
Jean,M_78,X_3,8-26-2021 18:11:06
Jean,M_71,X_4,8-26-2021 18:21:01
Jean,M_64,X_4,8-27-2021 20:21:59
Keith,M_57,X_4,8-27-2021 20:21:02
Alba,M_50,X_1,8-28-2021 20:21:03
Alba,M_43,X_3,8-29-2021 20:21:04
Alex,M_36,X_2,8-25-2021 20:21:05"
| multikv forceheader=1
| eval DATE_PURCHASE=strptime(DATE_PURCHASE,"%m-%d-%Y %H:%M:%S")
| append 
    [| makeresults 
    | eval _raw="NAME,TYPE,DATE_OF_CALL,DATE_PURCHASE
Karen,COMPLAIN,8-26-2021 18:21:28,8-25-2021 18:21:28
Jean,CX_SERVICE,8-27-2021 18:11:06,8-26-2021 18:11:06
Jean,COMPLAIN,8-28-2021 18:21:01,8-26-2021 18:21:01
Jean,CX_SERVICE,8-29-2021 20:21:59,8-27-2021 20:21:59
Keith,CX_SERVICE,8-29-2021 20:21:02,8-27-2021 20:21:02
Alba,COMPLAIN,8-30-2021 20:21:03,8-28-2021 20:21:03
Alex,CX_SERVICE,8-25-2021 21:21:05,8-29-2021 20:21:04"
    | multikv forceheader=1 
    | eval DATE_PURCHASE=strptime(DATE_PURCHASE,"%m-%d-%Y %H:%M:%S")
    | eval DATE_OF_CALL=strptime(DATE_OF_CALL,"%m-%d-%Y %H:%M:%S")
    ]
| fieldformat DATE_PURCHASE=strftime(DATE_PURCHASE,"%Y-%m-%d %H:%M:%S")
| fieldformat DATE_OF_CALL=strftime(DATE_OF_CALL,"%Y-%m-%d %H:%M:%S")
| fields - _* linecount
| eval _time=coalesce(DATE_OF_CALL,DATE_PURCHASE)
| sort _time
| streamstats last(PRODUCT) as PRODUCT last(AGENT) as AGENT by NAME
| where isnotnull(DATE_OF_CALL)
| stats latest(*) as * by NAME
| fields NAME PRODUCT AGENT DATE_PURCHASE TYPE DATE_OF_CALL

View solution in original post

ITWhisperer
Ultra Champion
| makeresults 
| eval _raw="NAME,PRODUCT,AGENT,DATE_PURCHASE
Karen,M_14,X_1,8-25-2021 18:21:28
Jean,M_78,X_3,8-26-2021 18:11:06
Jean,M_71,X_4,8-26-2021 18:21:01
Jean,M_64,X_4,8-27-2021 20:21:59
Keith,M_57,X_4,8-27-2021 20:21:02
Alba,M_50,X_1,8-28-2021 20:21:03
Alba,M_43,X_3,8-29-2021 20:21:04
Alex,M_36,X_2,8-25-2021 20:21:05"
| multikv forceheader=1
| eval DATE_PURCHASE=strptime(DATE_PURCHASE,"%m-%d-%Y %H:%M:%S")
| append 
    [| makeresults 
    | eval _raw="NAME,TYPE,DATE_OF_CALL,DATE_PURCHASE
Karen,COMPLAIN,8-26-2021 18:21:28,8-25-2021 18:21:28
Jean,CX_SERVICE,8-27-2021 18:11:06,8-26-2021 18:11:06
Jean,COMPLAIN,8-28-2021 18:21:01,8-26-2021 18:21:01
Jean,CX_SERVICE,8-29-2021 20:21:59,8-27-2021 20:21:59
Keith,CX_SERVICE,8-29-2021 20:21:02,8-27-2021 20:21:02
Alba,COMPLAIN,8-30-2021 20:21:03,8-28-2021 20:21:03
Alex,CX_SERVICE,8-25-2021 21:21:05,8-29-2021 20:21:04"
    | multikv forceheader=1 
    | eval DATE_PURCHASE=strptime(DATE_PURCHASE,"%m-%d-%Y %H:%M:%S")
    | eval DATE_OF_CALL=strptime(DATE_OF_CALL,"%m-%d-%Y %H:%M:%S")
    ]
| fieldformat DATE_PURCHASE=strftime(DATE_PURCHASE,"%Y-%m-%d %H:%M:%S")
| fieldformat DATE_OF_CALL=strftime(DATE_OF_CALL,"%Y-%m-%d %H:%M:%S")
| fields - _* linecount
| eval _time=coalesce(DATE_OF_CALL,DATE_PURCHASE)
| sort _time
| streamstats last(PRODUCT) as PRODUCT last(AGENT) as AGENT by NAME
| where isnotnull(DATE_OF_CALL)
| stats latest(*) as * by NAME
| fields NAME PRODUCT AGENT DATE_PURCHASE TYPE DATE_OF_CALL

View solution in original post

andres91302
Communicator

Thank you so much for your time and efford,  I cannot put into words how much your help means to me....I'd like to kindly ask you.... how would I have to change the code IF I were to create the tables A and B as the result of a prior search in splunk to later on build the desired result table...thanks a million again.

Lets the that Table A comes from the search

Index="A" | table NAME PRODUCT AGENT DATE_PURCHASE 

and B comes from
Index="B" | table NAME  TYPE  DATE_OF_CALL

0 Karma

ITWhisperer
Ultra Champion

Your table B search didn't table DATE_PURCHASE so I added that. I have assumed that the date fields in your data are formatted strings so included the parsing to convert to epoch times, but if your data already has epoch times, these can be omitted. Try this:

index="A" | table NAME PRODUCT AGENT DATE_PURCHASE 
| eval DATE_PURCHASE=strptime(DATE_PURCHASE,"%m-%d-%Y %H:%M:%S")
| append 
    [search index="B" | table NAME  TYPE  DATE_OF_CALL DATE_PURCHASE
    | eval DATE_PURCHASE=strptime(DATE_PURCHASE,"%m-%d-%Y %H:%M:%S")
    | eval DATE_OF_CALL=strptime(DATE_OF_CALL,"%m-%d-%Y %H:%M:%S")
    ]
| fieldformat DATE_PURCHASE=strftime(DATE_PURCHASE,"%Y-%m-%d %H:%M:%S")
| fieldformat DATE_OF_CALL=strftime(DATE_OF_CALL,"%Y-%m-%d %H:%M:%S")
| eval _time=coalesce(DATE_OF_CALL,DATE_PURCHASE)
| sort _time
| streamstats last(PRODUCT) as PRODUCT last(AGENT) as AGENT by NAME
| where isnotnull(DATE_OF_CALL)
| stats latest(*) as * by NAME
| fields NAME PRODUCT AGENT DATE_PURCHASE TYPE DATE_OF_CALL
0 Karma

andres91302
Communicator

Excellent! Thank so much you have helped me in ways you cant imagine thank you so so so much @ITWhisperer 

0 Karma

diogofgm
SplunkTrust
SplunkTrust

Hi,

Give this a try:

| inputlookup file1.csv
| inputlookup append=true file2.csv 
| stats last(*) AS * by NAME

Depending on the file events order you might need to include either a sort command to sort by dates or replace the last(*) with first(*)

------------
Hope I was able to help you. If so, an upvote would be appreciated.

andres91302
Communicator

hello @diogofgm  thank you for helping me , Im so sorry I am very new to your code what does the line: 

| stats last(*) AS * by NAME

is doing Thank you so much  man I hope you have a grat rets of the week, I have also been looking  for some of your replies on other posts to learn more from you! 

0 Karma

diogofgm
SplunkTrust
SplunkTrust

It will used the stats last to all available fields and rename it to the field name

if you have a table with “name, field1,field2” and you use that stats command you end up with a table with same fields but showing only the last  value of each field per name.

the other option would be to do:

|stats last(field1) AS field1 last(field2) AS field2 by name

 

------------
Hope I was able to help you. If so, an upvote would be appreciated.
0 Karma
.conf21 CFS Extended through 5/20!

Don't miss your chance
to share your Splunk
wisdom in-person or
virtually at .conf21!

Call for Speakers has
been extended through
Thursday, 5/20!