Splunk Search

How to make a Search NOT append results from previous row if record already exists ?

neerajs_81
Builder

Hi All,

We have a saved search (snippet below) which populates a CSV lookup file.  The search is scheduled to run daily.  

 

index=xyz
...
| stats latest(_time) as _time, latest(legacyUsername), latest(title), latest(email), latest(endDate), latest(firstname), latest(lastname),  by identity
| rename latest(*) as *
| eval identity = identity+"|"+email+"|"+legacyUsername

 


What's happening is,  every time the search  runs it creates duplicates as in each subsequent row appends the  results from the previous row to it for the "Identity" column as shown below.  
I get that my "eval identity" command is making it do it.  But how can i make it not create new records if the "identity" already exists once in the table?   Hope i am clear.   I tried "Dedup identity" that didn't work.  

 Result: 

IdentitytimelegacyuserNametitleemailenddatafirstlastname
1001|karen.woo@xyz.com|karen_wooxxxxxxxxxxxxxxKarenWoo
1001|karen.woo@xyz.com|karen_woo |karen.woo@xyz.com|karen_woo xxxxxxxxxxxxxxKarenWoo
1001|karen.woo@xyz.com|karen_woo |karen.woo@xyz.com|karen_woo |karen.woo@xyz.com|karen_woo xxxxxxxxxxxxxxKarenWoo
1001|karen.woo@xyz.com|karen_woo|karen.woo@xyz.com|karen_woo |karen.woo@xyz.com|karen_woo|karen.woo@xyz.com|karen_woo |karen.woo@xyz.com|karen_woo    xxKarenWoo
Labels (3)
Tags (1)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

The issue is that you are regenerating the 1001|karen.woo@xyz.com from the original as well as the previously generated version, so try repeating the stats and rename to remove the duplicate

index=xyz
...
| stats latest(_time) as _time, latest(legacyUsername), latest(title), latest(email), latest(endDate), latest(firstname), latest(lastname),  by identity
| rename latest(*) as *
| eval identity=mvindex(split(identity,"|"),0)+"|"+email+"|"+legacyUsername
| stats latest(_time) as _time, latest(legacyUsername), latest(title), latest(email), latest(endDate), latest(firstname), latest(lastname),  by identity
| rename latest(*) as *

View solution in original post

ITWhisperer
SplunkTrust
SplunkTrust
| eval identity=mvindex(split(identity,"|"),0)+"|"+email+"|"+legacyUsername
0 Karma

neerajs_81
Builder

@ITWhisperer   Tried that,  it appears to have stop appending to previous record  but the result is still showing duplicates as follows:  Basically it resolved the appending piece but is still creating a new record for the same Identity even though the identity exists.  How to make it not create duplicates further? 

IdentitytimelegacyuserNametitleemailenddatafirstlastname
1001|karen.woo@xyz.com|karen_wooxxxxxxxxxxxxxxKarenWoo
1001|karen.woo@xyz.com|karen_woo xxxxxxxxxxxxxxKarenWoo
1001|karen.woo@xyz.com|karen_wooxxxxxxxxxxxxxxKarenWoo
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

The issue is that you are regenerating the 1001|karen.woo@xyz.com from the original as well as the previously generated version, so try repeating the stats and rename to remove the duplicate

index=xyz
...
| stats latest(_time) as _time, latest(legacyUsername), latest(title), latest(email), latest(endDate), latest(firstname), latest(lastname),  by identity
| rename latest(*) as *
| eval identity=mvindex(split(identity,"|"),0)+"|"+email+"|"+legacyUsername
| stats latest(_time) as _time, latest(legacyUsername), latest(title), latest(email), latest(endDate), latest(firstname), latest(lastname),  by identity
| rename latest(*) as *

neerajs_81
Builder

Thank you very much.  Can you pls explain what does
 eval identity=mvindex(split(identity,"|"),0)+"|"+email+"|"+legacyUsername   do ?

I did lookup online on mvindex command but what exactly made you suggest split(identity,"|"),0) ?  





0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

split breaks the string up using the delimiter "|" into a multi-value field, mvindex takes one of the values, in this case, the first one since indexing begins at zero for this function. So, specifically, mvindex(split("1001|karen.woo@xyz.com|karen_woo","|"),0) equates to "1001". The line then appends the email and legacyUsername back again to recreate the identity.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @neerajs_81,

the only thing I can suppose is that you have more values in the latest fields, so try to use also email and legacyUsername as key in the BY cluase, something like this:

index=xyz
...
| stats 
   latest(_time) as _time
   latest(title) AS title
   latest(endDate) AS endDate
   latest(firstname) AS firstname
   latest(lastname) AS lastname
   by identity email legacyUsername
| eval identity = identity."|".email."|".legacyUsername

Ciao.

Giuseppe

neerajs_81
Builder

Hi @gcusello ,  Thank you for responding. Tried what you suggested in the stats by clause,  but it didn't make any difference to the results.

0 Karma

neerajs_81
Builder

Adding @ITWhisperer .
Thanks

0 Karma

isoutamo
SplunkTrust
SplunkTrust
Can you just check if identity in not null/“” (empty) and then add those to it?
Get Updates on the Splunk Community!

Splunk Observability Cloud | Unified Identity - Now Available for Existing Splunk ...

Raise your hand if you’ve already forgotten your username or password when logging into an account. (We can’t ...

Index This | How many sides does a circle have?

February 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

Registration for Splunk University is Now Open!

Are you ready for an adventure in learning?   Brace yourselves because Splunk University is back, and it's ...