Splunk Search

How to make a Search NOT append results from previous row if record already exists ?

neerajs_81
Contributor

Hi All,

We have a saved search (snippet below) which populates a CSV lookup file.  The search is scheduled to run daily.  

 

index=xyz
...
| stats latest(_time) as _time, latest(legacyUsername), latest(title), latest(email), latest(endDate), latest(firstname), latest(lastname),  by identity
| rename latest(*) as *
| eval identity = identity+"|"+email+"|"+legacyUsername

 


What's happening is,  every time the search  runs it creates duplicates as in each subsequent row appends the  results from the previous row to it for the "Identity" column as shown below.  
I get that my "eval identity" command is making it do it.  But how can i make it not create new records if the "identity" already exists once in the table?   Hope i am clear.   I tried "Dedup identity" that didn't work.  

 Result: 

IdentitytimelegacyuserNametitleemailenddatafirstlastname
1001|karen.woo@xyz.com|karen_wooxxxxxxxxxxxxxxKarenWoo
1001|karen.woo@xyz.com|karen_woo |karen.woo@xyz.com|karen_woo xxxxxxxxxxxxxxKarenWoo
1001|karen.woo@xyz.com|karen_woo |karen.woo@xyz.com|karen_woo |karen.woo@xyz.com|karen_woo xxxxxxxxxxxxxxKarenWoo
1001|karen.woo@xyz.com|karen_woo|karen.woo@xyz.com|karen_woo |karen.woo@xyz.com|karen_woo|karen.woo@xyz.com|karen_woo |karen.woo@xyz.com|karen_woo    xxKarenWoo
Labels (3)
Tags (1)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

The issue is that you are regenerating the 1001|karen.woo@xyz.com from the original as well as the previously generated version, so try repeating the stats and rename to remove the duplicate

index=xyz
...
| stats latest(_time) as _time, latest(legacyUsername), latest(title), latest(email), latest(endDate), latest(firstname), latest(lastname),  by identity
| rename latest(*) as *
| eval identity=mvindex(split(identity,"|"),0)+"|"+email+"|"+legacyUsername
| stats latest(_time) as _time, latest(legacyUsername), latest(title), latest(email), latest(endDate), latest(firstname), latest(lastname),  by identity
| rename latest(*) as *

View solution in original post

ITWhisperer
SplunkTrust
SplunkTrust
| eval identity=mvindex(split(identity,"|"),0)+"|"+email+"|"+legacyUsername
0 Karma

neerajs_81
Contributor

@ITWhisperer   Tried that,  it appears to have stop appending to previous record  but the result is still showing duplicates as follows:  Basically it resolved the appending piece but is still creating a new record for the same Identity even though the identity exists.  How to make it not create duplicates further? 

IdentitytimelegacyuserNametitleemailenddatafirstlastname
1001|karen.woo@xyz.com|karen_wooxxxxxxxxxxxxxxKarenWoo
1001|karen.woo@xyz.com|karen_woo xxxxxxxxxxxxxxKarenWoo
1001|karen.woo@xyz.com|karen_wooxxxxxxxxxxxxxxKarenWoo
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

The issue is that you are regenerating the 1001|karen.woo@xyz.com from the original as well as the previously generated version, so try repeating the stats and rename to remove the duplicate

index=xyz
...
| stats latest(_time) as _time, latest(legacyUsername), latest(title), latest(email), latest(endDate), latest(firstname), latest(lastname),  by identity
| rename latest(*) as *
| eval identity=mvindex(split(identity,"|"),0)+"|"+email+"|"+legacyUsername
| stats latest(_time) as _time, latest(legacyUsername), latest(title), latest(email), latest(endDate), latest(firstname), latest(lastname),  by identity
| rename latest(*) as *

neerajs_81
Contributor

Thank you very much.  Can you pls explain what does
 eval identity=mvindex(split(identity,"|"),0)+"|"+email+"|"+legacyUsername   do ?

I did lookup online on mvindex command but what exactly made you suggest split(identity,"|"),0) ?  





0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

split breaks the string up using the delimiter "|" into a multi-value field, mvindex takes one of the values, in this case, the first one since indexing begins at zero for this function. So, specifically, mvindex(split("1001|karen.woo@xyz.com|karen_woo","|"),0) equates to "1001". The line then appends the email and legacyUsername back again to recreate the identity.

0 Karma

gcusello
Legend

Hi @neerajs_81,

the only thing I can suppose is that you have more values in the latest fields, so try to use also email and legacyUsername as key in the BY cluase, something like this:

index=xyz
...
| stats 
   latest(_time) as _time
   latest(title) AS title
   latest(endDate) AS endDate
   latest(firstname) AS firstname
   latest(lastname) AS lastname
   by identity email legacyUsername
| eval identity = identity."|".email."|".legacyUsername

Ciao.

Giuseppe

neerajs_81
Contributor

Hi @gcusello ,  Thank you for responding. Tried what you suggested in the stats by clause,  but it didn't make any difference to the results.

0 Karma

neerajs_81
Contributor

Adding @ITWhisperer .
Thanks

0 Karma

isoutamo
SplunkTrust
SplunkTrust
Can you just check if identity in not null/“” (empty) and then add those to it?
Get Updates on the Splunk Community!

Improve Your Security Posture

Watch NowImprove Your Security PostureCustomers are at the center of everything we do at Splunk and security ...

Maximize the Value from Microsoft Defender with Splunk

 Watch NowJoin Splunk and Sens Consulting for this Security Edition Tech TalkWho should attend:  Security ...

This Week's Community Digest - Splunk Community Happenings [6.27.22]

Get the latest news and updates from the Splunk Community here! News From Splunk Answers ✍️ Splunk Answers is ...