Splunk Search

How to make a Search NOT append results from previous row if record already exists ?

neerajs_81
Builder

Hi All,

We have a saved search (snippet below) which populates a CSV lookup file.  The search is scheduled to run daily.  

 

index=xyz
...
| stats latest(_time) as _time, latest(legacyUsername), latest(title), latest(email), latest(endDate), latest(firstname), latest(lastname),  by identity
| rename latest(*) as *
| eval identity = identity+"|"+email+"|"+legacyUsername

 


What's happening is,  every time the search  runs it creates duplicates as in each subsequent row appends the  results from the previous row to it for the "Identity" column as shown below.  
I get that my "eval identity" command is making it do it.  But how can i make it not create new records if the "identity" already exists once in the table?   Hope i am clear.   I tried "Dedup identity" that didn't work.  

 Result: 

IdentitytimelegacyuserNametitleemailenddatafirstlastname
1001|karen.woo@xyz.com|karen_wooxxxxxxxxxxxxxxKarenWoo
1001|karen.woo@xyz.com|karen_woo |karen.woo@xyz.com|karen_woo xxxxxxxxxxxxxxKarenWoo
1001|karen.woo@xyz.com|karen_woo |karen.woo@xyz.com|karen_woo |karen.woo@xyz.com|karen_woo xxxxxxxxxxxxxxKarenWoo
1001|karen.woo@xyz.com|karen_woo|karen.woo@xyz.com|karen_woo |karen.woo@xyz.com|karen_woo|karen.woo@xyz.com|karen_woo |karen.woo@xyz.com|karen_woo    xxKarenWoo
Labels (3)
Tags (1)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

The issue is that you are regenerating the 1001|karen.woo@xyz.com from the original as well as the previously generated version, so try repeating the stats and rename to remove the duplicate

index=xyz
...
| stats latest(_time) as _time, latest(legacyUsername), latest(title), latest(email), latest(endDate), latest(firstname), latest(lastname),  by identity
| rename latest(*) as *
| eval identity=mvindex(split(identity,"|"),0)+"|"+email+"|"+legacyUsername
| stats latest(_time) as _time, latest(legacyUsername), latest(title), latest(email), latest(endDate), latest(firstname), latest(lastname),  by identity
| rename latest(*) as *

View solution in original post

ITWhisperer
SplunkTrust
SplunkTrust
| eval identity=mvindex(split(identity,"|"),0)+"|"+email+"|"+legacyUsername
0 Karma

neerajs_81
Builder

@ITWhisperer   Tried that,  it appears to have stop appending to previous record  but the result is still showing duplicates as follows:  Basically it resolved the appending piece but is still creating a new record for the same Identity even though the identity exists.  How to make it not create duplicates further? 

IdentitytimelegacyuserNametitleemailenddatafirstlastname
1001|karen.woo@xyz.com|karen_wooxxxxxxxxxxxxxxKarenWoo
1001|karen.woo@xyz.com|karen_woo xxxxxxxxxxxxxxKarenWoo
1001|karen.woo@xyz.com|karen_wooxxxxxxxxxxxxxxKarenWoo
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

The issue is that you are regenerating the 1001|karen.woo@xyz.com from the original as well as the previously generated version, so try repeating the stats and rename to remove the duplicate

index=xyz
...
| stats latest(_time) as _time, latest(legacyUsername), latest(title), latest(email), latest(endDate), latest(firstname), latest(lastname),  by identity
| rename latest(*) as *
| eval identity=mvindex(split(identity,"|"),0)+"|"+email+"|"+legacyUsername
| stats latest(_time) as _time, latest(legacyUsername), latest(title), latest(email), latest(endDate), latest(firstname), latest(lastname),  by identity
| rename latest(*) as *

neerajs_81
Builder

Thank you very much.  Can you pls explain what does
 eval identity=mvindex(split(identity,"|"),0)+"|"+email+"|"+legacyUsername   do ?

I did lookup online on mvindex command but what exactly made you suggest split(identity,"|"),0) ?  





0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

split breaks the string up using the delimiter "|" into a multi-value field, mvindex takes one of the values, in this case, the first one since indexing begins at zero for this function. So, specifically, mvindex(split("1001|karen.woo@xyz.com|karen_woo","|"),0) equates to "1001". The line then appends the email and legacyUsername back again to recreate the identity.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @neerajs_81,

the only thing I can suppose is that you have more values in the latest fields, so try to use also email and legacyUsername as key in the BY cluase, something like this:

index=xyz
...
| stats 
   latest(_time) as _time
   latest(title) AS title
   latest(endDate) AS endDate
   latest(firstname) AS firstname
   latest(lastname) AS lastname
   by identity email legacyUsername
| eval identity = identity."|".email."|".legacyUsername

Ciao.

Giuseppe

neerajs_81
Builder

Hi @gcusello ,  Thank you for responding. Tried what you suggested in the stats by clause,  but it didn't make any difference to the results.

0 Karma

neerajs_81
Builder

Adding @ITWhisperer .
Thanks

0 Karma

isoutamo
SplunkTrust
SplunkTrust
Can you just check if identity in not null/“” (empty) and then add those to it?
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...