Splunk Search

How to dedup in lookup specifying the "_time" field?

denissotoacc
Path Finder

Well, my question is not that intuitive, but I will deep dive here:

Let's suppose I have this lookup:

NameProductSell_Date
DenisBread2022-02-21
MariaBeer2022-02-23
DenisWater2022-01-27
DenisCheese2022-03-05
MariaBeer2021-12-12


I need to get the last "Sell_Date" grouping by "Name". In this case:

NameProductSell_Date
DenisCheese2022-03-05
MariaBeer2022-02-23


I know there is "dedup" command, but it's not working because "Sell_Date" is not being considered as "_time" field because this is a lookup and not an Index. I'm getting the wrong row as dedup result.

How can I get a custom dedup, specifying the field that should work as "_time"? 

Labels (1)
0 Karma
1 Solution

denissotoacc
Path Finder

It was easier than I thought. The dedup would get the last event. As the source is a lookup and the field is not the default "_time", I just needed to sort by Sell_Date before. In my case, the sort of this field works properly (it was detected as a proper datetime field) so I don't need to strptime before sorting.

| inputlookup mylookup
| sort -Sell_Date
| dedup Name

 

View solution in original post

denissotoacc
Path Finder

It was easier than I thought. The dedup would get the last event. As the source is a lookup and the field is not the default "_time", I just needed to sort by Sell_Date before. In my case, the sort of this field works properly (it was detected as a proper datetime field) so I don't need to strptime before sorting.

| inputlookup mylookup
| sort -Sell_Date
| dedup Name

 

yuanliu
SplunkTrust
SplunkTrust

If you take the thought away from dedup, you will see the solution clearer. (Dedup is so often used to seek out the latest events, I can see the trap.)  The requirement itself is really about sorting and retrieving.

| inputlookup sell_date
| stats values(eval(Sell_Date . "|" . Product)) as date_product by Name ``` values() sorts by default ```
| eval date_product = split(mvindex(date_product, -1), "|") ``` acccess the last joined record, then split back ```
| eval Sell_Date = mvindex(date_product, 0), Product = mvindex(date_product, 1)
| fields - date_product

 

Tags (3)

PickleRick
SplunkTrust
SplunkTrust

I don't quite follow what you need. The stats summarizing commands work pretty much the same regardless of whether the data comes from the search across indexes or from the inputlookup.

If your concern is that you can't use latest(whatever) due to lack of _time field, just make your own _time field. Since your Sell_Date is probably a string, you can't simply rename it to _time (well, you can but it won't work properly). You'll have to use eval with strptime to parse the timestamp from the string.

0 Karma
Get Updates on the Splunk Community!

Splunk Search APIを使えば調査過程が残せます

   このゲストブログは、JCOM株式会社の情報セキュリティ本部・専任部長である渡辺慎太郎氏によって執筆されました。 Note: This article is published in both Japanese ...

Integrating Splunk Search API and Quarto to Create Reproducible Investigation ...

 Splunk is More Than Just the Web Console For Digital Forensics and Incident Response (DFIR) practitioners, ...

Congratulations to the 2025-2026 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...