Splunk Search

How to dedup in lookup specifying the "_time" field?

denissotoacc
Path Finder

Well, my question is not that intuitive, but I will deep dive here:

Let's suppose I have this lookup:

NameProductSell_Date
DenisBread2022-02-21
MariaBeer2022-02-23
DenisWater2022-01-27
DenisCheese2022-03-05
MariaBeer2021-12-12


I need to get the last "Sell_Date" grouping by "Name". In this case:

NameProductSell_Date
DenisCheese2022-03-05
MariaBeer2022-02-23


I know there is "dedup" command, but it's not working because "Sell_Date" is not being considered as "_time" field because this is a lookup and not an Index. I'm getting the wrong row as dedup result.

How can I get a custom dedup, specifying the field that should work as "_time"? 

Labels (1)
0 Karma
1 Solution

denissotoacc
Path Finder

It was easier than I thought. The dedup would get the last event. As the source is a lookup and the field is not the default "_time", I just needed to sort by Sell_Date before. In my case, the sort of this field works properly (it was detected as a proper datetime field) so I don't need to strptime before sorting.

| inputlookup mylookup
| sort -Sell_Date
| dedup Name

 

View solution in original post

denissotoacc
Path Finder

It was easier than I thought. The dedup would get the last event. As the source is a lookup and the field is not the default "_time", I just needed to sort by Sell_Date before. In my case, the sort of this field works properly (it was detected as a proper datetime field) so I don't need to strptime before sorting.

| inputlookup mylookup
| sort -Sell_Date
| dedup Name

 

yuanliu
SplunkTrust
SplunkTrust

If you take the thought away from dedup, you will see the solution clearer. (Dedup is so often used to seek out the latest events, I can see the trap.)  The requirement itself is really about sorting and retrieving.

| inputlookup sell_date
| stats values(eval(Sell_Date . "|" . Product)) as date_product by Name ``` values() sorts by default ```
| eval date_product = split(mvindex(date_product, -1), "|") ``` acccess the last joined record, then split back ```
| eval Sell_Date = mvindex(date_product, 0), Product = mvindex(date_product, 1)
| fields - date_product

 

Tags (3)

PickleRick
SplunkTrust
SplunkTrust

I don't quite follow what you need. The stats summarizing commands work pretty much the same regardless of whether the data comes from the search across indexes or from the inputlookup.

If your concern is that you can't use latest(whatever) due to lack of _time field, just make your own _time field. Since your Sell_Date is probably a string, you can't simply rename it to _time (well, you can but it won't work properly). You'll have to use eval with strptime to parse the timestamp from the string.

0 Karma
Get Updates on the Splunk Community!

Upcoming Webinar: Unmasking Insider Threats with Slunk Enterprise Security’s UEBA

Join us on Wed, Dec 10. at 10AM PST / 1PM EST for a live webinar and demo with Splunk experts! Discover how ...

.conf25 technical session recap of Observability for Gen AI: Monitoring LLM ...

If you’re unfamiliar, .conf is Splunk’s premier event where the Splunk community, customers, partners, and ...

A Season of Skills: New Splunk Courses to Light Up Your Learning Journey

There’s something special about this time of year—maybe it’s the glow of the holidays, maybe it’s the ...