Well, my question is not that intuitive, but I will deep dive here:
Let's suppose I have this lookup:
Name | Product | Sell_Date |
Denis | Bread | 2022-02-21 |
Maria | Beer | 2022-02-23 |
Denis | Water | 2022-01-27 |
Denis | Cheese | 2022-03-05 |
Maria | Beer | 2021-12-12 |
I need to get the last "Sell_Date" grouping by "Name". In this case:
Name | Product | Sell_Date |
Denis | Cheese | 2022-03-05 |
Maria | Beer | 2022-02-23 |
I know there is "dedup" command, but it's not working because "Sell_Date" is not being considered as "_time" field because this is a lookup and not an Index. I'm getting the wrong row as dedup result.
How can I get a custom dedup, specifying the field that should work as "_time"?
It was easier than I thought. The dedup would get the last event. As the source is a lookup and the field is not the default "_time", I just needed to sort by Sell_Date before. In my case, the sort of this field works properly (it was detected as a proper datetime field) so I don't need to strptime before sorting.
| inputlookup mylookup
| sort -Sell_Date
| dedup Name
It was easier than I thought. The dedup would get the last event. As the source is a lookup and the field is not the default "_time", I just needed to sort by Sell_Date before. In my case, the sort of this field works properly (it was detected as a proper datetime field) so I don't need to strptime before sorting.
| inputlookup mylookup
| sort -Sell_Date
| dedup Name
If you take the thought away from dedup, you will see the solution clearer. (Dedup is so often used to seek out the latest events, I can see the trap.) The requirement itself is really about sorting and retrieving.
| inputlookup sell_date
| stats values(eval(Sell_Date . "|" . Product)) as date_product by Name ``` values() sorts by default ```
| eval date_product = split(mvindex(date_product, -1), "|") ``` acccess the last joined record, then split back ```
| eval Sell_Date = mvindex(date_product, 0), Product = mvindex(date_product, 1)
| fields - date_product
I don't quite follow what you need. The stats summarizing commands work pretty much the same regardless of whether the data comes from the search across indexes or from the inputlookup.
If your concern is that you can't use latest(whatever) due to lack of _time field, just make your own _time field. Since your Sell_Date is probably a string, you can't simply rename it to _time (well, you can but it won't work properly). You'll have to use eval with strptime to parse the timestamp from the string.