Splunk Search

Dedup Sortby not working?

Tedesco1
Path Finder

alt text

I am trying to exclude duplicate events- first I want to only include the most recent event for each combination of values for "cwo" and "clock_number". I'm using "| dedup clock_number cwo".

This part works fine, but I want to only include the event for each "clock_number" with the highest "start" value (in the search I am changing them to human-readable format after the dedup so it's easier to see, but these are UNIX epoch times), so I'm adding another dedup after the first one.

alt text

I've gotten this to work for me by using "|sort -start | dedup", but I've read that "dedup sortby -start" is more efficient than "sort -start | dedup" and should achieve the same results, but they don't in this case. In screenshot "search 2" I've done "sort -start | dedup clock_number". In screenshot "search 3" I've done "dedup clock_number sortby -start". I also tried "sortby start" to test the opposite sort, but it gave me the same result.

What's going on here? Why are these different? Is there a better way for me to do this?

Tags (3)
0 Karma

niketn
Legend

@Tedesco1, your start field seems to be string time not epoch. Since based on the screenshot since it is not in YYYY-mm-dd HH:MM:SS format it will not be sorted appropriately as string.

You should add the following after your base search

<yourBaseSearch>
| eval _time=strptime(start,"%m-%d-%Y %H:%M:%S")
| dedup clock_number sortby -_time

You can also try reverse before dedup which should also be fast

<yourBaseSearch>
| eval _time=strptime(start,"%m-%d-%Y %H:%M:%S")
| reverse
| dedup clock_number
____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

Tedesco1
Path Finder

It is in epoch time. It is showing in string format only because I used "| eval start=strftime(start, "%m-%d-%Y %H:%M:%S")" at the end of my search. That's shown in the screenshots above.

Reverse would not work- I can't rely on them being consistently in reverse order, I need to select the highest value of start time.

0 Karma

niketn
Legend

Your search does not show strptime() on start. So it must be string time

| eval start=strptime(start,"%m-%d-%Y %H:%M:%S")
| dedup clock_number sortby -start

Also for displaying start in human readable string time format you can add a fieldformat in the last pipe instead of eval so that start field remains as epoch time where as the displayed value should be string time.

 <yourBaseSearch>
 | eval start=strptime(start,"%m-%d-%Y %H:%M:%S")
 | dedup clock_number sortby -start
  ...
 | fieldformat start=strftime(start,"%m-%d-%Y %H:%M:%S")
____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

Tedesco1
Path Finder

Start is in epoch time. The last command in the searches in the screenshots above are: 'eval start=strftime(start, "%m-%d-%Y %H:%M:%S")'. This command converts epoch time to human readable string time. Am I missing something?

0 Karma

niketn
Legend

@Tedesco1, I am sorry I might be missing something then... For the two results can you add raw events after masking/anonymizing any sensitive data?

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

Tedesco1
Path Finder

Here is the event I was expecting:

{"clock_number": REDACTED,"cwo": REDACTED,"pwo": REDACTED,"op_sequence": "020","start": "1528749692","elapsed_seconds": "0","work_center": REDACTED,"trigger_timing": "after","command": "update","data_type": "punch" }

Here is the other event:

{"clock_number": REDACTED,"cwo": "NPL","pwo": "","op_sequence": "","start": "1528749000","elapsed_seconds": "648","work_center": "","trigger_timing": "after","command": "update","data_type": "punch" }

0 Karma

HiroshiSatoh
Champion

It is conceivable that data is deleted by the first DEDUP.
Also, is there a difference in the results of the first DEDUP?

What happens when I set a time range for yesterday and search?

0 Karma

Tedesco1
Path Finder

When I just ran the first dedup by itself, it returned only the two results that were shown in the screenshots. The one with timestamp 3:41 and with 4:06.

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...