Well, "correct" depends on what you want to achieve. Often dedup is not needed if you're going to stats the data right after it. And it's a tricky command and often misunderstood and misused. It filt...
See more...
Well, "correct" depends on what you want to achieve. Often dedup is not needed if you're going to stats the data right after it. And it's a tricky command and often misunderstood and misused. It filters out from your results all further events with a particular value of a given field (or a set of values if you use it on more fields) regardless of what the remaining content of those events are. So if you had, for example, logs containing fields criticality (being one of INFO, WARN, DEBUG, ERR or CRIT) and message after using | dedup criticality you'd only get one INFO, one DEBUG and so on - the first one Splunk encountered in your data. You'd lose all subsequent INFOs, DEBUGs and so on even though they had different message value. So you'd be aware that - for example - there was a CPU usage spike but wouldn't know that your system was also out of disk space and over the temperature threshold. Dedup is really rarely useful. For me it works only as an "extended head".