Splunk Search

Retrieving unique values of an indexed field

NancyCunningham
Engager

Is there a quick way to retrieve the list of all unique values of an indexed field?

I know I could search for the field and pipe to uniq, but hoping there might be something faster.

Tags (3)

ghendrey_splunk
Splunk Employee
Splunk Employee
|tstats values(<indexed__field_name>) where index=<index_name>

will totally avoid going over any events. It gets its answer from looking at metadata in .tsidx files, so no perf hit for scanning events. Orders of magnitude faster than piping a search to stats.

0 Karma

esachs
Splunk Employee
Splunk Employee

Actually, we were hoping that, because it is an indexed field, there is some kind of metadata or list that is persisted that we could access quickly, without running a search over all our events. I guess the simplest case would be source, sourcetype, or host - is there any quick way to find the list of all indexed hosts without going through stats or some other search? It seems like there must be, because the summary view displays those. We'd like to pull that type of summary information for any indexed field to get a list of all possible field values.

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

For host, source, and sourcetype specifically, you can use the |metadata search command.

0 Karma

esachs
Splunk Employee
Splunk Employee

For some reason, I don't see an "add comment" field on Nick's answer. Is there some other way to do that?

0 Karma

piebob
Splunk Employee
Splunk Employee

can you add this as a comment to Nick's answer, and not as a new answer?

0 Karma

sideview
SplunkTrust
SplunkTrust

Absolutely. There's several ways to do this. Lets assume your field is called 'foo'.

The most straightforward way is to use the stats command

<your search> | stats count by foo

Using stats opens up the door to collect other statistics by those unique values. For example:

<your search> | stats count avg(duration) dc(username) by foo

which will take the average of a field called duration and the distinct count of values of username, with each statistic being computed just for a given value of foo

http://www.splunk.com/base/Documentation/latest/SearchReference/Stats

Another way worth mentioning is to just use top

<your search> | top foo limit=10000

gkanapathy
Splunk Employee
Splunk Employee

For host, source, andsourcetypespecifically, you can use the| metadata` search command, which can certainly be much faster. If you need this a lot, run a scheduled search that runs over recent data and updates a lookup table (...| append [ inputlookup mytable ] | dedup myfield1, myfield2 | outputlookup mytable), i.e., basically you generate and maintain the metadata yourself periodically.

0 Karma
Get Updates on the Splunk Community!

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...