Splunk Search

Getting a count of the number of fields associated with a sourcetype

Runals
Motivator

I've done a little looking and poking around but haven't seen an answer to this - hopefully I haven't overlooked something obvious. I'm trying to build a query that counts the number of fields associated with a sourcetype (edit: number of fields associated with the result set based on a query that is looking at a particular sourcetype). While not an exact measurement I believe that will help me hone in on sourcetypes that possibly need additional field extraction/definition work (low numbers = spend some time).

Tags (1)
0 Karma
1 Solution

cschmidt0121
Path Finder

Both answers so far work great, but to do this one sourcetype at a time would take me forever. I solved this by writing a shell script the uses the CLI to do two things:

1) Enumerate all sourcetypes.
2) Count the number of fields for each sourcetype.

It looks like this:

#!/bin/bash

# Clear fields.txt
> fields.txt

echo "Building sourcetype list."
splunk search "| metadata type=sourcetypes index=* | table sourcetype" -uri https://hostname:8089 -maxout 10000 -earliest_time '-7d'| tail +3 > sourcetypes.txt

echo "Done."

while read st; do
    echo "Counting fields for $st."
    echo -n $st >> fields.txt
    echo -n ": " >> fields.txt

    # The search returns with some weird leading whitespace, so I use sed to
    # get rid of it.
    splunk search "sourcetype=$st | head 250 | fieldsummary | stats count" -uri https://hostname:8089 | tail -1 | sed 's/^[ \t]*//' >> fields.txt
done < sourcetypes.txt

View solution in original post

wrighke
Explorer

I'm ten years late, so I hope it's not urgent 🤣
But here's a solution you can run inline:

#base search here

| stats last(*) AS * by sourcetype 
| foreach * 
    [ eval <<FIELD>>=if("<<FIELD>>" == "sourcetype", sourcetype, 1)] 
| addtotals 
| fields Total sourcetype

 

bowesmana
SplunkTrust
SplunkTrust

That's unlikely to give you a good representation of fields that actually are part of the data associated with a sourcetype, for example if you just run this on index=_audit

index=_audit
| stats last(*) AS * by sourcetype 
| foreach * 
    [ eval <<FIELD>>=if("<<FIELD>>" == "sourcetype", sourcetype, 1), fields=mvappend(fields, "<<FIELD>>") ] 
| addtotals 
| fields Total sourcetype fields

you will get a load of fields associated with audittrail - on 3 different search heads, I get from 149 to 323 fields, most of which are just pulled in due to TAs installed on the search heads.

 

0 Karma

wrighke
Explorer

For my particular use case, I want to compare the difference between the count of fields extracted in windows event logs before versus after I install the Splunk TA for Windows on my search head. 
It fits my use case, but it might not for others if you have inconsistent configurations across your search head peers, for example.

0 Karma

Runals
Motivator

With 6x you can use the foreach command (maybe was there in 5 but not documented?). At any rate here is the search I'm about to start using.

<scope limiting search> | dedup 20 sourcetype punct | table * | fields - _raw date_* index linecount punct eventtype time*pos splunk_server  | foreach * [eval <<FIELD>> = '<<FIELD>>'."##".sourcetype."##||" ] | stats max(*) as * |  transpose | rename "row 1" as sourcetype column as field | rex field=sourcetype "##(?<sourcetype>[^#]+)" | eval fieldsort = lower(field) | dedup sourcetype field | sort sourcetype fieldsort | table sourcetype field
0 Karma

cschmidt0121
Path Finder

Both answers so far work great, but to do this one sourcetype at a time would take me forever. I solved this by writing a shell script the uses the CLI to do two things:

1) Enumerate all sourcetypes.
2) Count the number of fields for each sourcetype.

It looks like this:

#!/bin/bash

# Clear fields.txt
> fields.txt

echo "Building sourcetype list."
splunk search "| metadata type=sourcetypes index=* | table sourcetype" -uri https://hostname:8089 -maxout 10000 -earliest_time '-7d'| tail +3 > sourcetypes.txt

echo "Done."

while read st; do
    echo "Counting fields for $st."
    echo -n $st >> fields.txt
    echo -n ": " >> fields.txt

    # The search returns with some weird leading whitespace, so I use sed to
    # get rid of it.
    splunk search "sourcetype=$st | head 250 | fieldsummary | stats count" -uri https://hostname:8089 | tail -1 | sed 's/^[ \t]*//' >> fields.txt
done < sourcetypes.txt

Runals
Motivator

As cschmidt0121 mentioned the other two answers work well but I have to give it to this poster as he is a student worker in our shop 😃

0 Karma

lguinn2
Legend

BTW, great solution! I am going to use it into some of my work!

0 Karma

cschmidt0121
Path Finder

That didn't even occur to me! Thanks for the heads up.

0 Karma

lguinn2
Legend

Just be aware that the fields may vary based on the app context. For example, I can have a field that is defined in my Sales app, but which does not exist in the Search app context. So unless your fields are all global, you will probably miss some.

The CLI operates in the Search app context.

0 Karma

Gilberto_Castil
Splunk Employee
Splunk Employee

Assume that you want to get the details for sourcetype "WinEventLog:Security". You can get this by enumerating all fields for that singular sourcetype with this search:

sourcetype="WinEventLog:Security" | stats mode(*) AS * by sourcetype

This will give you a listing of all fields plus a single row with a sample of the data.

alt text

use the tranpose command to re-order the results and ignore the sampling of data.

| transpose | fields - "row 1" | rename column AS "FIELD_NAME"

This will provide a single column with all of the fields associated with the sourcetype:

alt text

And from there, the rest is up to you... for instance, to count the number of fields:

| stats count(FIELD_NAME) AS count

All together, the search looks like this:

sourcetype="WinEventLog:Security" | stats mode(*) AS * by sourcetype | transpose | fields - "row 1" | rename column AS "FIELD_NAME" | stats count(FIELD_NAME) AS count

gc

lguinn2
Legend

Very nice! I had not considered this!

0 Karma

yannK
Splunk Employee
Splunk Employee

There is no easy way, but you could get interesting data with a search like :

sourcetype=splunkd | fieldsummary | eval sourcetype="splunkd" | table field sourcetype

Then for each interesting sourcetype, save the result in a lookup:

sourcetype=splunkd | fieldsummary | eval sourcetype="splunkd" | table field sourcetype | outputlookup list_fields.csv append=true

and finally run a report on the whole lookup:

| inputlookup list_fields.csv | stats dc(field) by sourcetype

Get Updates on the Splunk Community!

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...

Introducing New Splunkbase Governance!

Splunk apps are essential for maximizing the value of your Splunk Experience. Whether you’re using the default ...

3 Ways to Make OpenTelemetry Even Better

My role as an Observability Specialist at Splunk provides me with the opportunity to work with customers of ...