I've done a little looking and poking around but haven't seen an answer to this - hopefully I haven't overlooked something obvious. I'm trying to build a query that counts the number of fields associated with a sourcetype (edit: number of fields associated with the result set based on a query that is looking at a particular sourcetype). While not an exact measurement I believe that will help me hone in on sourcetypes that possibly need additional field extraction/definition work (low numbers = spend some time).
Both answers so far work great, but to do this one sourcetype at a time would take me forever. I solved this by writing a shell script the uses the CLI to do two things:
1) Enumerate all sourcetypes.
2) Count the number of fields for each sourcetype.
It looks like this:
#!/bin/bash
# Clear fields.txt
> fields.txt
echo "Building sourcetype list."
splunk search "| metadata type=sourcetypes index=* | table sourcetype" -uri https://hostname:8089 -maxout 10000 -earliest_time '-7d'| tail +3 > sourcetypes.txt
echo "Done."
while read st; do
echo "Counting fields for $st."
echo -n $st >> fields.txt
echo -n ": " >> fields.txt
# The search returns with some weird leading whitespace, so I use sed to
# get rid of it.
splunk search "sourcetype=$st | head 250 | fieldsummary | stats count" -uri https://hostname:8089 | tail -1 | sed 's/^[ \t]*//' >> fields.txt
done < sourcetypes.txt
I'm ten years late, so I hope it's not urgent 🤣
But here's a solution you can run inline:
#base search here
| stats last(*) AS * by sourcetype
| foreach *
[ eval <<FIELD>>=if("<<FIELD>>" == "sourcetype", sourcetype, 1)]
| addtotals
| fields Total sourcetype
That's unlikely to give you a good representation of fields that actually are part of the data associated with a sourcetype, for example if you just run this on index=_audit
index=_audit
| stats last(*) AS * by sourcetype
| foreach *
[ eval <<FIELD>>=if("<<FIELD>>" == "sourcetype", sourcetype, 1), fields=mvappend(fields, "<<FIELD>>") ]
| addtotals
| fields Total sourcetype fields
you will get a load of fields associated with audittrail - on 3 different search heads, I get from 149 to 323 fields, most of which are just pulled in due to TAs installed on the search heads.
For my particular use case, I want to compare the difference between the count of fields extracted in windows event logs before versus after I install the Splunk TA for Windows on my search head.
It fits my use case, but it might not for others if you have inconsistent configurations across your search head peers, for example.
With 6x you can use the foreach command (maybe was there in 5 but not documented?). At any rate here is the search I'm about to start using.
<scope limiting search> | dedup 20 sourcetype punct | table * | fields - _raw date_* index linecount punct eventtype time*pos splunk_server | foreach * [eval <<FIELD>> = '<<FIELD>>'."##".sourcetype."##||" ] | stats max(*) as * | transpose | rename "row 1" as sourcetype column as field | rex field=sourcetype "##(?<sourcetype>[^#]+)" | eval fieldsort = lower(field) | dedup sourcetype field | sort sourcetype fieldsort | table sourcetype field
Both answers so far work great, but to do this one sourcetype at a time would take me forever. I solved this by writing a shell script the uses the CLI to do two things:
1) Enumerate all sourcetypes.
2) Count the number of fields for each sourcetype.
It looks like this:
#!/bin/bash
# Clear fields.txt
> fields.txt
echo "Building sourcetype list."
splunk search "| metadata type=sourcetypes index=* | table sourcetype" -uri https://hostname:8089 -maxout 10000 -earliest_time '-7d'| tail +3 > sourcetypes.txt
echo "Done."
while read st; do
echo "Counting fields for $st."
echo -n $st >> fields.txt
echo -n ": " >> fields.txt
# The search returns with some weird leading whitespace, so I use sed to
# get rid of it.
splunk search "sourcetype=$st | head 250 | fieldsummary | stats count" -uri https://hostname:8089 | tail -1 | sed 's/^[ \t]*//' >> fields.txt
done < sourcetypes.txt
As cschmidt0121 mentioned the other two answers work well but I have to give it to this poster as he is a student worker in our shop 😃
BTW, great solution! I am going to use it into some of my work!
That didn't even occur to me! Thanks for the heads up.
Just be aware that the fields may vary based on the app context. For example, I can have a field that is defined in my Sales app, but which does not exist in the Search app context. So unless your fields are all global, you will probably miss some.
The CLI operates in the Search app context.
Assume that you want to get the details for sourcetype "WinEventLog:Security". You can get this by enumerating all fields for that singular sourcetype with this search:
sourcetype="WinEventLog:Security" | stats mode(*) AS * by sourcetype
This will give you a listing of all fields plus a single row with a sample of the data.
use the tranpose command to re-order the results and ignore the sampling of data.
| transpose | fields - "row 1" | rename column AS "FIELD_NAME"
This will provide a single column with all of the fields associated with the sourcetype:
And from there, the rest is up to you... for instance, to count the number of fields:
| stats count(FIELD_NAME) AS count
All together, the search looks like this:
sourcetype="WinEventLog:Security" | stats mode(*) AS * by sourcetype | transpose | fields - "row 1" | rename column AS "FIELD_NAME" | stats count(FIELD_NAME) AS count
gc
Very nice! I had not considered this!
There is no easy way, but you could get interesting data with a search like :
sourcetype=splunkd | fieldsummary | eval sourcetype="splunkd" | table field sourcetype
Then for each interesting sourcetype, save the result in a lookup:
sourcetype=splunkd | fieldsummary | eval sourcetype="splunkd" | table field sourcetype | outputlookup list_fields.csv append=true
and finally run a report on the whole lookup:
| inputlookup list_fields.csv | stats dc(field) by sourcetype