Splunk Enterprise

Why is tstats returning sourcetypes that do not exist?

HeavyHats
Explorer

I recently discovered that "tstats" is returning sourcetypes which do not exist. 

Query: 

| tstats values(sourcetype) where index=* by index

 This returns a list of sourcetypes grouped by index. While it appears to be mostly accurate, some sourcetypes which are returned for a given index do not exist. For example, the sourcetype "WinEventLog:System" is returned for myindex, but the following query produces zero results: 

index=myindex sourcetype="WinEventLog:System"

This is the case for multiple indexes.

If my understanding of "tstats" is correct, it works by only analyzing indexed fields which are stored in the tsidx files. If no events exist with a given sourcetype for a specific index, how could that value have possibly been saved in the tsidx files? 

Labels (2)
0 Karma
1 Solution

VatsalJagani
SplunkTrust
SplunkTrust

@HeavyHats - The very possible reason is the "rename" of props.conf

  • index=myindex | stats count by sourcetype
    • is looking at the sourcetype name after the rename attribute.
  • | tstats values(sourcetype) where index=myindex
    • is looking at the sourcetype name that does not include the rename attribute.
    • Why? -> Because rename is a search-time attribute. And tstats just look at the summaries.

I know many Windows-related data has the rename attribute, for example, Sysmon data, Windows firewall logs from EventLogs. But this will be the issue anywhere where rename attribute it being used.

 

(Previously someone asked similar question - https://community.splunk.com/t5/Splunk-Search/Why-does-tstats-returns-events-by-sourcetype-but-searc...)

 

I hope this helps!!!

View solution in original post

PickleRick
SplunkTrust
SplunkTrust

Are you sure noone fiddled with the TA_windows? Typically you'd see "XmlWinEventLog:System" as source, not sourcetype.

See my home Splunk instance:

PickleRick_0-1657183391506.png

 

0 Karma

HeavyHats
Explorer

Yes, that is part of the confusion here. "tstats" shows that "(Xml)WinEventLog:System" exists as a sourcetype, when it actually only exists as a source

0 Karma

VatsalJagani
SplunkTrust
SplunkTrust

The only reason I could see of tstats and search showing different results is the rename attribute I mentioned in my answer.

VatsalJagani
SplunkTrust
SplunkTrust

Yeah if sourcetype is "WinEventLog:System" then you are using a very old version of the Add-on < 5.0.0

0 Karma

VatsalJagani
SplunkTrust
SplunkTrust

@HeavyHats - The very possible reason is the "rename" of props.conf

  • index=myindex | stats count by sourcetype
    • is looking at the sourcetype name after the rename attribute.
  • | tstats values(sourcetype) where index=myindex
    • is looking at the sourcetype name that does not include the rename attribute.
    • Why? -> Because rename is a search-time attribute. And tstats just look at the summaries.

I know many Windows-related data has the rename attribute, for example, Sysmon data, Windows firewall logs from EventLogs. But this will be the issue anywhere where rename attribute it being used.

 

(Previously someone asked similar question - https://community.splunk.com/t5/Splunk-Search/Why-does-tstats-returns-events-by-sourcetype-but-searc...)

 

I hope this helps!!!

HeavyHats
Explorer

Where is a rename most likely to happen? (Universal Forwarder, Heavy Forwarder, Indexer, etc.). Our Universal Forwarders are not using a rename function in any props.conf files, and I've checked the heavy forwarder that these logs are passing through and it does not contain a rename function in any of its props.conf files either. I'm guessing this happens on the indexers? 

0 Karma

VatsalJagani
SplunkTrust
SplunkTrust

rename is search time hence happens on the search head.

 

HeavyHats
Explorer

Thank you for the insight. I discovered that version 8.5.0 of the Splunk Add-on for Microsoft Windows (Splunk_TA_windows) contains rename statements in Splunk_TA_windows/default/props.conf: 

## To provide backward compatibility for WinEventLog and XmlWinEventLog data
## These will be deprecated in future
[WinEventLog:Security]
rename = wineventlog

[WinEventLog:Application]
rename = wineventlog

[WinEventLog:System]

rename = wineventlog
...

This appears to be the source of this behavior. Marking your solution as accepted. 

0 Karma

VatsalJagani
SplunkTrust
SplunkTrust

Glad you caught it!!!

Keep an eye out because many Add-ons use this, unfortunately (this makes it inconsistent between tstats and normal search).

0 Karma

PickleRick
SplunkTrust
SplunkTrust

I'm not buying this explanation. Rename works only one way - it only lets you search for a given sourcetype using a different name. It doesn't modify the returned results.

In order for you to have one value stored in the index (returned by tstats) and another calculated search-time you'd have to have some EVAL defined that would "cast" the value from source to sourcetype. Maybe someone did something like that while the windows TA changed it behaviour in order not to rework searches done for old values.

0 Karma

bowesmana
SplunkTrust
SplunkTrust

Can you confirm if

| tstats values(sourcetype) where index=myindex

and

index=myindex 
| stats count by sourcetype

produce identical looking sourcetypes for the same time range.

If they are giving different sourcetypes for that one 'myindex' example you gave that's odd. Could it be that the sourcetype 

sourcetype="WinEventLog:System"

has some leading trailing characters, e.g. space?

If you do the search with wildcards

index=myindex sourcetype="*WinEventLog:System*"

does that also give no results?

HeavyHats
Explorer

I can confirm that the first two queries do not produce identical lists. There is about 90% overlap, but each list contains entries which are absent from the other list. 

I can also confirm that there is no leading/trailing whitespace. The last query produces no results. 

0 Karma

bowesmana
SplunkTrust
SplunkTrust

Other than a permissions/security issue that is constraining what you can see in one variant as opposed to the other, I don't have any other suggestions 😞

 

0 Karma

Azeemering
Builder

I believe this is because the tstats command performs statistical queries on indexed fields in tsidx files.

Some time ago the Windows TA was changed in version 5.0.1 of the Windows TA.

See: Sourcetype changes for WinEventLog data 
This means all old sourcetypes that used to exist (and where indexed!) where named for example WinEventLog:System or WinEventLog:Application or WinEventLog:Security. They all have been renamed to WinEventLog by the newer version of Windows TA.

But since they where indexed in the past they still exists in the metadata. And since tstats only looks at the indexed metadata you see these old sourcetypes appear.

0 Karma

HeavyHats
Explorer

If I'm limiting my search to the past 24 hours though, shouldn't tstats respect the time limit and not evaluate older data? 

0 Karma

PickleRick
SplunkTrust
SplunkTrust

It doesn't work that way. A value of an indexed field is just a value. If you extract different value of _time for each event you don't expect the old ones to get "renamed" do you?

So that's not the cause.

Either there is some dynamic renaming in search-time happening as @VatsalJagani suggested or the index file is simply corrupted and for some reason "overlaps" source with sourcetype (or vice versa).

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...