TSTATS Sort by Indexed Time?

chrisboy68 · ‎09-23-2021

Hi, I have an issue where I can see something is consuming licenses ingestion for a specific sourcetype. Unfortunately, the host is blank in index=_internal source="*license_usage.log*, however, I do know the index. I cannot find what host is sending data Indexed today by potentially sending dates in the past. I have found sending events dates in the past to be this issues. Only time i have solved it before, is DEBUG an HEC and I don't want to keep doing that.

I want to do something like this:

| tstats count where index=os AND sourcetype=ps groupby host| where < actually ingest time was yesterday but event time was days in the past, list event time and ingest time>

Is this possible? Thank you!

Chris

chrisboy68 · ‎09-24-2021

Update. So I had to put HEC into DEBUG mode to find my issue

body_chunk="{"time":1630910039.599,"index":"os","host":"myhost","source":"ps","sourcetype":"ps",

Just a snippet of the event above. The "time" sent in, is for the past. The HEC received time is Sept 6-time, but the actual time is Sept-24-time.

This is the problem I'm trying to find without having to place an indexer into DEBUG mode. To clarify the question, how would I find this problem in the future by SPL?

Thanks

Chris

PickleRick · ‎09-24-2021

And are you able to find those events in indexes? What's their _time and _indextime?

chrisboy68 · ‎09-24-2021

I'm stumped. So using this debug data , I did a search specifc to the time. Time picker was selected for Sept 6 (same day)

index=os sourcetype=ps host=MyHost _time=1630910039.599  | eval indextime=strftime(_indextime,"%Y-%m-%d %H:%M:%S")
|  table host _time indextime

Returns, the event time and indexed time differences, see below.

Yeah, so my hunch was right. Now I want to reverse engineer some spl i can run now and find some issues. So lets just start with using a time modifier for index_earliest and spot check before I do some time math.

index=os sourcetype=ps _index_earliest=-24h 
| eval indextime=strftime(_indextime,"%Y-%m-%d") 
| eval event_time =strftime(_time, "%Y-%m-%d") 
| table host _time indextime event_time

The above does not pull back any data. If I remove the Timemodifier in the SPL and set the time picker for 24hrs, I do get back data.

Am I missing something?

PickleRick · ‎09-24-2021

Quoting the docs:

When using index-time based modifiers such as _index_earliest and _index_latest, your search must also have an event-time window which will retrieve the events. In other words, chunks of events might be ruled out based on the non index-time window as well as the index-time window. To be certain of retrieving every event based on index-time, you must run your search using All Time.

chrisboy68 · ‎09-24-2021

Ugh. Seem counter intuitive. Thought time modifiers in SPL overrode the time picker. Anyway, thanks, so I can just run some SPL for last 30-60 days in hopes I find my problem in the future. If someone is pushing events in the past like months and years in the past, it will be a heavy query...

PickleRick · ‎09-24-2021

It's not that counter-intuitive if you come to think of it. _time is the primary way of limiting buckets that splunk searches. _indexedtime is just a field there. So effectively, limiting index time is just like adding additional conditions on a field.

PickleRick · ‎09-23-2021

Well, unfortunately, host is not a very reliable field on its own. Depends on how you're getting your data and what and how is being parsed from the events.

But to check who is sending "late" data (remember that it might be indeed sent with a great delay or you might simply have highly misconfigured time on the source system or badly set timezone) you can do something like

<<search across your indexes>> | eval delay=_indextime-_time | stats avg(delay) min(delay) max(delay) by source index host

If you have consistent low values, your ingestion process is going smoothly. If you have several minutes delay, you have some bottlenecks (can be a normal state though in case of some ingestion forms - like forwarding data via WEF and reading them from Forwarded Items by UF). If you have negative values - you have problems with time sync. And so on.

chrisboy68 · ‎09-23-2021

Thanks for the reply. That query did not tell me anything was wrong, all looked fine there.

codebuilder · ‎09-23-2021

Your best option is to enable Forwarder Monitoring on the Distributed Monitoring Console (DMC) or MC on the index master (if you don't have a DMC). That feature provides all types of detailed information on what forwarders are connected, status, data thruput, and a lot more.

See the following documentation for more:

https://docs.splunk.com/Documentation/Splunk/8.2.2/Updating/Forwardermanagementoverview

The DMC (or MC on master) also provides license utilization information at:

Monitoring Console > Indexing > License Usage - Today or Historic License Usage (those are the two 'canned' options).

Worth noting if you have or try either option, you can hover over the graphs and click on "open in search" to see the search(es) that power the panels. Those can also help give you a good base for building upon/modifying to suit your specific needs.

----
An upvote would be appreciated and Accept Solution if it helps!

chrisboy68 · ‎09-23-2021

Thanks. Yeah looked at that, even have MetaWoot gathering metrics. None of those will show me ingest if the event date is in the past. Next stop debugging.

isoutamo · ‎09-23-2021

One option is install e.g. Meta Woot and use it to figure out which source sends those logs.

https://splunkbase.splunk.com/app/2949/

r. Ismo

TSTATS Sort by Indexed Time?

timechart

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Value Insights: Now Generally Available in the CMC

What’s New in Splunk AI: Volume 02

Splunk App Dev Quarterly Roundup: AI, Agents, and Innovation!

Join the Conversation

TSTATS Sort by Indexed Time?

timechart

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Value Insights: Now Generally Available in the CMC

What’s New in Splunk AI: Volume 02

Splunk App Dev Quarterly Roundup: AI, Agents, and Innovation!