Dashboards & Visualizations

TSTATS Sort by Indexed Time?

chrisboy68
Contributor

Hi, I have an issue where I can see something is consuming licenses ingestion for a specific sourcetype. Unfortunately, the host is blank in index=_internal source="*license_usage.log*, however, I do know the index. I cannot find what host is sending data Indexed today by potentially sending dates in the past. I have found sending events dates in the past to be this issues. Only time i have solved it before, is DEBUG an HEC and I don't want to keep doing that.

I want to do something like this:

| tstats count where index=os AND sourcetype=ps groupby host| where < actually ingest time was yesterday but event time was days in the past, list event time and ingest time>

 

Is this possible? Thank you!

Chris

Labels (1)
0 Karma

chrisboy68
Contributor

Update. So I had to put HEC into DEBUG mode to find my issue

body_chunk="{"time":1630910039.599,"index":"os","host":"myhost","source":"ps","sourcetype":"ps",

Just a snippet of the event above. The "time" sent in, is for the past. The HEC received time is Sept 6-time, but the actual time is Sept-24-time. 

This is the problem I'm trying to find without having to place an indexer into DEBUG mode.  To clarify the question, how would I find this problem in the future by SPL?

 

Thanks

 

Chris

0 Karma

PickleRick
Champion

And are you able to find those events in indexes? What's their _time and _indextime?

0 Karma

chrisboy68
Contributor

I'm stumped. So using this debug data , I did a search specifc to the time. Time picker was selected for Sept 6 (same day)

index=os sourcetype=ps host=MyHost _time=1630910039.599  | eval indextime=strftime(_indextime,"%Y-%m-%d %H:%M:%S")
|  table host _time indextime

Returns,  the event time and indexed time differences, see below.

chrisboy68_0-1632505666105.png

Yeah, so my hunch was right. Now I want to reverse engineer some spl i can run now and find some issues. So lets just start with using a time modifier for index_earliest and spot check before I do some time math.

index=os sourcetype=ps _index_earliest=-24h 
| eval indextime=strftime(_indextime,"%Y-%m-%d") 
| eval event_time =strftime(_time, "%Y-%m-%d") 
| table host _time indextime event_time

The above does not pull back any data. If I remove the Timemodifier in the SPL and set the time picker for 24hrs, I do get back data.

chrisboy68_2-1632506135906.png

Am I missing something? 

 

0 Karma

PickleRick
Champion

 Quoting the docs:

When using index-time based modifiers such as _index_earliest and _index_latest, your search must also have an event-time window which will retrieve the events. In other words, chunks of events might be ruled out based on the non index-time window as well as the index-time window. To be certain of retrieving every event based on index-time, you must run your search using All Time.
0 Karma

chrisboy68
Contributor

Ugh. Seem counter intuitive. Thought time modifiers in SPL overrode the time picker.  Anyway, thanks, so I can just run some SPL for last 30-60 days in hopes I find my problem in the future. If someone is pushing events in the past like months and years in the past, it will be a heavy query...

0 Karma

PickleRick
Champion

It's not that counter-intuitive if you come to think of it. _time is the primary way of limiting buckets that splunk searches. _indexedtime is just a field there. So effectively, limiting index time is just like adding additional conditions on a field.

0 Karma

PickleRick
Champion

Well, unfortunately, host is not a very reliable field on its own. Depends on how you're getting your data and what and how is being parsed from the events.

But to check who is sending "late" data (remember that it might be indeed sent with a great delay or you might simply have highly misconfigured time on the source system or badly set timezone) you can do something like

<<search across your indexes>> | eval delay=_indextime-_time | stats avg(delay) min(delay) max(delay) by source index host

 If you have consistent low values, your ingestion process is going smoothly. If you have several minutes delay, you have some bottlenecks (can be a normal state though in case of some ingestion forms - like forwarding data via WEF and reading them from Forwarded Items by UF). If you have negative values - you have problems with time sync. And so on.

 

0 Karma

chrisboy68
Contributor

Thanks for the reply. That query did not tell me anything was wrong, all looked fine there. 

0 Karma

codebuilder
SplunkTrust
SplunkTrust

Your best option is to enable Forwarder Monitoring on the Distributed Monitoring Console (DMC) or MC on the index master (if you don't have a DMC). That feature provides all types of detailed information on what forwarders are connected, status, data thruput, and a lot more.

See the following documentation for more:

https://docs.splunk.com/Documentation/Splunk/8.2.2/Updating/Forwardermanagementoverview

The DMC (or MC on master) also provides license utilization information at:

Monitoring Console > Indexing > License Usage - Today or Historic License Usage (those are the two 'canned' options).

Worth noting if you have or try either option, you can hover over the graphs and click on "open in search" to see  the search(es) that power the panels. Those can also help give you a good base for building upon/modifying to suit your specific needs.

----
An upvote would be appreciated and Accept Solution if it helps!
0 Karma

chrisboy68
Contributor

Thanks. Yeah looked at that, even have MetaWoot gathering metrics. None of those will show me ingest if the event date is in the past. Next stop debugging. 

0 Karma

isoutamo
SplunkTrust
SplunkTrust

One option is install e.g. Meta Woot and use it to figure out which source sends those logs.

https://splunkbase.splunk.com/app/2949/

r. Ismo

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!