Index=X sourcetype=Y cribl_pipe=Z when I ran for 1week and 24hrs it showed index , sourcetype field with 100%
Index=X sourcetype=Y cribl_pipe=Z when I ran for 2weeks and 1month index , sourcetype field is not showing up 100% .
I'm searching for single index and single sourcetype but for 1week it's showing 100% field value, for 2 weeks it's not showing 100% what can be the issue ?
How can I Identify raw events that are not indexed ( soure tcp:9997 )
Okay, I will explian it simple and short :
I'm searching data for Index=abc for last 2 weeks of data and I can see index percentage value is 98% events..
If I split time intervals and search like index=abc for this week and last week I can see 100% events.
So, my question here is when i search data for last 2 weeks 98% events are showing, if i split time intervals and search it's showing 100% events. If data is more and due to buckets policy for last 2 weeks is it showing as 98% or is there any reason?
I was hoping to see the SPL you are using and the output of that SPL because I'm having difficulty understanding the English description of what is happening.
Even if there is delay in _time events are getting processing , Here issue is if we run search for 10 days for raw data count of <10lacks index value is showing 100 % ... If raw data count is >10 lacks then index value is showing <100%... How can I rectify this issue ? we have to change limit.conf for data getting from tcp:9997 port or there any other way ?
or any search command to verify timestamps , I tried index=x sourcetype=y host= source= | convert ctime(_indextime) AS indextime | delay =_indextime-_time | table _time indextime date_zone host source sourcetype _raw it didn't worked
One of my pet peeves on this forum is postings that state "it didn't work" without explanation.
The search fails because _indextime was converted to text and then compared to an integer (_time), which is not valid. _indextime is already an integer and so can be compared to _time directly. Like this:
index=x sourcetype=y host=* source=* | eval indextime=_indextime | eval delay =indextime-_time | table _time indextime delay date_zone host source sourcetype _raw
That will show how long it took for an event to be indexed from the time it was generated (sort of - a numer of factors can skew this number). It won't however, "verify timestamps". Splunk does some of that for you and logs it in _internal. Use this search to find the messages.
index=_internal sourcetype=splunkd component=DateParserVerbose log_level=WARN | rex "Context:\s+source=(?<data_source>[^\|]+)\|host=(?<data_host>[^\|]+)\|(?<data_sourcetype>[^\|]+)" | stats count as Count values(data_source) values(data_host) dc(data_source) as "Source Count" dc(data_host) as "Host Count" BY data_sourcetype | sort 0 - count | rename data_sourcetype as Sourcetype
Sorry, the above search I already checked before posting here it didn't worked.
Index=X source=tcp:9997 sourcetype=Y cribl_pipe=Z , for last 2 weeks of data fields are not showing 100%.
I'm searching for single index, source, sourcetype, cribl_pipe .. but I'm unable to get raw events which are not indexed
It's impossible to search for events that are not indexed. Splunk searches its indexes for data so anything not indexed cannot be sought.
Have you tried less-restrictive searches to see if the data is there, but with different attributes? Have you tried different time ranges (including future times) in case event timestamps were mis-interpreted?
Try this search to find the events that arrived via TCP without a sourcetype.
index=X source="TCP:9997" NOT sourcetype=*
Go to Settings->Data inputs->TCP to change each input to have a sourcetype.
Even better: stop sending events directly to a Splunk TCP/UDP port. Doing so will cause data loss each time the listening instance restarts. Use a dedicated syslog server or other intermediary process.