Archive

Delay in indexing data

agoyal
Builder

Hi,

we are monitorning recursively on directory and some time indexing the data in splunk is delayed a lot ( 12+ hrs).

Universal Forwarder
[monitor:///net/hp707srv/hp707srv1/apps/QCST_MIC_v3.1.44_MASTER/logs.../.log]
disabled = false
host = MIC_v44
index = mlc_live
sourcetype = GC11_RAW
crcSalt =
whitelist = .*gc.log$|.*gc.
.log$
blacklist=logs_|fixing_|tps-archives

In below example log printed every hour but it was not indexed straigh away. More than 12hrs of data indexed at same time 12:01
alt text

throughput is set to unlimited as this forwarder is monitoring many other files.

limits.conf

  [thruput]
    maxKBps = 0
0 Karma

PavelP
Motivator

Hello @agoyal ,

to check if it caused by monitoring delay because of UF cannot detect that the file is changed try this command on UF and compare the the file size measured with ls command with file position and file size in the output. Also check percent (should be 100%) and type (finished reading). I can imagine such delay if the log located on the network share (NFS?) and fschange isn't being detected correctly.

/opt/splunk/bin/splunk _internal call /services/admin/inputstatus/TailingProcessor:FileStatus

sample output:

            <s:key name="/net/folder/file.log">
              <s:dict>
                <s:key name="file position">25008227</s:key>
                <s:key name="file size">25008227</s:key>
                <s:key name="percent">100.00</s:key>
                <s:key name="type">finished reading</s:key>
              </s:dict>
            </s:key>
0 Karma

agoyal
Builder

I tried your command on Forwarder but its not working. eventhough splunkd is up its saying its down. Am I missing something here ?

bash$ bin/splunk status
splunkd is running (PID: 18089).
splunk helpers are running (PIDs: 18090).
mx25089vm autoengine /data/apps/splunkforwarder_MIC_v44_ORCH1/
bash$ bin/splunk _internal call /services/admin/inputstatus/TailingProcessor:FileStatus
QUERYING: 'https://127.0.0.1:9089/services/admin/inputstatus/TailingProcessor:FileStatus'
This command [GET /services/admin/inputstatus/TailingProcessor:FileStatus] needs splunkd to be up, and splunkd is down.
0 Karma

PavelP
Motivator

it seems you have a custom splunk forwarder install, is it a docker container or custom tgz install? I see a not default port 9089.

Here is my output on UF :

splunk@debian:~$ /opt/splunkforwarder/bin/splunk _internal call /services/admin/inputstatus/TailingProcessor:FileStatus
QUERYING: 'https://127.0.0.1:8089/services/admin/inputstatus/TailingProcessor:FileStatus'
Your session is invalid.  Please login.
Splunk username: admin
Password:
HTTP Status: 200.
Content:
.....
0 Karma

agoyal
Builder

@PavelP : Some time we use to have Splunk install and forwarder on same box so we might have changed management port to 9089 for forwarder to avoid conflict. Forwarder is original Universal forwarder. I tried changing port back to 8089 but still not working 😞

0 Karma

PavelP
Motivator

Something is still wrong, why splunk thinks the process isn't running?

0 Karma

skalliger
SplunkTrust
SplunkTrust

Can you show us your props.conf and transforms.conf stanzas for the according sourcetype? The _time seems to be extracted correctly. What does the Monitoring Console say? Any filled up queues in your environment? What's the load of the indexers? Any Heavy Forwarders in between?

Skalli

agoyal
Builder

@skalliger : thanks for reply.

Props.conf
First data is injested to GC_11_RAW and only relevant events trasffered to GC11 sourcetype

[GC11]
disabled = false
SHOULD_LINEMERGE = false
TIME_PREFIX = ^\[
LINE_BREAKER = ([\r\n]+)
MAX_TIMESTAMP_LOOKAHEAD = 100
EXTRACT-PAUSE_FULL = (\[(?P\d*-\d*-\d\dT\d\d:\d\d:\d\d.\d*\+\d*)\])?(\[(?P\d+.\d+)s])?.*GC\((?P\d*)\) (?PPause Full)(.*) (?P[0-9]*)M->(?P[0-9]*)M\((?P[0-9]*)M\) (?P[0-9]*.[0-9]*)ms

[GC11_RAW]
LINE_BREAKER = ([\r\n]+)
SHOULD_LINEMERGE = false
NO_BINARY_CHECK = true
category = Custom
TRANSFORMS-sourcetye_routing = JAVA_GC11_sourcetye_routing

transforms.conf
[JAVA_GC11_sourcetye_routing]
DEST_KEY = MetaData:Sourcetype
REGEX = GC\(\d*\)|Using G1
FORMAT = sourcetype::GC11

There is no heavy forwarder in between. even no unusal activity on monitoing console. data volume is same what we ususally gets. _time is correct because its been extracted from logs. Just indexing is lagging.

0 Karma