Getting Data In

fully reindexing a file every time the datestamp changes

mjones414
Contributor

I've a few different automated pulls of data into directories of files I want splunk to index. These files get completely overwritten every night at least, but sometimes more often than that depending on different operational conditions out of my control. I need splunk to reindex these files every time the datestamp changes and that doesn't appear to be working. current props configurations:

[source:: /data/ridiculi/all_group/ridiculi.*]
CHECK_METHOD = modtime

[ridiculi:group]
DATETIME_CONFIG = CURRENT
FIELD_DELIMITER = ":"
FIELD_NAMES = gid,status,gidnumber,members
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
category = Operating System
description = Ridiculi
disabled = false
pulldown_type = true

inputs.conf:

[monitor:///data/ridiculi/all_group/ridiculi.wanker]
disabled = false
sourcetype = ridiculi:group
index=ridiculi

1 Solution

pdaigle_splunk
Splunk Employee
Splunk Employee

This is a log file rotation setup where you need to use the crcSalt bit configuration and/or the initCrcLength attribute:

https://docs.splunk.com/Documentation/Splunk/7.2.5/Data/Howlogfilerotationishandled

The winning combination was CHECK_METHOD combined with setting crcSalt to something like REINDEX_ALWAYS. Because the file has similar or almost the same data, more than likely the CRC Checksum value and size is the same and Splunk will skip the log file even if the time and date of the file has changed.

View solution in original post

0 Karma

pdaigle_splunk
Splunk Employee
Splunk Employee

This is a log file rotation setup where you need to use the crcSalt bit configuration and/or the initCrcLength attribute:

https://docs.splunk.com/Documentation/Splunk/7.2.5/Data/Howlogfilerotationishandled

The winning combination was CHECK_METHOD combined with setting crcSalt to something like REINDEX_ALWAYS. Because the file has similar or almost the same data, more than likely the CRC Checksum value and size is the same and Splunk will skip the log file even if the time and date of the file has changed.

View solution in original post

0 Karma

mjones414
Contributor

Thank you for posting this Paul!

0 Karma

pdaigle_splunk
Splunk Employee
Splunk Employee

You're welcome! Just sharing in case others in the community run across the same issue. 🙂

0 Karma

mjones414
Contributor

Just a bump on this to see if there were any more ideas? About to open a case with Splunk support -- as it seems whats here should be sufficient.

0 Karma

adonio
SplunkTrust
SplunkTrust

the answer is between the lines ... where is your props.conf that has the:
[source:: /data/ridiculi/all_group/ridiculi.*]
CHECK_METHOD = modtime
it supposed to be on the instance that collects the data

0 Karma

mjones414
Contributor

In my situation that is true. the /data directory is monitored by the splunk search head and the props.conf is also on the splunk search head. There is no forwarder involved in this particular data input.

0 Karma

mjones414
Contributor

Just a bump on this to see if there were any more ideas? About to open a case with Splunk support -- as it seems whats here should be sufficient.

0 Karma

mjones414
Contributor

Permissions on all files are 644. permissions on directory are 2644. Filesystem is NFSv3.

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

Are you running Universal Forwarder to read /data/ridiculi/all_group/ridiculi.wanker file ? If yes then below props.conf will not work on Universal Forwarder.

[ridiculi:group]
DATETIME_CONFIG = CURRENT
FIELD_DELIMITER = ":"
FIELD_NAMES = gid,status,gidnumber,members
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
category = Operating System
description = Ridiculi
disabled = false
pulldown_type = true

You need to configure above props.conf configuration on first Splunk Enterprise instance from Universal Forwarder because parsing happens on full splunk instance not on UF.

0 Karma

mjones414
Contributor

I'm actually running this on the splunk distributed search head in an app context. The props.conf should be ina distribution bundle that goes to the indexers. The splunk distributed search head does output to an output queue of indexers. Is there something else missing from that config that I need?

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

On which instance you are monitoring /data/ridiculi/all_group/ridiculi.wanker logfile ? Search Head or Universal Forwarder ?

0 Karma

adonio
SplunkTrust
SplunkTrust

where is your props.conf? iirc the top portion has to be on the forwarder:

[source:: /data/ridiculi/all_group/ridiculi.*]
CHECK_METHOD = modtime

the rest will be on the indexer

0 Karma

mjones414
Contributor

Its in etc/apps/ridiculi/local/props.conf and inputs.conf respectively on the distributed search head. the /data path is an autofs mount point that the splunk search head can read (other files are being indexed over /data both from this search head and from indexers as required.)

0 Karma

somesoni2
Revered Legend

Could you also post first few lines from the file?
Also, The props.conf with [source:..., did you place it in the forwarder (same host as where your inputs.conf lives)?

0 Karma

mjones414
Contributor

sure.

I'm actually running this on the splunk distributed search head in an app context. The props.conf should be ina distribution bundle that goes to the indexers. The splunk distributed search head does output to an output queue of indexers. Is there something else missing from that config that I need?

First few lines from a file:
Admins:NISG:123123:jim,bob,joe
Users:NISG:456456:alpha,whiskey,tango

0 Karma
Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.