I would like to know how to setup Splunk to monitor a local input directory, BUT the new files which are added (which contain multiple lines) are ingested by Splunk and only create 1 new event per file (containing all of the file's contents). I do have the ability to manipulate the file data to add line-breaks if that is the solution.
Just set the LINE_BREAKER for the sourcetype to something that will never match, such as (?!)
. You will also probably also need to increase MAX_EVENTS (default is only 500 lines, there isn't a hard limit I know of) and TRUNCATE to something larger than the biggest file size (or I think 0 is unlimited).
I use a regular monitor stanza combined with a custom sourcetype to index full files of interest.
I use the following monitor to index changes to my splunk configs for example (inputs.conf):
[monitor://C:\Program Files\Splunk\etc\...\*.conf] followTail = False sourcetype = splunk_config index = my_custom_index disabled = false
and define the splunk_config sourcetype in props.conf as such:
[splunk_config] BREAK_ONLY_BEFORE=goblygook MAX_EVENTS=200000 DATETIME_CONFIG = NONE CHECK_METHOD = modtime pulldown_type = true LEARN_MODEL = false
this combination will index all files under splunk\etc ending in .conf. The BREAK_ONLY_BEFORE=gooblybook basically tells splunk not to break the event (in this case the conf file) until it encounters "gooblygook" which shouldn't be in any of your files.
Update: Check out this answer to the same question Each File as One Single Splunk Event
[mysinglefilesourcetype]
SHOULD_LINEMERGE = false
LINE_BREAKER = ((*FAIL))
TRUNCATE = 99999999
I think this is newer information
In regards to gcoles findings about the first approach not working with Splunk 4.3:
LINE_BREAKER = (?!)
This approach still works in Splunk 4.3 with a minor modification. The expression needs to be surrounded by an additional pair of parantheses:
LINE_BREAKER = ((?!))
I think this is because Splunk 4.3 requires the regular expression to have at least one capture expression, and earlier Splunk versions did not enforce this. The "(?!)" is merely a lookahead expression, the additional pair of parentheses does add a capture expression.
As a note to anyone else who may be using this page as a reference, I had been using the LINE_BREAKER directive to do this (as outlined by gkanapathy), but this stopped working when we upgraded our indexers to 4.3. I had to change our props.conf entries for these kinds of inputs to use the method shown by ftk. I verified that the first method fails whether using lightweight or heavy forwarders, as long as the machine that is processing the props.conf for the sourcetype is 4.3.
I use a regular monitor stanza combined with a custom sourcetype to index full files of interest.
I use the following monitor to index changes to my splunk configs for example (inputs.conf):
[monitor://C:\Program Files\Splunk\etc\...\*.conf] followTail = False sourcetype = splunk_config index = my_custom_index disabled = false
and define the splunk_config sourcetype in props.conf as such:
[splunk_config] BREAK_ONLY_BEFORE=goblygook MAX_EVENTS=200000 DATETIME_CONFIG = NONE CHECK_METHOD = modtime pulldown_type = true LEARN_MODEL = false
this combination will index all files under splunk\etc ending in .conf. The BREAK_ONLY_BEFORE=gooblybook basically tells splunk not to break the event (in this case the conf file) until it encounters "gooblygook" which shouldn't be in any of your files.
It doesn't enough if you forward data to the indexer. It is just useful for local file monitoring.
Do you want Splunk to create one event per file, or do you want it to create one event per line?
Just set the LINE_BREAKER for the sourcetype to something that will never match, such as (?!)
. You will also probably also need to increase MAX_EVENTS (default is only 500 lines, there isn't a hard limit I know of) and TRUNCATE to something larger than the biggest file size (or I think 0 is unlimited).
Is this still good in 2021?