Getting Data In

Entire file contents as a single event

keiche
Explorer

I would like to know how to setup Splunk to monitor a local input directory, BUT the new files which are added (which contain multiple lines) are ingested by Splunk and only create 1 new event per file (containing all of the file's contents). I do have the ability to manipulate the file data to add line-breaks if that is the solution.

2 Solutions

gkanapathy
Splunk Employee
Splunk Employee

Just set the LINE_BREAKER for the sourcetype to something that will never match, such as (?!). You will also probably also need to increase MAX_EVENTS (default is only 500 lines, there isn't a hard limit I know of) and TRUNCATE to something larger than the biggest file size (or I think 0 is unlimited).

View solution in original post

ftk
Motivator

I use a regular monitor stanza combined with a custom sourcetype to index full files of interest.

I use the following monitor to index changes to my splunk configs for example (inputs.conf):

[monitor://C:\Program Files\Splunk\etc\...\*.conf]
followTail = False
sourcetype = splunk_config
index = my_custom_index
disabled = false

and define the splunk_config sourcetype in props.conf as such:

[splunk_config]
BREAK_ONLY_BEFORE=goblygook
MAX_EVENTS=200000
DATETIME_CONFIG = NONE
CHECK_METHOD = modtime
pulldown_type = true
LEARN_MODEL = false

this combination will index all files under splunk\etc ending in .conf. The BREAK_ONLY_BEFORE=gooblybook basically tells splunk not to break the event (in this case the conf file) until it encounters "gooblygook" which shouldn't be in any of your files.

View solution in original post

lguinn2
Legend

Update: Check out this answer to the same question Each File as One Single Splunk Event

[mysinglefilesourcetype]
SHOULD_LINEMERGE = false
LINE_BREAKER = ((*FAIL))
TRUNCATE = 99999999

I think this is newer information

amfranz
Engager

In regards to gcoles findings about the first approach not working with Splunk 4.3:

LINE_BREAKER = (?!)

This approach still works in Splunk 4.3 with a minor modification. The expression needs to be surrounded by an additional pair of parantheses:

LINE_BREAKER = ((?!))

I think this is because Splunk 4.3 requires the regular expression to have at least one capture expression, and earlier Splunk versions did not enforce this. The "(?!)" is merely a lookahead expression, the additional pair of parentheses does add a capture expression.

gcoles
Communicator

As a note to anyone else who may be using this page as a reference, I had been using the LINE_BREAKER directive to do this (as outlined by gkanapathy), but this stopped working when we upgraded our indexers to 4.3. I had to change our props.conf entries for these kinds of inputs to use the method shown by ftk. I verified that the first method fails whether using lightweight or heavy forwarders, as long as the machine that is processing the props.conf for the sourcetype is 4.3.

0 Karma

ftk
Motivator

I use a regular monitor stanza combined with a custom sourcetype to index full files of interest.

I use the following monitor to index changes to my splunk configs for example (inputs.conf):

[monitor://C:\Program Files\Splunk\etc\...\*.conf]
followTail = False
sourcetype = splunk_config
index = my_custom_index
disabled = false

and define the splunk_config sourcetype in props.conf as such:

[splunk_config]
BREAK_ONLY_BEFORE=goblygook
MAX_EVENTS=200000
DATETIME_CONFIG = NONE
CHECK_METHOD = modtime
pulldown_type = true
LEARN_MODEL = false

this combination will index all files under splunk\etc ending in .conf. The BREAK_ONLY_BEFORE=gooblybook basically tells splunk not to break the event (in this case the conf file) until it encounters "gooblygook" which shouldn't be in any of your files.

Yashar_Shah
New Member

It doesn't enough if you forward data to the indexer. It is just useful for local file monitoring.

0 Karma

lguinn2
Legend

Do you want Splunk to create one event per file, or do you want it to create one event per line?

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

Just set the LINE_BREAKER for the sourcetype to something that will never match, such as (?!). You will also probably also need to increase MAX_EVENTS (default is only 500 lines, there isn't a hard limit I know of) and TRUNCATE to something larger than the biggest file size (or I think 0 is unlimited).

adobrzeniecki
Path Finder

Is this still good in 2021?

0 Karma
Get Updates on the Splunk Community!

What’s new on Splunk Lantern in August

This month’s Splunk Lantern update gives you the low-down on all of the articles we’ve published over the past ...

Welcome to the Future of Data Search & Exploration

You have more data coming at you than ever before. Over the next five years, the total amount of digital data ...

This Week's Community Digest - Splunk Community Happenings [8.3.22]

Get the latest news and updates from the Splunk Community here! News From Splunk Answers ✍️ Splunk Answers is ...