Getting Data In

Tailing logfile written with buffered writter

christopherutz
Path Finder

We have some logs being written by a buffered writer. This means that occasionally (depending on when the buffer is flushed) an incomplete event is written to the log and indexed by splunk.

2010-09-16 00:00:02,874 INFO  Some full log message here
   I am a multi line message that needs
   to be accounted for
2010-09-16 00:00:03,874 INFO  Another full log message here
2010-09-16 00:00:04,874 INFO  Partial log mess

Splunk happily indexes all three lines as an individual events. In actuallity I want splunk to index the two complete events and wait for the event to be completely written.

2010-09-16 00:00:02,874 INFO  Some full log message here
   I am a multi line message that needs
   to be accounted for
2010-09-16 00:00:03,874 INFO  Another full log message here
2010-09-16 00:00:04,874 INFO  Partial log message completed now and should be indexed
2010-09-16 00:00:05,874 INFO  Onto the next message that may be incomplete

I have been playing around with LINE_BREAKER but cannot get it to work as I desire. The partial line message is always indexed as an event. I suspect this is because the partial message will always be at the end of the log file and splunk breaks the event there. What are my options to make this work? Due to the amount of log traffic we generate flushing after each log message creates a noticable degredation in performance.

Tags (1)

Genti
Splunk Employee
Splunk Employee

I was having a similar issue (see Log File Not Breaking Correctly)

This solved my problem:

time_before_close = <integer>    
* Modtime delta required before Splunk can close a file on EOF.   
* Tells the system not to close files that have been updated in past <integer> seconds.   
* Defaults to 3.   
0 Karma

jrodman
Splunk Employee
Splunk Employee

What versions are at play?

In 4.0.x, a transient EOF (that splunk manages to reach) mid-event would generate a done key, which would shut down all the machinery dealing with the datastream, ensureing that the event cuts at that point. This wasn't ideal, but since this pattern of writes is uncommin, and since splunk's tailing machinery often wasn't on top of all files at all time, it wasn't that large a problem.

In 4.1.x, our tailing machinery is much much much more low latency, and so the strategy changed to wait for EOF to stay still for a few seconds before shutting down the machinery, so generally speaking should have this problem less often.

However, if you are getting this problem it's likely that part of your text is sitting in a application or libc buffer for a significant amount of time, possibly during a brief lull. The buffers can be of significant size (several kilobytes).

I think you just have to adopt the same behavior that most logging applications do: that of flushing at the end of your events. It's a good practice in general. If your app says "omg, huge bug happening, dying." and then crashes when that data is still in the buffers, you'll never see the message. Flushing should not have a significant cost, it's just the matter of handing that memory buffer from your app to the OS, not the matter of waiting for the write to complete.

0 Karma

christopherutz
Path Finder

We are running build 4.1.3. It's sounding like our application developers are going to work towards an appender that buffers and flushes at the event level rather than byte level as suggested by gkanapathy.

0 Karma

christopherutz
Path Finder

We are using the Light Forwarder. The standard forwarder can handle partial events? If I recall we were using the regular folder before but encountered a similar problem (I will need to test again to confirm). Flushing after N events may be an option. I will talk to our dev team and see if they are willing to write a custom log4j appender. Thanks for the tips.

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

Are you using a Light Forwarder to read this file? This won't happen if you switch to a regular forwarder (and you shouldn't need to mess with LINE_BREAKER). Alternatively, you can try to make sure you buffered writer flushes only at the end of an event. It does not have to flush after every event, but just not in the middle of an event. So, e.g., flush after every 200 events instead of every 100kb. This might complicate the code a bit, or else introduce a little wasted memory.

0 Karma

christopherutz
Path Finder

The closest I got was.

LINE_BREAKER = [\r\n]+((?=\d+-\d+-\d+\s+\d+:\d+:\d+,\d+\s+[A-Z]+))

This seems to break my multiline events correctly but the incomplete event that is the final line of the file is still indexed. It seems as if splunk hits the EOF and terminates the event its currently parsing.

0 Karma

hulahoop
Splunk Employee
Splunk Employee

Would you post the linebreaking configs you've tried?

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...