Getting Data In

Index entire file content as one event

sdkp03
Path Finder

I am trying to index a file and i dont see why the events are broken. I have tried defining line breaker setting both at indexer and forwarded level as suggested in multiple articles but with no luck. Is there a way that would help me in identifying whats breaking events. Or is there any configuration that would over ride all settings and ensure that events are not broken. Any help would be much appreciated.

log file that is being indexed has content like below:
2020-03-10T11:20:27.456+1100: 687196.162: [Event1, 0.0207885 secs]
[Parallel Time: 19.8 ms, Workers: 4]
[Worker Start (ms): Min: 687196162.2, Avg: 687196162.3, Max: 687196162.3, Diff: 0.1]
[Ext Scanning (ms): Min: 0.9, Avg: 1.0, Max: 1.0, Diff: 0.1, Sum: 3.9]
[Update RS (ms): Min: 2.4, Avg: 2.4, Max: 2.6, Diff: 0.2, Sum: 9.7]
[Processed Buffers: Min: 3, Avg: 10.5, Max: 21, Diff: 18, Sum: 42]
[Scan RS (ms): Min: 6.8, Avg: 6.9, Max: 6.9, Diff: 0.1, Sum: 27.6]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Object Copy (ms): Min: 9.4, Avg: 9.4, Max: 9.5, Diff: 0.1, Sum: 37.7]
[Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Termination Attempts: Min: 1, Avg: 3.2, Max: 6, Diff: 5, Sum: 13]
[Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
[Worker Total (ms): Min: 19.7, Avg: 19.7, Max: 19.8, Diff: 0.1, Sum: 78.9]
[Worker End (ms): Min: 687196182.0, Avg: 687196182.0, Max: 687196182.0, Diff: 0.0]
[Code Root Fixup: 0.0 ms]
[Code Root Purge: 0.0 ms]
[Clear CT: 0.1 ms]
[Other: 0.8 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 0.2 ms]
[Ref Enq: 0.0 ms]
[Redirty Cards: 0.1 ms]
[Humongous Register: 0.1 ms]
[Humongous Reclaim: 0.0 ms]
[Free CSet: 0.2 ms]
[Eden: 44.0M(44.0M)->0.0B(44.0M) Survivors: 7168.0K->7168.0K Heap: 306.0M(1024.0M)->270.0M(1024.0M)]
[Times: user=0.08 sys=0.00, real=0.02 secs]

2020-03-10T11:20:38.710+1100: 687207.416: [Event2, 0.0204509 secs]

In splunk log file there are some warning messages around -
Failed to parse timestamp in first MAX_TIMESTAMP_LOOKAHEAD (128) characters of event. Defaulting to timestamp of previous event (Wed
Mar 11 13:42:59 2020).

Expectation: Splunk treats the first field in the format like - "2020-03-10T11:20:38.710+1100: 687207.416: " as date/timestamp and should not try to interpret other numbers as date/time.

Around the setup, we have a UF sending logs to Indexer.

Labels (2)
Tags (1)
0 Karma
1 Solution

gcusello
Legend

Hi @sdkp03,
without having a sample of your logs I can only think that in your logs there are some numbers or dates that Splunk parse as timestamps so it divides your file in different events, please try to set the TIME_PREFIX option in your props.conf.

If you could share a sample of your logs, I could help you better!

Ciao.
Giuseppe

View solution in original post

0 Karma

FrankVl
Ultra Champion

Is this bit in the same file: 2020-03-10T11:20:38.710+1100: 687207.416: [Event2, 0.0204509 secs] and that should go into a second event? Meaning: you don't actually want to ingest the whole file?

Try this in props.conf for the relevant sourcetype on your indexers:

TIME_PREFIX = ^
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%3N%z
MAX_TIMESTAMP_LOOKAHEAD = 28
TRUNCATE = 0
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)\d+-\d+-\d+T
0 Karma

woodcock
Esteemed Legend

Try these settings on your Indexer or Heavy Forwarder:

TIME_PREFIX = ^
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%3N%z
MAX_TIMESTAMP_LOOKAHEAD = 28
TRUNCATE = 0
SHOULD_LINEMERGE = false
LINE_BREAKER = (?!)
0 Karma

FrankVl
Ultra Champion

Can you please share a sample of the log file, a screenshot or so of how it shows up broken in splunk and your current config. And also some info on the setup. Is it just a UF sending to Indexer, or is there a heavy forwarder involved?

0 Karma

sdkp03
Path Finder

Apologies had not shared enough details. Added required details in question

0 Karma

gcusello
Legend

Hi @sdkp03,
without having a sample of your logs I can only think that in your logs there are some numbers or dates that Splunk parse as timestamps so it divides your file in different events, please try to set the TIME_PREFIX option in your props.conf.

If you could share a sample of your logs, I could help you better!

Ciao.
Giuseppe

0 Karma

sdkp03
Path Finder

Apologies shared log in question now. Yes what you are suspecting is true in my case. TIME_PREFIX, am not sure what should i use in my case considering event starts with timestamp in yyyy-mm-dd pattern.

0 Karma

gcusello
Legend

Hi @sdkp03,
in your props.conf set:

[your_sourcetype]
TIME_PREFIX = ^
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%3N%z
SHOULD_LINEMERGE = true
MAX_TIMESTAMP_LOOKAHEAD = 28

Ciao.
Giuseppe

0 Karma

sdkp03
Path Finder

Thanks, this worked like magic 🙂

0 Karma
Get Updates on the Splunk Community!

Splunk Forwarders and Forced Time Based Load Balancing

Splunk customers use universal forwarders to collect and send data to Splunk. A universal forwarder can send ...

NEW! Log Views in Splunk Observability Dashboards Gives Context From a Single Page

Today, Splunk Observability releases log views, a new feature for users to add their logs data from Splunk Log ...

Last Chance to Submit Your Paper For BSides Splunk - Deadline is August 12th!

Hello everyone! Don't wait to submit - The deadline is August 12th! We have truly missed the community so ...