Getting Data In

Adding custom data from file

Be_JAR
Path Finder

Hi all.

I am ingesting data into Splunk Enterprise from a file. This file contains a lot of information, and I would like Splunk to make the events start on the ##start_string

and end on the line before the next occurrence ##end_string
Within these blocks there are different fields with the form-> ##key = value
Here is an example of the file:

 

…..
##start_string
##Field = 1
##Field2 = 12
##Field3 = 1
##Field4 =
##end_string
.......
##start_string
##Field = 22
##Field2 = 12
##Field3 = field_value
##Field4 =
##Field8 = 1
##Field7 = 12
##Field6 = 1
##Field5 =
##end_string
……

I have tried to create this sourcetype (with different regular expressions) but it creates only one event with all the lines:

DATETIME_CONFIG =
LINE_BREAKER = ([\n\r]+)##start_string

##LINE_BREAKER = ([\n\r]+##start_string\s+(?<block>.*?)\s+## end_string
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = true
category = Custom
description = Format custom logs

pulldown_type = 1
disabled = false

How should I approach this case?
Any ideas or help would be welcome

Thanks in advanced

Labels (3)
0 Karma

marnall
Builder

I recommend setting SHOULD_LINEMERGE to false so that Splunk does not try to re-combine your events together.

0 Karma

Be_JAR
Path Finder

I tried it, but it didn't work.
splunk does not create the events with the information between the delimiters:

## MONIT_DOC_START
....
.....
## MONIT_DOC_END

 

Any ideas?
I have also tried this (unsuccessful) :

Be_JAR_0-1711112111543.png

 

BR

0 Karma

marnall
Builder

It should work. Here is how I have it set up:

log sample: (at /tmp/hashlogs)

##start_string
##time = 1711292017
##Field2 = 12
##Field3 = field_value
##Field4 = somethingelse
##Field8 = 1
##Field7 = 12
##Field6 = 1
##Field5 =
##end_string
##start_string
##time = 1711291017
##Field2 = 12
##Field3 = field_value2
##Field4 = somethingelse3
##Field8 = 14
##Field7 = 12
##Field6 = 15
##Field5 =
##end_string
##start_string
##time = 1711282017
##Field2 = 12
##Field3 = asrsar
##Field4 = somethingelsec
##Field8 = 1
##Field7 = 12
##end_string

 

inputs.conf (on forwarder machine)

[monitor:///tmp/hashlogs]
index=main
sourcetype=hashlogs

props.conf (on indexer machine)

[hashlogs]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\n\r]+)##start_string

 

Result: (search is index=* sourcetype=hashlogs)

marnall_0-1711292286542.png

 

 

0 Karma

Be_JAR
Path Finder

Thank you @marnall 

However, in my log (which has some line between event end markers and the next event start), something is wrong.

Some info
extra info
##start_string
##time = 1711292017
##Field2 = 12
##Field3 = field_value
##Field4 = somethingelse
##Field8 = 1
##Field7 = 12
##Field6 = 1
##Field5 =
##end_string
Some info
more info
extra info
##start_string
##time = 1711291017
##Field2 = 12
##Field3 = field_value2
##Field4 = somethingelse3
##Field8 = 14
##Field7 = 12
##Field6 = 15
##Field5 =
##end_string
SOme info
more info
info
extra info
##start_string
##time = 1711282017
##Field2 = 12
##Field3 = asrsar
##Field4 = somethingelsec
##Field8 = 1
##Field7 = 12
##end_string
Some info
extra info

 

 

Some idea to delimit events between the markers?
##start_string
##end_string

 

BR
JAR

 

0 Karma

isoutamo
SplunkTrust
SplunkTrust

It would be nice to get the real log format in the first phase not after 1st version has resolved!

Do all valid log rows starting with ##? If so then you should add transforms.conf which drop away other lines. If there is not any way to recognise those without looking ##start_string and ##end_string then you probably must write some preprocessing or your own modular input. Splunk's normal input processing handling those lines one by one and it cannot keep track other lines and is there happening something or not.

0 Karma

Be_JAR
Path Finder

Thank you very much for the clarification.
Yes, valid rows start with ##. And each event is what is inside each ##start_string and ##end_string block.

From UI, is there any way to do the first step and remove the rows that do not start with ## ?


BR
JAR

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Then I propose you to use transforms.conf and send those lines into dev null. There are quite many examples on community and also on docs. See e.g. https://community.splunk.com/t5/Getting-Data-In/sending-specific-events-to-nullqueue-using-props-amp... just replace that REGEX to match your line or beginning of your line.

Basically SEDCMD do almost same. It just clears that line but it didn't remove it. Basically there are sill "empty" line on your log events, not removed line.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

The difference is that with SEDCMD you can "blank" part of a multiline event. If you send to nullQueue, you'll discard whole event.

0 Karma

isoutamo
SplunkTrust
SplunkTrust
Exactly that way. So you must select which one those are and based on that select SEDCMD or transforms.
0 Karma

Be_JAR
Path Finder

Hi .

Trying with:

Field transformations:

 

Be_JAR_0-1711459220823.png

 

 

And adding them to sourcetype:

 

Be_JAR_1-1711459220827.png

 

But does not work

Be_JAR_2-1711459220828.png

is there anything wrong?

 

Thank you all!!

 

BR

0 Karma

PickleRick
SplunkTrust
SplunkTrust

You can use SEDCMD to remove all lines not beginning with two hashes.

Something like

SEDCMD-remove-unhashsed = s/^([^#]|#[^#]).*$//

(Haven't tested it though, might need some tweaking).

0 Karma

isoutamo
SplunkTrust
SplunkTrust
Have you try to escape # characters like \# ?
0 Karma
Get Updates on the Splunk Community!

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...