Re: Adding custom data from file

Be_JAR · ‎03-21-2024

Hi all.

I am ingesting data into Splunk Enterprise from a file. This file contains a lot of information, and I would like Splunk to make the events start on the ##start_string

and end on the line before the next occurrence ##end_string
Within these blocks there are different fields with the form-> ##key = value
Here is an example of the file:

…..
##start_string
##Field = 1
##Field2 = 12
##Field3 = 1
##Field4 =
##end_string
.......
##start_string
##Field = 22
##Field2 = 12
##Field3 = field_value
##Field4 =
##Field8 = 1
##Field7 = 12
##Field6 = 1
##Field5 =
##end_string
……

I have tried to create this sourcetype (with different regular expressions) but it creates only one event with all the lines:

DATETIME_CONFIG =
LINE_BREAKER = ([\n\r]+)##start_string

##LINE_BREAKER = ([\n\r]+##start_string\s+(?<block>.*?)\s+## end_string
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = true
category = Custom
description = Format custom logs

pulldown_type = 1
disabled = false

How should I approach this case?
Any ideas or help would be welcome

Thanks in advanced

marnall · ‎03-21-2024

I recommend setting SHOULD_LINEMERGE to false so that Splunk does not try to re-combine your events together.

Be_JAR · ‎03-22-2024

I tried it, but it didn't work.
splunk does not create the events with the information between the delimiters:

## MONIT_DOC_START
....
.....
## MONIT_DOC_END

Any ideas?
I have also tried this (unsuccessful) :

BR

marnall · ‎03-24-2024

It should work. Here is how I have it set up:

log sample: (at /tmp/hashlogs)

##start_string
##time = 1711292017
##Field2 = 12
##Field3 = field_value
##Field4 = somethingelse
##Field8 = 1
##Field7 = 12
##Field6 = 1
##Field5 =
##end_string
##start_string
##time = 1711291017
##Field2 = 12
##Field3 = field_value2
##Field4 = somethingelse3
##Field8 = 14
##Field7 = 12
##Field6 = 15
##Field5 =
##end_string
##start_string
##time = 1711282017
##Field2 = 12
##Field3 = asrsar
##Field4 = somethingelsec
##Field8 = 1
##Field7 = 12
##end_string

inputs.conf (on forwarder machine)

[monitor:///tmp/hashlogs]
index=main
sourcetype=hashlogs

props.conf (on indexer machine)

[hashlogs]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\n\r]+)##start_string

Result: (search is index=* sourcetype=hashlogs)

Be_JAR · ‎03-25-2024

Thank you @marnall

However, in my log (which has some line between event end markers and the next event start), something is wrong.

Some info
extra info
##start_string
##time = 1711292017
##Field2 = 12
##Field3 = field_value
##Field4 = somethingelse
##Field8 = 1
##Field7 = 12
##Field6 = 1
##Field5 =
##end_string
Some info
more info
extra info
##start_string
##time = 1711291017
##Field2 = 12
##Field3 = field_value2
##Field4 = somethingelse3
##Field8 = 14
##Field7 = 12
##Field6 = 15
##Field5 =
##end_string
SOme info
more info
info
extra info
##start_string
##time = 1711282017
##Field2 = 12
##Field3 = asrsar
##Field4 = somethingelsec
##Field8 = 1
##Field7 = 12
##end_string
Some info
extra info

Some idea to delimit events between the markers?
##start_string
##end_string

BR
JAR

isoutamo · ‎03-25-2024

It would be nice to get the real log format in the first phase not after 1st version has resolved!

Do all valid log rows starting with ##? If so then you should add transforms.conf which drop away other lines. If there is not any way to recognise those without looking ##start_string and ##end_string then you probably must write some preprocessing or your own modular input. Splunk's normal input processing handling those lines one by one and it cannot keep track other lines and is there happening something or not.

Be_JAR · ‎03-25-2024

Thank you very much for the clarification.
Yes, valid rows start with ##. And each event is what is inside each ##start_string and ##end_string block.

From UI, is there any way to do the first step and remove the rows that do not start with ## ?

BR
JAR

isoutamo · ‎03-25-2024

Then I propose you to use transforms.conf and send those lines into dev null. There are quite many examples on community and also on docs. See e.g. https://community.splunk.com/t5/Getting-Data-In/sending-specific-events-to-nullqueue-using-props-amp... just replace that REGEX to match your line or beginning of your line.

Basically SEDCMD do almost same. It just clears that line but it didn't remove it. Basically there are sill "empty" line on your log events, not removed line.

PickleRick · ‎03-25-2024

The difference is that with SEDCMD you can "blank" part of a multiline event. If you send to nullQueue, you'll discard whole event.

isoutamo · ‎03-25-2024

Exactly that way. So you must select which one those are and based on that select SEDCMD or transforms.

Be_JAR · ‎03-26-2024

Hi .

Trying with:

Field transformations:

And adding them to sourcetype:

But does not work

is there anything wrong?

Thank you all!!

BR

PickleRick · ‎03-25-2024

You can use SEDCMD to remove all lines not beginning with two hashes.

Something like

SEDCMD-remove-unhashsed = s/^([^#]|#[^#]).*$//

(Haven't tested it though, might need some tweaking).

isoutamo · ‎03-22-2024

Have you try to escape # characters like \# ?

Adding custom data from file

props.conf

sourcetype

transforms.conf

AppDynamics Summer Webinars

SOCin’ it to you at Splunk University

Credit Card Data Protection & PCI Compliance with Splunk Edge Processor

Are you a member of the Splunk Community?