Archive

Why does BREAK_ONLY_BEFORE work only for some events?

New Member

I have applied regex in the heavy forwarders as below. But this works only for few events and a lot of events are not getting parsed with the regex in BREAKONLYBEFORE.

pulldowntype = 1
SEDCMD-backslash=s/\//g
TRUNCATE = 0
BREAK
ONLYBEFORE = {\”name\”
DATETIME
CONFIG = CURRENT
INDEXEDEXTRACTIONS = json
KV
MODE = json
category = Structured
SHOULDLINEMERGE = false
NO
BINARY_CHECK = true

Sample logs as below.

{\"name\":\"\",\"\":,\"severity\":\"info\",\"time\":,\"host\":\"\",\"hostname\":\"\",\"\":\"\",\"\":\"UNKNOWN CORRELATION\",\"userId\":\"UNKNOWN USER\",\"moduleName\":\"\",\"\":\"a\",\"client\":\"AgentDesktop\",\"type\":\"application\",\"msg\":\"\",\"\":\"\"}{\"name\":\"\",\"level\":30,\"\":\"info\",\"time\":,\"host\":\"\",\"hostname\":\"\",\"\":\"\",\"clientCorrelationId\":\"\",\"userId\":\"UNKNOWN 

For some events the same stanza in heavy forwarder works, but for others, it does not work. Can someone let me know what could be wrong?

0 Karma

Motivator

Hi,

Your SHOULD_LINEMERGE value must be true. And I made small adjustment to your regex. Try below,

props.conf:

BREAK_ONLY_BEFORE = \{\W+name
SHOULD_LINEMERGE = true
0 Karma

New Member

Thanks! But how my stanza worked for one event and it is not working for another event. Why it was not working for all the events with the same pattern? Also in the regex you provided, I want to break only at name and at the braces before that.Will this break the event at the field name?

0 Karma

Motivator

I am not sure how it worked for the first event. Your regex did not match the event. Tested here. The backslash before quotes must be escaped in order to match \".

I updated my regex above. This will look for { before name

0 Karma

New Member

Hi Surya

Thanks! I will try to implement it ! Also could you let me know what regex can be applied to the below log sample to break at the name field?

{\"name\":\"\",\"level\":,\"severity\":\"info\",\"time

0 Karma

Motivator

If events are multi-line, then try (?m)\{\W+name

(?m) - multi-line modifier
\{ - This will look for { literally.
\W+ - This will match any number of non-word characters. If you're sure about the number of characters between { and name, then make use of quantifiers, for example, \W{1,3} - this will look for minimum 1 and max 3 characters instead of looking for 1 and unlimited.
name - This will look for name literally case-sensitive.

Please refer to this page for more details.

If events are not multi-line:

I would suggest using LINE_BREAKER instead of BREAK_ONLY_BEFORE because, LINEBREAKER will improve processing speed. If you would like to use LINEBREAKER, then below are the configs,

LINE_BREAKER = ([\r\n]+)\{\W+name
SHOULD_LINEMERGE = false
0 Karma

New Member

Hi Surya

We tried most of all the suggestions that you provided but nothing looks to be working.Only few events are being parsed and most of the events are not.But the SED command that I am applying works for all the events.The Regex is not working for all the events.I have not used the LINe BREAKER though.Will it work ?

0 Karma

Motivator

Okay, I see what you're doing. I will provide you two set of configs, one for multi line events; and another for single line events. Please apply these configs per your use case.

Multi line events (records with name starting in same line):

{\"name\":\"\",\"\":,\"severity\":\"info\",\"time\":,\"host\":\"\",\"hostname\":\"\",\"\":\"\",\"\":\"UNKNOWN CORRELATION\",\"userId\":\"UNKNOWN USER\",\"moduleName\":\"\",\"\":\"a\",\"client\":\"AgentDesktop\",\"type\":\"application\",\"msg\":\"\",\"\":\"\"}{\"name\":\"\",\"level\":30,\"\":\"info\",\"time\":,\"host\":\"\",\"hostname\":\"\",\"\":\"\",\"clientCorrelationId\":\"\",\"userId\":\"UNKNOWN 

props.conf:

[your_sourcetype]
    BREAK_ONLY_BEFORE = (?m)\{\W*name
    SHOULD_LINEMERGE = true
    SEDCMD-backslash=s/\\//g
    DATETIME_CONFIG = CURRENT
    KV_MODE = json
    category = Structured
    NO_BINARY_CHECK = true
    TRUNCATE = 0

Single line events (records with name starting in new line):

{\"name\":\"\",\"\":,\"severity\":\"info\",\"time\":,\"host\":\"\",\"hostname\":\"\",\"\":\"\",\"\":\"UNKNOWN CORRELATION\",\"userId\":\"UNKNOWN USER\",\"moduleName\":\"\",\"\":\"a\",\"client\":\"AgentDesktop\",\"type\":\"application\",\"msg\":\"\",\"\":\"\"}
{\"name\":\"\",\"level\":30,\"\":\"info\",\"time\":,\"host\":\"\",\"hostname\":\"\",\"\":\"\",\"clientCorrelationId\":\"\",\"userId\":\"UNKNOWN 

props.conf:

[your_sourcetype]
    LINE_BREAKER = ([\r\n]+)\{\W*name
    SHOULD_LINEMERGE = false
    SEDCMD-backslash=s/\\//g
    DATETIME_CONFIG = CURRENT
    KV_MODE = json
    category = Structured
    NO_BINARY_CHECK = true
    TRUNCATE = 0

You can test regex for both BREAK_ONLY_BEFORE and LINE_BREAKER with their respective data samples here.

Also, in your configurations, you're using INDEXED_EXTRACTIONS and KV_MODE to extract json fields. This is not suggestible as this will extract fields twice, resulting in duplicate field values. Please have a look at below links and use any one setting which suits your need.

https://answers.splunk.com/answers/556279/why-would-indexed-extractionsjson-in-propsconf-be.html

https://www.hurricanelabs.com/blog/splunk-case-study-indexed-extractions-vs-search-time-extractions

0 Karma

New Member

Hi Surya- The solution thatyou provided yesterday works only for the events starting with new line.For the events are merged in a single line,it does not work.Will the above stanza work for thos merged events within a single line too?

0 Karma

Motivator

Yes. Use the 1st set of configs. I am not sure why it did not work the first time. Can you paste your full props.conf here which you're using right now. Please use "code generator" (the icon with 101010) for pasting content.

0 Karma

New Member
[empath_app_log]
pulldown_type = 1
SEDCMD-backslash=s/\\//g
TRUNCATE = 0
BREAK_ONLY_BEFORE = \{\W+name
DATETIME_CONFIG = CURRENT
INDEXED_EXTRACTIONS = json
KV_MODE = json
category = Structured
SHOULD_LINEMERGE = true
NO_BINARY_CHECK = true
0 Karma

New Member

This is what we deployed last night and only the events starting with newline is being parsed while the events merged together in single line is not being parsed.

0 Karma

New Member
{"name":"utterance.service logger","level":30,"severity":"info","time":"host":"","hostname":"","category":"application","clientCorrelationId":"","userId":"","moduleName":"DisplayUtterancesFsModule","source":"angular","client":"AgentDesktop","type":"application","msg":"utterance does not exist","logId":""}{"name":"utterance.service logger","level":30,"severity":"info","time":,"host":"","hostname":"","category":"application","clientCorrelationId":"","userId":"","moduleName":"","source":"angular","client":"AgentDesktop","type":"application","msg":"utterance does not exist","logId":""}
0 Karma

New Member

Above the sample log that is not being parsed .I pulled it from the splunk UI

0 Karma

Motivator

Thanks for the information. Please add (?m) - multi-line modifier before \{\W+name. This will make splunk to look at each line for {"name string.

0 Karma

New Member

Oops! I applied that as well.Below is the one that is in the server and still not working as I expected.

[empathapplog]
pulldowntype = 1
SEDCMD-backslash=s/\//g
TRUNCATE = 0
BREAK
ONLYBEFORE = (?m){\W+name
DATETIME
CONFIG = CURRENT
INDEXEDEXTRACTIONS = json
KV
MODE = json
category = Structured
SHOULDLINEMERGE = true
NO
BINARY_CHECK = true

0 Karma

Motivator

Hmm. Can you check if any other setting is taking precedence by running this command splunk btool props list --debug | grep 'empath_app_log'

Do you mind walking me through your architecture. Data flow is from UF --> HF --> Indexer?

0 Karma

New Member

The Data flow is from Deployment server to the heavy forwarder to the indexers.

0 Karma

Motivator

Are you collecting logs from deployment server? In that case, please place the same props.conf along with your inputs.conf on DS as well. What was the output of btool command. Did you notice any conflicts?

0 Karma

New Member

I am unable to run that command.I dont have that previlege

0 Karma