Getting Data In

Log merging

pgelnar_hci
New Member

Hello, I am trying to merge two lines logs, but no luck with it
Splunk Enterprise 7.1.2

here is sample
{"log":"Apr 04, 2019 12:01:24 PM hudson.model.AsyncPeriodicWork$1 run\n", "stream":"stderr", "time":"2019-04-04T12:01:24.77173976Z", "kubernetes":{"pod_name":"jenkins-bdd89884d-4v6sd", "namespace_name":"001", "pod_id":"33c4a5bd-553a-11e9-8b8e-005056aea3a7", "labels":{"app":"jenkins", "pod-template-hash":"688454408"}, "host":"001", "container_name":"jenkins", "docker_id":"aa9ab26e108daf221b974d80ddf1e51d91b6b235698a4f4711a0313231649a10"}}
{"log":"INFO: Finished DockerContainerWatchdog Asynchronous Periodic Work. 2 ms\n", "stream":"stderr", "time":"2019-04-04T12:01:24.771743784Z", "kubernetes":{"pod_name":"jenkins-bdd89884d-4v6sd", "namespace_name":"001", "pod_id":"33c4a5bd-553a-11e9-8b8e-005056aea3a7", "labels":{"app":"jenkins", "pod-template-hash":"688454408"}, "host":"001", "container_name":"jenkins", "docker_id":"aa9ab26e108daf221b974d80ddf1e51d91b6b235698a4f4711a0313231649a10"}}

i have created regex that works well with sample log in Add Data, but not in "real world". it is matching {"log":" at the begging of the log and then date.

this is my local props.conf
[jsonCicd]
BREAK_ONLY_BEFORE = ^({\"log\":\")([A-Za-z]+)\s([0-9]+),\s([0-9]+)\s([0-9]+):([0-9]+):([0-9]+)\s([A,PM])
DATETIME_CONFIG =
NO_BINARY_CHECK = true
category = Structured
description = cicd logs merging
pulldown_type = true

particular input has this sourcetype set

here is debug log

04-12-2019 08:46:25.086 +0000 DEBUG PropertiesMapConfig - Pattern 'jsonCicd' matches with priority 100
04-12-2019 08:46:25.086 +0000 DEBUG UTF8Processor - Done key received for: source::http:cicd|host::001:8088|jsonCicd|
04-12-2019 08:46:25.086 +0000 DEBUG UTF8Processor - Done key received for: source::http:cicd|host::001:8088|jsonCicd|
04-12-2019 08:46:25.086 +0000 DEBUG UTF8Processor - Done key received for: source::http:cicd|host::001:8088|jsonCicd|
04-12-2019 08:46:25.086 +0000 DEBUG UTF8Processor - Done key received for: source::http:cicd|host::001:8088|jsonCicd|
04-12-2019 08:46:25.086 +0000 INFO AggregatorMiningProcessor - Setting up line merging apparatus for: source::http:cicd|host::001:8088|jsonCicd|

which looks fine for me. I have tried multiple combinations, for example with time format etc, but result is still the same.
any ideas why I still can see two logs in search app, please?

0 Karma

pgelnar_hci
New Member

It seems that I am stuck with HEC, here is answer, that I got from official Splunk Support:

I you need to use HEC, events need to be contained in one request:
"Events must be contained within a single HTTP request. They cannot span multiple requests."
as explained in the official documentation:
https://docs.splunk.com/Documentation/Splunk/7.2.5/Data/FormateventsforHTTPEventCollector#Raw_event_...

I'll try if it's possible to use TCP input for these logs. I will let this issue opened for updates.

Thank you for your help

0 Karma

woodcock
Esteemed Legend

Try this:

[jsonCicd]
SHOULD_LINEMERGE =false
LINE_BREAKER = }}([\r\n\s]*){\"log\":\"
TIME_PREFIX = \"time\":\"
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%N%Z
MAX_TIMESTAMP_LOOKAHEAD = 32
0 Karma

pgelnar_hci
New Member

Thank you to both of you, but none of these are working unfortunatelly. any other ideas, please?
DEBUG log looks still the same

0 Karma

woodcock
Esteemed Legend

I suspect that you are wrong. You must do ALL of the following:
0: Ensure that jsonCicd is the ORIGINAL sourcetype of the events (if you did sourcetype override/overwrite, you must use the PREVIOUS value).
1: Deploy this configuration to the first FULL instance of Splunk that handles the events (usually Indexers but might be a Heavy Forwarder).
2: Restart All Splunk instances there.
3: Send new events into Splunk.
4: Search ONLY the new events (older events will stay broken); to ensure this BE SURE to use _index_earliest=-5m to your search SPL string.

0 Karma

pgelnar_hci
New Member

Thank you, @woodcock
0 - Not really sure, what do you mean by that. If you mean that this sourcetype was used from beggining of logging of Jenkins, then no, I used different one. I also tried to point logs with new sourcetype to another index.
1 - It is the only instance of Splunk, not separated indexer from search head
2 - Done after every attempt
3 - Receiving
4 - Looking at newest events, still the same

0 Karma

woodcock
Esteemed Legend
0 Karma

pgelnar_hci
New Member

Thank you, as I understood from documentation, I tried this

/system/local/transforms.conf
[jsonCicd]
REGEX = ^({\"log\":\")([A-Za-z]+)\s([0-9]+),\s([0-9]+)\s([0-9]+):([0-9]+):([0-9]+)\s([A,PM])
FORMAT = sourcetype::jsonCicd
DEST_KEY = MetaData:Sourcetype

and according your advice:

/system/local/props.conf
[jsonCicd]
SHOULD_LINEMERGE = false
LINE_BREAKER = }}([\r\n\s]*){\"log\":\"
TIME_PREFIX = \"time\":\"
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%N%Z
MAX_TIMESTAMP_LOOKAHEAD = 32

is that correct? i suppose not, because not working either...

0 Karma

woodcock
Esteemed Legend

No. You do not need the transforms.conf stuff at all.

0 Karma

somesoni2
Revered Legend

Give this a try

[jsonCicd]
SHOULD_LINEMERGE =false
LINE_BREAKER =([\r\n]+)(?=\{\"log\"\:\"\w+\s+\d+,\s+\d+)
TIME_PREFIX = \"time\":\"
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%N%Z
0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...