Getting Data In

Log merging

pgelnar_hci
New Member

Hello, I am trying to merge two lines logs, but no luck with it
Splunk Enterprise 7.1.2

here is sample
{"log":"Apr 04, 2019 12:01:24 PM hudson.model.AsyncPeriodicWork$1 run\n", "stream":"stderr", "time":"2019-04-04T12:01:24.77173976Z", "kubernetes":{"pod_name":"jenkins-bdd89884d-4v6sd", "namespace_name":"001", "pod_id":"33c4a5bd-553a-11e9-8b8e-005056aea3a7", "labels":{"app":"jenkins", "pod-template-hash":"688454408"}, "host":"001", "container_name":"jenkins", "docker_id":"aa9ab26e108daf221b974d80ddf1e51d91b6b235698a4f4711a0313231649a10"}}
{"log":"INFO: Finished DockerContainerWatchdog Asynchronous Periodic Work. 2 ms\n", "stream":"stderr", "time":"2019-04-04T12:01:24.771743784Z", "kubernetes":{"pod_name":"jenkins-bdd89884d-4v6sd", "namespace_name":"001", "pod_id":"33c4a5bd-553a-11e9-8b8e-005056aea3a7", "labels":{"app":"jenkins", "pod-template-hash":"688454408"}, "host":"001", "container_name":"jenkins", "docker_id":"aa9ab26e108daf221b974d80ddf1e51d91b6b235698a4f4711a0313231649a10"}}

i have created regex that works well with sample log in Add Data, but not in "real world". it is matching {"log":" at the begging of the log and then date.

this is my local props.conf
[jsonCicd]
BREAK_ONLY_BEFORE = ^({\"log\":\")([A-Za-z]+)\s([0-9]+),\s([0-9]+)\s([0-9]+):([0-9]+):([0-9]+)\s([A,PM])
DATETIME_CONFIG =
NO_BINARY_CHECK = true
category = Structured
description = cicd logs merging
pulldown_type = true

particular input has this sourcetype set

here is debug log

04-12-2019 08:46:25.086 +0000 DEBUG PropertiesMapConfig - Pattern 'jsonCicd' matches with priority 100
04-12-2019 08:46:25.086 +0000 DEBUG UTF8Processor - Done key received for: source::http:cicd|host::001:8088|jsonCicd|
04-12-2019 08:46:25.086 +0000 DEBUG UTF8Processor - Done key received for: source::http:cicd|host::001:8088|jsonCicd|
04-12-2019 08:46:25.086 +0000 DEBUG UTF8Processor - Done key received for: source::http:cicd|host::001:8088|jsonCicd|
04-12-2019 08:46:25.086 +0000 DEBUG UTF8Processor - Done key received for: source::http:cicd|host::001:8088|jsonCicd|
04-12-2019 08:46:25.086 +0000 INFO AggregatorMiningProcessor - Setting up line merging apparatus for: source::http:cicd|host::001:8088|jsonCicd|

which looks fine for me. I have tried multiple combinations, for example with time format etc, but result is still the same.
any ideas why I still can see two logs in search app, please?

0 Karma

pgelnar_hci
New Member

It seems that I am stuck with HEC, here is answer, that I got from official Splunk Support:

I you need to use HEC, events need to be contained in one request:
"Events must be contained within a single HTTP request. They cannot span multiple requests."
as explained in the official documentation:
https://docs.splunk.com/Documentation/Splunk/7.2.5/Data/FormateventsforHTTPEventCollector#Raw_event_...

I'll try if it's possible to use TCP input for these logs. I will let this issue opened for updates.

Thank you for your help

0 Karma

woodcock
Esteemed Legend

Try this:

[jsonCicd]
SHOULD_LINEMERGE =false
LINE_BREAKER = }}([\r\n\s]*){\"log\":\"
TIME_PREFIX = \"time\":\"
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%N%Z
MAX_TIMESTAMP_LOOKAHEAD = 32
0 Karma

pgelnar_hci
New Member

Thank you to both of you, but none of these are working unfortunatelly. any other ideas, please?
DEBUG log looks still the same

0 Karma

woodcock
Esteemed Legend

I suspect that you are wrong. You must do ALL of the following:
0: Ensure that jsonCicd is the ORIGINAL sourcetype of the events (if you did sourcetype override/overwrite, you must use the PREVIOUS value).
1: Deploy this configuration to the first FULL instance of Splunk that handles the events (usually Indexers but might be a Heavy Forwarder).
2: Restart All Splunk instances there.
3: Send new events into Splunk.
4: Search ONLY the new events (older events will stay broken); to ensure this BE SURE to use _index_earliest=-5m to your search SPL string.

0 Karma

pgelnar_hci
New Member

Thank you, @woodcock
0 - Not really sure, what do you mean by that. If you mean that this sourcetype was used from beggining of logging of Jenkins, then no, I used different one. I also tried to point logs with new sourcetype to another index.
1 - It is the only instance of Splunk, not separated indexer from search head
2 - Done after every attempt
3 - Receiving
4 - Looking at newest events, still the same

0 Karma

woodcock
Esteemed Legend
0 Karma

pgelnar_hci
New Member

Thank you, as I understood from documentation, I tried this

/system/local/transforms.conf
[jsonCicd]
REGEX = ^({\"log\":\")([A-Za-z]+)\s([0-9]+),\s([0-9]+)\s([0-9]+):([0-9]+):([0-9]+)\s([A,PM])
FORMAT = sourcetype::jsonCicd
DEST_KEY = MetaData:Sourcetype

and according your advice:

/system/local/props.conf
[jsonCicd]
SHOULD_LINEMERGE = false
LINE_BREAKER = }}([\r\n\s]*){\"log\":\"
TIME_PREFIX = \"time\":\"
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%N%Z
MAX_TIMESTAMP_LOOKAHEAD = 32

is that correct? i suppose not, because not working either...

0 Karma

woodcock
Esteemed Legend

No. You do not need the transforms.conf stuff at all.

0 Karma

somesoni2
Revered Legend

Give this a try

[jsonCicd]
SHOULD_LINEMERGE =false
LINE_BREAKER =([\r\n]+)(?=\{\"log\"\:\"\w+\s+\d+,\s+\d+)
TIME_PREFIX = \"time\":\"
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%N%Z
0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...