Getting Data In

Slice Messages Before Search Time

DoubleAka
Observer

How can I cut some parts of my message prior to index time?
I tried to use both SEDCMD and transform on raw messages but I still get the full content each time.

Here is my current props configuration:

[ETW_SILK_JSON]
description = silk etw
LINE_BREAKER = ([\r\n]+"event":)
SHOULD_LINEMERGE = false
CHARSET = UTF-8
TRUNCATE = 0
# TRANSFORMS-cleanjson = strip_event_prefix
SEDCMD-strip_event = s/^"event":\{\s*//


And my message sample:
"event":{{"ProviderGuid":"7dd42a49-5329-4832-8dfd-43d979153a88","YaraMatch":[],"ProviderName":"Microsoft-Windows-Kernel-Network","EventName":"KERNEL_NETWORK_TASK_TCPIP/Datareceived.","Opcode":11,"OpcodeName":"Datareceived.","TimeStamp":"2024-07-22T14:29:27.6882177+03:00","ThreadID":10008,"ProcessID":1224,"ProcessName":"svchost","PointerSize":8,"EventDataLength":28,"XmlEventData":{"FormattedMessage":"TCPv4: 43 bytes received from 1,721,149,632:15,629 to -23,680,832:14,326. ","connid":"0","sport":"15,629","_PID":"820","seqnum":"0","MSec":"339.9806","saddr":"1,721,149,632","size":"43","PID":"1224","dport":"14,326","TID":"10008","ProviderName":"Microsoft-Windows-Kernel-Network","PName":"","EventName":"KERNEL_NETWORK_TASK_TCPIP/Datareceived.","daddr":"-23,680,832"}}}


I want to get rid of the "event" prefix but none of the optios seems to work.

Labels (1)
Tags (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

1. Haven't we discussed it on Slack yesterday? (or was I discussing that with another person? The sourcetype was the same and the case was similar)

2. Your LINE_BREAKER should get rid of the "event": part already (it's within the capture group so it should be treated as line breaker and stripped).

So apparently your settings are not applied at all. I'd say you probably have your props set on a wrong component.

0 Karma

DoubleAka
Observer

Yes that is me, I am sorry for the using two channels for the same question, after asking in the Slack I searched again about the issue on the web but could not find any previous questions. Therefore I realized it could be better to ask here for future Splunk explorer. However eventually I was able to resolve the issue by editing my third party source code (not my Splunk UF) to produce valid formatted JSON messages. So problem is solved but not in conventional ways. For this reason I this think the question should be completely deleted in order to avoid future confusion.

How can I remove this question completely?

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Well, that's a very good news 🙂 And IMHO it's a good solution to be found in the future - get your data in order first 🙂 Just leave the thread be.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @DoubleAka ,

your message seems to be in json, so if you delete part of the message (for example the first part) you lose the formatting and you can no longer use field extraction tools such as INDEXED_EXTRACTIONS or spath, furthermore you save very little by deleting just one word.
In any case, the SED_CMD command uses a substitution regex and the one you used is wrong because quotes must be escaped and you missed the global parameter:

SEDCMD-strip_event = s/^\"event\":\{\s*//g

Ciao.

Giuseppe

 

0 Karma
Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Shape the Future of Splunk: Join the Product Research Lab!

Join the Splunk Product Research Lab and connect with us in the Slack channel #product-research-lab to get ...

Auto-Injector for Everything Else: Making OpenTelemetry Truly Universal

You might have seen Splunk’s recent announcement about donating the OpenTelemetry Injector to the ...