Getting Data In

Slice Messages Before Search Time

DoubleAka
Observer

How can I cut some parts of my message prior to index time?
I tried to use both SEDCMD and transform on raw messages but I still get the full content each time.

Here is my current props configuration:

[ETW_SILK_JSON]
description = silk etw
LINE_BREAKER = ([\r\n]+"event":)
SHOULD_LINEMERGE = false
CHARSET = UTF-8
TRUNCATE = 0
# TRANSFORMS-cleanjson = strip_event_prefix
SEDCMD-strip_event = s/^"event":\{\s*//


And my message sample:
"event":{{"ProviderGuid":"7dd42a49-5329-4832-8dfd-43d979153a88","YaraMatch":[],"ProviderName":"Microsoft-Windows-Kernel-Network","EventName":"KERNEL_NETWORK_TASK_TCPIP/Datareceived.","Opcode":11,"OpcodeName":"Datareceived.","TimeStamp":"2024-07-22T14:29:27.6882177+03:00","ThreadID":10008,"ProcessID":1224,"ProcessName":"svchost","PointerSize":8,"EventDataLength":28,"XmlEventData":{"FormattedMessage":"TCPv4: 43 bytes received from 1,721,149,632:15,629 to -23,680,832:14,326. ","connid":"0","sport":"15,629","_PID":"820","seqnum":"0","MSec":"339.9806","saddr":"1,721,149,632","size":"43","PID":"1224","dport":"14,326","TID":"10008","ProviderName":"Microsoft-Windows-Kernel-Network","PName":"","EventName":"KERNEL_NETWORK_TASK_TCPIP/Datareceived.","daddr":"-23,680,832"}}}


I want to get rid of the "event" prefix but none of the optios seems to work.

Labels (1)
Tags (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

1. Haven't we discussed it on Slack yesterday? (or was I discussing that with another person? The sourcetype was the same and the case was similar)

2. Your LINE_BREAKER should get rid of the "event": part already (it's within the capture group so it should be treated as line breaker and stripped).

So apparently your settings are not applied at all. I'd say you probably have your props set on a wrong component.

0 Karma

DoubleAka
Observer

Yes that is me, I am sorry for the using two channels for the same question, after asking in the Slack I searched again about the issue on the web but could not find any previous questions. Therefore I realized it could be better to ask here for future Splunk explorer. However eventually I was able to resolve the issue by editing my third party source code (not my Splunk UF) to produce valid formatted JSON messages. So problem is solved but not in conventional ways. For this reason I this think the question should be completely deleted in order to avoid future confusion.

How can I remove this question completely?

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Well, that's a very good news 🙂 And IMHO it's a good solution to be found in the future - get your data in order first 🙂 Just leave the thread be.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @DoubleAka ,

your message seems to be in json, so if you delete part of the message (for example the first part) you lose the formatting and you can no longer use field extraction tools such as INDEXED_EXTRACTIONS or spath, furthermore you save very little by deleting just one word.
In any case, the SED_CMD command uses a substitution regex and the one you used is wrong because quotes must be escaped and you missed the global parameter:

SEDCMD-strip_event = s/^\"event\":\{\s*//g

Ciao.

Giuseppe

 

0 Karma
Get Updates on the Splunk Community!

Splunk Observability for AI

Don’t miss out on an exciting Tech Talk on Splunk Observability for AI!Discover how Splunk’s agentic AI ...

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Watch On Demand the Tech Talk on November 6 at 11AM PT, and empower your SOC to reach new heights! Duration: ...

Splunk Observability as Code: From Zero to Dashboard

For the details on what Self-Service Observability and Observability as Code is, we have some awesome content ...