How can I cut some parts of my message prior to index time?
I tried to use both SEDCMD and transform on raw messages but I still get the full content each time.
Here is my current props configuration:
[ETW_SILK_JSON]
description = silk etw
LINE_BREAKER = ([\r\n]+"event":)
SHOULD_LINEMERGE = false
CHARSET = UTF-8
TRUNCATE = 0
# TRANSFORMS-cleanjson = strip_event_prefix
SEDCMD-strip_event = s/^"event":\{\s*//
And my message sample:
"event":{{"ProviderGuid":"7dd42a49-5329-4832-8dfd-43d979153a88","YaraMatch":[],"ProviderName":"Microsoft-Windows-Kernel-Network","EventName":"KERNEL_NETWORK_TASK_TCPIP/Datareceived.","Opcode":11,"OpcodeName":"Datareceived.","TimeStamp":"2024-07-22T14:29:27.6882177+03:00","ThreadID":10008,"ProcessID":1224,"ProcessName":"svchost","PointerSize":8,"EventDataLength":28,"XmlEventData":{"FormattedMessage":"TCPv4: 43 bytes received from 1,721,149,632:15,629 to -23,680,832:14,326. ","connid":"0","sport":"15,629","_PID":"820","seqnum":"0","MSec":"339.9806","saddr":"1,721,149,632","size":"43","PID":"1224","dport":"14,326","TID":"10008","ProviderName":"Microsoft-Windows-Kernel-Network","PName":"","EventName":"KERNEL_NETWORK_TASK_TCPIP/Datareceived.","daddr":"-23,680,832"}}}
I want to get rid of the "event" prefix but none of the optios seems to work.
1. Haven't we discussed it on Slack yesterday? (or was I discussing that with another person? The sourcetype was the same and the case was similar)
2. Your LINE_BREAKER should get rid of the "event": part already (it's within the capture group so it should be treated as line breaker and stripped).
So apparently your settings are not applied at all. I'd say you probably have your props set on a wrong component.
Yes that is me, I am sorry for the using two channels for the same question, after asking in the Slack I searched again about the issue on the web but could not find any previous questions. Therefore I realized it could be better to ask here for future Splunk explorer. However eventually I was able to resolve the issue by editing my third party source code (not my Splunk UF) to produce valid formatted JSON messages. So problem is solved but not in conventional ways. For this reason I this think the question should be completely deleted in order to avoid future confusion.
How can I remove this question completely?
Well, that's a very good news 🙂 And IMHO it's a good solution to be found in the future - get your data in order first 🙂 Just leave the thread be.
Hi @DoubleAka ,
your message seems to be in json, so if you delete part of the message (for example the first part) you lose the formatting and you can no longer use field extraction tools such as INDEXED_EXTRACTIONS or spath, furthermore you save very little by deleting just one word.
In any case, the SED_CMD command uses a substitution regex and the one you used is wrong because quotes must be escaped and you missed the global parameter:
SEDCMD-strip_event = s/^\"event\":\{\s*//gCiao.
Giuseppe