Hi Everyone,
Apologies for my post here since I am unable to post a new one question adding in this.
I have tough time to filter the data from my incoming xml in Heavy Forwarder and sending to Indexer.
considering content is my xml tag which need to be removed from the xml data in between the content tags
Below is the REGEx I am using in transforms.conf file
transforms.conf
[remove-content]
REGEX = s/(?s).*(?=<\/content>)<\/content>//
DEST_KEY = queue
FORMAT = nullQueue
Props.conf
[test_transform]
pulldown_type = 1
TRANSFORMS-null = remove-content
DATETIME_CONFIG =
NO_BINARY_CHECK = true
category = Custom
disabled = false
still unable to achieve the results expected xml to index..googled so many posts and implemented ...but no luck
I'm guessing you're trying to strip some content of of your events, not all event. So, you should be using SEDCMD of props.conf, as the transforms that you're trying will remove the whole event
[test_transform]
pulldown_type = 1
SEDCMD-remove_content = s/(?s).*(?=<\/content>)<\/content>//
DATETIME_CONFIG =
NO_BINARY_CHECK = true
category = Custom
disabled = false
You can't do a substitution within the REGEX; that is not allowed. Also, you can send entire events to the nullQueue, but not parts of events.
[remove-content]
REGEX=(.*?\<content\>).*?(\</content\>.*)
DEST_KEY=_raw
FORMAT = $1$2
Above, the REGEX captures everything up to and including the <content>
tag. The it also captures everything from the </content>
tag to the end of the event. It re-writes the raw event, using only the captured pieces of the original event and omitting the characters in between the tags.