I got alot of more than 900K bytes xml file, i just want to index the first few tags of file within the xml. If my xml file is just less than 20K bytes, the below transforms able to work. But if the file is more than 50K bytes, the below transforms will not work. Any other alternative? The one alternative that i can think of is to write a batch script to remove the body and the content tag from the xml file before it go thru the splunk. I dont really know how to write a batch script. Any other easier suggestion? thks
props.conf
[xmlFilter1]
KV_MODE = xml
BREAK_ONLY_BEFORE = <xml>
SHOULD_LINEMERGE = false
NO_BINARY_CHECK = 1
pulldown_type = 1
TRUNCATE = 1000000000000
MAX_EVENTS = 1000000000000
TRANSFORMS-test1 = body, content
transforms.conf
[body]
LOOKAHEAD = 1000000000000
SOURCE_KEY=_raw
REGEX=(.*?)\<body\>.*?\</body\>(.*)
DEST_KEY=_raw
FORMAT=$1<body>####</body>$2
[content]
LOOKAHEAD = 1000000000000
SOURCE_KEY=_raw
REGEX=(.*?)\<content\>.*?\</content\>(.*)
DEST_KEY=_raw
FORMAT=$1<content>*******</content>$2
... View more