Folks,
I have some NetFlow logs that go something like this:
<?xml version="1.0"?>
<flowrecord>
<delimiter>|</delimiter>
<key name="srcaddr" type="ipaddress"/>
<key name="destaddr" type="ipaddress"/>
<value name="inBytes" type="integer" usage="count"/>
<value name="outBytes" type="integer" usage="count"/>
</flowrecord>
1.2.3.4|2.3.4.5|100|34
2.3.4.5|4.5.6.7|20|12
8.9.1.1|4.3.2.1|123|1034
</xml>
We have a few things going on here. A multi-line header we want SHOULD_LINEMERGE=true
and for the actual events, we want SHOULD_LINEMERGE=false
.
These are all sourcetype'd as netflow and i've had several unsuccessful attempts at getting Splunk to process these the way i'd like.
What's the best way to configure Splunk to have the XML as a single multi-line event, and the events broken up?
I'd obviously like to process the actual events which is fairly easy with DELIMS/FIELDS in props.conf/transforms.conf but dealing with this XML header has thrown a curve-ball into what should be fairly easy.
Would appreciate any thoughts..
Thanks.
You could use transforms.conf and nullQueue the lines that start with <
. This will save space within your license, as well as remove all XML tags. Set your linebreaking back to default, and they should just index as individual lines.
props.conf
[netflow]
TRANSFORMS-null = nullQueueNetFlow
transforms.conf
[nullQueueNetFlow]
REGEX = (?m)^\s*<
DEST_KEY = queue
FORMAT = nullQueue
You could use transforms.conf and nullQueue the lines that start with <
. This will save space within your license, as well as remove all XML tags. Set your linebreaking back to default, and they should just index as individual lines.
props.conf
[netflow]
TRANSFORMS-null = nullQueueNetFlow
transforms.conf
[nullQueueNetFlow]
REGEX = (?m)^\s*<
DEST_KEY = queue
FORMAT = nullQueue