Getting Data In

Routing data to different indexes using regex

sivaranjiniG
Communicator

I have a standalone splunk server where i am receiving multiple network logs from different network devices on a same port. Now i need to monitor it and segregate it to different indexes.

Sample data:

CEF:0|Palo Alto Networks|PAN-OS|10.2|TRAFFIC|allow|6|src=10.1.1.25 dst=172.16.20.10 spt=52415 dpt=443 proto=tcp act=allow deviceInboundInterface=ethernet1/1 deviceOutboundInterface=ethernet1/2 rule=Allow-Internet msg=SSL traffic to internal site
CEF:0|Palo Alto Networks|PAN-OS|10.2|THREAT|virus|5|src=10.1.1.45 dst=192.168.40.12 spt=52133 dpt=8080 proto=tcp act=block msg=Detected virus Win32.Trojan.Generics in HTTP traffic
CEF:0|Trend Micro|Deep Security Agent|12.0|IDS|Exploit Attempt Detected|8|src=192.168.1.77 dst=172.16.15.5 spt=51711 dpt=445 proto=tcp act=block msg=Detected network exploit attempt CVE-2021-1675
CEF:0|Trend Micro|Apex One|14.0|MALWARE|Malware Detected|5|src=10.10.5.22 dst=unknown spt=0 dpt=0 act=quarantine fileHash=9f83d03e8c7a12a15b9d msg=Detected TROJ_GEN.R002C0PJS22
CEF:0|FORCEPOINT|NGFW|6.8|network|connection allowed|3|src=10.20.1.14 dst=172.16.8.50 spt=52111 dpt=22 proto=tcp act=allow msg=SSH connection allowed by corporate policy
CEF:0|FORCEPOINT|Web Security|8.5|proxy|blocked|5|src=10.0.2.20 dst=104.16.132.229 spt=54521 dpt=80 cs1Label=URL cs1=http://malicious-domain.com msg=Blocked malicious website
CEF:0|VMware|Carbon Black EDR|7.8|process|Process Start|3|src=192.168.100.45 dst=localhost suser=Administrator fname=C:\Windows\System32\cmd.exe msg=New process started: cmd.exe /c whoami
CEF:0|VMware|Carbon Black EDR|7.8|alert|Malware Detected|10|src=192.168.100.45 suser=SYSTEM fileHash=3b9f0aabc4f98efc8d1f msg=Malware detected: Trojan.Win32.Emotet
CEF:0|VMware|Carbon Black EDR|7.8|process|Process Start|3|src=192.168.100.45 dst=localhost suser=Administrator fname=C:\Windows\System32\cmd.exe msg=New process started: cmd.exe /c whoami

Inputs.conf:
[udp://9156]
index=dummy_idx
sourcetype=dummy_srctype
connection_host = ip

 
props.conf

[dummy_srctype]
TRANSFORMS-routing = route_trendmicro, route_forcepoint

Transforms.conf

[route_trendmicro]
DEST_KEY = _MetaData:Index
REGEX = Trend Micro
FORMAT = trendmicro_idx

[route_forcepoint]
DEST_KEY = _MetaData:Index
REGEX =FORCEPOINT
FORMAT = forcepoint_idx


its not working 
someone please help me here 
Labels (2)
0 Karma
1 Solution

tscroggins
Champion

Hi @sivaranjiniG,

While the transforms should be working, timestamping and line breaking may not be. This could result in events that are too far in the past, too far in the future, or malformed.

Try a combination of the following:

# indexes.conf

[default]
# define a last chance index if you don't already have one
# lastChanceIndex = lastchance

# [lastchance]
# ...

# inputs.conf

[udp://9156]
index = this_index_does_not_exist
connection_host = ip

# props.conf

[source::udp:9156]
DATETIME_CONFIG = CURRENT
LINE_BREAKER = ([\r\n]+)CEF:
SHOULD_LINEMERGE = false
TRANSFORMS-type_and_route = type_forcepoint,route_forcepoint,type_trendmicro,route_trendmicro

# transforms.conf

[type_forcepoint]
DEST_KEY = MetaData:Sourcetype
REGEX = ^CEF:0\|FORCEPOINT\|
FORMAT = sourcetype::cef

[route_forcepoint]
DEST_KEY = _MetaData:Index
REGEX = ^CEF:0\|FORCEPOINT\|
FORMAT = forcepoint_idx

[type_trendmicro]
DEST_KEY = MetaData:Sourcetype
REGEX = ^CEF:0\|Trend Micro\|
FORMAT = sourcetype::cef

[route_trendmicro]
DEST_KEY = _MetaData:Index
REGEX = ^CEF:0\|Trend Micro\|
FORMAT = trendmicro_idx

You can also set sourcetype in the [udp://9156] inputs stanza or [source::udp:9156] props stanza if all events will have the same source type. Setting sourcetype in a transform gives you a little more flexibility.

If you can change it, use a non-CEF format, e.g., the source's native syslog format; this isn't ArcSight, although the CEF spec is still useful. 😉 You'll have broader access to off-the-shelf Splunkbase apps and add-ons using non-CEF formats, and you'll spare yourself the overhead of maintaining a functional CEF field extraction transform. Without a custom transform, you'll notice the msg field, for example, is truncated after the first space when using KV_MODE = auto.

View solution in original post

sivaranjiniG
Communicator

Hello @tscroggins 

You are a genius and life saver 😄

the transforms didnt work because i didnt do the parsing for the dummy_idx
i used the props you shared now its routing properly 

 

Thank you so much

 

sivaranjiniG
Communicator

Hello @tscroggins 
Thanks for the response. But i think i am not clearly explain he problem.

the transforms is not working. When i index the log it goes to the last defined index in transforms which is in this case forcepoint_idx 
not only forcepoint log, all the logs are going to forcepoint_idx


0 Karma

tscroggins
Champion

Hi @sivaranjiniG,

There must be something in your inputs, props, or transforms that is different or missing from your original post. Check the output of the following for extra or incorrect settings:

$SPLUNK_HOME/bin/splunk btool inputs list udp://9156 --debug

$SPLUNK_HOME/bin/splunk btool props list dummy_srctype --debug

$SPLUNK_HOME/bin/splunk btool transforms list route_trendmicro --debug

$SPLUNK_HOME/bin/splunk btool transforms list route_forcepoint --debug

btool should show inherited default settings as well, but you can cross-reference the default stanzas in your conf files if needed.

0 Karma

tscroggins
Champion

Hi @sivaranjiniG,

While the transforms should be working, timestamping and line breaking may not be. This could result in events that are too far in the past, too far in the future, or malformed.

Try a combination of the following:

# indexes.conf

[default]
# define a last chance index if you don't already have one
# lastChanceIndex = lastchance

# [lastchance]
# ...

# inputs.conf

[udp://9156]
index = this_index_does_not_exist
connection_host = ip

# props.conf

[source::udp:9156]
DATETIME_CONFIG = CURRENT
LINE_BREAKER = ([\r\n]+)CEF:
SHOULD_LINEMERGE = false
TRANSFORMS-type_and_route = type_forcepoint,route_forcepoint,type_trendmicro,route_trendmicro

# transforms.conf

[type_forcepoint]
DEST_KEY = MetaData:Sourcetype
REGEX = ^CEF:0\|FORCEPOINT\|
FORMAT = sourcetype::cef

[route_forcepoint]
DEST_KEY = _MetaData:Index
REGEX = ^CEF:0\|FORCEPOINT\|
FORMAT = forcepoint_idx

[type_trendmicro]
DEST_KEY = MetaData:Sourcetype
REGEX = ^CEF:0\|Trend Micro\|
FORMAT = sourcetype::cef

[route_trendmicro]
DEST_KEY = _MetaData:Index
REGEX = ^CEF:0\|Trend Micro\|
FORMAT = trendmicro_idx

You can also set sourcetype in the [udp://9156] inputs stanza or [source::udp:9156] props stanza if all events will have the same source type. Setting sourcetype in a transform gives you a little more flexibility.

If you can change it, use a non-CEF format, e.g., the source's native syslog format; this isn't ArcSight, although the CEF spec is still useful. 😉 You'll have broader access to off-the-shelf Splunkbase apps and add-ons using non-CEF formats, and you'll spare yourself the overhead of maintaining a functional CEF field extraction transform. Without a custom transform, you'll notice the msg field, for example, is truncated after the first space when using KV_MODE = auto.

Get Updates on the Splunk Community!

New Year. New Skills. New Course Releases from Splunk Education

A new year often inspires reflection—and reinvention. Whether your goals include strengthening your security ...

Splunk and TLS: It doesn't have to be too hard

Overview Creating a TLS cert for Splunk usage is pretty much standard openssl.  To make life better, use an ...

Faster Insights with AI, Streamlined Cloud-Native Operations, and More New Lantern ...

Splunk Lantern is a Splunk customer success center that provides practical guidance from Splunk experts on key ...