Getting Data In

How do I set different source types on one data input?

Communicator

Hello,

I have a Data Input for TCP:10514 where I am receiving logs from different devices (blue coat proxy (192.168.3.217) and a firewall on 10.54.3.xxx)

I need the logs from the proxy to have the source type: bluecoat:proxysg:access:syslog and the logs from the Firewall to have a different sourcetype.

How can achieve this?

Thank you in advanced.

Tags (3)
1 Solution

SplunkTrust
SplunkTrust

Hi noybin,

You can do this on a Heavyweight Forwarder or an Indexer by setting a props.conf for the proxy. First you set a default sourcetype for the input in inputs.conf

[tcp::10514]
sourcetype = bluecoat

next use this sourcetype in props.conf to re-write the sourcetype for the proxy:

[bluecoat] 
TRANSFORMS-001_bluecoat_rewrite = bluecoat_get_hostname,bluecoat_rewrite_sourcetype

and finally in transforms.conf set the regex to match the proxy IP:

[bluecoat_get_hostname] 
REGEX = "\s((?:\d+\.){3}\d+)\s
DEST_KEY = MetaData:Host 
FORMAT = host::$1 

[bluecoat_rewrite_sourcetype] 
SOURCE_KEY = MetaData:Host 
REGEX = 192\.168\.3\.217 
DEST_KEY = MetaData:Sourcetype 
FORMAT = sourcetype::bluecoat:proxysg:access:syslog 

This will only re-write the sourcetype for this IP and leaves the sourcetype all others as bluecoat.

Hope this helps ...

cheers, MuS

View solution in original post

SplunkTrust
SplunkTrust

Hi noybin,

You can do this on a Heavyweight Forwarder or an Indexer by setting a props.conf for the proxy. First you set a default sourcetype for the input in inputs.conf

[tcp::10514]
sourcetype = bluecoat

next use this sourcetype in props.conf to re-write the sourcetype for the proxy:

[bluecoat] 
TRANSFORMS-001_bluecoat_rewrite = bluecoat_get_hostname,bluecoat_rewrite_sourcetype

and finally in transforms.conf set the regex to match the proxy IP:

[bluecoat_get_hostname] 
REGEX = "\s((?:\d+\.){3}\d+)\s
DEST_KEY = MetaData:Host 
FORMAT = host::$1 

[bluecoat_rewrite_sourcetype] 
SOURCE_KEY = MetaData:Host 
REGEX = 192\.168\.3\.217 
DEST_KEY = MetaData:Sourcetype 
FORMAT = sourcetype::bluecoat:proxysg:access:syslog 

This will only re-write the sourcetype for this IP and leaves the sourcetype all others as bluecoat.

Hope this helps ...

cheers, MuS

View solution in original post

Communicator

Could you tell me if it is works to "index" too?

0 Karma

SplunkTrust
SplunkTrust

Sure, but the transforms would look like this:

[stanza_name_goes_here] 
 SOURCE_KEY = MetaData:Host 
 REGEX = 192\.168\.3\.217 
 DEST_KEY = _MetaData:Index 
 FORMAT = SomeIndexNameGoesHere
0 Karma

Communicator

Thanks for your answer.

If I set the regex= 192\.168\.3\.217, then any message that comes from any device with the string "192.168.3.217" will be matched as bluecoat:proxysg:access:syslog.
Is that right?

I am using Blue Coat add on, where can I find the regex used for host extraction for sourcetype: bluecoat:proxysg:access:syslog

I was using the regex used for host extraction for syslog (splunk/etc/system/default/transforms.conf) but it doesn't work for these events:

REGEX = > :\d\d\s+(?:\d+\s+|(?:user|daemon|local.?)\.\w+\s+)*\[?(192\.168\.3\.217)[\w\.\-]*\]?\s

Following are some events from my Blue Coat proxy:

2016-02-15 20:50:12 405 10.54.3.51 noybin LTT-AR\GRP%20Internet%20Base - OBSERVED "Web Ads/Analytics" http://www.diarioregistrado.com/  200 TCP_NC_MISS GET text/html;charset=UTF-8 http bcp.crwdcntrl.net 80 /5/c=6508/rand=111344262/pv=y/rt=ifr - - "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.97 Safari/537.36" 192.168.3.217 4175 3940 - "none" "none"
2016-02-15 20:50:12 10828 10.54.3.51 noybin LTT-AR\GRP%20Internet%20Base - OBSERVED "News/Media" http://www.diarioregistrado.com/  200 TCP_NC_MISS GET image/gif http www.diarioregistrado.com 80 /files/banners/TVR2014-2.gif - gif "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.97 Safari/537.36" 192.168.3.217 298625 886 - "none" "none"
2016-02-15 20:50:02 5434 10.54.3.51 noybin LTT-AR\GRP%20Internet%20Base - OBSERVED "News/Media" http://www.diarioregistrado.com/  200 TCP_NC_MISS GET image/jpeg http www.diarioregistrado.com 80 /upload/news/diarioregistrado/56c04c503aa8e.jpg - jpg "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.97 Safari/537.36" 192.168.3.217 149894 907 - "none" "none"

Thank you very much.

0 Karma

SplunkTrust
SplunkTrust

Quote: If I set the regex= 192\.168\.3\.217, then any message that comes from any device with the string "192.168.3.217" will be matched as bluecoat:proxysg:access:syslog.
Is that right?

No, any events from the host=192.168.3.217 will get the sourcetype bluecoat:proxysg:access:syslog This is because the regex only checks on the SOURCE_KEY = MetaData::Host.
But key here is to get the host extraction working in this case first. Based on your provided examples this should work as host regex "\s((?:\d+\.){3}\d+)\s

0 Karma

Communicator

Thanks again!

So as you say, first I need to get Splunk to extract the host correctly so then with the REGEX and the SOURCE_KEY I can set the sourcetype.

Then:
1. In the examples I copied, the IP address that should be extracted as host is the second on each event (192.168.3.217)

2016-02-15 20:50:12 405 10.54.3.51 noybin LTT-AR\GRP%20Internet%20Base - OBSERVED "Web Ads/Analytics" http://www.diarioregistrado.com/ 200 TCP_NC_MISS GET text/html;charset=UTF-8 http bcp.crwdcntrl.net 80 /5/c=6508/rand=111344262/pv=y/rt=ifr - - "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.97 Safari/537.36" 192.168.3.217 4175 3940 - "none" "none"

Won't your regex: "\s((?:\d+.){3}\d+)\s match with the first IP address?

  1. Where should I write that regex? (Which file and key/stanza?)

Thank you very much.

0 Karma

SplunkTrust
SplunkTrust

No it will not, because the regex matches "\s (a double quote followed by a whitespace) in front of the digits and this only happens before the second IP not before the first IP.
Take a look at the docs http://docs.splunk.com/Documentation/Splunk/6.3.3/Knowledge/ExtractfieldsinteractivelywithIFX to learn about field extraction

0 Karma

Communicator

Ok you're right.
So where do i have to write that regex for extracting the host correctly?

And in what stanza/key?

Thank you so much again!

0 Karma

SplunkTrust
SplunkTrust

try something like this is $SPLUNK_HOME/etc/apps/YourAppName/local/props.conf

[bluecoat] 
TRANSFORMS-001_rewrite_bluecoat_sourcetype = bluecoat_get_hostname,bluecoat_rewrite_sourcetype

and in $SPLUNK_HOME/etc/apps/YourAppName/local/transforms.conf set the regex to match the second IP:

 [bluecoat_get_hostname] 
 REGEX = "\s((?:\d+\.){3}\d+)\s
 DEST_KEY = MetaData:Host 
 FORMAT = host::$1 

 [bluecoat_rewrite_sourcetype] 
 SOURCE_KEY = MetaData::Host 
 REGEX = 192\.168\.3\.217 
 DEST_KEY = MetaData::Sourcetype 
 FORMAT = sourcetype::bluecoat:proxysg:access:syslog 

This is un-tested, but it should work as long as you have it on either a Heavy Weight Forwarder or an Indexer and don't forget to restart Splunk after the changes.

0 Karma

Communicator

Hi, applying your latest comment, now Splunk is extracting the host correctly but it isn't applying the source type correctly (it doesn't apply bluecoat:proxysg:access:syslog).

The host is being extracted as 192.168.3.217 (OK)
The sourcetype is being extracted as syslog (which is the default sourcetype for the data input) instead of bluecoat:proxysg:access:syslog.

I 've set the following:

-- /sdm/splunk/etc/system/local/props.conf --

TRANSFORMS-changesourcetype = bluecoat_get_hostname,set_sourcetype_bluecoat_for_some_hosts

-- /sdm/splunk/etc/system/local/transforms.conf --

[bluecoat_get_hostname]
REGEX = ["|-]\s((?:\d+\.){3}\d+)\s
DEST_KEY = MetaData:Host
FORMAT = host::$1

[set_sourcetype_bluecoat_for_some_hosts]
SOURCE_KEY = MetaData::Host
REGEX = 192\.168\.3\.217
DEST_KEY = MetaData::Sourcetype
FORMAT = sourcetype::bluecoat:proxysg:access:syslog

I had to add ["|-] to the regex because of some special events that came in that format.

Thank you!

0 Karma

SplunkTrust
SplunkTrust

Is your props.conf applied to [syslog] ?

0 Karma

Communicator

Thanks again.

I tried both: [syslog] and [source::tcp:10514]. Both apply correctly the host transformation ( bluecoat_get_hostname) but not the source type transformation (set_sourcetype_bluecoat)

Thank you.

0 Karma

SplunkTrust
SplunkTrust

Hi, I updated the original answer and fixed all typos in it. So, feel free to accept it if this answers your question - thanks.

0 Karma

Communicator

Done!
Thank you very much for your help!

0 Karma

Champion

This can be accomplished through transforms and props on the indexers. You will need to use regex to identify which log lines are from the firewall. Read the article below.

http://docs.splunk.com/Documentation/Splunk/4.0/Knowledge/Overridesourcetypesonaper-eventbasis

0 Karma

Communicator

I've tried your solution and when I search I see the events with the correct sourcetype, but the host field is extracted with the original sourcetype which extracts it wrongly.

By host I mean the IP of the proxy that sends the logs.

I've set "syslog" as sourcetype for the datainput and "proxysg:access:syslog" when the regex is matched.

The host is extracted with the syslog regex and not with the proxysg:access:syslog one. And it extracts the host field with the wrong value and I guess that I will have the same problem when I add more source types.

Is there a way to make the host be extracted with the correct source type?

Thank you very much for your help.

0 Karma