Getting Data In

How to transform my raw data into a readable format?

markb81
New Member

Hi,

I'm new to Splunk and hope I don't ask a question that's already been asked. I just don't know which terminology to use to search.

I configured my firewall to send syslog messages to a syslog server. That syslog server is indexed with Splunk. Splunk indexes the UTF-8 messages. So far so good. All the data I need to see is in the index.

However now I would like to make it more readable. I would like for instance to search for IP port block to see why an IP and port is blocked.

The issue is that all the required data is in the _raw column. It looks like this:

23-02-2017 10:30 PM,Info,2001:x:x:20::1,pfsensefw.x.local,5,16777216,,1000000103,igb1,match,block,in,4,0x0,,49,47728,0,none,6,tcp,4x,21x.2.108.2,195.3x.2xxxx,9719,22,0,S,3273904238,,16143,,

How can I transform this data in something readable?

Date, time, severity, firewall IP, host, lot of data I don't need, action (block/accept),data I don't need, protocol, source IP, destination ip, source port, destination port

and if that's possible, is it possible to re-order them?

Hope somebody can help me with it.

Thanks!

Mark

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

Here's one potential regex. no idea what names you wanted for anything, so the names are arbitrary.

^(?<a1>[-\d]+\s[\d\:]+\s[^,]+),(?<a2>[^,]+),(?<a3>[^,]+),(?<a4>[^,]+),(?<a5>[^,]+),([^,]*,){6}(?<a7>[^,]+),(?<a8>[^,]*,){9}(?<a9>[^,]+),(?<a10_IP1>[^,]+),(?<a11_IP2>[^,]+),(?<a12_Port1>[^,]+),(?<a13_port2>[^,]+),(?<remainder>.+)$

When you build one up from scratch from the left, using a facility like regex101.com, use something like (?<remainder>.+)$ at the end so that you always will have a match for the rest. Saves a lot of guesswork and fumbling.

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

Used the field names from somesoni2's answer to update and debug. Try this -

| rex "^(?<date>\S+)\s(?<time>[^,]+),(?<severity>[^,]+),(?<firewall_ip>[^,]+),(?<hostname>[^,]+),([^,]*,){6}(?<action>[^,]+),([^,]*,){9}(?<protocol>[^,]+),(?<a09>[^,]+),(?<src_ip>[^,]+),(?<dest_ip>[^,]+),(?<src_port>[^,]+),(?<dest_port>[^,]+),(?<remainder>.+)$"

Feel free to name A09 if you knwo what it is.

0 Karma

Richfez
SplunkTrust
SplunkTrust

markb81 ,

You have another option - there's an app called Home Monitor for that. 🙂 It takes a little bit to get set up right (read the directions carefully) but it'll do all your pfsense extractions and parsing for you, including lookups.

0 Karma

somesoni2
SplunkTrust
SplunkTrust

What you need is to extract fields from your raw data. All required values seems to be comma separated, so you can assign each (comma separated) segment a name so that you can that in search. E.g. (based on segment position, may need to adjust as per the number of comma separated values you have)

your base search | rex "^(?<date>\S+)\s(?<time>[^,]+),(?<severity>[^,]+),(?<firewall_ip>[^,]+),(?<hostname>[^,]+),([^,]*,){6}(?<action>[^,]+),([^,]*,){9}(?<protocol>[^,]+),(?<src_ip>[^,]+),(?<dest_ip>[^,]+),(?<src_port>[^,]+),(?<dest_port>[^,]+)," 

See these links for more information on field extractions
http://docs.splunk.com/Documentation/Splunk/6.5.2/Knowledge/WhenSplunkEnterpriseaddsfields#Field_ext...
http://docs.splunk.com/Documentation/Splunk/6.5.2/Knowledge/ExtractfieldsinteractivelywithIFX
http://docs.splunk.com/Documentation/Splunk/latest/Search/Extractfieldswithsearchcommands

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

Looks like a field missing between protocol and src_ip.

0 Karma

markb81
New Member

Hi,

Thanks so much for taking the time to answer. I been playing around a little with the regex you created. It's almost complete, but I cannot get it to work. The source and destination IP and port where wrong. It seemed the raw data looked a little different then the indexed data.

The indexed data:

25-02-2017 04:24 PM,Info,2001:41f0:xx:20::1,pfsensefw.ax.local,5,16777216,,1000000103,lagg1_vlan151,match,block,in,4,0x10,,16,0,0,none,17,udp,201,192.x.150.3,255.255.255.255,7303,7303,181

I already created this regex which is almost complete:

rex "^(?\S+)\s(?[^,]+),(?[^,]+),(?[^,]+),(?[^,]+),([^,]*,){6}(?[^,]+),([^,]*,){9}(?[^,]+),(?[^,]+),(?[^,]+),(?[^,]+),(?[^,]+),"

The issue is the dst_port is wrong. In my example the dst_port is the 7303 . I'm not sure which one 🙂 How can I get my regex working with dst_port 7303?

And after this is complete how to actually work with this regex? Al already included it with a base search, selected verbose and chose the fields to be displayed. This works great. But is this something I need to do every time? or can I save this search and then call it as a shortcut in my search field?

So my seach looks like : host="myhost" | regex

Can I turn it into : shortcut | src_ip="IP" action="block" to really search smart?

Kind regards and thanks very much for the help.

Mark

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

Okay, when you post code in the forum, you need to make sure to highlight it and mark it with the code button. That's the little button with 101 010 on it.

The reason you have to do that is that , especially with regexes, the web interface is going to delete anything that is in angle brackets < > like an html tag would be. So, what you posted isn't going to be what anyone sees, and no one can independently test your code, because they don't have it.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...