Splunk Search

Filed Extraction on Text File

garima_chauhan
Path Finder

Hi,

I have Host Firewall Logs coming in a text file. The data in the text file is separated by spaces and is inconsistent as for some rows there are say 8 columns, in some there are fewer and in some greater than 8 columns. I want to perform filed extraction on this data. How can this be achieved? I am familiar with csv field extraction but there the data is not inconsistent as is the case with this text file. I am using Splunk v5.0.5.

Please help. Its quite urgent. Any help would be really appreciated.

Tags (2)
0 Karma

gfuente
Motivator

There are missing characters, please see this update:

... | rex "^(?<field1>[^\s]+)?(\s)\*(?<field2>[^\s]+)?(\s)\*(?<field3>[^\s]+)?(\s)\*(?<field4>[^\s]+)?(\s)\*(?<field5>[^\s]+)?(\s)\*(?<field6>[^\s]+)?(\s)\*(?<field7>[^s]+)?(\s)\*(?<field8>[^s]+)?(\s)\*(?<field9>[^s]+)?(\s)\*(?<field10>[^s]+)?(\s)\*(?<field11>[^s]+)?(\s)\*(?<field12>[^s]+)?(\s)\*(?<field13>[^s]+)?(\s)\*(?<field14>[^s]+)?(\s)\*(?<field15>[^s]+)?(\s)\*" | ... 
0 Karma

garima_chauhan
Path Finder

Hi,

Still didnt work..:(
I copied this exact regex.

0 Karma

gfuente
Motivator

Hello

You could use a rex like this one:

^(?<field1>[^\s]+)?(\s)?(?<field2>[^\s]+)?(\s)?(?<field3>[^\s]+)?(\s)?(?<field4>[^\s]+)?(\s)?(?<field5>[^\s]+)?(\s)?(?<field6>[^\s]+)?(\s)?(?<field7>[^\s]+)?(\s)?(?<field8>[^\s]+)?(\s)?(?<field9>[^\s]+)?(\s)?(?<field10>[^\s]+)?(\s)?

Add as fields as the maximun number of fields you could have in the log file

Regards

garima_chauhan
Path Finder

Hi, I tried the following search:
source=FirewallLogs | rex "^(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)*" | table field1 field2

but, nothing gets displayed.

0 Karma

gfuente
Motivator

Your sample lines have more than one space between some fields. Thats different from what you explained in your original question. try this:

| rex "^(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)*"

This works with the sample data you had provided

0 Karma

garima_chauhan
Path Finder

Hi gfuente,

My log file looks like:

7 123456 1.1.1.1 sfgdfghdghgdh 25 6 2.2.2.2 5255225 3.3.3.3 80 1 1 0 sdgzdfsg
7 456789 1.1.1.1 fsdfgsfgsfgfv 52 6 3.3.3.3 4654646 5.5.5.5 4564 2 2 2 pathoffile ssdfgsfg
7 123456 1.1.1.1 sfgdfghdghgdh 25 6 2.2.2.2 5255225 3.3.3.3 80 1 1 0 pathoffilevzfsgfgjsdlgjlsggflkgj sdgzdfsg

I am guessing that the column number discrepancy is due to the fact that if one column value is blank, it is not left blank and is instead populated with the next column value.

In any case, I do not how how to tackle this. Please help.

0 Karma
*NEW* Splunk Love Promo!
Snag a $25 Visa Gift Card for Giving Your Review!

It's another Splunk Love Special! For a limited time, you can review one of our select Splunk products through Gartner Peer Insights and receive a $25 Visa gift card!

Review:





Or Learn More in Our Blog >>