Splunk Search

Filed Extraction on Text File

garima_chauhan
Path Finder

Hi,

I have Host Firewall Logs coming in a text file. The data in the text file is separated by spaces and is inconsistent as for some rows there are say 8 columns, in some there are fewer and in some greater than 8 columns. I want to perform filed extraction on this data. How can this be achieved? I am familiar with csv field extraction but there the data is not inconsistent as is the case with this text file. I am using Splunk v5.0.5.

Please help. Its quite urgent. Any help would be really appreciated.

Tags (2)
0 Karma

gfuente
Motivator

There are missing characters, please see this update:

... | rex "^(?<field1>[^\s]+)?(\s)\*(?<field2>[^\s]+)?(\s)\*(?<field3>[^\s]+)?(\s)\*(?<field4>[^\s]+)?(\s)\*(?<field5>[^\s]+)?(\s)\*(?<field6>[^\s]+)?(\s)\*(?<field7>[^s]+)?(\s)\*(?<field8>[^s]+)?(\s)\*(?<field9>[^s]+)?(\s)\*(?<field10>[^s]+)?(\s)\*(?<field11>[^s]+)?(\s)\*(?<field12>[^s]+)?(\s)\*(?<field13>[^s]+)?(\s)\*(?<field14>[^s]+)?(\s)\*(?<field15>[^s]+)?(\s)\*" | ... 
0 Karma

garima_chauhan
Path Finder

Hi,

Still didnt work..:(
I copied this exact regex.

0 Karma

gfuente
Motivator

Hello

You could use a rex like this one:

^(?<field1>[^\s]+)?(\s)?(?<field2>[^\s]+)?(\s)?(?<field3>[^\s]+)?(\s)?(?<field4>[^\s]+)?(\s)?(?<field5>[^\s]+)?(\s)?(?<field6>[^\s]+)?(\s)?(?<field7>[^\s]+)?(\s)?(?<field8>[^\s]+)?(\s)?(?<field9>[^\s]+)?(\s)?(?<field10>[^\s]+)?(\s)?

Add as fields as the maximun number of fields you could have in the log file

Regards

garima_chauhan
Path Finder

Hi, I tried the following search:
source=FirewallLogs | rex "^(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)*" | table field1 field2

but, nothing gets displayed.

0 Karma

gfuente
Motivator

Your sample lines have more than one space between some fields. Thats different from what you explained in your original question. try this:

| rex "^(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)*"

This works with the sample data you had provided

0 Karma

garima_chauhan
Path Finder

Hi gfuente,

My log file looks like:

7 123456 1.1.1.1 sfgdfghdghgdh 25 6 2.2.2.2 5255225 3.3.3.3 80 1 1 0 sdgzdfsg
7 456789 1.1.1.1 fsdfgsfgsfgfv 52 6 3.3.3.3 4654646 5.5.5.5 4564 2 2 2 pathoffile ssdfgsfg
7 123456 1.1.1.1 sfgdfghdghgdh 25 6 2.2.2.2 5255225 3.3.3.3 80 1 1 0 pathoffilevzfsgfgjsdlgjlsggflkgj sdgzdfsg

I am guessing that the column number discrepancy is due to the fact that if one column value is blank, it is not left blank and is instead populated with the next column value.

In any case, I do not how how to tackle this. Please help.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

We are excited to announce that the upcoming releases of Splunk Enterprise 10.2.x and Splunk Cloud Platform ...

Step into “Hunt the Insider: An Splunk ES Premier Mystery” to catch a cybercriminal ...

After a whole week of being on call, you fell asleep on your keyboard, and you hit a sequence of buttons that ...