Filed Extraction on Text File

garima_chauhan · ‎02-05-2014

Hi,

I have Host Firewall Logs coming in a text file. The data in the text file is separated by spaces and is inconsistent as for some rows there are say 8 columns, in some there are fewer and in some greater than 8 columns. I want to perform filed extraction on this data. How can this be achieved? I am familiar with csv field extraction but there the data is not inconsistent as is the case with this text file. I am using Splunk v5.0.5.

Please help. Its quite urgent. Any help would be really appreciated.

gfuente · ‎02-06-2014

There are missing characters, please see this update:

... | rex "^(?<field1>[^\s]+)?(\s)\*(?<field2>[^\s]+)?(\s)\*(?<field3>[^\s]+)?(\s)\*(?<field4>[^\s]+)?(\s)\*(?<field5>[^\s]+)?(\s)\*(?<field6>[^\s]+)?(\s)\*(?<field7>[^s]+)?(\s)\*(?<field8>[^s]+)?(\s)\*(?<field9>[^s]+)?(\s)\*(?<field10>[^s]+)?(\s)\*(?<field11>[^s]+)?(\s)\*(?<field12>[^s]+)?(\s)\*(?<field13>[^s]+)?(\s)\*(?<field14>[^s]+)?(\s)\*(?<field15>[^s]+)?(\s)\*" | ...

garima_chauhan · ‎02-06-2014

Hi,

Still didnt work..:(
I copied this exact regex.

gfuente · ‎02-05-2014

Hello

You could use a rex like this one:

^(?<field1>[^\s]+)?(\s)?(?<field2>[^\s]+)?(\s)?(?<field3>[^\s]+)?(\s)?(?<field4>[^\s]+)?(\s)?(?<field5>[^\s]+)?(\s)?(?<field6>[^\s]+)?(\s)?(?<field7>[^\s]+)?(\s)?(?<field8>[^\s]+)?(\s)?(?<field9>[^\s]+)?(\s)?(?<field10>[^\s]+)?(\s)?

Add as fields as the maximun number of fields you could have in the log file

Regards

garima_chauhan · ‎02-05-2014

Hi, I tried the following search:
source=FirewallLogs | rex "^(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)(?[^s]+)?(s)*" | table field1 field2

but, nothing gets displayed.

gfuente · ‎02-05-2014

Your sample lines have more than one space between some fields. Thats different from what you explained in your original question. try this:

| rex "^(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)(?[^\s]+)?(\s)*"

This works with the sample data you had provided

garima_chauhan · ‎02-05-2014

Hi gfuente,

My log file looks like:

7 123456 1.1.1.1 sfgdfghdghgdh 25 6 2.2.2.2 5255225 3.3.3.3 80 1 1 0 sdgzdfsg
7 456789 1.1.1.1 fsdfgsfgsfgfv 52 6 3.3.3.3 4654646 5.5.5.5 4564 2 2 2 pathoffile ssdfgsfg
7 123456 1.1.1.1 sfgdfghdghgdh 25 6 2.2.2.2 5255225 3.3.3.3 80 1 1 0 pathoffilevzfsgfgjsdlgjlsggflkgj sdgzdfsg

I am guessing that the column number discrepancy is due to the fact that if one column value is blank, it is not left blank and is instead populated with the next column value.

In any case, I do not how how to tackle this. Please help.

Filed Extraction on Text File

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Best Practices: Splunk auto adjust pipeline queue

Laser Bananas and Edge Hubs: Exploring Operational Technology (OT) Data Through a ...

Event Series: Mastering AI Tokenomics and Splunk Agent Observability

Join the Conversation

Filed Extraction on Text File

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Best Practices: Splunk auto adjust pipeline queue

Laser Bananas and Edge Hubs: Exploring Operational Technology (OT) Data Through a ...

Event Series: Mastering AI Tokenomics and Splunk Agent Observability