Getting Data In

Issue in Data Parsing for extracting the field .

Vigneshprasanna
Explorer

Hi Team,

I’m struck in parsing the data, please advise how to handle the data.

In the log of an application a particular filed is alone containing huge data, so the lines are broken into new lines, so the splunk is also considering every single new lines as a new event

Sample data :

Please have a look in the image i have attached for sample data as it contains tags i'm not able to post here.

The above data is comma separated data and its field names are :

AUD_SEQ_NO PACKAGE_NAME SERVICE_NAME AUDIT_TIME EVENT_NAME SHORT_TEXT LONG_TEXT AUDIT_DATA CONSUMER_ID MESSAGE_ID CONTEXT_ID USER_NAME USER_ID USER_CONTEXT COMPANY_ID VERSION SESSION_ID CHANNEL_ID BUSINESSUNIT_ID SERVER_NAME

I have added a snap for better understanding.

alt text

0 Karma
1 Solution

gbowden_pheaa
Path Finder

Generally speaking, you need to define the event boundaries (event beginning and end) and date/time configurations in your props.conf file. Since this in not a single line event, you may need to use the more expensive (in processing) line break definitions such as "BREAK_ONLY_BEFORE" or "MUST BREAK_AFTER" and the regex required to identify the event boundaries. You also need to define the date/time location in your data (TIME_PREFIX and/or TIME_STAMP_LOOKAHEAD) , and then the date/time format (TIME_FORMAT). You may also need to modify the TRUNCATE or MAX_EVENTS defaults if the event can be large (10000 bytes for TRUNCATE and 256 lines for MAX_EVENTS).

Going through the Admin guide in the props.conf specifications should help you along this process.

http://docs.splunk.com/Documentation/Splunk/7.1.1/Admin/Propsconf

This is just a starting point, not much more can be done without a data sample.

View solution in original post

0 Karma

gbowden_pheaa
Path Finder

Generally speaking, you need to define the event boundaries (event beginning and end) and date/time configurations in your props.conf file. Since this in not a single line event, you may need to use the more expensive (in processing) line break definitions such as "BREAK_ONLY_BEFORE" or "MUST BREAK_AFTER" and the regex required to identify the event boundaries. You also need to define the date/time location in your data (TIME_PREFIX and/or TIME_STAMP_LOOKAHEAD) , and then the date/time format (TIME_FORMAT). You may also need to modify the TRUNCATE or MAX_EVENTS defaults if the event can be large (10000 bytes for TRUNCATE and 256 lines for MAX_EVENTS).

Going through the Admin guide in the props.conf specifications should help you along this process.

http://docs.splunk.com/Documentation/Splunk/7.1.1/Admin/Propsconf

This is just a starting point, not much more can be done without a data sample.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

You need to adjust the props.conf settings for that sourcetype so multiple lines are treated as a single event. Post the relevant props.conf stanza and we can help you make the necessary changes.

---
If this reply helps you, Karma would be appreciated.
0 Karma

Vigneshprasanna
Explorer

Hi Richgalloway,

Thanks for the effort 🙂 can you please help me in understanding the changes i need to do in the "PROPS.CONF" file ??

Regards,
Vigneshprasanna R

0 Karma

niketn
Legend

@Vigneshprasanna, as stated by @richgalloway, the configuration would be strictly based on your data pattern. (like which pattern to identify Timestamp, what is the timestamp format, which pattern to allow Event Breaks etc).

So in order to assist you better you would need to provide us with sample data(Please mask out/anonymize any sensitive information before posting). You should also provide us with your exisiting props.conf.

Following are the couple of resources that should help you figure out correct props.conf settings:
https://docs.splunk.com/Documentation/Splunk/latest/Data/Configuretimestamprecognition
http://docs.splunk.com/Documentation/Splunk/latest/Data/Configureeventlinebreaking

You should also take advantage of adding a sample file in Data Preview Mode to apply and test the props.conf settings for timestamp recognition and event breaks before indexing the data. Once data is indexed incorrectly it can not be (easily) corrected.

https://docs.splunk.com/Documentation/Splunk/latest/Data/Setsourcetype

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

kvswathi
Path Finder

Hi ,

You can define the line breaker while indexing data manually from a file. There you can define the identifier for a new line using regex.

0 Karma
Get Updates on the Splunk Community!

Federated Search for Amazon S3 | Key Use Cases to Streamline Compliance Workflows

Modern business operations are supported by data compliance. As regulations evolve, organizations must ...

New Dates, New City: Save the Date for .conf25!

Wake up, babe! New .conf25 dates AND location just dropped!! That's right, this year, .conf25 is taking place ...

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...