Getting Data In

How do I edit my props.conf for proper line breaking of my sample CSV log file?

dcascione
Explorer

I have a simple .csv log file that I'm trying to break with:

[software_summary]
LINE_BREAKER  = ([\r\n]+)
SHOULD_LINEMERGE = false

Here is a sample of the log:

Back to Index,
HOST INFORMATION,
Software build-2718055,10
Software build-3116895,15
Software build-2583090,35
Software 5.5.0 build-1746974,22

The two fields I'm interested in which are comma delimited are Software Build and Count. I'd like to see each line break out into its own log file. Thanks !!

0 Karma

ryanoconnor
Builder

The biggest issue I see with this file is that it's poorly formatted and not truly a well-formatted CSV file. See the screenshot below of when I saved the sample text you sent, saved it as software_summary.csv and opened it in Numbers on my mac.

alt text

If you did have it as a CSV then your props.conf would also include INDEXED_EXTRACTIONS = CSV and would handle this file much easier.

0 Karma

ryanoconnor
Builder

If possible, can you clean that index and re-index the file after you've made that change? Can you also post the updated version of the csv file once they've changed it so we can confirm it looks correct?

Thank you,
Ryan

0 Karma

dcascione
Explorer

Will the change to the props.conf line break the pre-existing log files within the index, or just the new logs that are being ingested after the change?

0 Karma

muebel
SplunkTrust
SplunkTrust

event breaking is done at index time, more info here : http://docs.splunk.com/Documentation/Splunk/6.4.1/Admin/Configurationparametersandthedatapipeline

The already indexed events won't change.

0 Karma

dcascione
Explorer

Yes the source file is a .csv - I just added the csv reference to the stanza - hopefully this will work! Thanks for the tip!!
[software_summary]
INDEXED_EXTRACTIONS = CSV
LINE_BREAKER = ([\r\n]+)
SHOULD_LINEMERGE = false

0 Karma

ryanoconnor
Builder

Can you also modify the first couple lines? the CSV indexed extractions work best if the first line is a header for the CSV file.

 Back to Index,
 HOST INFORMATION,

Should be

 software_build, count

So your CSV would look more like

 software_build, count
 Software build-2718055,10
 Software build-3116895,15
 Software build-2583090,35
 Software 5.5.0 build-1746974,22
0 Karma

dcascione
Explorer

I just reached out to the team who generates the logs to see if they can remove the very first "HOST INFORMATION" line. Thanks!

0 Karma

muebel
SplunkTrust
SplunkTrust

HI dcascione, I think a good angle on this would be to checkout the structured data options in props described here : http://docs.splunk.com/Documentation/Splunk/6.4.1/admin/Propsconf#Structured_Data_Header_Extraction_...

Essentially you could define FIELD_NAMES config to define the software_build and count fields, and PREAMBLE_REGEX config to disregard the initial couple lines

To stick with the initial idea of breaking out the events, it seems that your config should be effective for treating each line as an event (SHOULD_LINEMERGE=false)

One issue here could be that this config is being set on a universal forwarder, which wouldn't do linebreaking. Props definitions would need to be put on the upstream HF or Indexer.

Please let me know if this answers your question!

0 Karma

dcascione
Explorer

I was hoping to just line break the file in the props.conf and then build the field extractions using the UI - Thanks

0 Karma

dcascione
Explorer

According to the documentation, the stanza I added to the props.conf which includes (([\r\n]+), ) should break out each line into an event...Not sure why this is not working?

  • Defaults to ([\r\n]+), meaning data is broken into an event for each line, delimited by any number of carriage return or newline characters.
0 Karma

muebel
SplunkTrust
SplunkTrust

I edited my original issue to address the event breaking difficulty

0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...