Getting Data In

Why is line breaking inconsistent - File Monitoring - Roxio SecureBurn Log file - .txt

danielansell
Path Finder

Everytime a CD is burned with Roxio SecureBurn, a txt file log of the cd is created. The format of the .txt log file is:

Date: Thu Mar 7 13:47:00 2019
Computer Name: ComputerName01
User Name: domain.accountname

Project includes 1 folder(s) and 2 file(s)

C:\Users\accountname\Desktop\TransferFolder\file1.txt e69f78a887b(rest of file hash)a35 3764543bytes 2019/3/7 08:13:15
C:\Users\accountname\Desktop\TransferFolder\file2.txt e69f78a887b(rest of file hash)a35 7764543bytes 2019/3/7 08:13:18
END OF FILE

My props.conf for the sourcetype I created includes:
SHOULD_LINEMERGE = false

Linebreaking is occurring inconsistently. Some of my events show up where each line is its own event, others include every bit of data in the .txt file as its own event. Any ideas - perhaps a more bulletproof way to force a break? Do Windows .txt files normally have inconsistancies with a carriage return or line breaking?

0 Karma
1 Solution

FrankVl
Ultra Champion

When you set SHOULD_LINEMERGE = false you also need to specify a linebreaker. And while you're at it, it is always good to specify time format config explicitly as well.

So, try adding this to your props.conf:

LINE_BREAKER = ([\r\n]*)Date:\s+\w+\s+\w+\s+\d+
TIME_PREFIX = Date:\s+
MAX_TIMESTAMP_LOOKAHEAD = 25
TIME_FORMAT = %a %b %e %H:%M:%S %Y

View solution in original post

0 Karma

mfw113
New Member

Were you able to extract the file names and sizes from these logs, if so would you be willing to share?

0 Karma

danielansell
Path Finder

For file details, my extraction looks like this:

(?P<FileName>.+)\s(?P<FileHash>[0-9a-fA-F]{40})\s+(?P<FileSize>\d+)bytes\s+(?P<FileModDate>\d{4}\/\d+\/\d+\s\d+:\d+:\d+
0 Karma

danielansell
Path Finder

The ultimate goal is actually to extract data from each file burned to CD. When each .txt file is returned as a single event, I tried to extract data though using the rex command. The first result is stored in my "FileSize" field, but it does not continue on to the remaining lines.

So alternatively, if there is a more appropriate way to populate a FileSize, FileName, etc. field from each line, without breaking the event into single line events, that would be great as well.

0 Karma

FrankVl
Ultra Champion

That sounds like a matter of setting max_matches=0 in your rex command. By default the rex command stops after finding 1 match.

0 Karma

danielansell
Path Finder

I typically use rex to test my extractions before building my field extractions in my props.conf.
Will a field extraction defined in a props.conf return multiple values - should a max_matches=0 be added to my props.conf?

If so, will I be able to perform stats functions on the multivalued field? For example, if I were to define a FileSize field in my props.conf as an extraction, and I'd like to return a total size for the burn job, would I be able to do the following:
...search... |eval DiscSize=sum(FileSize) by source

0 Karma

FrankVl
Ultra Champion

You can use a REPORT extraction and configure the corresponding transforms.conf settings with MV_ADD=true.

Yes, something like that would work. Although the correct syntax would be: | stats sum(FileSize) by source not eval 🙂

But it might also be valid to split this into single line events upon indexing, as multi valued fields can be a bit difficult to work with sometimes. But then you will have to find a way of dealing with those header lines of the file (if at all interesting).

0 Karma

FrankVl
Ultra Champion

When you set SHOULD_LINEMERGE = false you also need to specify a linebreaker. And while you're at it, it is always good to specify time format config explicitly as well.

So, try adding this to your props.conf:

LINE_BREAKER = ([\r\n]*)Date:\s+\w+\s+\w+\s+\d+
TIME_PREFIX = Date:\s+
MAX_TIMESTAMP_LOOKAHEAD = 25
TIME_FORMAT = %a %b %e %H:%M:%S %Y
0 Karma

danielansell
Path Finder

I added to my props.conf and did not see a change in behavior. I'm still seeing the same - one of the .txt files has been split to 10 events where each line is its own event; while 2 other .txt files are still being returned as a multi-line event.

As I understand Splunk, changes in line breaking are applied at search time, correct?

Could there be an issue in how the .txt file is formatted?
I'm generally aware that there are some quirks with how Notepad.exe handles new lines (as evidenced by trying to modify any Splunk app in Notepad - there are never any linebreaks). Because of that I use another tool that handles the new lines appropriately - are there any tools (e.g. notepad++) that could identify what type of new line is present in my .txt file?

0 Karma

FrankVl
Ultra Champion

Linebreaking happens at indextime. So to see the changes take effect, you need to restart and ingest fresh sample files.

0 Karma

danielansell
Path Finder

That's my problem. I was expecting the props config to be interpreted at search time. The event I have that is broken up correctly was indexed after my props change. I didn't put two and two together with that.

I think I'll re-index the data and see if that serves my needs.

Thanks for the help - to include up above with the stats versus eval.

nickhills
Ultra Champion

Forgive my ignorance - does each burn job result in a new file, or all all the job logs written into the same log?

If my comment helps, please give it a thumbs up!
0 Karma

danielansell
Path Finder

Each job results in a new file. As such, I intend to use the source field as a means to provide meaningful data (using either a transaction, or "by" to group data).

0 Karma
Get Updates on the Splunk Community!

Happy CX Day to our Community Superheroes!

Happy 10th Birthday CX Day!What is CX Day? It’s a global celebration recognizing innovation and success in the ...

Check out This Month’s Brand new Splunk Lantern Articles

Splunk Lantern is a customer success center providing advice from Splunk experts on valuable data insights, ...

Routing Data to Different Splunk Indexes in the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. The OpenTelemetry project is the second largest ...