Getting Data In

Can you help me parse the following raw data set using Splunk?

Stevelim
Communicator

I have a raw data set that goes like this:

Logtime: 20181010_15:30:34

ID: V12

ArrivalTime: 15:30:33
No OFFSET DIRECTION LOAD
1 14.3  Counter 100
2 14.5  Reverse100
ExitTime: 15:30:34
Max: 1000
MIN: 900

What will be the best way to parse this data using Splunk?

0 Karma
1 Solution

maciep
Champion

Yeah, that's one terrible looking log file. If you have any control over its format, change it to something a bit more splunk-friendly. If not, then maybe something like this below.

I didn't test this at all and I'm sure the regex can be better...the examples is just to provide you an idea of how I'd parse it.

Essentially, I'd grab the whole thing as one event (at parse time) and then extract the fields i need from each event (at search time). Of course, if the format changes from event to event or is inconsistent in general, then i would have to modify the extractions appropriately.

props.conf

[your_sourcetype]
# PARSE-TIME SETTINGS
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)(?=Logtime:)
TIME_PREFIX = Logtime:\s*
TIME_FORMAT = %Y%m%d_%H:%M:%S
TIMESTAMP_LOOKAHEAD = 20

# SEARCH-TIME SETTINGS
EXTRACT-arrival_time = (?i)ArrivalTime:\s*(?<arrival_time>\S+)
EXTRACT-exit_time = (?i)ExitTime:\s*(?<exit_time>\S+)
EXTRACT-id = (?i)ID:\s*(?<id>\S+)
EXTRACT-max = (?i)Max:\s*(?<max>\S+)
EXTRACT-min = (?i)Min:\s*(?<min>\S+)
EXTRACT-no1_offset = (?i)^1\s+(?<no1_offset>\S+)\s*(?<no1_direction>counter|reverse)\s*(?<no1_something>\d+)
EXTRACT-no2_offset = (?i)^2\s+(?<no2_offset>\S+)\s*(?<no2_direction>counter|reverse)\s*(?<no2_something>\d+)

View solution in original post

0 Karma

maciep
Champion

Yeah, that's one terrible looking log file. If you have any control over its format, change it to something a bit more splunk-friendly. If not, then maybe something like this below.

I didn't test this at all and I'm sure the regex can be better...the examples is just to provide you an idea of how I'd parse it.

Essentially, I'd grab the whole thing as one event (at parse time) and then extract the fields i need from each event (at search time). Of course, if the format changes from event to event or is inconsistent in general, then i would have to modify the extractions appropriately.

props.conf

[your_sourcetype]
# PARSE-TIME SETTINGS
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)(?=Logtime:)
TIME_PREFIX = Logtime:\s*
TIME_FORMAT = %Y%m%d_%H:%M:%S
TIMESTAMP_LOOKAHEAD = 20

# SEARCH-TIME SETTINGS
EXTRACT-arrival_time = (?i)ArrivalTime:\s*(?<arrival_time>\S+)
EXTRACT-exit_time = (?i)ExitTime:\s*(?<exit_time>\S+)
EXTRACT-id = (?i)ID:\s*(?<id>\S+)
EXTRACT-max = (?i)Max:\s*(?<max>\S+)
EXTRACT-min = (?i)Min:\s*(?<min>\S+)
EXTRACT-no1_offset = (?i)^1\s+(?<no1_offset>\S+)\s*(?<no1_direction>counter|reverse)\s*(?<no1_something>\d+)
EXTRACT-no2_offset = (?i)^2\s+(?<no2_offset>\S+)\s*(?<no2_direction>counter|reverse)\s*(?<no2_something>\d+)
0 Karma

Stevelim
Communicator

This works great! I learnt a lot from the regex pattern as I was primary stuck on how to extract the huge chunk of table. I would love to extract the raw to something splunk friendly but unfortunately that is out of my control.

0 Karma

asabatini85
Path Finder

What is for you, the relevant values?

you can use the : with separator, but you need modify the file log.

FrankVl
Ultra Champion

And what exactly do you mean by parsing in this case? Timestamping and linebreaking, or field extractions (or both)?

0 Karma

Stevelim
Communicator

My apologies, I realised I did not think deep enough about how it will appear in Splunk since I was originally working on it in Excel. So in Excel, I am able to just fit it columns but I forgot that in Splunk it will be associated with time. My end state is to be able to say do a search and create a time series chart of all the No = 1.

| No = 1

and it should return me with something like:
15:30:33 OFFSET = 14,3 <- From first log file
15:30:33 OFFSET = 15.2 <- From second log file of similar format

After which I can then append a | timechart avg(OFFSET) by No to see all the OFFSET.

I hope it makes sense.

0 Karma

Stevelim
Communicator

I figured what I can extract most of the fields out of the box via Splunk. I will like the key value pair to be something along:

15:30:33 No1_OFFSET = 14.3
15:30:33 No2_OFFSET = 14.5
15:30:33 No1_Direction = Counter
15:30:33 No2_Direction = Reverse

Is this possible?

0 Karma

FrankVl
Ultra Champion

It is a damn ugly log file format to get into Splunk directly as separate events. Would be much easier if you had timestamp on each line and no footer.

Your best bet now might be to get the whole chunk in as one event and then further extract the contents and split it up with search commands.

Using Rex to extract the actual data lines into a multi valued field and then split the event into individual events and then for each of those pull out the individual fields of the data.

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...