parsing help

hreinstein · ‎10-16-2015

Control File: /dir/dir/dir/file_name
Data File: /dir/dir/dir/file_name.dat
Bad File: /dir/dir/dir/file_name.log
Discard File: /dir/dir/dir/file_name.log
(Allow all discards)

Number to load: ALL
Number to skip: 0
Errors allowed: 50000
Bind array: 1 rows, maximum of 256000 bytes
Continuation: none specified
Path used: Conventional
Silent options: FEEDBACK
Table TABLE_NAME, loaded from every logical record.
Insert option in effect for this table: APPEND
TRAILING NULLCOLS option in effect

Column Name Position Len Term Encl Datatype

NAME_ID FIRST * | O(") CHARACTER

Table TABLE:
1 Row successfully loaded.
0 Rows not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.

Space allocated for bind array: 1542 bytes(1 rows)
Read buffer bytes: 1048576

Total logical records skipped: 0
Total logical records read: 1
Total logical records rejected: 0
Total logical records discarded: 0

Run began on Wed Sep 09 08:50:36 2015
Run ended on Wed Sep 09 08:50:36 2015

Elapsed time was: 00:00:00.22
CPU time was: 00:00:00.05

log file is above and need to search for and send email for each log file that comes in during 24 hour period the following summary (fields)

Data File:

Table

Number of Rows loaded -> 1 Row successfully loaded.
Number of Rows failed -> 0 Rows not loaded because all WHEN clauses were failed.
START -> Run began on Wed Sep 09 08:57:36 2015
END -> Run ended on Wed Sep 09 08:57:37 2015

Elapsed time was: 00:00:00.46
CPU time was: 00:00:00.10

Richfez · ‎10-17-2015

I wrote the above questions, then realized it may not matter much though there are several ways to do this. They're probably separate events, so I think we first combine them. We can use transaction to do that. I'm picking a 1 minute max interval between the first line and the last line of the log file to keep it more efficient - adjust as necessary.

... | transaction startswith="Control File" endswith="CPU time was" maxspan=1m

That should group the events together. Now, let's extract the data you need with rex. To the end of the above...

... | rex "Data File: (?<data_file>[^\s]+)" | rex "Table (?<target_table>[^:]+)"

I took your string Data File: /dir/dir/dir/file_name.dat and made a field called "data_file" out of everything that isn't a space that followed the "Data File: " string. Right after that, I used rex to create a field called "target_table" out of everything that comes after the word "Table " up to the colon. Several of the other strings/captures will be much like that, I'm leaving it as an exercise for you to build them, but if you have any problems add a comment to this and I or someone will try to help with that particular problem!

One that's different will be 1 Row successfully loaded.

... | rex "(?<rows_success_string>\d+ Row successfully loaded.)" | rex field=rows_success_string "(?<rows_success_count>\d+)"

The creates a field called "rows_success_string" that is the full 1 Row successfully loaded. then immediately does a rex on that new field and pulls out the digits in the front and creates a field out of that called "rows_success_count", which you don't mention you need, but I thought I'd show the technique. You can easily do this in one rex, but this seemed like it should be easier to understand.

Those are the pieces I think you may need to get your data into fields. Next, you need to format it. I think the easiest way to format it might be to create a table out of your fields that you want, then "transpose" them to make it vertically oriented instead of left-right oriented. That would be something like

... | table data_file, target_table, rows_success, Field3, Field4, ... FieldN | transpose

Obviously, fill in the rest of the fields you need to show.

Last, create an alert from the "Save As" menu. Maybe have it run once per hour (or every 5 minutes, or once per day - whatever you want, keeping in mind system load) and alert when the result is greater than 1 and send an email with the contents sent in-line, and perhaps attached too.

Richfez · ‎10-17-2015

A couple of quick questions:

Is that log file being ingested already into Splunk?
Does each line come in as a separate event or does the entire log come in as a single event?
About how may of these log files get ingested each day? Less than 100? More than 1000?

parsing help

Announcing Scheduled Export GA for Dashboard Studio

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!