Splunk Search

Regex not pulling out all values

gnovak
Builder

This really has me stumped. Not sure why this isn't working. I've got data in a log that looks like this:

    --- [1 - main] 2013-06-28/15:43:10.954 [00:00:00.182]
INFO: output files:
      Log Text File: /var/www/html/2013-06-28-15-43-10-script4.log.txt
      Log JSON File: /var/www/html/2013-06-28-15-43-10-script4.log.json
         Event File: /var/www/html/2013-06-28-15-43-10-script4.events.json
       Summary File: /var/www/html/2013-06-28-15-43-10-script4.summary.json
--- [1 - main] 2013-06-28/15:43:10.964 [00:00:00.191]
INFO: Script 4: https://www.nottellingyouthisinfo.com/page
 Users: 1
--- [1 - main] 2013-06-28/15:43:10.984 [00:00:00.213]
INFO: all threads started.
   size: 1
  group: Dropdown Thread Group
--- [10 - Thread-1] 2013-06-28/15:43:16.009 [00:00:05.236]
INFO: snapshot
[EVENTS]
            Event      OK         Avg       Dev      Min      Max     Err
  1:        Login       1     1979.00       NaN     1979     1979       0 [0.00%]
  2: Basic Search       1      848.00       NaN      848      848       0 [0.00%]
[ERRORS]
   << no errors >>
--- [1 - main] 2013-06-28/15:43:18.121 [00:00:07.349]
INFO: Run complete: 00:00:07.204
[EVENTS]
                     Event      OK         Avg       Dev      Min      Max     Err
  1:             Full Path       1     6973.00       NaN     6973     6973       0 [0.00%]
  2:                 Login       1     1979.00       NaN     1979     1979       0 [0.00%]
  3:          Basic Search       1      848.00       NaN      848      848       0 [0.00%]
  4:           Get Studies       1     2062.00       NaN     2062     2062       0 [0.00%]
  5: Get Study Level Items       1     1220.00       NaN     1220     1220       0 [0.00%]
  6:         Get Countries       1      847.00       NaN      847      847       0 [0.00%]
[ERRORS]
   << no errors >>

I have a search that should pull out all the Events and put them into a field Events. However it's only pulling out 2 of the events. It only shows Login and Full Path as the 2 events extracted. Here is the search:

sourcetype="ec2_web" "[EVENTS]" | rex field=_raw "\d:\s+(?<event>[\w+\s]+)\s+(?<Status>\d)\s+(?<Avg>\d+.\d+)"

All extractions are working great, except for event. I talked to 2 different people and they said they thought the regex was fine. So my last resort is here! Any idea what could be causing it to only extract 2 events instead of 6? Also out of the 2 events it extracts, if I click on one of those events and splunk searches for that event, it shows up with 6 spaces after the event name. For example:

sourcetype="ec2_web" "[EVENTS]" | rex field=_raw "\d:\s+(?<event>[\w+\s]+)\s+(?<Status>\d)\s+(?<Avg>\d+.\d+)" | search event="Login      "
Tags (1)
0 Karma
1 Solution

Ayn
Legend

Make the match for the event field non-greedy.

\d:\s+(?<event>[\w+\s]+?)\s+(?<Status>\d)\s+(?<Avg>\d+.\d+)

(difference is I added the ? sign at the end of the event extraction)

View solution in original post

gnovak
Builder

well that did work. I guess my regex was selfish and greedy. thanks for the heads up!

Ayn
Legend

Make the match for the event field non-greedy.

\d:\s+(?<event>[\w+\s]+?)\s+(?<Status>\d)\s+(?<Avg>\d+.\d+)

(difference is I added the ? sign at the end of the event extraction)

gnovak
Builder

Any idea about getting rid of 6 spaces after the event name? I'm still messing with that one...could be something i'm just overlooking

0 Karma

gnovak
Builder

Ok I added max_match=1000 just to test and got more results! This resolved the issue. Forgot out this actually...Thanks for refreshing my memory!!!!!

0 Karma

Ayn
Legend

SURE this is just one event? By default Splunk should break upon each timestamp, so the "INFO: snapshot" and "INFO: Run complete" should be in two separate events. That's the only way I can think of that you actually would get two instead of one match out of this anyway, because by default rex matches only value before quitting and you haven't told it to match more than that.

gnovak
Builder

I updated the post to include the full log. This is classified as one event.

0 Karma

gnovak
Builder

It's one bit chunk just like this in the log

0 Karma

krugger
Communicator

Is that the text comming in from the log file? Or is that a multiline event?

The regex is correct and if each line is an event it should return all the lines starting with a number.

0 Karma
Get Updates on the Splunk Community!

Last Chance to Submit Your Paper For BSides Splunk - Deadline is August 12th!

Hello everyone! Don't wait to submit - The deadline is August 12th! We have truly missed the community so ...

Ready, Set, SOAR: How Utility Apps Can Up Level Your Playbooks!

 WATCH NOW Powering your capabilities has never been so easy with ready-made Splunk® SOAR Utility Apps. Parse ...

DevSecOps: Why You Should Care and How To Get Started

 WATCH NOW In this Tech Talk we will talk about what people mean by DevSecOps and deep dive into the different ...