This really has me stumped. Not sure why this isn't working. I've got data in a log that looks like this:
--- [1 - main] 2013-06-28/15:43:10.954 [00:00:00.182]
INFO: output files:
Log Text File: /var/www/html/2013-06-28-15-43-10-script4.log.txt
Log JSON File: /var/www/html/2013-06-28-15-43-10-script4.log.json
Event File: /var/www/html/2013-06-28-15-43-10-script4.events.json
Summary File: /var/www/html/2013-06-28-15-43-10-script4.summary.json
--- [1 - main] 2013-06-28/15:43:10.964 [00:00:00.191]
INFO: Script 4: https://www.nottellingyouthisinfo.com/page
Users: 1
--- [1 - main] 2013-06-28/15:43:10.984 [00:00:00.213]
INFO: all threads started.
size: 1
group: Dropdown Thread Group
--- [10 - Thread-1] 2013-06-28/15:43:16.009 [00:00:05.236]
INFO: snapshot
[EVENTS]
Event OK Avg Dev Min Max Err
1: Login 1 1979.00 NaN 1979 1979 0 [0.00%]
2: Basic Search 1 848.00 NaN 848 848 0 [0.00%]
[ERRORS]
<< no errors >>
--- [1 - main] 2013-06-28/15:43:18.121 [00:00:07.349]
INFO: Run complete: 00:00:07.204
[EVENTS]
Event OK Avg Dev Min Max Err
1: Full Path 1 6973.00 NaN 6973 6973 0 [0.00%]
2: Login 1 1979.00 NaN 1979 1979 0 [0.00%]
3: Basic Search 1 848.00 NaN 848 848 0 [0.00%]
4: Get Studies 1 2062.00 NaN 2062 2062 0 [0.00%]
5: Get Study Level Items 1 1220.00 NaN 1220 1220 0 [0.00%]
6: Get Countries 1 847.00 NaN 847 847 0 [0.00%]
[ERRORS]
<< no errors >>
I have a search that should pull out all the Events and put them into a field Events. However it's only pulling out 2 of the events. It only shows Login and Full Path as the 2 events extracted. Here is the search:
sourcetype="ec2_web" "[EVENTS]" | rex field=_raw "\d:\s+(?<event>[\w+\s]+)\s+(?<Status>\d)\s+(?<Avg>\d+.\d+)"
All extractions are working great, except for event. I talked to 2 different people and they said they thought the regex was fine. So my last resort is here! Any idea what could be causing it to only extract 2 events instead of 6? Also out of the 2 events it extracts, if I click on one of those events and splunk searches for that event, it shows up with 6 spaces after the event name. For example:
sourcetype="ec2_web" "[EVENTS]" | rex field=_raw "\d:\s+(?<event>[\w+\s]+)\s+(?<Status>\d)\s+(?<Avg>\d+.\d+)" | search event="Login "
Make the match for the event field non-greedy.
\d:\s+(?<event>[\w+\s]+?)\s+(?<Status>\d)\s+(?<Avg>\d+.\d+)
(difference is I added the ? sign at the end of the event extraction)
well that did work. I guess my regex was selfish and greedy. thanks for the heads up!
Make the match for the event field non-greedy.
\d:\s+(?<event>[\w+\s]+?)\s+(?<Status>\d)\s+(?<Avg>\d+.\d+)
(difference is I added the ? sign at the end of the event extraction)
Any idea about getting rid of 6 spaces after the event name? I'm still messing with that one...could be something i'm just overlooking
Ok I added max_match=1000 just to test and got more results! This resolved the issue. Forgot out this actually...Thanks for refreshing my memory!!!!!
SURE this is just one event? By default Splunk should break upon each timestamp, so the "INFO: snapshot" and "INFO: Run complete" should be in two separate events. That's the only way I can think of that you actually would get two instead of one match out of this anyway, because by default rex
matches only value before quitting and you haven't told it to match more than that.
I updated the post to include the full log. This is classified as one event.
It's one bit chunk just like this in the log
Is that the text comming in from the log file? Or is that a multiline event?
The regex is correct and if each line is an event it should return all the lines starting with a number.