Hello,
I am building a table and supplying values from search. One of the values exists multiple times within each event. I want rex to stop after the first value returned. I thought that may be un greedy but I can't seem to nail down the proper syntax.
I'm grateful for any help.
rex field=statement "(?[^\s]+)"
This rex returns two of the same values into my table for each line. (ALERT, ALERT). I want a single line, therefore I require a single result to be extracted.
UPDATE
I don't wish to waste anybody's time further on this - I am convinced the issue is with the 'statement' field. A simple query (no rex, etc) consistently produces two values on two lines when 'statement' is displayed in a table. Splunk returns that there are only two results, but each result has two lines (4 total). Other fields for example 'RecordNumber' produce a single line.
I have no clue why this is happening but it has nothing to do with rex.
**Turns out the problem was a multivalue field as other's suggested. I modified my search string to eliminate the duplicates;
...| nomv statement | rex field=statement "(?<ALERTTYPE>[^\s]+)" ...
If you like to get the first word of a string inn to a variable, this should do:
ALTER DATABASE [DBA] MODIFY FILE ( NAME = N'DBA', FILEGROWTH = 1048576KB )
your search | rex "^(?<AlertType>\S+)"
Gives
AlertType=ALTER
The ^
in the regex tells to get text from the line until the first space.
I had to change that a bit, as the value is extracted from the 'statement' field;
rex field=statement "^(?<field1>\S+)"
But the result is still two lines,
Is your data multi lined?
I've been reading/learning as much as possible about rex before posting here (just to try to help myself and not waste anybody's time) but admittedly I was not fully able to figure out if my data is multi-lined or not. I would say, no. The sample data is near the bottom of this page.
I am really leaning towards the issue being caused by the fact there are two events for every instance, with the same date/timestamp. If I eliminate rex from the search entirely, and just display a table using the value of the 'statement' field, I STILL get two values in that field.
Try this without any rex
.
.... | search "ALTER" | head 1
to see how one line looks like.
Not quite sure where to put that in the string? At the point where I look for alter, I look for 7 other alert types;
... where like(statement,"ALTER%") OR like(statement,"CREATE%")
running that string where I thought it should go, produces a single result, where I should see 93 for this date range.
It look that your date is not separated well, or are in big chunks. If you use head 1
and gets 93 line its big packed. Every group of data in standard view if you just use *
and last 15
min should start with >
.
head 1 gives me one result. The number of actual events for the date range is 93.
If I strip away all of the rex info and formatting completely, I still get the same duplication;
index=main ComputerName="COMPUTER.domain" LogName="Application" session_server_principal_name:PAP\\* SourceName="MSSQL$OTPMSSQL" | where NOT isnull(statement) | where like(statement,"ALTER%") | search statement="*" | table statement
Here is the result; 2 events are creating 4 lines in the table (2 events, with 2 lines each).
If all lines ends with )
, and only contain one )
, you can split the lines like this:
| rex max_match=0 field=_raw "(?<lineData>[^)]+)" | mvexpand lineData | eval _raw=lineData.")"
But the problem here is how splunk reads your data. You may be better of split the line in props.conf
by adding BREAK_ONLY_BEFORE
.
Or as other suggest, keep only the first hit in rex
with {1}
.
If you only extracting the first word, try this
... | rex field=statement "(?<ALERTTYPE>\w+)" max_match=1
Here's a runanywhere sample
| makeresults | eval statement="ALTER ALTER DATABASE [DBA] MODIFY FILE ( NAME = N'DBA', FILEGROWTH = 1048576KB )" | rex field=statement "(?<ALERTTYPE>\w+)" max_match=1 | table statement ALERTTYPE
Ok, these did not produce any value for the AlertType field.
How about using a quantifier? This will restrict it to the first match
rex field=statement "(?[^\s]{1})"
Your use of the "+" (plus sign) indicates to 'regex', one or more matches
This still returns two values, but only the letter 'A', not the entire word 'Alert'.
I'm starting to wonder if the issue is unrelated to the rex statement (ie; maybe the word 'ALERT' only appears once for each event and the return value is being duplicated for another reason.)
Can you provide some sample data and the data you want extracted?
Be sure to use the code sample format when you post
data sample above (yesterday) in response to skoelpin. I'll try to post it in the question body.
rex field=statement "(^\w+)"
I think I needed to change this slightly as Splunk returned "Error in 'rex' command: The regex '(^\w+)' does not extract anything. It should specify at least one named group. Format: (?...). "
So I changed to;
rex field=statement "(?<ALERTTYPE>[^\w+])"
And that produced zero results for statement