If i need to extract two fields from below string
"source=/app/cups-drink/test/iron13-machine5a-43machine.log"
The first field name is "item" and value is "cups"
The second field name is "system" and value is "43machine"
This really isn't an answer, but more of a comment that applies to all of these great solutions. An approach using the rex
command will work great. But, if you try to put this into a configuration file as a permanent field extraction ( props.conf
or transforms.conf
) and want to use it in a base search, you will probably not get the result you're looking for. The reason for this is when you do a search for something like
sourcetype=mysourcetype myfieldfromsource=123
splunk will look for the token "123" within the raw text of the event - it will not look in the source
field.
If you want to extract a regular expression from source
and have it searchable as a field name in a base search then you will need to make it an indexed field. Indexed fields are not recommended for a variety of very good reasons, not the least of which is they are must be defined in advance and are very inflexible. But if this is what you need to solve your problem, it is available to you.
If regex was that easy, then I would have answered.:)
... | rex field=source "^/[^/]+/(?<animal>[a-zA-Z]+)"
Which means, from the start of the string in the field called source
, find a single slash, followed by one or more non-slash characters, followed by a single slash - then take all (but at least one) uppercase or lowercase letters you find, and put them in the field 'animal'.
As you'll find, the field will only contain 'dog' in this scenario, as the dash between 'dog' and 'focus' is not a letter.
You can probably benefit from reading up on regular expressions if you want to make more dynamic extractions.
/K
How about:
Search | rex field=_raw .*capture(?<NUM>num)34/12.log.*$
faster 🙂
... | eval num="num" | ...
i am trying to extract the word "NUM" from source=c:/documents/app/test1/test12/controlNUM34/12.log.
You can do field extractions dynamically in the search with the rex
command;
your_base_search | rex field=source "your regex with a capture group here"
to capture "34" an put it in a field called num
;
your_base_search | rex field=source "(?<num>\d+)/[^/]+$"
which is to be read as, capture one or more digits (and call them num
) that are followed by one slash, which is followed by one or more non-slash characters, followed by the end-of-line.
Once you're happy with your regex field extraction, you should probably make it 'permanent' by adding the extraction rule to props.conf as an EXTRACT.
See more here:
http://docs.splunk.com/Documentation/Splunk/5.0.4/Knowledge/Addfieldsatsearchtime
http://docs.splunk.com/Documentation/Splunk/5.0.4/Knowledge/Createandmaintainsearch-timefieldextract...
http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Rex
/K
Given your question here, and in other posts I suggest that you read up on regex in general.
e.g. http://www.regular-expressions.info
http://gskinner.com/RegExr/
In this case (one of) the answer(s) is;
rex field=source "/app/(?<item>[a-z]+)([^/]+/){2}.+(?<system>[^-]+)\.log$
Which is; find '/app/', then take any a-z characters and call them item
. Then jump over any non-slash characters followed by a slash, twice. Then skip through any characters, until you find a set of non-dash characters followed by .log
at the end of the string. Call these non-dash characters system
.
/K
I'm guessing that you want to extract XXX in the following scenario, where XXX is a string that follows 'control' and 'yy' is one or more digits. Not the literal string 'num', right?
/controlXXXyy/zzz.log
In that case;
rex field=source "/control(?<XXX>[a-zA-Z]+)\d+/[^/]+$"
Hi Thinksplunk - can you give a few more samples? Are you trying to extract:
source=c:/documents/app/test1/test12/control*num*34/12.log
or:
source=c:/documents/app/test1/test12/controlnum*34*/12.log?