I have raw data like this,
09:00:06 08/01/2016 good TSMONW46PRDV [TSMONW46PRDV][AP] Disk Space Disk/File System/[C]/percent full=45.745, Disk/File System/[E]/percent full=34.595
I want to extract field from this so that I can get result like this:
[C]/percent full=45.745
[E]/percent full=34.595
What is the best suited option for this? eval or regex? Any help is really appreciated.
Here's the regex for the C percent full.. This will only extract the numbers, so it will look like this..
C_Full = 45.745
(?P<C_Full>(?<=C\]\/percent\sfull\=)\d{2}\.\d+)
Here's the regex for the E percent full
(?P<E_Full>(?<=E\]\/percent\sfull\=)\d{2}\.\d+)
Like this
... | rex max_match=0 "File System\/(?<drive>[^,]+)" | mvexpand drive | ...
Here's the regex for the C percent full.. This will only extract the numbers, so it will look like this..
C_Full = 45.745
(?P<C_Full>(?<=C\]\/percent\sfull\=)\d{2}\.\d+)
Here's the regex for the E percent full
(?P<E_Full>(?<=E\]\/percent\sfull\=)\d{2}\.\d+)
Good call pulling out just the value!
One note. It is rare Splunk needs lookbehind or lookahead of regex. And, lookahead and lookbhind are more expensive in resource usage. So, if you do not need to use them, you would like to avoid using them.
C_Full = 45.745
(?P<C_Full>(?<=C\]\/percent\sfull\=)\d{2}\.\d+)
Assuming the number could be bigger than 100 🙂
could be;
"C\]/percent\s+full=(?P<C_Full>\d{2,}\.\d+)"
So,
| rex "C\]/percent\s+full=(?P<C_Full>\d{2,}\.\d+)"
I'm not sure where you read this, but this is not true. Lookaheads/lookbehinds can be used if needed with little impact to search performance.. Obviously there are exceptions to this rule, such as indexing a massive amount of data in a short period of time.. So, it could potentially be an issue in some circumstances, but this case, I doubt it.. I actually posted a question about this last year
So in my limited experience, deploying one lookahead was unnoticeable
https://answers.splunk.com/answers/294477/will-lookaheadslookbehinds-hurt-search-performance.html
Also, why create a regular expression to account for disk usage greater than 100%? It's not needed
I just tested this by creating a regular expression with a lookbehind then ran a search in verbose mode, I then inspected the job and it took 44.368 seconds. I then modified that extraction by removing the lookbehind and that same exact search took 44.329 seconds, so the lookbehind was 39ms slower which is insignificant..
Thanks skoelpin for the info.
I was talking about general regex cost, and if no need to use lookahead/lookbehind, that's better. Yes, scalability is in my concern. Indexing performance with lookahead/lookbehind with 1MB each event.
More like, why you suggest to use lookahead/lookbehind when you do not need to use them.
I'm fine with using lookahead/lookbehiind for this specific splunk answer. That's why up up-voted this before I added my comment. My comment is just a suggestion. If you think that's wrong. That's fine with me, too.
Wow those are big events!!
Mine are 1-2 KB's each, so I could see how lookbehinds could potentially be an issue for you
I am able to get C_drive and the related value correctly in a field. But how do I get E_drive also with the same rex command? I may have couple of other servers which has more drives, how do i dynamically get the drive info with a regex?
You could create one field with many values or you could create many fields with one value, it's all preference.
1 field with many values would look like this
Where Drive
will be your field.. The advantages of this would be, you only have 1 field.. The disadvantages are that it could be difficult to isolate one drive when querying, such as when using ... | stats count by
Drive = [C]/percent full=45.745
Drive = [E]/percent full=34.595
Many fields with 1 value would look like this
C_Drive
= 45.745
D_Drive
= 34.595
The advantage of this would be that it's super easy to manipulate the fields in your searching. So if you only wanted to see the drive space on a single host, your search would look like this
index=foo hostname=anoopambli C_Drive="*" OR D_Drive="*"
So depending on what route you want to go, I can help build your regular expression.
@anoopambli , was this able to help you? If so, could you accept the answer?
Also, if you want to get really good with regular expressions then you should check out www.regex101.com
and play around. Once you get familiar with Lookaheads and Lookbehinds then it's pretty straight forward