Splunk Search
Highlighted

Garbage collection logs field extraction from log file

Path Finder

Would like to extract fields from the below log by using reqular expressions. Can some one help me

28820.220: [Full GC (System.gc()) 8832K->8624K(37888K), 0.0261704 secs]
29372.500: [GC (Allocation Failure) 23984K->8816K(37888K), 0.0013546 secs]
29932.500: [GC (Allocation Failure) 24176K->8808K(37888K), 0.0017082 secs]
30492.500: [GC (Allocation Failure) 24168K->8960K(37888K), 0.0017122 secs]
31047.500: [GC (Allocation Failure) 24320K->8944K(37888K), 0.0020634 secs]
31602.500: [GC (Allocation Failure) 24304K->8992K(37888K), 0.0017542 secs]
32157.500: [GC (Allocation Failure) 24352K->8968K(37888K), 0.0018971 secs]
32420.247: [GC (System.gc()) 16160K->8944K(37888K), 0.0012816 secs]
32420.248: [Full GC (System.gc()) 8944K->8624K(37888K), 0.0205035 secs]

Would like to extract Full GC --- 8944K->8624K(37888K)

Field1: 8944 --- what ever comes throughout the multiple entries of Full GC
Field2: 8624 -- what ever comes throughout the multiple entries of Full GC
Field3: 37888 -- what ever comes throughout the multiple entries of Full GC

similarly for GC

Early help would be appreciate as my organization not allowing me to install field extractor app to extract easily these fields

0 Karma
Highlighted

Re: Garbage collection logs field extraction from log file

Legend

Hi nagaraju_chittathuru,
try this regex

\[Full GC.*\)\)\s(?<FullGC1>[^K]*)K-\>(?<FullGC2>[^K]*)K\((?<FullGC3>[^\)]*)

if instead of K you could have M or G, you can use

\[Full GC.*\)\)\s(?<FullGC1>[^KMG]*)(K|M|G)-\>(?<FullGC2>[^KMG]*)(K|M|G)\((?<FullGC3>[^KMG]*)

Test it at https://regex101.com/r/z3PqFP/1
Bye.
Giuseppe

Highlighted

Re: Garbage collection logs field extraction from log file

Path Finder

Hi cusello,
Thanks for quick turnaoround...when I build the query

mysearch | rex field=_raw "[Full GC.))\s(?[^KMG])(K|M|G)->(?[^KMG])(K|M|G)((?[^KMG])" | table FullGC1, FullGC2, FullGC3, _raw

this is returning only the first Full GC event eventhough I have multiple Full GC in the same event.
in https://regex101.com/r/z3PqFP/1 it is showing the other occurences..but when I build the actual query only one row it is printing
Any sort of help would be appreciated?

0 Karma
Highlighted

Re: Garbage collection logs field extraction from log file

Legend

Hi nagarajuchittathuru,
try to add `max
match=0` to the rex command

mysearch 
| rex max_match=0 "[Full GC.))\s(?[^KMG])(K|M|G)-\>(?[^KMG])(K|M|G)((?[^KMG])" 
| table FullGC1, FullGC2, FullGC3, _raw

Bye.
Giuseppe

0 Karma
Highlighted

Re: Garbage collection logs field extraction from log file

Path Finder

Hi cusello,
Thanks a lot that works fine. Would like to extend the regex for the timestamp and gctime from the sample data below

28820.220: [Full GC (System.gc()) 8832K->8624K(37888K), 0.0261704 secs]
29372.500: [GC (Allocation Failure) 23984K->8816K(37888K), 0.0013546 secs]

out of this trying to extract the below fields ...could you help me around
28820.220 as "timestamp"
0.0261704 as "gctime"

 mysearch 
 | rex max_match=0 "[Full GC.))\s(?[^KMG])(K|M|G)-\>(?[^KMG])(K|M|G)((?[^KMG])" 
 | table FullGC1, FullGC2, FullGC3, _raw
0 Karma
Highlighted

Re: Garbage collection logs field extraction from log file

Legend

try

^(?<timestamp>[^:]*): \[Full GC.*\)\)\s(?<FullGC1>[^K]*)K-\>(?<FullGC2>[^K]*)K\((?<FullGC3>[^\)]*)\),\s+(?<gctime>[^ ]*)

Bye.
Giuseppe
(if you're satisfied accept or upvote it)

0 Karma
Highlighted

Re: Garbage collection logs field extraction from log file

Legend

@nagaraju_chittathuru, based on the sample events provided please try the following rex command.

<YourBaseSearch>
| rex field=_raw "\[([^\(]+)\(([^\)]+)\)[\)|\s]+(?<field1>\d+)K-\>(?<field2>\d+)K\((?<field3>\d+)K\)"
| table field1, field2, field3, _raw

You can use regex101.com for writing/testing your regular expressions. Also Splunk has its own Interactive Field Extraction (IFX) that you can use for Splunk to come up with required Regular Expression.
Link to documentation: http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/ExtractfieldsinteractivelywithIFX




| eval message="Happy Splunking!!!"


View solution in original post

Highlighted

Re: Garbage collection logs field extraction from log file

Path Finder

Hi niketnilay,
Thanks for quick turnaoround...when I build the query

mysearch | | rex field=_raw "[([^(]+)(([^)]+))[)|\s]+(?\d+)K->(?\d+)K((?\d+)K)"
| table field1, field2, field3, _raw

this is returning only the first Full GC event eventhough I have multiple Full GC in the same event.Any sort of help would be appreciated?

0 Karma
Highlighted

Re: Garbage collection logs field extraction from log file

Legend

In case you have multiple matches in the same event you can use max_match argument. If set to 0 it will try to find all matches

| rex field=_raw "\[([^\(]+)\(([^\)]+)\)[\)|\s]+(?<field1>\d+)K-\>(?<field2>\d+)K\((?<field3>\d+)K\)" max_match=0



| eval message="Happy Splunking!!!"


0 Karma
Highlighted

Re: Garbage collection logs field extraction from log file

Path Finder

I am trying to extend the the regex to extract the first time stamp by using the below
\s\w+.(?\w+:)..somehow it is extracting only after the decimal.
from the below example.could you pls help in this regard
29372.500: [GC (Allocation Failure) 23984K->8816K(37888K), 0.0013546 secs]

0 Karma