Need help with field extractions. Need to extract the fields in bold.
Here are two sample events
Sample1
40.156.209.1 | ssh | o*4RAGZLx404x22840423x1 | JG25721 | 2018-06-20 06:44:51,219 | SSH - git-upload-pack '/dga/dgiodbatc.git' | - | 0 | 4 | 1911 | cache:miss, refs, ssh:user:id:126642 | 2140 | 1hgs9dp |
Sample2
10.348.20.158,30.158.219.1 | https | i*1N0FIQQx408x22719240x2 | - | 2018-06-20 06:48:08,653 | "GET /rest/api/1.0/repos HTTP/1.1" | "" "Apache-HttpClient/4.5.3 (Java/1.8.0_77)" | - | - | - | - | - | - |
Post extraction of the first field , check if that extracted field starts with "o" then extract the second bold field (i.e. 2140) and if the extracted first field starts with "i" then ignore that event.
Maybe something like this as field extraction
^(?<ips>\S+)\s\|\s(?<protocol>\S+)\s\|\s(?<id>\S+)\s\|\s([^\|]+\|\s){8}(?<id_no>\S+)
the "*" makes it a little cumbersome, but this should work: base search | where NOT like(id,"i%")
Personally, I'd just extract all the fields btw and not use ([^\|]+\|\s){8}
to skip to the number later on, but if you don't need the other fields, well...
Hth,
-Kai.
Maybe something like this as field extraction
^(?<ips>\S+)\s\|\s(?<protocol>\S+)\s\|\s(?<id>\S+)\s\|\s([^\|]+\|\s){8}(?<id_no>\S+)
the "*" makes it a little cumbersome, but this should work: base search | where NOT like(id,"i%")
Personally, I'd just extract all the fields btw and not use ([^\|]+\|\s){8}
to skip to the number later on, but if you don't need the other fields, well...
Hth,
-Kai.
Could you help me form it in a query
This is how I am composing
sourcetype="Raccess" (host="AVOP" OR host="BVOP") date_wday!=saturday AND date_wday !=sunday
| rex "^(?\S+)\s|\s(?\S+)\s|\s(?\S+)\s|\s([^|]+|\s){8}(?\S+)"
| where NOT like(id,"i%")
| timechart values(id_no)
This doesn't give me any result.
Yes, extracting all the fields would also help me a great deal... But we just gotta make sure only to extract the fields from the events if the third field of the event starts with an 'o' Not 'i'.
Would you mind putting your code into code blocks? 🙂
Well, it wasn't meant to be used as rex command, I thought of field extraction on the sourcetype in question, and then doing a search with that. That being said, in my test it works with rex.
If you insist on not extracting the field on i* (I just discarded those events with the NOT like() clause), you could do that directly in rex as well, eg
rex field=input "^(?<ips>\S+)\s\|\s(?<protocol>\S+)\s\|\s(?<id>i\S+)\s\|\s([^\|]+\|\s){8}(?<id_no>\S+)"
will only extract when id starts with "i", and then you can lose the "where NOT".
At least this works when I pipe your example through, like
| makeresults | eval input="40.156.209.1 | ssh | i*4RAGZLx404x22840423x1 | JG25721 | 2018-06-20 06:44:51,219 | SSH - git-upload-pack '/dga/dgiodbatc.git' | - | 0 | 4 | 1911 | cache:miss, refs, ssh:user:id:126642 | 2140 | 1hgs9dp |" | rex field=input "^(?<ips>\S+)\s\|\s(?<protocol>\S+)\s\|\s(?<id>i\S+)\s\|\s([^\|]+\|\s){8}(?<id_no>\S+)" | stats values(id_no)
I tried the following ,
sourcetype="Raccess" (host="AVOP" OR host="BVOP") date_wday!=saturday AND date_wday !=sunday
makeresults | eval input=_raw | rex field=_raw "^(?<ips>\S+)\s\|\s(?<protocol>\S+)\s\|\s(?<id>i\S+)\s\|\s([^\|]+\|\s){8}(?<id_no>\S+)" | stats values(id_no)
It gives errors saying Error in 'makeresults' command: This command must be the first command of a search.
"makeresults" is what is being used a lot here to generate artificial result sets, since people don't have the same raw data as other people.
So you could cut and paste my last answer without any additional base search to play around with it. Sorry, I took that for granted.
So what about
sourcetype="Raccess" (host="AVOP" OR host="BVOP") date_wday!=saturday AND date_wday !=sunday | rex field=_raw "^(?<ips>\S+)\s\|\s(?<protocol>\S+)\s\|\s(?<id>i\S+)\s\|\s([^\|]+\|\s){8}(?<id_no>\S+)" | stats values(id_no)
Doesn't that work? If not, then your raw data doesn't probably exactly match what you posted here, or I may be misunderstanding something. It happens, I am so used to using Splunk in a certain way with certain data sets and questions, that I automatically misunderstand in my own way. 🙂
Thank you @knielsen