I have a file that I am monitoring on a Heavy Forwarder(HF). The file is JSON logs. On the HF I have the following props.conf:
[EC-json]
KV_MODE=JSON
TIME_PREFIX="timestamp":"
TIME_FORMAT=%Y-%m-%dT%H:%M:%S
SHOULD_LINEMERGE=false
TRUNCATE=0
After the file gets to the indexers, from the SH, I am trying to create several search time extractions.
I first tested from the search bar using this search:
sourcetype=EC-json | rex field =_raw "userid\":.+?,ou=(?<user_org>\w+)," | rex field = _raw "SourceName.+?:.+?\/\/.+\/(?<PDF>.+?\.pdf)"
This was successful, I was able to create two new fields user_org and PDF.
Then I tried using props.conf in /etc/apps/search/local/ on the SH:
[EC-json]
EXTRACT-user_org = userid\".+?,ou=(?<user_org>\w+),
EXTRACT-PDF = SourceName\".+?:.+?\/\/.+\/(?<PDF>.+?\.pdf)
Here is a sample of my data:
{"timestamp":"02/16/2018 08:02:23","Accountid":"userj", <snip> ,"SourceName":"https://share.org.com/sites/reports/ORGReports/report1.pdf","userid":"cn= joe user,ou=SOC,ou=org,ou=company,ou=us"}
I tried using the suggestions here: https://www.splunk.com/blog/2016/06/28/eureka-extracting-key-value-pairs-from-json-fields.html
and added the following to the props.conf on my SH to pull all the information from the userid:
EXTRACT-KVPS = (?:\\[rnt]|:")(?<_KEY_1>[^="\\]+)=(?:\\")?(?<_VAL_1>[^="\\]+)
But that doesn't seem to pull the info into the right fields. And all I care about is the first OU anyway.
Can someone help with my props.conf syntax?
Do I need to escape the quote after userid and SourceName or not?
Thanks.
I'm posting here because I used all three answers from @somesoni2 and @cpetterborg and @493669 to get things to work.
@somesoni2's extractions ( although @cpetterborg version would have worked as well)
@cpetterborg check via the Extraction Tool
@493669 config change in fields.conf
Thanks everyone!
In my case what worked is the answer from @somesoni2
your base search | extract reload=t
Sometimes things does not update even if you reload splunk.
Another fact about field extraction is that it takes some minutes for fields to show up, in this case just wait some minutes until you see the fields.
I'm posting here because I used all three answers from @somesoni2 and @cpetterborg and @493669 to get things to work.
@somesoni2's extractions ( although @cpetterborg version would have worked as well)
@cpetterborg check via the Extraction Tool
@493669 config change in fields.conf
Thanks everyone!
I'm putting this separate because I was testing both answers from @somesoni2 and @cpetterborg
OK, I'm really confused:
I used the suggestion from @somesoni2 for user_org, restarted and performed my search. No luck. The field didn't show. I verified it worked with rex field = _rex ...... and it worked there.
Then I started working through what @cpetterborg said and opened up the Splunk UI Extraction Tool and it showed user_org as an existing field! whaaa? If it's an existing field, why doesn't it show up?
I re-ran my search, and it still didn't show up. I tried running my search with | stats count by user_org but still no results. I clicked to open All Fields, in case it was hidden there but user_org wasn't listed.
I then finished going through the Extraction Tool and saved (just in case I had not properly passed everything across my cluster). I then tried searching again and nope, that field STILL doesn't show up anywhere.
BUT, it still shows up when going through the Extraction Tool....
Are you in verbose mode in your search? if not, try that.
Try to run this
your base search | extract reload=t
Then run it without extract
command. Sometimes that's required for Field extractions to show up.
try setting INDEXED_VALUE = false
in fields.conf
[user_org]
INDEXED_VALUE = false
What about trying the following regex in an auto field extraction (though the Splunk UI Field Extraction Tool):
"SourceName":"[^"]+/(?<PDF>[^/]+?\.pdf).*"userid":.+?,ou=(?<user_org>\w+),
It works on my development system with the example data you provided. It should produce the same results for you are using props.conf
file field extraction, but is easier to work with.
Try this (props.conf on SH, will need to restart SH)
[EC-json]
EXTRACT-user_org = userid\"\:\"[^,]+,ou=(?<user_org>[^,]+)
EXTRACT-PDF = SourceName\"\:\"([^\/]+\/+)+(?<PDF>[^\.]+\.pdf)