I am having below content with different (4 sets)urls presented in my logs, having index="abc_uyt"
RuntimeException having https://google.yahoo.com/web/kiran/cart/groups/ronaldo/getbilledvspaid/v1
RuntimeException having https://google.yahoo.com/web/kiran/cart/groups/ronaldo/paymenthistory/v1
RuntimeException having https://google.yahoo.com/web/kiran/cart/groups/ronaldo/requesthistory/v1
RuntimeException having https://google.yahoo.com/web/kiran/cart/groups/ronaldo/runninghistory/v1
RuntimeException having https://google.yahoo.com/web/kiran/cart/groups/watson/invoicedetail/v1
RuntimeException having https://google.yahoo.com/web/kiran/cart/groups/watson/invoicesummary/v1
RuntimeException having https://google.yahoo.com/web/kiran/cart/groups/watson/gettingValue/v1
RuntimeException having https://google.yahoo.com/web/kiran/cart/groups/watson/historyValue/v1
RuntimeException having https://microsoft.word.com/ringert/rkj3/obama/funatwork
RuntimeException having https://microsoft.word.com/ringert/rkj3/obama/runathome
RuntimeException having https://cisco-services.raj.com/ytr-services/gilchrist/vision
RuntimeException having https://cisco-services.raj.com/ytr-services/gilchrist/health
and i want to get the count based on ronaldo, watson, obama, gilchrist with appropriate values also as tabular form like below
ronaldo - 25
watson - 22
obama - 36
gilchrist - 21
Could any one please assist, i have tried with rex, sed, count.. but getting unexpected count
If the URL domains are fixed, you try like this
index="abc_uyt"
| rex field=UrlFieldName "https:\/\/(google([^\/]+\/){5}|microsoft([^\/]+\/){3}|cisco([^\/]+\/){2})(?<name>[^\/]+)"
| stats count by name
See the regex working with your sample data here: https://regex101.com/r/t8coTo/1
If the URL domains are fixed, you try like this
index="abc_uyt"
| rex field=UrlFieldName "https:\/\/(google([^\/]+\/){5}|microsoft([^\/]+\/){3}|cisco([^\/]+\/){2})(?<name>[^\/]+)"
| stats count by name
See the regex working with your sample data here: https://regex101.com/r/t8coTo/1
can you please consider above scenario ? please the syntax is almost correct not getting name based count
Above regex works with your new data samples as well. https://regex101.com/r/BCtKTw/1
In my query, I'm assuming there is a URL field which contains these logs or the URL portion of it. If there is no such field and you're searching though your whole log entry or _raw field, just remove field=UrlFieldName from above query.
Thanks, its worked
If the URLs will always end with either /something
OR /something/v1
(where the "v1" will literally always be "v1" and not anything else), then this should work:
| rex field=_raw "(?<name>\w+)\/\w+(\/v1)?$"
| stats count by name
Alternately, if you have a finite list of names you're looking for, you could create a wildcard lookup containing those names. Here's a good answer that explains how to do that:
https://answers.splunk.com/answers/52580/can-we-use-wildcard-characters-in-a-lookup-table.html
I'll assume you load those in such that you wind up with something like this:
user, username
*ronaldo*, ronaldo
*watson*, watson
*obama*, obama
*gilchrist*, gilchrist
Once you have the names loaded into your wildcard lookup, you would do something like this:
your base search where the URLs are in a field called URL
| lookup your_wildcard_lookup user AS URL OUTPUT username
| stats count by username
logs are comes like below sets log1, log2, log3 , log4, log5, log6
log1: Caused by: java.RuntimeException having https://google.yahoo.com/web/kiran/cart/groups/ronaldo/getbilledvspaid/v1: Read timed out
log2: Caused by: java.RuntimeException having https://google.yahoo.com/web/kiran/cart/groups/ronaldo/saysfs/v1: Read timed out
log3: Caused by: java.RuntimeException having https://google.yahoo.com/web/kiran/cart/groups/watson/invoicesummary/v1: Read timed out
log4: Caused by: java.RuntimeException having https://google.yahoo.com/web/kiran/cart/groups/watson/iuaxaddd/v1: Read timed out
log5: KHGM PDF invoice service at endpoint: https://microsoft.word.com/ringert/rkj3/obama/funatwork
log6: and setting service endpoint URL: https://cisco-services.raj.com/ytr-services/gilchrist/health
For the first set, will the URLs always ends with v1 ?
No, they are not end with v1, rather than i have to depend on the url domains
The field you are looking for seems to be in different places in the URLs. What determines the where the field is located?