Solved: Get count from multiple urls based on required pro...

arjun_krishna · ‎05-16-2018

I am having below content with different (4 sets)urls presented in my logs, having index="abc_uyt"

RuntimeException having https://google.yahoo.com/web/kiran/cart/groups/ronaldo/getbilledvspaid/v1
RuntimeException having https://google.yahoo.com/web/kiran/cart/groups/ronaldo/paymenthistory/v1
RuntimeException having https://google.yahoo.com/web/kiran/cart/groups/ronaldo/requesthistory/v1
RuntimeException having https://google.yahoo.com/web/kiran/cart/groups/ronaldo/runninghistory/v1

RuntimeException having https://google.yahoo.com/web/kiran/cart/groups/watson/invoicedetail/v1
RuntimeException having https://google.yahoo.com/web/kiran/cart/groups/watson/invoicesummary/v1
RuntimeException having https://google.yahoo.com/web/kiran/cart/groups/watson/gettingValue/v1
RuntimeException having https://google.yahoo.com/web/kiran/cart/groups/watson/historyValue/v1

RuntimeException having https://microsoft.word.com/ringert/rkj3/obama/funatwork
RuntimeException having https://microsoft.word.com/ringert/rkj3/obama/runathome

RuntimeException having https://cisco-services.raj.com/ytr-services/gilchrist/vision
RuntimeException having https://cisco-services.raj.com/ytr-services/gilchrist/health

and i want to get the count based on ronaldo, watson, obama, gilchrist with appropriate values also as tabular form like below
ronaldo - 25
watson - 22
obama - 36
gilchrist - 21

Could any one please assist, i have tried with rex, sed, count.. but getting unexpected count

somesoni2 · ‎05-16-2018

If the URL domains are fixed, you try like this

index="abc_uyt"
| rex field=UrlFieldName "https:\/\/(google([^\/]+\/){5}|microsoft([^\/]+\/){3}|cisco([^\/]+\/){2})(?<name>[^\/]+)"
| stats count by name

See the regex working with your sample data here: https://regex101.com/r/t8coTo/1

View solution in original post

somesoni2 · ‎05-16-2018

If the URL domains are fixed, you try like this

index="abc_uyt"
| rex field=UrlFieldName "https:\/\/(google([^\/]+\/){5}|microsoft([^\/]+\/){3}|cisco([^\/]+\/){2})(?<name>[^\/]+)"
| stats count by name

See the regex working with your sample data here: https://regex101.com/r/t8coTo/1

arjun_krishna · ‎05-16-2018

can you please consider above scenario ? please the syntax is almost correct not getting name based count

somesoni2 · ‎05-16-2018

Above regex works with your new data samples as well. https://regex101.com/r/BCtKTw/1

In my query, I'm assuming there is a URL field which contains these logs or the URL portion of it. If there is no such field and you're searching though your whole log entry or _raw field, just remove field=UrlFieldName from above query.

arjun_krishna · ‎05-17-2018

Thanks, its worked

elliotproebstel · ‎05-16-2018

If the URLs will always end with either /something OR /something/v1 (where the "v1" will literally always be "v1" and not anything else), then this should work:

| rex field=_raw "(?<name>\w+)\/\w+(\/v1)?$"
| stats count by name

elliotproebstel · ‎05-16-2018

Alternately, if you have a finite list of names you're looking for, you could create a wildcard lookup containing those names. Here's a good answer that explains how to do that:
https://answers.splunk.com/answers/52580/can-we-use-wildcard-characters-in-a-lookup-table.html
I'll assume you load those in such that you wind up with something like this:

user, username
*ronaldo*, ronaldo
*watson*, watson
*obama*, obama
*gilchrist*, gilchrist

Once you have the names loaded into your wildcard lookup, you would do something like this:

your base search where the URLs are in a field called URL
| lookup your_wildcard_lookup user AS URL OUTPUT username
| stats count by username

arjun_krishna · ‎05-16-2018

logs are comes like below sets log1, log2, log3 , log4, log5, log6
log1: Caused by: java.RuntimeException having https://google.yahoo.com/web/kiran/cart/groups/ronaldo/getbilledvspaid/v1: Read timed out

log2: Caused by: java.RuntimeException having https://google.yahoo.com/web/kiran/cart/groups/ronaldo/saysfs/v1: Read timed out

log3: Caused by: java.RuntimeException having https://google.yahoo.com/web/kiran/cart/groups/watson/invoicesummary/v1: Read timed out

log4: Caused by: java.RuntimeException having https://google.yahoo.com/web/kiran/cart/groups/watson/iuaxaddd/v1: Read timed out

log5: KHGM PDF invoice service at endpoint: https://microsoft.word.com/ringert/rkj3/obama/funatwork

log6: and setting service endpoint URL: https://cisco-services.raj.com/ytr-services/gilchrist/health

somesoni2 · ‎05-16-2018

For the first set, will the URLs always ends with v1 ?

arjun_krishna · ‎05-16-2018

No, they are not end with v1, rather than i have to depend on the url domains

richgalloway · ‎05-16-2018

The field you are looking for seems to be in different places in the URLs. What determines the where the field is located?

---
If this reply helps you, Karma would be appreciated.

Get count from multiple urls based on required properties

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

ATTENTION: We’re Moving! (AGAIN!)

Deep Dive: Optimizing Telemetry Pipelines in Splunk Observability Cloud

Announcing Modern Navigation: A New Era of Splunk User Experience

Join the Conversation