Not sure how best to word the question but below is what I am trying to do - feel free to edit the question header.
We have a list of URLs that are referrals:
e.g.
www.example.com/this-file/doe?a=a
www.example.com/this-file/dane
www.example.com/this-file/doe
URL 1 and URL 3 are in actuality the same, there is just URL params in the first. Is there a method I Can use to strip the URL params before running the search an doing a count? The outcome would be ideally
www.example.com/this-file/doe - 2
www.example.com/this-file/dane - 1
We cannot pre-filter the data using props or inputs.conf. This would have to be done at search run time.
currently our search string is index="test" regex referrer="^http://www.example.com/these-files/*" | stats count by referrer | sort -count
Try this
index="test" referrer="http://www.example.com/these-files/*"
| rex field=referrer "(?<new_referrer>.*?)\?"
| stats count by new_referrer | sort -count
Alternative would be to use regular expression:
index="test" regex referrer="^http://www.example.com/these-files/*" | rex field=referrer "(?.+)\?"
stats count by url | sort -count
Try this
index="test" referrer="http://www.example.com/these-files/*"
| rex field=referrer "(?<new_referrer>.*?)\?"
| stats count by new_referrer | sort -count
have you try
faup app:
https://splunkbase.splunk.com/app/1545/
this my help you handling urls
Will definitely look into this. @lguinn answer did it for me so far though - thanks!