Splunk Search

How to collapse variable part of url to get a list of urls?

Path Finder

Hi,

I'm trying to get a list of urls that users are visiting for each of the customer sites that we manage.

I have a lookup table that links hosts to a particular customer site.

I have gotten as far as:

| rex field=uri_path mode=sed "s/([a-z0-9_\.-]+)@([\da-z\.-]+)\.([a-z\.]{2,6})/email/g"
| lookup host_grouping Host AS host OUTPUTNEW Customer, Environment 
| dedup Customer uri_path 
| fields _time Customer Environment uri_path

The lookup works just fine, but a lot of my url's have a variable portion that is easily described with regex. If I try to tabulate the output of this search I end up with thousands of entries.

For example, the values 28400 and 212 are id's of the channel and stream respectively.

 /api/channel/28400/stream/212/play

Instead of listing every combination of this endpoint that has been reached I want to count it just once, for example by matching those integers with regex.

There will be different url formats, but the pieces I need to regex out can be assumed to be either integers or email addresses.

I think what I'm looking for is Rex (http://docs.splunk.com/Documentation/SplunkCloud/7.0.0/SearchReference/Rex) but it doesn't seem to accept PCRE.

Any clues where I can find this in the documentation? Or give an example of "flattening" fields that have predictable variable content?

0 Karma
1 Solution

Path Finder

This query actually works - but the search results don't show the edited field. If you expand the result to show the fields then you'll see that they have been changed.

View solution in original post

0 Karma

Path Finder

This query actually works - but the search results don't show the edited field. If you expand the result to show the fields then you'll see that they have been changed.

View solution in original post

0 Karma