Splunk Search

Path Analysis in Splunk

DanielFordWA
Contributor

Hi,

I use iis server logs and in each hit I have the flowing parameters.

cs_uri_stem= Page user is on
cs_Referer=Previous page

What I would want to do is track back or forward by 3 or 4 steps.

Every users is identified by the cs_username field.

The question I want to answer is as follows....

For all users that looked at a product page "/Product//Product*/", what were the previous 4 pages looked at before arriving at the product page?

Any ideas?

Thanks,

Dan

Tags (4)
1 Solution

kristian_kolb
Ultra Champion

If you have something like @ShaneNewman suggests, i.e. some form of session identifier, you can get to the value you want like so;

...| transaction sessionID 
| eval n = mvfind(cs_uri_stem, "Product/ProductX") 
| eval m = n - 4 
| eval prevpage4 = mvindex(cs_uri_stem, m) 
| table cs_uri_stem prevpage4

The transaction makes cs_uri_stem a multivalued field which you can search through with mvfind and mvindex.

http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/CommonEvalFunctions

Hope this helps,

/K

View solution in original post

kristian_kolb
Ultra Champion

If you have something like @ShaneNewman suggests, i.e. some form of session identifier, you can get to the value you want like so;

...| transaction sessionID 
| eval n = mvfind(cs_uri_stem, "Product/ProductX") 
| eval m = n - 4 
| eval prevpage4 = mvindex(cs_uri_stem, m) 
| table cs_uri_stem prevpage4

The transaction makes cs_uri_stem a multivalued field which you can search through with mvfind and mvindex.

http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/CommonEvalFunctions

Hope this helps,

/K

kristian_kolb
Ultra Champion

Good to hear it worked - you learn something new each day. Thank you.

DanielFordWA
Contributor

your advice was good! Just found the solution...

adding mvlist=t sorted the issue.

http://answers.splunk.com/answers/54955/ordering-of-fields-in-a-transaction-mvfind-bug

0 Karma

kristian_kolb
Ultra Champion

oops. sorry for giving you bad advice.

one possible workaround could be to concatenate _time with cs_uri_stem before the transaction;

... | eval my_cs_uri_stem = _time . " " . cs_uri_stem

then split them later... sounds ugly - but it may work.

DanielFordWA
Contributor

I think I have worked out the issue. If I look at an individual users activity for a day and filter descending on URL value I come to same results as from our query above.

It seems that the mvindex is ordering results on URL value (A-Z) and not time of the hit by user.

For an individuals activity I see the order (A-Z URL value)

/SearchResults/
/Toolbox/Dev Tools
/Toolbox/Dev Tools
/Toolbox/Product
/Toolbox/Product
/Toolbox/Service

I see the next pages in the above query as
nextpage1=/Toolbox/Dev Tools
nextpage2=/Toolbox/Product
nextpage3=/Toolbox/Service

How can I fix this?

0 Karma

DanielFordWA
Contributor

Also the pages are not in the correct order. The page values in nextpage1 have been seen by the user but not before or after the /Search/Results/

I remove the eval m = n - 4 and added

| eval nextpage1 = mvindex(cs_uri_stem, 1)
| eval nextpage2 = mvindex(cs_uri_stem, 2)
| eval nextpage3 = mvindex(cs_uri_stem, 3)
| stats count by cs_uri_stem nextpage1 nextpage2 nextpage3 cs_username

This shows pages the user has seen but not in any correct order, I checked against some users by looking at the date time stamp of all the users hits in the logs.

0 Karma

DanielFordWA
Contributor

Thanks for the response.

I have tried the below, however in the cs_uri_stem field there are all types of pages instead of just "/Search/SearchResults/"

sourcetype="iis-2" | extract auto=true | search | transaction cs_username maxspan=30m
| eval n = mvfind(cs_uri_stem, "/Search/SearchResults/")
| eval m = n + 1
| eval nextpage1 = mvindex(cs_uri_stem, m)
| stats count by cs_uri_stem nextpage1
| eval cs_uri_stem=urldecode(cs_uri_stem) | eval nextpage1=urldecode(nextpage1)

0 Karma

kristian_kolb
Ultra Champion

I don't think you should wildcard the string in mvfind() - just make it mvfind(cs_uri_stem, "/SearchResults").

Also, you might need to check whether m is a positive number.

/K

DanielFordWA
Contributor

I try the following.

sourcetype="iis-2" | extract auto=true | search | transaction cs_username maxspan=30m
| eval n = mvfind(cs_uri_stem, "/SearchResults.*")
| eval m = n - 4
| eval prevpage4 = mvindex(cs_uri_stem, m)
| table cs_uri_stem prevpage4
| eval cs_uri_stem=urldecode(cs_uri_stem)

The data looks a bit odd. I would expect to have /SearchResults/ in the cs_uri_stem field however this is populated with all different types of page.

It would be good to see the number of hits on the Search results page the a list of all previous 4 pages combinations and hits against them?

0 Karma

DanielFordWA
Contributor

Sorry the error was due to my poor regex, thanks for the answer, works great.

0 Karma

DanielFordWA
Contributor

Hi, I am trying your solution, thanks for the response.

I use the cs_username field instead of session ID and look at the data over a 1 day range.

I get the below error.

Error in 'eval' command: Regex: nothing to repeat

0 Karma

ShaneNewman
Motivator

You should be getting a GUID in IIS as well for the session. Use that to create a transaction. That will allow you to see each users entire session, then you can capture the pervious 4 pages viewed from there.

DanielFordWA
Contributor

Hi, Currently I have no session ID, I break the data down by user by day.

0 Karma
Get Updates on the Splunk Community!

3 Ways to Make OpenTelemetry Even Better

My role as an Observability Specialist at Splunk provides me with the opportunity to work with customers of ...

What's New in Splunk Cloud Platform 9.2.2406?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.2.2406 with many ...

Enterprise Security Content Update (ESCU) | New Releases

In August, the Splunk Threat Research Team had 3 releases of new security content via the Enterprise Security ...