Splunk Search

Path Analysis in Splunk

Contributor

Hi,

I use iis server logs and in each hit I have the flowing parameters.

cs_uri_stem= Page user is on
cs_Referer=Previous page

What I would want to do is track back or forward by 3 or 4 steps.

Every users is identified by the cs_username field.

The question I want to answer is as follows....

For all users that looked at a product page "/Product//Product*/", what were the previous 4 pages looked at before arriving at the product page?

Any ideas?

Thanks,

Dan

Tags (4)
1 Solution

Ultra Champion

If you have something like @ShaneNewman suggests, i.e. some form of session identifier, you can get to the value you want like so;

...| transaction sessionID 
| eval n = mvfind(cs_uri_stem, "Product/ProductX") 
| eval m = n - 4 
| eval prevpage4 = mvindex(cs_uri_stem, m) 
| table cs_uri_stem prevpage4

The transaction makes cs_uri_stem a multivalued field which you can search through with mvfind and mvindex.

http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/CommonEvalFunctions

Hope this helps,

/K

View solution in original post

Ultra Champion

If you have something like @ShaneNewman suggests, i.e. some form of session identifier, you can get to the value you want like so;

...| transaction sessionID 
| eval n = mvfind(cs_uri_stem, "Product/ProductX") 
| eval m = n - 4 
| eval prevpage4 = mvindex(cs_uri_stem, m) 
| table cs_uri_stem prevpage4

The transaction makes cs_uri_stem a multivalued field which you can search through with mvfind and mvindex.

http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/CommonEvalFunctions

Hope this helps,

/K

View solution in original post

Ultra Champion

Good to hear it worked - you learn something new each day. Thank you.

Contributor

your advice was good! Just found the solution...

adding mvlist=t sorted the issue.

http://answers.splunk.com/answers/54955/ordering-of-fields-in-a-transaction-mvfind-bug

0 Karma

Ultra Champion

oops. sorry for giving you bad advice.

one possible workaround could be to concatenate _time with cs_uri_stem before the transaction;

... | eval my_cs_uri_stem = _time . " " . cs_uri_stem

then split them later... sounds ugly - but it may work.

Contributor

I think I have worked out the issue. If I look at an individual users activity for a day and filter descending on URL value I come to same results as from our query above.

It seems that the mvindex is ordering results on URL value (A-Z) and not time of the hit by user.

For an individuals activity I see the order (A-Z URL value)

/SearchResults/
/Toolbox/Dev Tools
/Toolbox/Dev Tools
/Toolbox/Product
/Toolbox/Product
/Toolbox/Service

I see the next pages in the above query as
nextpage1=/Toolbox/Dev Tools
nextpage2=/Toolbox/Product
nextpage3=/Toolbox/Service

How can I fix this?

0 Karma

Contributor

Also the pages are not in the correct order. The page values in nextpage1 have been seen by the user but not before or after the /Search/Results/

I remove the eval m = n - 4 and added

| eval nextpage1 = mvindex(cs_uri_stem, 1)
| eval nextpage2 = mvindex(cs_uri_stem, 2)
| eval nextpage3 = mvindex(cs_uri_stem, 3)
| stats count by cs_uri_stem nextpage1 nextpage2 nextpage3 cs_username

This shows pages the user has seen but not in any correct order, I checked against some users by looking at the date time stamp of all the users hits in the logs.

0 Karma

Contributor

Thanks for the response.

I have tried the below, however in the cs_uri_stem field there are all types of pages instead of just "/Search/SearchResults/"

sourcetype="iis-2" | extract auto=true | search | transaction cs_username maxspan=30m
| eval n = mvfind(cs_uri_stem, "/Search/SearchResults/")
| eval m = n + 1
| eval nextpage1 = mvindex(cs_uri_stem, m)
| stats count by cs_uri_stem nextpage1
| eval cs_uri_stem=urldecode(cs_uri_stem) | eval nextpage1=urldecode(nextpage1)

0 Karma

Ultra Champion

I don't think you should wildcard the string in mvfind() - just make it mvfind(cs_uri_stem, "/SearchResults").

Also, you might need to check whether m is a positive number.

/K

Contributor

I try the following.

sourcetype="iis-2" | extract auto=true | search | transaction cs_username maxspan=30m
| eval n = mvfind(cs_uri_stem, "/SearchResults.*")
| eval m = n - 4
| eval prevpage4 = mvindex(cs_uri_stem, m)
| table cs_uri_stem prevpage4
| eval cs_uri_stem=urldecode(cs_uri_stem)

The data looks a bit odd. I would expect to have /SearchResults/ in the cs_uri_stem field however this is populated with all different types of page.

It would be good to see the number of hits on the Search results page the a list of all previous 4 pages combinations and hits against them?

0 Karma

Contributor

Sorry the error was due to my poor regex, thanks for the answer, works great.

0 Karma

Contributor

Hi, I am trying your solution, thanks for the response.

I use the cs_username field instead of session ID and look at the data over a 1 day range.

I get the below error.

Error in 'eval' command: Regex: nothing to repeat

0 Karma

Motivator

You should be getting a GUID in IIS as well for the session. Use that to create a transaction. That will allow you to see each users entire session, then you can capture the pervious 4 pages viewed from there.

Contributor

Hi, Currently I have no session ID, I break the data down by user by day.

0 Karma