All Apps and Add-ons

Splunk App for WebAnalytics - Run lookup ""Generate user sessions" not selecting records

dchima
Path Finder

Greetings All,

After the install of the App, I began following these steps

These steps needs to be done in order.
1. Get the data in. Use one of these sourcetype names "access_common", "access_combined", "iis", "apache:access" or "aws:cloudfront:accesslogs".
2. Configure the sites that you want to monitor.
3. Run "Generate Session" and "Generate pages" lookup searches.

and then step 3. "Generate Session" lookup goes through records but nothing gets selected. I reviewed the job inspector details after the run and the reason why no records are being selected are because my record's eventtype has "non-pageview" associated with it not "pageview" what the search is requesting.

partial search example -- search (eventtype=pageview site=*)

Please suggest next steps as once i get past this i can do steps 4 and 5

  1. Enable Data Model Acceleration for Web datamodel.
  2. Configure goals (Optional)

Note:
websites are setup and the 'site' field is populated.
tag=web -- brings back data

0 Karma
1 Solution

dchima
Path Finder

I was able to figure this out.

1) i did not need any rules for changing my sourcetype as I exporting raw logs from my enterprise splunk and then placing them in a file share

2) my biggest pain point was that I had my Apache Servers collecting information in a custom format string that was not suitable for this App.

I would suggest the following for other splunkers to try if you're having issues

run this query to see how close you are in getting your apache data to align with relevant fields for the app

eventtype=pageview site=*
| eval time=time
| eval http_referer = _time."
".http_referer
| eval http_referer_domain = time."".http_referer_domain
| eval http_referer_hostname = time."".http_referer_hostname
| fields _time time http_referer http_referer_domain http_referer_hostname site clientip http_user_agent http_request
|table _time time http_referer http_referer_domain http_referer_hostname site clientip http_user_agent http_request _raw

if things don't look aligned delete from your index you have associated with the apache logs. then delete or zero out your actual logs on disk

on your apache guest, redo the log format string to something like this below and then restart your apache guest

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"%D" combined

Take a few transactions that will populate the apache logs, then you can collect them directly for Apache and copy them into your splunk apache log

Once the above is completed, you should be fine in completing the rest of steps

View solution in original post

0 Karma

dchima
Path Finder

I was able to figure this out.

1) i did not need any rules for changing my sourcetype as I exporting raw logs from my enterprise splunk and then placing them in a file share

2) my biggest pain point was that I had my Apache Servers collecting information in a custom format string that was not suitable for this App.

I would suggest the following for other splunkers to try if you're having issues

run this query to see how close you are in getting your apache data to align with relevant fields for the app

eventtype=pageview site=*
| eval time=time
| eval http_referer = _time."
".http_referer
| eval http_referer_domain = time."".http_referer_domain
| eval http_referer_hostname = time."".http_referer_hostname
| fields _time time http_referer http_referer_domain http_referer_hostname site clientip http_user_agent http_request
|table _time time http_referer http_referer_domain http_referer_hostname site clientip http_user_agent http_request _raw

if things don't look aligned delete from your index you have associated with the apache logs. then delete or zero out your actual logs on disk

on your apache guest, redo the log format string to something like this below and then restart your apache guest

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"%D" combined

Take a few transactions that will populate the apache logs, then you can collect them directly for Apache and copy them into your splunk apache log

Once the above is completed, you should be fine in completing the rest of steps

View solution in original post

0 Karma

dchima
Path Finder

i got a bit further today and now at the point of where records are being selected initially for generate user sessions, but then fail on this part of the spl. I think my big issue at this point is that many of the fields listed below are not present in my index of records -- These records i exported from my enterprise splunk as raw, they were initially apache_combined but i rename the sourcettype to access_combined in my local props.conf. this rename appears to work fine.

the area i need assistance with is the mapping that is in my "/local/props.conf" many of the fields are not mapping across and that is causing things not to work.

stats first(site) as site,first(user) as user, first(time) AS http_session_start, last(time) AS http_session_end,count(http_request) AS http_session_pageviews,first(duration) as http_session_duration,first(http_referer) as http_session_referrer,first(http_referer_domain) as http_session_referrer_domain,first(http_referer_hostname) as http_session_referrer_hostname by time,http_session | search user=* | eval http_session_referrer=replace(http_session_referrer,"^[0-9]*",""), http_session_referrer_domain=if((http_session_referrer == "-"),"-",replace(http_session_referrer_domain,"^[0-9]_","")), http_session_referrer_hostname=if((http_session_referrer == "-"),"-",replace(http_session_referrer_hostname,"^[0-9]_","")) | lookup WA_channels Hostname AS http_session_referrer_hostname OUTPUT Channel AS http_session_channel | eval http_session_channel=if((http_session_referrer == "-"),"Direct",if(like(site,("%" . http_session_referrer_domain)),"Direct",if((isnull(http_session_channel) AND isnotnull(http_session)),"Referal",http_session_channel))) | ifields + acceleration, datamodel_update_time, count, _time, site, user, http_session, http_session_start, http_session_end, http_session_pageviews, http_session_duration, http_session_referrer, http_session_referrer_domain, http_session_referrer_hostname, http_session_channel | outputlookup WA_sessions createinapp=true**

0 Karma

jbjerke_splunk
Splunk Employee
Splunk Employee

Hi dchima

The problem is most likely due to fields not being extracted correctly. The most important ones for the eventtype=pageview to work is "file". Check if you have this field correctly extracted. If not, try and fix this extraction.

As you can define whichever log format you want with Apache and NGINX, it is not guaranteed it will work automatically with your configuration. The app assumes the default one. The app also comes with an alternative configuration for the logs under the sourcetype "apache:access". This might work better for you. This configuration corresponds to the one we have defined in the Add-on for Apache:
https://docs.splunk.com/Documentation/AddOns/released/ApacheWebServer/Configure

Please also review the troubleshooting steps here:
https://splunkbase.splunk.com/app/2699/#/details

Let me know how you get along.

j

0 Karma

dchima
Path Finder

Hello jbjerke,

Thank you for responding.

Initially my record's eventtype had "non-pageview" associated with it not "pageview" what the generate user session search was requesting.

I viewed the file SplunkAppForWebAnalytics/default/eventtypes.conf file and found how the two fields are getting assigned

[pageview]
search = eventtype=web-traffic status=200 NOT (eventtype=web-uri-nonpage OR eventtype=ua-bot OR eventtype=exclude-pageview OR eventtype=clientip-internal) (http_method=GET OR NOT http_method=*)

[non-pageview]
search = eventtype=web-traffic eventtype!=pageview

my problem was that 'status' was not extracted for me so that is why non-pageview was being assigned to the eventtype. i created an extraction rule to have 'status' extracted. i deleted records from my apache.log and from the associated index, then i reloaded records back into my apache.log which populated my index again. Now, the eventtype was correct as 'status' was found.

then i tried to generate user sessions again, it made it past this first part, but then failed on the 2nd part of the spl

1st part--

search (eventtype=pageview site=*) | eval time='time', http_referer=(('_time' . "") . http_referer), http_referer_domain=(('time' . "") . http_referer_domain), http_referer_hostname=(('time' . "") . http_referer_hostname) | fields + time, time, http_referer, http_referer_domain, http_referer_hostname, site, clientip, http_user_agent, http_request | transaction site clientip http_user_agent maxpause=30m maxspan=4h keepevicted=f | eval user=md5(((clientip . "") . http_user_agent)), http_session=md5(((((clientip . "") . http_user_agent) . "") . '_time'))

2nd part--
| stats first(site) as site,first(user) as user, first(time) AS http_session_start, last(time) AS http_session_end,count(http_request) AS http_session_pageviews,first(duration) as http_session_duration,first(http_referer) as http_session_referrer,first(http_referer_domain) as http_session_referrer_domain,first(http_referer_hostname) as http_session_referrer_hostname by time,http_session | search user=* | eval http_session_referrer=replace(http_session_referrer,"^[0-9]*",""), http_session_referrer_domain=if((http_session_referrer == "-"),"-",replace(http_session_referrer_domain,"^[0-9]_","")), http_session_referrer_hostname=if((http_session_referrer == "-"),"-",replace(http_session_referrer_hostname,"^[0-9]_","")) | lookup WA_channels Hostname AS http_session_referrer_hostname OUTPUT Channel AS http_session_channel | eval http_session_channel=if((http_session_referrer == "-"),"Direct",if(like(site,("%" . http_session_referrer_domain)),"Direct",if((isnull(http_session_channel) AND isnotnull(http_session)),"Referal",http_session_channel))) | ifields + acceleration, datamodel_update_time, count, _time, site, user, http_session, http_session_start, http_session_end, http_session_pageviews, http_session_duration, http_session_referrer, http_session_referrer_domain, http_session_referrer_hostname, http_session_channel | outputlookup WA_sessions createinapp=true

*** Note -- I will try what you suggested next and let you know how things go...

0 Karma

jbjerke_splunk
Splunk Employee
Splunk Employee

I'm not sure what you mean with "failed" in your example. This error is most likely due to your log file not conforming to what the app expects.

Have you tried access_combined and apache:access as sourcetype names?

If none of these work you need to make sure all field extractions work. As you can see from the query there are numerous fields that needs to be extracted. Here's some examples:
status
file
http_request
http_referer

You can look in props.conf for more examples. You can also check the Web datamodel.

j

0 Karma

dchima
Path Finder

hello jbjerke,

yes, i understand it is log file not conforming to the what the app expects. in my last comment i was simply stating that the 2nd part of the spl is satisfied due to missing fields.

currently in my props.conf i'm converting the apache_combined to access_combined and that is not populating all the fields the app expects. I will take your suggestion of converting to apache:access and let you know how things turn out. below is the first line of my props.conf and i'll change it now, delete records from index and reload the log files to see what happens

[t-splunk@lgtisplunk1 default]$ cd /apps/splunk/etc/users/admin/SplunkAppForWebAnalytics/local/
[t-splunk@lgtisplunk1 local]$ cat props.conf

Added this -- first thing rename the apache_combined source to access_combined as application request it.

[apache_combined]
rename = access_combined

0 Karma