Splunk Search

Field Extractions Never Appear

vbrtrmn
Explorer

Starting a new project with Adobe's CQ5...

I'm starting with the access log, as it is straight forward.

I've done field extractions before for another custom log type, worked great. Now, I can't seem to get any of my extractions appear in the Search.

Walkthrough:

  • Created an index called adobe_cq5.
  • Created various "file" type data inputs using various CQ5 log files, setting the index to adobe_cq5. The one I started with is called cq5-access.
  • Go into Search and verify that I'm getting good data, which I am.
  • Select the drop down next to the first log line and click Extract Fields.
  • Under Generated Pattern, click Edit.
  • Put in a basic regex: ^(?P<FIELDNAME>\d+\.\d+\.\d+\.\d+?)
  • Click "Apply"
  • Check several lines to make sure the IP addresses are selected.
  • Click "Save"
  • In "Save Field Extraction" enter ip_address for the field name.
  • Click "Save"
  • Click "Close" on "Successfully Saved" dialog.
  • Reload the Search page.
  • Note that ip_address is not appearing in the log line list as it has for past projects.
  • Click "Pick fields"
  • Note that ip_address does not appear in Available Fields.
  • Go back to Extract Fields
  • Enter in: ^(?P<FIELDNAME>\d+\.\d+\.\d+\.\d+?)
  • Get two errors:
  • --Note: the values you want may already be extracted in the 'ip_address' field.
  • --Note: This regex already extracts ip_address for cq5-access.
  • Close out of Extract Fields
  • Browse to Manager » Fields » Field extractions
  • Verify extraction: cq5-access : EXTRACT-ip_address
  • Click Permissions give Everyone Read Permission and set Object should appear in This app only (search)
  • Click Save
  • Re-Check the search page, ip_address still does not appear.
  • Open up terminal
  • cat: /opt/splunk/etc/apps/search/local/props.conf
  • Verify extraction: EXTRACT-ip_address = ^(?P<ip_address>\d+\.\d+\.\d+\.\d+?)

For my last project, I simply entered the Extract Fields tool, entered my regex, saved and the data appeared right in the Search.

props.conf for modified extraction

[cq5-access]
EXTRACT-ip_address = ^(?P<ip_address>\d+\.\d+\.\d+\.\d+?)

props.conf with original full extraction

[cq5-access]
EXTRACT-ip_address-username-day-month-year-hour-minute-second-http_type-http_request-http_code-referer-user_agent = ^(?P<ip_address>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s.+?\s(?P<username>.+?)\s(?P<day>\d\d)/(?P<month>\w\w\w)/(?P<year>\d\d\d\d):(?P<hour>\d\d):(?P<minute>\d\d):(?P<second>\d\d)\s.+?\s"(?P<http_type>\w+?)\s(?P<http_request>.+?)\sHTTP.+?"\s(?<http_code>\d+?)\s.+?\s"(?P<referer>.+?)"\s"(?P<user_agent>.+?)"

Sample data:

10.71.40.57 - admin 23/Apr/2013:16:15:14 -0400 "GET /crx/server/crx.default/jcr%3aroot/etc/map/http.1.json?_dc=1366748119022&node=xnode-339 HTTP/1.1" 200 175 "https://twcc-ci01.lab.webapps.rr.com:4602/crx/de/index.jsp" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:20.0) Gecko/20100101 Firefox/20.0"
10.71.40.57 - admin 23/Apr/2013:16:15:13 -0400 "GET /crx/de/icons/crxde_favicon.ico HTTP/1.1" 200 295606 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:20.0) Gecko/20100101 Firefox/20.0"
127.0.0.1 - admin 23/Apr/2013:16:42:31 -0400 "GET /bin/receive?sling:authRequestLogin=1 HTTP/1.1" 200 32 "-" "Jakarta Commons-HttpClient/3.1"
Tags (1)
0 Karma

jklumpp_splunk
Splunk Employee
Splunk Employee

This isn't necessarily related to your problem, but I don't think your regex will give you the expected results. You have a lazy (?) modifier at the end of your regex will should cause the last section of your IP Address to stop at only 1 digit, so if you have an IP that ends with 2 or 3 digits you won't get those. I believe the ip_address extraction in the original full extraction will work better.

Also, I've seen some unexpected results in Splunk when using the start of line character (^) so I try where possible not to use them. Here is a modified regex that removes the ^ (I look for the pattern following that IP in your example data instead) and updates the lazy modifier. Give it a shot...

(?P<ip_address>\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})\s-

carasso
Splunk Employee
Splunk Employee

this is correct. as an example:

import re
re.findall('^(?P\d+.\d+.\d+.\d+?)', '10.71.40.57 -')
['10.71.40.5']
re.findall('^(?P\d+.\d+.\d+.\d+)', '10.71.40.57 -')
['10.71.40.57']

0 Karma

kristian_kolb
Ultra Champion

Is cq5-access the sourcetype or a filename you're reading?

I'd try to use underscores instead of dashes in all names (sourcetypes, fields, anything), where possible. There have been issues with these not showing up when names have contained dashes.

http://splunk-base.splunk.com/answers/48611/bug-in-interactive-field-extractor-ifx

/K

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...