Splunk Search

Field Extractions Never Appear

vbrtrmn
Explorer

Starting a new project with Adobe's CQ5...

I'm starting with the access log, as it is straight forward.

I've done field extractions before for another custom log type, worked great. Now, I can't seem to get any of my extractions appear in the Search.

Walkthrough:

  • Created an index called adobe_cq5.
  • Created various "file" type data inputs using various CQ5 log files, setting the index to adobe_cq5. The one I started with is called cq5-access.
  • Go into Search and verify that I'm getting good data, which I am.
  • Select the drop down next to the first log line and click Extract Fields.
  • Under Generated Pattern, click Edit.
  • Put in a basic regex: ^(?P<FIELDNAME>\d+\.\d+\.\d+\.\d+?)
  • Click "Apply"
  • Check several lines to make sure the IP addresses are selected.
  • Click "Save"
  • In "Save Field Extraction" enter ip_address for the field name.
  • Click "Save"
  • Click "Close" on "Successfully Saved" dialog.
  • Reload the Search page.
  • Note that ip_address is not appearing in the log line list as it has for past projects.
  • Click "Pick fields"
  • Note that ip_address does not appear in Available Fields.
  • Go back to Extract Fields
  • Enter in: ^(?P<FIELDNAME>\d+\.\d+\.\d+\.\d+?)
  • Get two errors:
  • --Note: the values you want may already be extracted in the 'ip_address' field.
  • --Note: This regex already extracts ip_address for cq5-access.
  • Close out of Extract Fields
  • Browse to Manager » Fields » Field extractions
  • Verify extraction: cq5-access : EXTRACT-ip_address
  • Click Permissions give Everyone Read Permission and set Object should appear in This app only (search)
  • Click Save
  • Re-Check the search page, ip_address still does not appear.
  • Open up terminal
  • cat: /opt/splunk/etc/apps/search/local/props.conf
  • Verify extraction: EXTRACT-ip_address = ^(?P<ip_address>\d+\.\d+\.\d+\.\d+?)

For my last project, I simply entered the Extract Fields tool, entered my regex, saved and the data appeared right in the Search.

props.conf for modified extraction

[cq5-access]
EXTRACT-ip_address = ^(?P<ip_address>\d+\.\d+\.\d+\.\d+?)

props.conf with original full extraction

[cq5-access]
EXTRACT-ip_address-username-day-month-year-hour-minute-second-http_type-http_request-http_code-referer-user_agent = ^(?P<ip_address>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s.+?\s(?P<username>.+?)\s(?P<day>\d\d)/(?P<month>\w\w\w)/(?P<year>\d\d\d\d):(?P<hour>\d\d):(?P<minute>\d\d):(?P<second>\d\d)\s.+?\s"(?P<http_type>\w+?)\s(?P<http_request>.+?)\sHTTP.+?"\s(?<http_code>\d+?)\s.+?\s"(?P<referer>.+?)"\s"(?P<user_agent>.+?)"

Sample data:

10.71.40.57 - admin 23/Apr/2013:16:15:14 -0400 "GET /crx/server/crx.default/jcr%3aroot/etc/map/http.1.json?_dc=1366748119022&node=xnode-339 HTTP/1.1" 200 175 "https://twcc-ci01.lab.webapps.rr.com:4602/crx/de/index.jsp" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:20.0) Gecko/20100101 Firefox/20.0"
10.71.40.57 - admin 23/Apr/2013:16:15:13 -0400 "GET /crx/de/icons/crxde_favicon.ico HTTP/1.1" 200 295606 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:20.0) Gecko/20100101 Firefox/20.0"
127.0.0.1 - admin 23/Apr/2013:16:42:31 -0400 "GET /bin/receive?sling:authRequestLogin=1 HTTP/1.1" 200 32 "-" "Jakarta Commons-HttpClient/3.1"
Tags (1)
0 Karma

jklumpp_splunk
Splunk Employee
Splunk Employee

This isn't necessarily related to your problem, but I don't think your regex will give you the expected results. You have a lazy (?) modifier at the end of your regex will should cause the last section of your IP Address to stop at only 1 digit, so if you have an IP that ends with 2 or 3 digits you won't get those. I believe the ip_address extraction in the original full extraction will work better.

Also, I've seen some unexpected results in Splunk when using the start of line character (^) so I try where possible not to use them. Here is a modified regex that removes the ^ (I look for the pattern following that IP in your example data instead) and updates the lazy modifier. Give it a shot...

(?P<ip_address>\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})\s-

carasso
Splunk Employee
Splunk Employee

this is correct. as an example:

import re
re.findall('^(?P\d+.\d+.\d+.\d+?)', '10.71.40.57 -')
['10.71.40.5']
re.findall('^(?P\d+.\d+.\d+.\d+)', '10.71.40.57 -')
['10.71.40.57']

0 Karma

kristian_kolb
Ultra Champion

Is cq5-access the sourcetype or a filename you're reading?

I'd try to use underscores instead of dashes in all names (sourcetypes, fields, anything), where possible. There have been issues with these not showing up when names have contained dashes.

http://splunk-base.splunk.com/answers/48611/bug-in-interactive-field-extractor-ifx

/K

0 Karma
Get Updates on the Splunk Community!

The Splunk Success Framework: Your Guide to Successful Splunk Implementations

Splunk Lantern is a customer success center that provides advice from Splunk experts on valuable data ...

Splunk Training for All: Meet Aspiring Cybersecurity Analyst, Marc Alicea

Splunk Education believes in the value of training and certification in today’s rapidly-changing data-driven ...

Investigate Security and Threat Detection with VirusTotal and Splunk Integration

As security threats and their complexities surge, security analysts deal with increased challenges and ...