Splunk Search

Regular expressions

Path Finder

I'm new to writing regular expressions and am having a difficult time building a field using extract fields. Unfortunately Splunk is unable to automagically create one for this circumstance.
There are a series of events I'm trying to monitor, a sample of them follows:

I'd like isolate the county names (washington, broome, cattaraugus, dutchess) in the string and create field for them.

Tags (2)
0 Karma


Is the path to the file a part of the log messages themselves or is it simply the source (i.e. inputs.conf)? If it's simply the source, then you would have to do something like this...

rex field=source "F\:\\mssql\\backups\\(?P<county>\w*)\\.*"

The problem with that statement is Windows and their cursed backslash paths. When you put a backslash in front of the open parenthesis it kills the statement. I tried to escape the backslash but it doesn't seem to be working for me. If you can figure your way around it, you got your answer.

Edit: You have to triple-slash it to fully escape the path...

rex field=source "F\:\\\mssql\\\backups\\\(?P<county>\w*)\\.*"


the (?P\w*) piece is Splunk's language for extracting a field. You won't be able to test in rubular.

So what you're saying is that the literal string "F:\mssql\backups\..." is in the event data itself? in that case, just drop the "field=source" and it'll extract from the _raw.

0 Karma

Path Finder

Thank you,
I'm testing using rubular and noticed that the following piece of your string perfectly isolates the first 1/2 of the issue.
When I add (?P\w)\. I get undefined group option error.

I believe you're using the (?P piece to create a named capture group? Is this necessary when the goal is create a permanent field extraction?

To answer your question; no, I'm not pulling this information from the directory\file source. I'm looking in a windows application log.

Thanks again for your help.

0 Karma


The regex you would use to extract your requirement from a stream of data would be


but since these seem to be a series of Windows-based directories you are monitoring files from, then the filename will not actually appear as a field within the data, but as the source in the metadata. You will note the doubling of the backslash. This is because regexes originate in the Unix world, where the path separator is a forward slash, and where the backslash has a special meaning - specifically to "escape" special characters (to force them to lose their special properties).

I'm not sure how, or where you would configure the regex to get yourself the field you want as data. You could get the data into the "host" field by using the "host_segment" parameter in your inputs.conf definition.

0 Karma

Path Finder

Thank You, but unfortunately when I try to run that command through rubular to test it comes back with some errors.

I'm not directly monitoring a series of physical files in their directories. I'm looking through the application logs on the server these files are created on. I have 50+ databases backed up to disk on this server each evening and am trying to create a query that will allow me to see success/failure lined up with an inputlookup csv file of the county names, so my first task is to define each event by it's county name.

0 Karma