Splunk Search

Problem with regular expression

Communicator

Hi everyone, I have create a regular expression query that match in a long list of pathname 1 specific folder, and next cut everything that there is after this folder:

   index=main "  | rex "\s\-\s\[(?<path_dd>.+)\specific_folder" | dedup path_dd | eval path="file:read:"+path_dd+"*" | sort by path| table path | outputlookup output.csv append=True

Next, I have add inputlookuptable at the start of the query, this table contain always path name, and there is one only field per line: path

So I have tried to edit the query like that:

 | inputlookup write_rules.csv | rex "(?<path_dd>.+)\/specific_folder" | table path_dd

But it's not working, can anyone help me?
Thank you

Example of the file:

/home/jenkins/qa-automation-smcconnell/Automation/Tomcats/tomcat7/work/Catalina/localhost/host-manager/loader

/home/jenkins/qa-automation-smcconnell/Automation/Tomcats/tomcat7/work/Catalina/localhost/examples/loader
/home/jenkins/qa-automation-smcconnell/Automation/Tomcats/tomcat7/work/Catalina/localhost/xwikioracle/loader
/home/jenkins/qa-automation-smcconnell/Automation/Tomcats/tomcat7/work/Catalina/localhost/xwiki
oracle/SESSIONS.ser
/home/jenkins/qa-automation-smcconnell/Automation/Tomcats/tomcat7/work/Catalina/localhost/docs/loader
/home/jenkins/qa-automation-smcconnell/Automation/Tomcats/tomcat7/work/Catalina/localhost/manager/loader
/home/jenkins/qa-automation-smcconnell/Automation/Tomcats/tomcat7/work/Catalina/localhost//loader
/home/jenkins/qa-automation-smcconnell/Automation/Tomcats/tomcat7/work/Catalina/localhost/xwiki
oracle/xwiki-temp/aether-repository/org/apache/maven/doxia/doxia-core/1.3/maven.repositories
/home/jenkins/qa-automation-smcconnell/Automation/Tomcats/tomcat7/work/Catalina/localhost/xwiki
oracle/xwiki-temp/aether-repository/org/apache/maven/doxia/doxia-core/1.3/doxia-core-1.3.pom
/home/jenkins/qa-automation-smcconnell/Automation/Tomcats/tomcat7/work/Catalina/localhost/xwikioracle/xwiki-temp/aether-repository/org/apache/maven/doxia/doxia-core/1.3/doxia-core-1.3.pom.ahc26f05574a43e4fce
/home/jenkins/qa-automation-smcconnell/Automation/Tomcats/tomcat7/work/Catalina/localhost/xwiki
oracle/xwiki-temp/aether-repository/org/apache/maven/doxia/doxia-core/1.3/doxia-core-1.3.pom.sha1.ahca7be2b392cec49e7
/home/jenkins/qa-automation-smcconnell/Automation/Tomcats/tomcat7/work/Catalina/localhost/xwiki_oracle/xwiki-temp/aether-repository/org/apache/maven/doxia/doxia-core/1.3/doxia-core-1.3.pom.sha1

Tags (2)
0 Karma
1 Solution

SplunkTrust
SplunkTrust

Thanks, the sample entries would be helpful but I believe the problem is not in regex.

When you run this, you would have an Splunk in-built field call '_raw'. This is the default field that a rex statement work on.

index=main  | rex "\s\-\s\[(?<path_dd>.+)\specific_folder" | dedup path_dd | eval path="file:read:"+path_dd+"*" | sort by path| table path | outputlookup output.csv append=True

So statement | rex "\s\-\s\[(?.+)\specific_folder" is same as | rex field=_raw "\s\-\s\[(?.+)\specific_folder"

Whereas, when you run this (with inputlookup), there is no field with name raw. SO here you would have to specify your field name from which the pathdd will be extracted.

| inputlookup write_rules.csv | rex "(?<path_dd>.+)\/specific_folder" | table path_dd

So, replace | rex "(?.+)\/specific_folder" with | rex field=fieldFromCSVFile "(?.+)\/specific_folder"

View solution in original post

0 Karma

Path Finder

This revised regex should do the named capture from the sample string you provided
backwardslashS(?lessthanpathddgreaterthan.+)tomcat7

Notes on modification to regex:
changed to capture any non-whitespace character (S) before the literal value "tomcat7"
also the named capture's name shouldn't contain a hyphen so changed it to pathdd
tested against your supplied input string at regex101.com

Match information
MATCH 1
pathdd [1-58] home/jenkins/qa-automation-smcconnell/Automation/Tomcats/

Hope this helps with the regex part of the question.
If you're working against input from an inputlookup command I believe someson12 is correct - in the rex command you need to specify the fieldname from the csv that you want to apply the regex to.

sorry for some reason the capture name was edited out when i posted the reply, possibly because or the angle brackets - i've replaced them with "lessthan" and "greaterthan" here, also the backslash at the beginning

0 Karma

SplunkTrust
SplunkTrust

Thanks, the sample entries would be helpful but I believe the problem is not in regex.

When you run this, you would have an Splunk in-built field call '_raw'. This is the default field that a rex statement work on.

index=main  | rex "\s\-\s\[(?<path_dd>.+)\specific_folder" | dedup path_dd | eval path="file:read:"+path_dd+"*" | sort by path| table path | outputlookup output.csv append=True

So statement | rex "\s\-\s\[(?.+)\specific_folder" is same as | rex field=_raw "\s\-\s\[(?.+)\specific_folder"

Whereas, when you run this (with inputlookup), there is no field with name raw. SO here you would have to specify your field name from which the pathdd will be extracted.

| inputlookup write_rules.csv | rex "(?<path_dd>.+)\/specific_folder" | table path_dd

So, replace | rex "(?.+)\/specific_folder" with | rex field=fieldFromCSVFile "(?.+)\/specific_folder"

View solution in original post

0 Karma

SplunkTrust
SplunkTrust

Can you post some sample events from the write_rules.csv file (one which is not working)?

0 Karma

Communicator

The query don't produce any events, and the job inspector say that there aren't match fields.

0 Karma

SplunkTrust
SplunkTrust

Is the lookup table write_rules.csv empty? What does it return if you just run this

| inputlookup write_rules.csv 
0 Karma

Communicator

yes, it's not empty

0 Karma

SplunkTrust
SplunkTrust

That is good. The remaining portion of the search is searching for a specific pattern (regex) and it's not able to find the pattern causing the end result to be be empty. To see if the pattern used is correct or not, please provide some sample entries from the write_rules.csv file (which should be added as a lookup table file).

0 Karma

Communicator

I have add it in the answer! : )

0 Karma

Path Finder

This revised regex should do the named capture from the sample string you provided
\S(?.+)tomcat7

Notes on modification to regex:
changed to capture any non-whitespace character (\S) before the literal value "tomcat7"
also the named capture's name shouldn't contain a hyphen so changed it to pathdd
tested against your supplied input string at regex101.com

Match information
MATCH 1
pathdd [1-58] home/jenkins/qa-automation-smcconnell/Automation/Tomcats/

Hope this helps with the regex part of the question.

0 Karma

Path Finder

sorry for some reason the capture name was edited out when i posted the reply, possibly because or the angle brackets - i've replaced them with "lessthan" and "greaterthan" here
\S(?lessthanpathddgreaterthan.+)tomcat7

0 Karma