Hi everyone, I have create a regular expression query that match in a long list of pathname 1 specific folder, and next cut everything that there is after this folder:
index=main " | rex "\s\-\s\[(?<path_dd>.+)\specific_folder" | dedup path_dd | eval path="file:read:"+path_dd+"*" | sort by path| table path | outputlookup output.csv append=True
Next, I have add inputlookuptable at the start of the query, this table contain always path name, and there is one only field per line: path
So I have tried to edit the query like that:
| inputlookup write_rules.csv | rex "(?<path_dd>.+)\/specific_folder" | table path_dd
But it's not working, can anyone help me?
Thank you
Example of the file:
/home/jenkins/qa-automation-smcconnell/Automation/Tomcats/tomcat7/work/Catalina/localhost/host-manager/loader
/home/jenkins/qa-automation-smcconnell/Automation/Tomcats/tomcat7/work/Catalina/localhost/examples/loader
/home/jenkins/qa-automation-smcconnell/Automation/Tomcats/tomcat7/work/Catalina/localhost/xwiki_oracle/loader
/home/jenkins/qa-automation-smcconnell/Automation/Tomcats/tomcat7/work/Catalina/localhost/xwiki_oracle/SESSIONS.ser
/home/jenkins/qa-automation-smcconnell/Automation/Tomcats/tomcat7/work/Catalina/localhost/docs/loader
/home/jenkins/qa-automation-smcconnell/Automation/Tomcats/tomcat7/work/Catalina/localhost/manager/loader
/home/jenkins/qa-automation-smcconnell/Automation/Tomcats/tomcat7/work/Catalina/localhost/_/loader
/home/jenkins/qa-automation-smcconnell/Automation/Tomcats/tomcat7/work/Catalina/localhost/xwiki_oracle/xwiki-temp/aether-repository/org/apache/maven/doxia/doxia-core/1.3/_maven.repositories
/home/jenkins/qa-automation-smcconnell/Automation/Tomcats/tomcat7/work/Catalina/localhost/xwiki_oracle/xwiki-temp/aether-repository/org/apache/maven/doxia/doxia-core/1.3/doxia-core-1.3.pom
/home/jenkins/qa-automation-smcconnell/Automation/Tomcats/tomcat7/work/Catalina/localhost/xwiki_oracle/xwiki-temp/aether-repository/org/apache/maven/doxia/doxia-core/1.3/doxia-core-1.3.pom.ahc26f05574a43e4fce
/home/jenkins/qa-automation-smcconnell/Automation/Tomcats/tomcat7/work/Catalina/localhost/xwiki_oracle/xwiki-temp/aether-repository/org/apache/maven/doxia/doxia-core/1.3/doxia-core-1.3.pom.sha1.ahca7be2b392cec49e7
/home/jenkins/qa-automation-smcconnell/Automation/Tomcats/tomcat7/work/Catalina/localhost/xwiki_oracle/xwiki-temp/aether-repository/org/apache/maven/doxia/doxia-core/1.3/doxia-core-1.3.pom.sha1
Thanks, the sample entries would be helpful but I believe the problem is not in regex.
When you run this, you would have an Splunk in-built field call '_raw'. This is the default field that a rex statement work on.
index=main | rex "\s\-\s\[(?<path_dd>.+)\specific_folder" | dedup path_dd | eval path="file:read:"+path_dd+"*" | sort by path| table path | outputlookup output.csv append=True
So statement | rex "\s\-\s\[(?.+)\specific_folder"
is same as | rex field=_raw "\s\-\s\[(?.+)\specific_folder"
Whereas, when you run this (with inputlookup), there is no field with name _raw. SO here you would have to specify your field name from which the path_dd will be extracted.
| inputlookup write_rules.csv | rex "(?<path_dd>.+)\/specific_folder" | table path_dd
So, replace | rex "(?.+)\/specific_folder"
with | rex field=fieldFromCSVFile "(?.+)\/specific_folder"
This revised regex should do the named capture from the sample string you provided
backwardslashS(?lessthanpathddgreaterthan.+)tomcat7
Notes on modification to regex:
changed to capture any non-whitespace character (S) before the literal value "tomcat7"
also the named capture's name shouldn't contain a hyphen so changed it to pathdd
tested against your supplied input string at regex101.com
Match information
MATCH 1
pathdd [1-58] home/jenkins/qa-automation-smcconnell/Automation/Tomcats/
Hope this helps with the regex part of the question.
If you're working against input from an inputlookup command I believe someson12 is correct - in the rex command you need to specify the fieldname from the csv that you want to apply the regex to.
sorry for some reason the capture name was edited out when i posted the reply, possibly because or the angle brackets - i've replaced them with "lessthan" and "greaterthan" here, also the backslash at the beginning
Thanks, the sample entries would be helpful but I believe the problem is not in regex.
When you run this, you would have an Splunk in-built field call '_raw'. This is the default field that a rex statement work on.
index=main | rex "\s\-\s\[(?<path_dd>.+)\specific_folder" | dedup path_dd | eval path="file:read:"+path_dd+"*" | sort by path| table path | outputlookup output.csv append=True
So statement | rex "\s\-\s\[(?.+)\specific_folder"
is same as | rex field=_raw "\s\-\s\[(?.+)\specific_folder"
Whereas, when you run this (with inputlookup), there is no field with name _raw. SO here you would have to specify your field name from which the path_dd will be extracted.
| inputlookup write_rules.csv | rex "(?<path_dd>.+)\/specific_folder" | table path_dd
So, replace | rex "(?.+)\/specific_folder"
with | rex field=fieldFromCSVFile "(?.+)\/specific_folder"
Can you post some sample events from the write_rules.csv file (one which is not working)?
The query don't produce any events, and the job inspector say that there aren't match fields.
Is the lookup table write_rules.csv empty? What does it return if you just run this
| inputlookup write_rules.csv
yes, it's not empty
That is good. The remaining portion of the search is searching for a specific pattern (regex) and it's not able to find the pattern causing the end result to be be empty. To see if the pattern used is correct or not, please provide some sample entries from the write_rules.csv file (which should be added as a lookup table file).
I have add it in the answer! : )
This revised regex should do the named capture from the sample string you provided
\S(?.+)tomcat7
Notes on modification to regex:
changed to capture any non-whitespace character (\S) before the literal value "tomcat7"
also the named capture's name shouldn't contain a hyphen so changed it to pathdd
tested against your supplied input string at regex101.com
Match information
MATCH 1
pathdd [1-58] home/jenkins/qa-automation-smcconnell/Automation/Tomcats/
Hope this helps with the regex part of the question.
sorry for some reason the capture name was edited out when i posted the reply, possibly because or the angle brackets - i've replaced them with "lessthan" and "greaterthan" here
\S(?lessthanpathddgreaterthan.+)tomcat7