Hi Everyone,
I have to extract a file path from a path.
The path will be in the format C:\a\b\c\abc\xyz\abc.h.
I want to skip first 4 folders. That is in this example i want to extract \abc\xyz\abc.h.
How can i dot it using regex?
hi @gcusello ,
I am not able to get the path \xyz\abc.h using this regex..
Hi @anooshac,
wher there are more backslashes there is an issue, so please try:
| rex field=your_field "^\w:\\\\w+\\\\w+\\\\w+\\\\w+(?<filename>.*)"
ciao.
Giuseppe
hi @gcusello , still i am not able to extract.
Hi @anooshac,
the second regex is correct, as you can check at https://regex101.com/r/kpyTLl/2,
in Splunk is different when you have backslashes, so you can try:
| rex field=your_field "^\w*:\\\\\w*\\\\\w*\\\\\w*\\\\\w*\\\\(?<filename>.*)"
as you can check using the following search:
| makeresults
| eval my_field="C:\a\b\c\abc\xyz\abc.h"
| rex field=my_field "^\w*:\\\\\w*\\\\\w*\\\\\w*\\\\\w*\\\\(?<filename>.*)"
Ciao.
Giuseppe
Hi @gcusello ,
I tested it and it is working fine. The paths in my data are vary from another. I may have data something like this. In these conditions will it work.
C:\a\b\c\abc.pqr.a1.b1.jkl\xyz\abc.h
OK. Assuming that:
1. You always have a drive letter at the beginning
2. You don't have "empty parts" (you don't have consecutive backslashes which are syntactically correct if you want to specify a file path but are typically not returned as a path to existing file)
3. You want to extract the part after the first four components
The regex to do so would be like that:
[a-zA-Z]:\\\\([^\\]+\\){4}(?<remainder>.*)
The "remainder" capture group will capture the path after first four directories.
Of course if you want to do it with "rex" command in Splunk, you need to escape all backslashes which makes it something like this:
| rex "[a-zA-Z]:\\\\\\\\([^\\\\]+\\\\){4}(?<remainder>.*)"
Hi @anooshac,
let me understand, you could have different log formats: "C:\a\b\c\abc\xyz\abc.h" or ""C:\a\b\c\abc.pqr.a1.b1.jkl\xyz\abc.h", is it correct?
in this case, you could try:
| rex field=your_field "^\w*:\\\\[^\\\]*\\\\\w*\\\\[^\\\]*\\\\[^\\\]*\\\\(?<filename>.*)"
that you can try using this search:
| makeresults
| eval your_field="C:\a\b\c\abc\xyz\abc.h"
| append [ | makeresults | eval your_field="C:\a\b\c\abc.pqr.a1.b1.jkl\xyz\abc.h" ]
| rex field=your_field "^\w*:\\\\[^\\\]*\\\\\w*\\\\[^\\\]*\\\\[^\\\]*\\\\(?<filename>.*)"
Ciao.
Giuseppe
Hi @anooshac,
I suppose that you have this path in a field, so you could use something like this:
| rex field=your_field "^(?<path>\w:\\\w+\\\w+\\\w+\\\w+)"
that you can test at https://regex101.com/r/kpyTLl/1
It could be possible that there's an issue for a difference between regex101.com and Splunk, so, if the above regex doesn't run, please try this:
| rex field=your_field "^(?<path>\w:\\\\w+\\\\w+\\\\w+\\\\w+)"
Ciao.
Giuseppe
Hi @gcusello , Thanks for the response..
I don't want to extract the first 4 folders.. I want to skip them and extract the rest of the path.. I was finding hard writing a regex.. How can i do this?
Hi @anooshac,
it's the same thing:
| rex field=your_field "^\w:\\\w+\\\w+\\\w+\\\w+(?<filename>.*)"
Ciao.
Giuseppe