I have my Splunk source in the format below :
source=/default/folder/20190403/file_PARADOX_7747_txt
I am trying to only pick the file name from the source to do some analysis & unable to get rid of unwanted process id appended at the end i.e., I only need PARADOX from the above.
Below is the closest I have got so far , however I am unable to separate the process id from the file name
rex field=source "(?<logdir>[\w\W/]+)/file_(?<filename>[^.]+)_txt"
Ideally, I would like the below output :
Any help is appreciated . Thank you.
If you only want the filename, I think @FrankVI or @vnravikumar would be a good approach. If you want it all parsed out:
| rex field=source "(?<logdir>\/[\W\w]+\/[\W\w]+\/)(?<date>[^\/]+)\/file_(?<filename>[^\_]+)\_(?<processid>[^\_]+)\_(?<extension>.+)"
Here is what I used to test it:
| makeresults
| eval source = "/default/folder/20190403/file_PARADOX_7747_txt"
| rex field=source "(?<logdir>\/[\W\w]+\/[\W\w]+\/)(?<date>[^\/]+)\/file_(?<filename>[^\_]+)\_(?<processid>[^\_]+)\_(?<extension>.+)"
Thanks @FrankVI , @vnravikumar & @ragedsparrow for all your help .
Unfortunately my source pattern can contain multiple words in the file name but filename is always suffixed by process id like below :
source=/default/folder/20190403/file_PARADOX_7747_txt
source=/default/folder/20190402/file_AMR_CA_1234_txt
source=/default/folder/20190402/file_EMEA_IRE_DUB_8964_txt
If there is a way to grab the file name between "file_" and a numeric digit ([0-9]) , it ll help .
I think this would work:
| rex field=source "(?<logdir>\/[\W\w]+\/[\W\w]+\/)(?<date>[^\/]+)\/file_(?<filename>[^\d]+)\_(?<processid>\d+)\_(?<extension>.+)"
I tested it here:
| makeresults
| eval source="/default/folder/20190402/file_EMEA_IRE_DUB_8964_txt"
| rex field=source "(?<logdir>\/[\W\w]+\/[\W\w]+\/)(?<date>[^\/]+)\/file_(?<filename>[^\d]+)\_(?<processid>\d+)\_(?<extension>.+)"
Works like a charm ! Thank you
Hi
Try this
| makeresults
| eval source = "source=/default/folder/20190402/file_EMEA_IRE_DUB_8964_txt"
| rex field=source "file\_(?P<name>.+)_\d+"
If you only want the filename, I think @FrankVI or @vnravikumar would be a good approach. If you want it all parsed out:
| rex field=source "(?<logdir>\/[\W\w]+\/[\W\w]+\/)(?<date>[^\/]+)\/file_(?<filename>[^\_]+)\_(?<processid>[^\_]+)\_(?<extension>.+)"
Here is what I used to test it:
| makeresults
| eval source = "/default/folder/20190403/file_PARADOX_7747_txt"
| rex field=source "(?<logdir>\/[\W\w]+\/[\W\w]+\/)(?<date>[^\/]+)\/file_(?<filename>[^\_]+)\_(?<processid>[^\_]+)\_(?<extension>.+)"
Hi
Give a try
| makeresults
| eval source = "/default/folder/20190403/file_PARADOX_7747_txt"
| eval filename = mvindex(split(source,"_"),1)
OR
To avoid any directory that contains the underscore
| makeresults
| eval source = "/default/folder/20190403/file_PARADOX_7747_txt"
| rex field=source "\/(?P<filename>file.+)"
| eval filename = mvindex(split(filename,"_"),1)
[New]:
Try this
| makeresults
| eval source = "/default/folder/20190402/file_AMR_CA_1234_txt"
| rex field=source "file\_(?P<name>.+)_\d+"
You were pretty close. I guess this should work (unless the filename can also contain _ or other variations on the format cause this to break in some cases.
| rex field=source "(?<logdir>[\w\W/]+)/file_(?<filename>[^_]+)_(?<processid>[^_]+)_txt"