I really need to get good at regex and learn to do this myself but alas there are so many other things that seem to be a priority right now. I have the following log file names.
log_SVR-IES-PAN-RAMA-01-20170806 log_SVR-ORW-PAN-RAMA-01-20170806 log_SVR-IES-PAN-RAMA-01-20170813 log_SVR-ORW-PAN-RAMA-01-20170813 log_SVR-IES-PAN-RAMA-01-20170820 log_SVR-ORW-PAN-RAMA-01-20170820 log_SVR-IES-PAN-RAMA-01-20170827 log_SVR-ORW-PAN-RAMA-01-20170827 log_SVR-IES-PAN-RAMA-01-20170903 log_SVR-ORW-PAN-RAMA-01-20170903 log_SVR-IES-PAN-RAMA-01-20170910 log_SVR-ORW-PAN-RAMA-01-20170910 log_SVR-IES-PAN-RAMA-01 log_SVR-ORW-PAN-RAMA-01
I am monitoring the log files with the following stanza:
[monitor:///var/log2/gns/palo/log_*] index = panlog host_regex = (?<=log_).+-01 sourcetype = pan:log no_appending_timestamp = true
So the question is will the host_regex just give the host name svr-orw|ies-pan-rama-01? According to the regexr.com/v1 site it should but I want to make sure it is correct before I implement it.
The part of the pattern that matches between '(' and ')' (i.e. the capturing group) will be used, so rich's answer is correct. 'log_' is not inside the capturing group, and neither is '-01', so they will just be used to match.
If the '-01' part can vary, you can use
log_(.+)-\d+. That would also match log_xxxxxxx-02, for example