Right now, we've got a path like: /splunk/data-sources/domain-botnet.csv
, with numerous files, but each is a .csv
file.
I'm trying to import it so that the host
field returns the domain-botnet
part of the filename, but not the whole filename.
Right now I'm trying to make it work sorta, but it only captures the first part of that filename, say, 'domain' or 'url' rather than what I want it to capture, and this is the regex I've come up with so far (keep in mind I'm a newbie at regex...): (url|domain|infrastructure|email|malware)-\w*
Anyone able to maybe give me some pointers on how to make this work? Note that this will also be applied to a Windows system as well as a Linux system, so it needs to be able to adapt to a variable-length path, traversing any number of directories and/or drive paths to extract the filename (minus the .csv
extension)
Try this:
host_regex=(?:[\\/][^\\/]*){1,}[\\/]([^\.]*)\.csv
RegExr (http://www.regexr.com/) is a great tool for testing regular expressions.
Try this:
host_regex=(?:[\\/][^\\/]*){1,}[\\/]([^\.]*)\.csv
RegExr (http://www.regexr.com/) is a great tool for testing regular expressions.
Works perfectly, thanks!
In inputs.conf, use this
host_regex=(?:/|\\)(\S+?)\.csv$
should do it. HTH!
That does part of it, the host
now shows up as "splunk/data-sources/domain-malware" or "splunk/data-sources/domain-botnet" or "splunk/data-sources/infrastructure-scan", but i only want the last segment of this, domain-malware
or domain-botnet
or infrastructure-scan
, etc.