What you need is a configuration kind of like this is
[monitor:///data/ftp/paloalto/PA*.csv] sourcetype = paloalto host = paloaltohostname
You might be able to do a more sophisticated
host processing if the information is available, e.g., in the data or in the file path. Then, in
[paloalto] REPORT-paextract = paloalto_extractions KV_MODE = none
KV_MODE = none just turns off some default extractions that don't usually work in a CSV file. And then in
[source::...paloalto....csv] sourcetype = paloalto priority = 100 [paloalto_extracts] DELIMS = "," FIELDS = "Domain", "Receive_Time", "Serial_Number", "Threat_Content_Type" , # And so on for the fields.
The first clause here exists to disable/override some default behavior that is clumsy and confusing. (In particular, automatic generation of headers.) In theory, Splunk should have auto-generated the second clause (or something like it) based on the header in the CSV file and the fact that the name ended in
.csv, but it doesn't work well, so we turn it off. The second clause creates the header that we do want explicitly.
Actually in my experience, CSV files, even if you specify the sourcetype, gets auto-learned, and the fields are not extracted. I find this true up until version 4.0.11. I haven't had a chance to upgrade to 4.0.12.
BunnyHop, yes I actually have that same issue. I named sourcetype paloalto and yet i get sourcetypes like paloalto1 and palo_alto2. I mean it doesn't really bother me too much but just re-confirming what you are saying.
Thanks gkanapathy, since I'm still a little new to splunk as far as advanced configurations, I'm still learning to grasp the concepts of the transform.conf and props.conf files. Nevertheless, I appreciate the point in the right direction. I was actually gonna try this today before reading this comment since I was dealing with setting up custom extraction fields. Thanks!!!
Another suggestion is to use the TA for the Palo Altos and the Plao Alto App. It will parse the data automatically.