When importing data into splunk I would like to pull all the files from a specific directory besides special files that match the following regex .*?\d{14}(.log)+   And all the pulled files should have the same host of "server-1".
I also need to be able to set specific sourcetypes based on the file name:
D:\logs\Alarm.log    sourcetype=alarm
D:\logs\Admin.log    sourcetype=admin
D:\logs\DC_W_SDK.log sourcetype=sdk
D:\logs\DC_S_SDK.log sourcetype=sdk
D:\logs\alarm-84569558741265.log    IGNORE
D:\logs\*-crash.log  sourcetype=crash
D:\logs\*.log        sourcetype=catch_all
How would one set this up in the more efficient manner?
P.S. I would like to possibly have the host be the name that the host reports as I need this import setup on ~50 different servers
Hi,
1) Install the Universal Forwarder on those 50 servers from which you want logs. During the installation, the forwarder will make note of the local hostname (i.e. not necessarily the same as a DNS name), and tag all outgoing log data with this meta-information.
2) If you are going to have 50 identical source machines, it might be a very clever idea to make use of the Deployment Server (DS) functionality. By letting your Splunk Indexer also act as a Deployment Server, you can push out configuration changes to all clients, instead of logging on to them individually. NB. You should configure the Forwarders accordingly, i.e. configure the address of the DS, which is best done at installation time. So configure the DS first, then install the forwarders.
3) Working with a DS lets you push out ... er well, technically the clients polls the DS ... for configuration changes which are packaged into something called an app. This is normally one or more .conf files, and in your case it would probably be an inputs.conf and possibly an outputs.conf. Assuming that you configure the destination to be fairly static (i.e. the Indexer address is hardcoded in the forwarder), we will concentrate on the inputs part.
It could look something like @martin_mueller did, i.e. do the sourcetype fixing on the indexer-side, but you could also make explicit [monitor] stanzas for each of the known log files that you want. That lets you set the sourcetype and index on the forwarder side, which has its merits. Then just skip the rest of the catch-all-stuff - it is probably not important anyway :-). It all depends on your situation and needs, really.
Or you can do a variant of what @martin_mueller outlined, and that is to not set ANY sourcetype in inputs.conf on the forwarder. This forces splunk to detect/guess the sourcetype for each file it finds. Then use the indexer-side transforming of the known sourcetypes (however, you must call it with [source::D:\\logs\*] in props.conf instead of [catch-all], since there is no catch-all sourcetype).
Since splunk can learn new sourcetypes, it should hopefully classify them in the same manner. The benefit of this comes when you find that you may want to do something with that catch-all data in the future - then the same type of log have the same sourcetype. See more on why this can be a good thing in;
http://docs.splunk.com/Documentation/Splunk/6.1.3/Data/Whysourcetypesmatter
Bunching different types of data into a single sourcetype is not the splunk way. But you should be aware that you might end up with quite a few odd-looking sourcetypes if you let splunk do the guessing.
Hope this helps,
/K
 
		
		
		
		
		
	
			
		
		
			
					
		You would set up your inputs.conf something like this:
[monitor://D:\\logs\*]
blacklist = \d{14}\.log
host = server-1
sourcetype = catch_all
On the indexer side you set this in props.conf:
[catch_all]
TRANSFORMS-detect = detect_alarm_sourcetype, detect_admin_sourcetype, and so on
And this in transforms.conf:
[detect_alarm_sourcetype]
SOURCE_KEY = MetaData:Source
REGEX = Alarm\.log$
DEST_KEY = MetaData:Sourcetype
FORMAT = sourcetype::alarm
Add the other detection transforms rules accordingly. If none matches then the catch_all sourcetype is retained.
 
					
				
		
It should be added to sourcetype in props.conf on Indexer.
And to modify the timezone of all those sourcetypes can you add TZ = UTC to the monitor in inputs.conf?
