Hello,
Splunk is insisting on trying to auto-find headers in a tab-delimited CSV file for which I have manually defined headers in a CONF file. I thought that putting this information in /etc/system/local would override the /etc/apps/learned/ but it doesn't look like that's the case...
Here are my CONF files for system-local:
inputs.conf
[monitor:///logs/strauss_splunk/bulksession]
sourcetype=csv
source=strauss_sessions
index=strauss_sessions
host=WHITNEY
[monitor:///logs/strauss_splunk/bulkurl]
sourcetype=csv
source=strauss_url
index=strauss_url
host=WHITNEY
[monitor:///inetpub/strauss_splunk/bulkhit]
sourcetype=csv
source=strauss_hits
index=strauss_hits
host=WHITNEY
props.conf
[source::strauss_url]
SHOULD_LINEMERGE=false
CHECK_FOR_HEADER=false
TRANSFORMS-STRAUSSTSV=STRAUSSTSV-1
[source::strauss_sessions]
SHOULD_LINEMERGE=false
CHECK_FOR_HEADER=false
TRANSFORMS-STRAUSSTSV=STRAUSSTSV-2
[source::strauss_hits]
SHOULD_LINEMERGE=false
CHECK_FOR_HEADER=false
TRANSFORMS-STRAUSSTSV=STRAUSSTSV-3
transforms.conf
[STRAUSSTSV-3]
DELIMS = "  "
FIELDS = "SESSION_KEY", "HIT_KEY", "ID", "SECURE"
[STRAUSSTSV-2]
DELIMS = "  "
FIELDS = "SESSION_KEY", "ADDRESS", "CANISTER"
[STRAUSSTSV-1]
DELIMS = "  "
FIELDS = "SESSION_KEY", "HIT_KEY", "NAME", "VALUE", "TIMESTAMP"
* * * * * * * * * * * * * * * * * * * * * * * *
...and this all looks good, right? But... this is what the /system/learned/ CONF files populate as afterwards:
* * * * * * * * * * * * * * * * * * * * * * * *
props.conf
[csv-2]
KV_MODE = none
REPORT-AutoHeader = AutoHeader-1
SHOULD_LINEMERGE = False
given_type = csv
pulldown_type = true
[csv-3]
KV_MODE = none
REPORT-AutoHeader = AutoHeader-2
SHOULD_LINEMERGE = False
given_type = csv
pulldown_type = true
[csv-4]
KV_MODE = none
REPORT-AutoHeader = AutoHeader-3
SHOULD_LINEMERGE = False
given_type = csv
pulldown_type = true
[csv-5]
KV_MODE = none
REPORT-AutoHeader = AutoHeader-4
SHOULD_LINEMERGE = False
given_type = csv
pulldown_type = true
[csv-6]
KV_MODE = none
REPORT-AutoHeader = AutoHeader-5
SHOULD_LINEMERGE = False
given_type = csv
pulldown_type = true95
transforms.conf
[AutoHeader-1]
DELIMS = "  "
FIELDS = "58b0c3f3c517dd9ee90cf256800dae98", "27eec7a8d949b8afe03a11b47604633b", "B99004EA29E4FA783416FAC3F7AB87A5", "N"
[AutoHeader-2]
DELIMS = "  "
FIELDS = "0f4c4f0c76bb2898ccbcfa816cfbe49b", "cbb2438acf31acf8acefacb3ff2b59a9", "EA9AE62469C9E2DCE926B32A545675EA", "N"
[AutoHeader-3]
DELIMS = "  "
FIELDS = "8ed97b717ce4b20e561a9c5a033f925c", "44bdc568a0b0052fd2232239257ebc6c", "897FFA5C1C1F078517B1FF8DB392AC54", "Y"
[AutoHeader-4]
DELIMS = "  "
FIELDS = "58b0c3f3c517dd9ee90cf256800dae98", "63.194.158.158", "LSSN_20110419_WHITNEY.dat"
[AutoHeader-5]
DELIMS = "  "
FIELDS = "becf1cd6433bd8ddf2a3f4e9da3fe133", "20c184e3ce2396fbda9d5071c8b3344d", "login_username", "XXXXXXXXXXXXXXXX", "2011-04-19 07:00:20.000"
Any ideas??
Thank you so much
Solved: I capitulated. Don't fight the beast. I ended up saying screw-it, splunk, you can auto-extract field names for me. But I wrote a REGEX rule that pointed to nullQueue to remove the first line.
See here:
http://www.splunk.com/support/forum:SplunkAdministration/4081
Solved: I capitulated. Don't fight the beast. I ended up saying screw-it, splunk, you can auto-extract field names for me. But I wrote a REGEX rule that pointed to nullQueue to remove the first line.
See here:
http://www.splunk.com/support/forum:SplunkAdministration/4081
 
					
				
		
You might want to add this to your props.conf
http://www.splunk.com/base/Documentation/latest/admin/Propsconf
LEARN_SOURCETYPE = [true|false]
 
		
		
		
		
		
	
			
		
		
			
					
		You would need to clean out the etc/apps/learned/props.conf file, and reindex the data.
This is a test box, because I have about 10+ GB /day of this stuff to index, (short halflife) and I'm cleaning the index every time. (> splunk clean eventdata) I got it to work with auto field extraction by inserting my own header line, but the issue there is that the header line is included in the count, and if I have 27,804 events I don't want it to say 27,805.
 
					
				
		
I think changes at this point would only apply to new event's and not events already indexed.
Thanks for the suggestions, but neither of those actually fixed it. The learn_sourcetype modifier did stop Splunk from trying to auto-define fields, but it didn't let my CONF files take over...
 
					
				
		
Light bulb went off when I re-read your question. You will need to use DELIMS = "\t" for tab and not " "
Thanks, haven't got it to work yet, but I'll keep investigating. I think I might have to change some other things around relating to the source types.
