How can I add the linux /home directory to a server's Data Set and splunk only 2011 .bash_history data? If I add /home as a Data Set, it splunk's all data in /home going back to 1009 pushing me over my 500mb free version limit.
Put ".bash_history*" in the whitelist option (edit your entry on the gui: >manager>data>inputs>files & directories) or simply just monitor exactly that file you want instead of the whole directory
Hm, not sure if I get you right. Do you want to see/search only events from 2011 out of your .bash_history file? If that is the case, you'll have the option in the search-app to drill down only that time-range!
Splunk is going to eat the entire file, I don't know of a setting that will only index specific parts of a file.
However, if you are wanting a specific data set from a file I would advise writing a small script that will write all of the data from year xxxx, in your case 2011, to a file. Then have splunk index that file. If your server is unix based you can cron the script to run every day to keep you file up-to-date.
If you index the file with its own source/sourcetype, you can use MAX_DAYS_AGO in props.conf and set it to the number of days since in 2011, that way anything prior is ignored.
<p>MAX_DAYS_AGO = * Specifies the maximum number of days past, from the current date, that an extracted date can be valid. * For example, if MAX_DAYS_AGO = 10, Splunk ignores dates that are older than 10 days ago. * Defaults to 2000 (days). * IMPORTANT: If your data is older than 2000 days, increase this setting.</p>
.bash_history may not capture everything if the user has multiple sessions or the session terminates abnormally. See http://mywiki.wooledge.org/BashFAQ/088
Depending on your setup, you might want to consider using a version of bash with native syslog support compiled in. To help get you started: http://blog.rootshell.be/2009/02/28/bash-history-to-syslog/