I am using the AWS Add-On for SPlunk to pull in vpcflowlogs via cloudwatch. The problem is Splunk is incorrectly identifying our accountId in each log as an epoch timestamp, placing all of our longs on the exact same time. I verified the filed extractions for this sourcetype specifically extract the AccountId, however, I don't know how to tell Splunk not to automatically extract that field as an epoch timestamp. has anyone run into this before, where Splunk finds a timestamp where none exists? Any thoughts on how to resolve this?
Timestamp extraction is done at parsing time. You do not set this in inputs.conf. Instead, do this in props.conf on the indexer:
[source::/path/to/the/source/file]
#yourtimestamp settings here
This will allow you to use the sourcetype that you choose, and change only the timestamp processing for this particular file (or set of files). Here is the docs page that describes the various settings in props.conf. In particular, I think you should consider using these two settings:
TIME_PREFIX = <regular expression>
MAX_TIMESTAMP_LOOKAHEAD = <integer>
TIME_PREFIX tells Splunk where to start looking for the timestamp. If the timestamp is at the beginning of the event, you don't need this. But it can be useful to make Splunk skip over fields (like the AccountId).
MAX_TIMESTAMP_LOOKAHEAD tells Splunk how many characters to examine for the timestamp. Usually a number like 25 is enough. Splunk starts from either the beginning of the line (or from the end of the TIME_PREFIX when specified) and looks only at the number of characters that you specify. Again, this keeps Splunk from moving past the region where it should find a timestamp, and picking up data from the wrong parts of the event.
These settings will also make the event parsing a little faster.