All Apps and Add-ons

Splunk Add-on for Amazon Web Services: How to prevent Splunk from identifying an accountId field as an epoch timestamp?

devenjarvis
Path Finder

I am using the AWS Add-On for SPlunk to pull in vpcflowlogs via cloudwatch. The problem is Splunk is incorrectly identifying our accountId in each log as an epoch timestamp, placing all of our longs on the exact same time. I verified the filed extractions for this sourcetype specifically extract the AccountId, however, I don't know how to tell Splunk not to automatically extract that field as an epoch timestamp. has anyone run into this before, where Splunk finds a timestamp where none exists? Any thoughts on how to resolve this?

1 Solution

lguinn2
Legend

Timestamp extraction is done at parsing time. You do not set this in inputs.conf. Instead, do this in props.conf on the indexer:

[source::/path/to/the/source/file]
#yourtimestamp settings here

This will allow you to use the sourcetype that you choose, and change only the timestamp processing for this particular file (or set of files). Here is the docs page that describes the various settings in props.conf. In particular, I think you should consider using these two settings:

TIME_PREFIX = <regular expression>
MAX_TIMESTAMP_LOOKAHEAD = <integer>

TIME_PREFIX tells Splunk where to start looking for the timestamp. If the timestamp is at the beginning of the event, you don't need this. But it can be useful to make Splunk skip over fields (like the AccountId).
MAX_TIMESTAMP_LOOKAHEAD tells Splunk how many characters to examine for the timestamp. Usually a number like 25 is enough. Splunk starts from either the beginning of the line (or from the end of the TIME_PREFIX when specified) and looks only at the number of characters that you specify. Again, this keeps Splunk from moving past the region where it should find a timestamp, and picking up data from the wrong parts of the event.

These settings will also make the event parsing a little faster.

View solution in original post

lguinn2
Legend

Timestamp extraction is done at parsing time. You do not set this in inputs.conf. Instead, do this in props.conf on the indexer:

[source::/path/to/the/source/file]
#yourtimestamp settings here

This will allow you to use the sourcetype that you choose, and change only the timestamp processing for this particular file (or set of files). Here is the docs page that describes the various settings in props.conf. In particular, I think you should consider using these two settings:

TIME_PREFIX = <regular expression>
MAX_TIMESTAMP_LOOKAHEAD = <integer>

TIME_PREFIX tells Splunk where to start looking for the timestamp. If the timestamp is at the beginning of the event, you don't need this. But it can be useful to make Splunk skip over fields (like the AccountId).
MAX_TIMESTAMP_LOOKAHEAD tells Splunk how many characters to examine for the timestamp. Usually a number like 25 is enough. Splunk starts from either the beginning of the line (or from the end of the TIME_PREFIX when specified) and looks only at the number of characters that you specify. Again, this keeps Splunk from moving past the region where it should find a timestamp, and picking up data from the wrong parts of the event.

These settings will also make the event parsing a little faster.

devenjarvis
Path Finder

I don't have a local props.conf setup on our indexers specific to this sourcetype. The only props.conf files specific to this sourcetype can be found in the AWS add-on on the forwarder and search head, and both of those are the default props file that comes with the app..

0 Karma

devenjarvis
Path Finder

I added the TIME_PREFIX to the props.conf on the forwarder and this appears to have done the trick! Thank you so much!

0 Karma

mreynov_splunk
Splunk Employee
Splunk Employee

that's great, but there is still a bug in the addon, or rather a behavior we want to support - would be good to get that sample anyway, even if there is a customization that serves as a workaround.
Also, hopefully you added this in local/props.conf, because if this is in /default/props, it WILL get overwritten on upgrade.

niddhi
Explorer

I have a similar case, where i want to filer the incoming data from cloudwatch logs. I am trying to configure props.conf and transform.conf. What should be the value of source in this case?

0 Karma

mreynov_splunk
Splunk Employee
Splunk Employee

it should be cloudwatchlogs, I think its getting confused about the format.

0 Karma

devenjarvis
Path Finder

I tried both cloudwatchlogs and cloudwatchlogs:vpcflow for the sourcetype and both had the same issue.

0 Karma

mreynov_splunk
Splunk Employee
Splunk Employee

Did you configure through UI or conf files? Basically, please share the config.

0 Karma

devenjarvis
Path Finder

I configured through the UI. Here is the resulting config for this input:

[input_name]
account = [account_name]
delay = 1800
groups = vpcflowlogs
index = taxhubprod
interval = 30
only_after = 1971-01-01T00:00:00
region = us-east-1
sourcetype = aws:cloudwatchlogs:vpcflow
stream_matcher = .*

Nothing there seems to indicate how Splunk would interpret the time stamp however.

0 Karma

mreynov_splunk
Splunk Employee
Splunk Employee

timestamp extraction is not handled through input, but rather native Splunk behavior or props/transforms override. For vpcflow logs it's the former.

The props for this sourcetype extract in the following format:

[aws:cloudwatchlogs:vpcflow]
EXTRACT-all=^\s*(?P[^\s]+)\s+(?P[^\s]+)\s+(?P[^\s]+)\s+(?P[^\s]+)\s+(?P[^\s]+)\s+(?P[^\s]+)\s+(?P[^\s]+)\s+(?P[^\s]+)\s+(?P[^\s]+)\s+(?P[^\s]+)\s+(?P[^\s]+)\s+(?P[^\s]+)\s+(?P[^\s]+)\s+(?P[^\s]+)

This is the format it expects:

2 000000000000 eni-00000000 #ipv4-1# #ipv4-2# #port-1# #port-2# #protocol# #packets# #bytes# #timestamp# #timestamp# #action# OK

It seems Splunk is getting confused about the timestamp in your vpcflow events: instead of grabbing the start_time, it finds the account which also looks like a timestamp.

This does seem like a bug, could you share a sample event for me to try?

0 Karma

devenjarvis
Path Finder

I am trying to get a hold of one, however I can no longer see the events in Splunk due to having too many events with the same time stamp (Splunk won't show any of them). I am trying to get one directly from AWS however.

0 Karma
Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

WATCH NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If exploited, ...

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...