Getting Data In

Multiple sourcetypes on SQS based S3


Hello Splunkers,

I have a bit of an issue onboarding some AWS Canaries from S3.  We have deployed the SQS/SNS and S3 and the files are coming in fine.  However each canary writes 4 files:

  1. TXT - detailed report -  I want this
  2. JSON - summary report - I want this
  3. PNG - image - I don't want this
  4. HTML - I don't want this

Because they are all coming on a single queue they are getting the same sourcetype which is not working as they are very different structures.  As such I've built the following props:

KV_MODE = none
LINE_BREAKER = ([\r\n]+)
category = Structured
description = AWS Canary JSON Summary File
disabled = false
pulldown_type = true

LINE_BREAKER = (Start Canary)
TIME_PREFIX = timestamp\:\s
category = Application
description = AWS Canary Detailed TXT file
disabled = false
pulldown_type = true

This is on the HF and the Indexer cluster.

In order to separate and drop the files the following is also added to the props/transforms:


TRANSFORMS-aws_canaries = set_aws_canary_json, set_aws_canary_txt, drop_aws_canary_png, drop_aws_canary_html


##### AWS Canaries - change sourcetype for SQS based S3 files and drop PNG files #####
SOURCE_KEY = MetaData:Source
REGEX = ^source::.*json
FORMAT = sourcetype::aws:canaries:summary
DEST_KEY = MetaData:Sourcetype

SOURCE_KEY = MetaData:Source
REGEX = ^source::.*txt
FORMAT = sourcetype::aws:canaries:detailed
DEST_KEY = MetaData:Sourcetype

SOURCE_KEY = MetaData:Source
REGEX = ^source::.*png
FORMAT = nullQueue
DEST_KEY = queue

SOURCE_KEY = MetaData:Source
REGEX = ^source::.*html
FORMAT = nullQueue
DEST_KEY = queue

Again, both applied to HF and Indexers.

On the inputs I have the following:

aws_account = SplunkForwarderRole
aws_iam_role = canaries
index = canaries
interval = 300
s3_file_decoder = CustomLogs
sourcetype = aws:canaries
sqs_batch_size = 10
sqs_queue_region = us-east-1
sqs_queue_url =
disabled = 1


The idea being that the input receives the data and sourcetypes it aws:canaries then parsing/transorms alters the sourcetype.

On the SH I can see the files are being source typed correctly (and dropped) however event breaking is not working... the JSON is everyline and it looks like the TXT file breaks at the date.

Anyone configured something similar?

I suspect the event breaking is happening as part of the aws:canaries sourcetype?  Just not sure

Any help appreciated!!

Labels (3)
0 Karma
1 Solution


In the end, we created two queues and used filters to route the data.

View solution in original post

0 Karma


In the end, we created two queues and used filters to route the data.

0 Karma
Get Updates on the Splunk Community!

March Community Office Hours Security Series Uncovered!

Hello Splunk Community! In March, Splunk Community Office Hours spotlighted our fabulous Splunk Threat ...

Stay Connected: Your Guide to April Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars in April. This post ...

Want to Reduce Costs, Mitigate Risk, Improve Performance, or Increase Efficiencies? ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...