All Apps and Add-ons

How to configure Amazon Kinesis Modular Input for gzipped data from CloudWatch? Getting error "Malformed data: null"

scpack
New Member

I’m working to implement the Kinesis Modular Input for Splunk, ingesting VPC Flow Logs from CloudWatch, but am running into an issue. When CloudWatch submits logs to Kinesis, it unavoidably gzips the data prior to sending in the records, which is causing the Kinesis input to throw errors:

04-11-2016 15:02:46.680 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py" Malformed data: null
04-11-2016 15:02:46.680 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py" java.nio.charset.MalformedInputException: Input length = 1

I imagine it wouldn't be a hard fix, but my Java-fu is pretty rusty.

0 Karma
1 Solution

Damien_Dallimor
Ultra Champion

Check out the latest version , 1.0.2 on Splunkbase.

Release notes copy/paste :

Pushed default charset decoding out of
the main message processing flow and
into custom handling , so custom
handlers that you implement should in
theory be able to process any binary
or text payload.

View solution in original post

Damien_Dallimor
Ultra Champion

Check out the latest version , 1.0.2 on Splunkbase.

Release notes copy/paste :

Pushed default charset decoding out of
the main message processing flow and
into custom handling , so custom
handlers that you implement should in
theory be able to process any binary
or text payload.

scpack
New Member

With the newly implemented message_handler I'm still running into a problem. Sorry for the in-comment nastiness, I can't do attachments yet.

Stanza Definition:


[kinesis://vpc_flow_logs_us-west-2]
app_name = vpc_flow_logs_us-west-2
aws_access_key_id = 
aws_secret_access_key = 
hec_batch_mode = 0
hec_https = 0
host = 
index = aws
initial_stream_position = TRIM_HORIZON
kinesis_endpoint = https://kinesis.us-west-2.amazonaws.com
message_handler_impl = com.splunk.modinput.kinesis.GZIPDataRecordDecoderHandler
output_type = stdout
sourcetype = aws:kinesis:vpc_flowlogs
stream_name = vpc_flow_logs_us-west-2

Example Record:

{
            "Data": "H4sIAAAAAAAAAL1d224cRw79lYaenUZdWLe8GY4SIIvFLtZ+WwSG4iiBAPkCS0mwCPLvy66ei6a7WKPWkAwSJJDGM31CFnl4yOL8dfXx9uHh5rfbd//7cnv17dV3r9+9fv/P67dvX/9wffXq6vOfn26/4o/BleQMWAjG4Y/vP//2w9fPv3/B3/zx5cP7X+8///kw//jt49fbm4/489tPd9/8DDe/3OQP7pub+3v89cPvPz98+Hr35fHu86fv7+4fb78+XH373+M7vP+1/vDqp/pO13/cfnqcXvDX1d0v+IbehWScd6aEAhASeOty8gClRB9SiiU4Cynjr5OJ+EqPL/YZ8IMf7xDk481HfF4L0eCfSLYYY17twePbu+EpxuHp4w82udHb0dkxmiHG0fkw2uDHUoZsBo/Pk4Y44OuMGY4fsP/P7Mvw+s2b63+/G/71j6u/X12GJ3DgKW504BFHGi2cwvM5pTTBmgBBzuKAIruBPIIyI8AYYAISYiwO4fjBRnn7JAF/szahvcZY4XhrM6jByQxwThCcYAsGopudzQ4OoAEnJU44Rd7ZUkg96xRGOMUIWyfEsrMOmsfaKO1uxcrbJ2Rf3Q1a5mH1tuK40TiLxjIw1n97PwHCJ8tFC5CX9rewiwYBc6kTzz2Fgxzs8eTRTk98kktTyUExlxYWctAiO4BEAWPAdIBM2sVrBX/j4AYHCH4Ef4rOlt0hUko/hYMcnPLPU3/Dh1H1Nw520CQ782Ga2E6eTKSVT9npQSNg4xMHJfoGhoMgPCMgJJ2AAIaDH3QCwvRuRi8ggOFgCB08+Clmz6/d4LyCx3FShHZK9aBpIQn9YBHiKiSlA8TOEI5aQhl9GEIxzg0FvWr2OR9yy0acYRsMv4TQsFFQsxG/hLBWREJPseKFw04SrEG7IKQUxxQGdDfvrak1EEJz8iGOgyWcmOS0CqoKwrmqm/P8WDmWUFNS3iciJdZjWVSE07h2Aq+GtznSTSEBi1bhM2TZlYSmip3nmGDl8TCxhKdUdCElZJ9UiY/lIArLcuG0lqhKT7e64zUSB1foVKvBQ1ZkcpZFTOiE7anLoCiWgmXhCZ1WUPDJ7bg2OltILUS8JmLpNvRMtM+sfgjyRM6yyAkdA2HYA9DTe8BJEIWT5pZFom2VeLbjbzYsac+sJyjBYdISyMo7mwJnUyonL3UcLKHrbnOHS4mWOnkpIRmjJcU56UEEFw/NExV+4KT5wdR4VIQjNojwRJufIoKONg+OaRKhVzKErFoyOBYhoauVBk2t1MvrCHPHTifCeRZ+YMtY/BgRRFnUcxgPQkBHm4vupm414fnP9Y/Xb1jwCBOEmnwUvY1FQyDjNcCeWiupcJ5pGKHXHN7FN/xNJv2NDxCTfkBMi8ylgqK/sXQZeuV2tgfh1yP9MeKNEy/PEQJ+eFf4ZcXD0mromAgw+8TZRDCkdmeLkyF48XnFmGsBpNIJAnZ+sLBV3mtWSjI2sCsIa0Czy+lMlwOwziM0SfY0PaYXtIFdQmhM9IB1KSuRUhBvM8xoFE2kMrQ41XZKOg9IjCMs26nR6eHhnFqkgoJm5QCclxoahdDMeTSVUuDgCd2oUKcWNbtbQazXcOCm0fgYlTr4QYEqzIqpTh4KEhMJS20+OC3iE2TFhMqy9RqPQf7SY23eK7kap47QbGwdbwHohOvAea+BkHqejJB5hfPDriSsw1syRmsEM0gLCRWLHuMJ/DpCi2P77s1HTn+LTDcf6cH56HSbQVFiGmGRUKfLtjoHKLLTg4bUU/VSJWkxyl59nC91a/LryM8RFtJi7aC81DwevMeXuBydg2QxtmQkAvgYqaDVfSwTRsCHRS8oiIbgCIkpHtSJ32Y4SEKACIqwCVC//HHZmfXwyxEP+gsjHoIgvBzPOiDM43CLAJeeeXy24iEYwsvxtITFYNcnSMpABEXYBqjfO8n7fh1grRpBGlIiSMJFZ+hUiNtddKIsxBoSEsEQNsHpS7/BEklIykIETdgatsmFAjHGc4B4bUTQBM4wNx8kpbCQCJrAmIfmLQk6YTtx8IT+ep6Y7frCupS7SbCEVV1XbI8m8ALioAkdrWfmPGeIHO8BYicKay3OObdbKRDkQzYvTzh3uQ7fpTjC6+jJuI2YshxReHqM/Hrvg9AxyhxUoV/cFeeils9lDppwJqvOe+90slDmYAn9juoUszXrVWrP4osgtSXtuG9ALobjpJyOnSo0KlaYnl6JylGbFi8AtLyIjwVEtOuL+FI+x64pNG9Fa3FTatciL5zYS0K8/iatKMynR1NRoBYuMioKPkSrxhOofYt85dBcgmvBYZITennVB9W8Sq1cZDtELpXDRQedxEptXeQTfahWipSNmJoPVG+oKgqqgPiZwnpYqfqdUiqiFi9ytYfmoK0aFyTIwjJyN/ZMPxcOlqAx+oCkJnj831Egeeen6FkHSlPOZboulksGRD21KptwMljWCjyurqLgJzzP47YBCsTmxUsALSpwfK6yyqxicNpEYRuc3gHKjlguIIaozRX4EHmf2nl1jwh9xFhORG2qsBEROcPoQzjs6zlNQ2ImahOFy4LCIsaldStFzD5tmsBmnxoPVO3Tpgm89llvGBCD0yYJ2+Agv3EGeY6F0Zq4YKalTNwH/1ooCk8Q9XXtrYjaJIE7q6ZVv0vsCLHThMaYqSsrSU4KD7F5kcNABz2hZlalI0RsXuRkPTULacFh4ghUN9Ki17W3FIo5HAdHOHc3bSI+WgeIhSL0BrVdKaosjti6+DKfa6kj6TCFpeRyYizhMEEynyMtl+OgCd0RhWSsKo8j1i6+sBRqy6apfcNGzEgcRKF7ikpbY5SyEbF6kdVGsX1pVQwSP1lYjzhPZ2mxX+nZeKLF50dDhzCNx9o4fftetgEc/jJEfBpvXIngIgJ2+HebLUyCHJuJsjsmIzhCKrsbXmeP0VZIbbrACqm0BxXEILUZwzZIZ+axwKTVVq9yvITH63VtvrANT3cqOLrj6MX00uMeYDEbtSnDBTZqjF54X4fmmkZiB9TmDBsB9VrhFY3qOWqThheGhoZ+Wpzbc4YwuCe3psUQtTnDRkSkPmdD9kb1GBH7GBlC3XEfI8B6qlEMT5swbHY5knljadRWuMUQcVCGM0sunAWnlYyInYwvCwqtrwyomVUzzBFbGS+x0OrLeCYTaB0hDrZwhnPXU6TlcOxMoXENbyZAWhbioAp05Qp4hPyxhsDHEwfEQRQqGrSKLWascfk0bps079FFC2XK5+iuylZAHDyhL3Jj5bxq7UvZh9jKeAnZXo0FW1O59ulYsBggFprQWeQ+eZMmSyC2MnLwuKdtvHXbSwwQE00g1bli2gNzYoA4WEKH91SqqMp7iK2MfBVrdIkYpRWDxMEV+ven4bDnS6nEI3Yz8iol+CkrEVWK0BG7GTnlufkwKSk/xGZGNplkLsA1jxGxmJFRy6r+pgqJgzGcGWSaUpJSeiV2M3LqPrM6p4WHgy6c3bUCef1VUGKI2FWFddieE6xS2CZ2NHJT1Lya3xazELuwsCzyZj1YCw6LqkDPntaCVZNxExsaGUP2rJRo2YeJJlDS9qwCX2SgBMnA9DWqCT03BR8n2cXHHNJ0bcUbAHwsfI9g/TQBQGxoLJGtxkt73oMAcjosiShRCBBBEl4EqLmGFg7z9UqICJ7AZCKHn7mrhlJtUIrjIXjCJjznaFxa51QpOARJuAjOKY2DFHZLL6I8HIIiXABn2UlJplI4FTQEP9h6dqi6DmzeT5idfveyGCCCIbCaJ2Qt8xD04MKz40bkAqPFGGCx/Kn/YGgbFKxD8IOXuVtrEMsa4pKxFCRiOyNreENQa6VHCg8LP6DbDvX0KGZTYjcjh32OJV31Oi0DcdCDc2eIINlSkOQowomN9A4RO0loSD3eZEWv4yAKZxZo1spBCw8/T1iT7KhGson1jBvhdNRsn41rt4SkEHFQhbNHKJuodoSI9YyMXFs7bhP7GRm5wr4POYkJOYvjYecKrQWNRi0NEQsaGYPcLI7oFEPEbkbG0g6y1ctAxGLGbXC6ffyZIGhKccRuxksO0HIIK3hTv6W0DBDlA4IES1gWd7mA1gniYAlLBWFxE1xTTiBWM3IeoUoQ9jnVFvl6ldjNyJVT5xBXAZXBTt8OJY2Hs9/QbKBE3QYKsZ2Rj8ZVMXtvofycRPTT3/8HQnOWOQCrAAA=",
            "PartitionKey": "4e8cecf660ea97791f347097c85223b3",
            "ApproximateArrivalTimestamp": 1460500032.299,
            "SequenceNumber": "49560977364991779866952104588195942057001415028632453122"
        }
    ],
    "NextShardIterator": "AAAAAAAAAAGsJQiCfpWm2CQ7S1XEkwV4ljjbVHp+0d4uxTSQ2RAxlwWGWhBJ10fMfM5a6CxZNISyRWLH8Ahe1h6eCjCY8EgZvsGF3Y5uHxIDEfR9uuANPUJTONlXzDKciCQ6kdoGeyB1uYKnm3bYsZssknjfazm5MUi1c8JfVt5hrpBHIOBWPIX03B+rZplYGLue6Z6PvCdEaqU1WnwncUpGFlYExUv/f3B0/DZXjYy9exMwCY9Nqw==",
    "MillisBehindLatest": 0
}

Error Message:

04-12-2016 15:49:12.788 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py" Couldn't process record {SequenceNumber: 49560977364991779866952104588195942057001415028632453122,Data: java.nio.HeapByteBuffer[pos=1 lim=3146 cap=3146],PartitionKey: 4e8cecf660ea97791f347097c85223b3}. Skipping the record.
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py" Malformed data: null
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py" java.nio.charset.MalformedInputException: Input length = 1
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py"
       at java.nio.charset.CoderResult.throwException(CoderResult.java:281)
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py"
       at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:816)
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py"
       at com.splunk.modinput.kinesis.KinesisModularInput$RecordProcessor.processRecordsWithRetries(Unknown Source)
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py"
       at com.splunk.modinput.kinesis.KinesisModularInput$RecordProcessor.processRecords(Unknown Source)
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py"
       at com.amazonaws.services.kinesis.clientlibrary.lib.worker.ProcessTask.call(ProcessTask.java:125)
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py"
       at com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:48)
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py"
       at com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:23)
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py"
       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py"
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py"
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py"
       at java.lang.Thread.run(Thread.java:745)
0 Karma

Jeremiah
Motivator

Yes this is a problem for any kinesis stream that is fed from a cloudwatch subscription. Customizing the default message handler won't help because its expecting text, not compressed data. As an alternative, have you looked at the AWS app? It supports reading VPC flow logs directly from Cloudwatch.

0 Karma

Damien_Dallimor
Ultra Champion

New version coming imminently to skip the charset decoding and jump straight to gzip decoding.

0 Karma
Get Updates on the Splunk Community!

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer at Splunk .conf24 ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...