All Apps and Add-ons

How to configure Amazon Kinesis Modular Input for gzipped data from CloudWatch? Getting error "Malformed data: null"

scpack
New Member

I’m working to implement the Kinesis Modular Input for Splunk, ingesting VPC Flow Logs from CloudWatch, but am running into an issue. When CloudWatch submits logs to Kinesis, it unavoidably gzips the data prior to sending in the records, which is causing the Kinesis input to throw errors:

04-11-2016 15:02:46.680 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py" Malformed data: null
04-11-2016 15:02:46.680 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py" java.nio.charset.MalformedInputException: Input length = 1

I imagine it wouldn't be a hard fix, but my Java-fu is pretty rusty.

0 Karma
1 Solution

Damien_Dallimor
Ultra Champion

Check out the latest version , 1.0.2 on Splunkbase.

Release notes copy/paste :

Pushed default charset decoding out of
the main message processing flow and
into custom handling , so custom
handlers that you implement should in
theory be able to process any binary
or text payload.

View solution in original post

Damien_Dallimor
Ultra Champion

Check out the latest version , 1.0.2 on Splunkbase.

Release notes copy/paste :

Pushed default charset decoding out of
the main message processing flow and
into custom handling , so custom
handlers that you implement should in
theory be able to process any binary
or text payload.

View solution in original post

scpack
New Member

With the newly implemented message_handler I'm still running into a problem. Sorry for the in-comment nastiness, I can't do attachments yet.

Stanza Definition:


[kinesis://vpc_flow_logs_us-west-2]
app_name = vpc_flow_logs_us-west-2
aws_access_key_id = 
aws_secret_access_key = 
hec_batch_mode = 0
hec_https = 0
host = 
index = aws
initial_stream_position = TRIM_HORIZON
kinesis_endpoint = https://kinesis.us-west-2.amazonaws.com
message_handler_impl = com.splunk.modinput.kinesis.GZIPDataRecordDecoderHandler
output_type = stdout
sourcetype = aws:kinesis:vpc_flowlogs
stream_name = vpc_flow_logs_us-west-2

Example Record:

{
            "Data": "H4sIAAAAAAAAAL1d224cRw79lYaenUZdWLe8GY4SIIvFLtZ+WwSG4iiBAPkCS0mwCPLvy66ei6a7WKPWkAwSJJDGM31CFnl4yOL8dfXx9uHh5rfbd//7cnv17dV3r9+9fv/P67dvX/9wffXq6vOfn26/4o/BleQMWAjG4Y/vP//2w9fPv3/B3/zx5cP7X+8///kw//jt49fbm4/489tPd9/8DDe/3OQP7pub+3v89cPvPz98+Hr35fHu86fv7+4fb78+XH373+M7vP+1/vDqp/pO13/cfnqcXvDX1d0v+IbehWScd6aEAhASeOty8gClRB9SiiU4Cynjr5OJ+EqPL/YZ8IMf7xDk481HfF4L0eCfSLYYY17twePbu+EpxuHp4w82udHb0dkxmiHG0fkw2uDHUoZsBo/Pk4Y44OuMGY4fsP/P7Mvw+s2b63+/G/71j6u/X12GJ3DgKW504BFHGi2cwvM5pTTBmgBBzuKAIruBPIIyI8AYYAISYiwO4fjBRnn7JAF/szahvcZY4XhrM6jByQxwThCcYAsGopudzQ4OoAEnJU44Rd7ZUkg96xRGOMUIWyfEsrMOmsfaKO1uxcrbJ2Rf3Q1a5mH1tuK40TiLxjIw1n97PwHCJ8tFC5CX9rewiwYBc6kTzz2Fgxzs8eTRTk98kktTyUExlxYWctAiO4BEAWPAdIBM2sVrBX/j4AYHCH4Ef4rOlt0hUko/hYMcnPLPU3/Dh1H1Nw520CQ782Ga2E6eTKSVT9npQSNg4xMHJfoGhoMgPCMgJJ2AAIaDH3QCwvRuRi8ggOFgCB08+Clmz6/d4LyCx3FShHZK9aBpIQn9YBHiKiSlA8TOEI5aQhl9GEIxzg0FvWr2OR9yy0acYRsMv4TQsFFQsxG/hLBWREJPseKFw04SrEG7IKQUxxQGdDfvrak1EEJz8iGOgyWcmOS0CqoKwrmqm/P8WDmWUFNS3iciJdZjWVSE07h2Aq+GtznSTSEBi1bhM2TZlYSmip3nmGDl8TCxhKdUdCElZJ9UiY/lIArLcuG0lqhKT7e64zUSB1foVKvBQ1ZkcpZFTOiE7anLoCiWgmXhCZ1WUPDJ7bg2OltILUS8JmLpNvRMtM+sfgjyRM6yyAkdA2HYA9DTe8BJEIWT5pZFom2VeLbjbzYsac+sJyjBYdISyMo7mwJnUyonL3UcLKHrbnOHS4mWOnkpIRmjJcU56UEEFw/NExV+4KT5wdR4VIQjNojwRJufIoKONg+OaRKhVzKErFoyOBYhoauVBk2t1MvrCHPHTifCeRZ+YMtY/BgRRFnUcxgPQkBHm4vupm414fnP9Y/Xb1jwCBOEmnwUvY1FQyDjNcCeWiupcJ5pGKHXHN7FN/xNJv2NDxCTfkBMi8ylgqK/sXQZeuV2tgfh1yP9MeKNEy/PEQJ+eFf4ZcXD0mromAgw+8TZRDCkdmeLkyF48XnFmGsBpNIJAnZ+sLBV3mtWSjI2sCsIa0Czy+lMlwOwziM0SfY0PaYXtIFdQmhM9IB1KSuRUhBvM8xoFE2kMrQ41XZKOg9IjCMs26nR6eHhnFqkgoJm5QCclxoahdDMeTSVUuDgCd2oUKcWNbtbQazXcOCm0fgYlTr4QYEqzIqpTh4KEhMJS20+OC3iE2TFhMqy9RqPQf7SY23eK7kap47QbGwdbwHohOvAea+BkHqejJB5hfPDriSsw1syRmsEM0gLCRWLHuMJ/DpCi2P77s1HTn+LTDcf6cH56HSbQVFiGmGRUKfLtjoHKLLTg4bUU/VSJWkxyl59nC91a/LryM8RFtJi7aC81DwevMeXuBydg2QxtmQkAvgYqaDVfSwTRsCHRS8oiIbgCIkpHtSJ32Y4SEKACIqwCVC//HHZmfXwyxEP+gsjHoIgvBzPOiDM43CLAJeeeXy24iEYwsvxtITFYNcnSMpABEXYBqjfO8n7fh1grRpBGlIiSMJFZ+hUiNtddKIsxBoSEsEQNsHpS7/BEklIykIETdgatsmFAjHGc4B4bUTQBM4wNx8kpbCQCJrAmIfmLQk6YTtx8IT+ep6Y7frCupS7SbCEVV1XbI8m8ALioAkdrWfmPGeIHO8BYicKay3OObdbKRDkQzYvTzh3uQ7fpTjC6+jJuI2YshxReHqM/Hrvg9AxyhxUoV/cFeeils9lDppwJqvOe+90slDmYAn9juoUszXrVWrP4osgtSXtuG9ALobjpJyOnSo0KlaYnl6JylGbFi8AtLyIjwVEtOuL+FI+x64pNG9Fa3FTatciL5zYS0K8/iatKMynR1NRoBYuMioKPkSrxhOofYt85dBcgmvBYZITennVB9W8Sq1cZDtELpXDRQedxEptXeQTfahWipSNmJoPVG+oKgqqgPiZwnpYqfqdUiqiFi9ytYfmoK0aFyTIwjJyN/ZMPxcOlqAx+oCkJnj831Egeeen6FkHSlPOZboulksGRD21KptwMljWCjyurqLgJzzP47YBCsTmxUsALSpwfK6yyqxicNpEYRuc3gHKjlguIIaozRX4EHmf2nl1jwh9xFhORG2qsBEROcPoQzjs6zlNQ2ImahOFy4LCIsaldStFzD5tmsBmnxoPVO3Tpgm89llvGBCD0yYJ2+Agv3EGeY6F0Zq4YKalTNwH/1ooCk8Q9XXtrYjaJIE7q6ZVv0vsCLHThMaYqSsrSU4KD7F5kcNABz2hZlalI0RsXuRkPTULacFh4ghUN9Ki17W3FIo5HAdHOHc3bSI+WgeIhSL0BrVdKaosjti6+DKfa6kj6TCFpeRyYizhMEEynyMtl+OgCd0RhWSsKo8j1i6+sBRqy6apfcNGzEgcRKF7ikpbY5SyEbF6kdVGsX1pVQwSP1lYjzhPZ2mxX+nZeKLF50dDhzCNx9o4fftetgEc/jJEfBpvXIngIgJ2+HebLUyCHJuJsjsmIzhCKrsbXmeP0VZIbbrACqm0BxXEILUZwzZIZ+axwKTVVq9yvITH63VtvrANT3cqOLrj6MX00uMeYDEbtSnDBTZqjF54X4fmmkZiB9TmDBsB9VrhFY3qOWqThheGhoZ+Wpzbc4YwuCe3psUQtTnDRkSkPmdD9kb1GBH7GBlC3XEfI8B6qlEMT5swbHY5knljadRWuMUQcVCGM0sunAWnlYyInYwvCwqtrwyomVUzzBFbGS+x0OrLeCYTaB0hDrZwhnPXU6TlcOxMoXENbyZAWhbioAp05Qp4hPyxhsDHEwfEQRQqGrSKLWascfk0bps079FFC2XK5+iuylZAHDyhL3Jj5bxq7UvZh9jKeAnZXo0FW1O59ulYsBggFprQWeQ+eZMmSyC2MnLwuKdtvHXbSwwQE00g1bli2gNzYoA4WEKH91SqqMp7iK2MfBVrdIkYpRWDxMEV+ven4bDnS6nEI3Yz8iol+CkrEVWK0BG7GTnlufkwKSk/xGZGNplkLsA1jxGxmJFRy6r+pgqJgzGcGWSaUpJSeiV2M3LqPrM6p4WHgy6c3bUCef1VUGKI2FWFddieE6xS2CZ2NHJT1Lya3xazELuwsCzyZj1YCw6LqkDPntaCVZNxExsaGUP2rJRo2YeJJlDS9qwCX2SgBMnA9DWqCT03BR8n2cXHHNJ0bcUbAHwsfI9g/TQBQGxoLJGtxkt73oMAcjosiShRCBBBEl4EqLmGFg7z9UqICJ7AZCKHn7mrhlJtUIrjIXjCJjznaFxa51QpOARJuAjOKY2DFHZLL6I8HIIiXABn2UlJplI4FTQEP9h6dqi6DmzeT5idfveyGCCCIbCaJ2Qt8xD04MKz40bkAqPFGGCx/Kn/YGgbFKxD8IOXuVtrEMsa4pKxFCRiOyNreENQa6VHCg8LP6DbDvX0KGZTYjcjh32OJV31Oi0DcdCDc2eIINlSkOQowomN9A4RO0loSD3eZEWv4yAKZxZo1spBCw8/T1iT7KhGson1jBvhdNRsn41rt4SkEHFQhbNHKJuodoSI9YyMXFs7bhP7GRm5wr4POYkJOYvjYecKrQWNRi0NEQsaGYPcLI7oFEPEbkbG0g6y1ctAxGLGbXC6ffyZIGhKccRuxksO0HIIK3hTv6W0DBDlA4IES1gWd7mA1gniYAlLBWFxE1xTTiBWM3IeoUoQ9jnVFvl6ldjNyJVT5xBXAZXBTt8OJY2Hs9/QbKBE3QYKsZ2Rj8ZVMXtvofycRPTT3/8HQnOWOQCrAAA=",
            "PartitionKey": "4e8cecf660ea97791f347097c85223b3",
            "ApproximateArrivalTimestamp": 1460500032.299,
            "SequenceNumber": "49560977364991779866952104588195942057001415028632453122"
        }
    ],
    "NextShardIterator": "AAAAAAAAAAGsJQiCfpWm2CQ7S1XEkwV4ljjbVHp+0d4uxTSQ2RAxlwWGWhBJ10fMfM5a6CxZNISyRWLH8Ahe1h6eCjCY8EgZvsGF3Y5uHxIDEfR9uuANPUJTONlXzDKciCQ6kdoGeyB1uYKnm3bYsZssknjfazm5MUi1c8JfVt5hrpBHIOBWPIX03B+rZplYGLue6Z6PvCdEaqU1WnwncUpGFlYExUv/f3B0/DZXjYy9exMwCY9Nqw==",
    "MillisBehindLatest": 0
}

Error Message:

04-12-2016 15:49:12.788 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py" Couldn't process record {SequenceNumber: 49560977364991779866952104588195942057001415028632453122,Data: java.nio.HeapByteBuffer[pos=1 lim=3146 cap=3146],PartitionKey: 4e8cecf660ea97791f347097c85223b3}. Skipping the record.
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py" Malformed data: null
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py" java.nio.charset.MalformedInputException: Input length = 1
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py"
       at java.nio.charset.CoderResult.throwException(CoderResult.java:281)
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py"
       at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:816)
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py"
       at com.splunk.modinput.kinesis.KinesisModularInput$RecordProcessor.processRecordsWithRetries(Unknown Source)
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py"
       at com.splunk.modinput.kinesis.KinesisModularInput$RecordProcessor.processRecords(Unknown Source)
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py"
       at com.amazonaws.services.kinesis.clientlibrary.lib.worker.ProcessTask.call(ProcessTask.java:125)
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py"
       at com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:48)
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py"
       at com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:23)
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py"
       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py"
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py"
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
04-12-2016 15:49:12.820 -0700 ERROR ExecProcessor - message from "python /opt/splunk/etc/apps/kinesis_ta/bin/kinesis.py"
       at java.lang.Thread.run(Thread.java:745)
0 Karma

Jeremiah
Motivator

Yes this is a problem for any kinesis stream that is fed from a cloudwatch subscription. Customizing the default message handler won't help because its expecting text, not compressed data. As an alternative, have you looked at the AWS app? It supports reading VPC flow logs directly from Cloudwatch.

0 Karma

Damien_Dallimor
Ultra Champion

New version coming imminently to skip the charset decoding and jump straight to gzip decoding.

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!