All Apps and Add-ons

REST API Modular Input: Duplicates when fetching data from Twitter

bkarambelkar
New Member

I'll pulling Tweets from a list using the /lists/statuses API
https://dev.twitter.com/rest/reference/get/lists/statuses.

But the events are being duplicated, and I think this is because there is no way for me to specify the since_id param in the request URL.
What I would ideally like to do is, when tweets are fetched, store the max(id_str) value somewhere and pass it as a request param to the next invocation.
How can I accomplish this ?

Also does the module support ARRAYs and decomposes individual events from the Array ? Currently I'm passing count=1 argument, but ideally I would like to pass in count=100 (the max allowed) so as to be able to pull in more than 1 tweet per call.

0 Karma

Damien_Dallimor
Ultra Champion

I think that the twitter response json format may have changed since I wrote that response handler.

Try this instead :

class TwitterEventHandler:

    def __init__(self,**args):
        pass

    def __call__(self, response_object,raw_response_output,response_type,req_args,endpoint):       

        if response_type == "json":        
            output = json.loads(raw_response_output)
            last_tweet_indexed_id = 0
            for twitter_event in output:
                print_xml_stream(json.dumps(twitter_event))
                if "id_str" in twitter_event:
                    tweet_id = twitter_event["id_str"]
                    if tweet_id > last_tweet_indexed_id:
                        last_tweet_indexed_id = tweet_id

            if not "params" in req_args:
                req_args["params"] = {}

            req_args["params"]["since_id"] = last_tweet_indexed_id

        else:
            print_xml_stream(raw_response_output)
0 Karma

Damien_Dallimor
Ultra Champion

The App does come with an example custom response handler for Twitter.

Look at TwitterEventHandler in rest_ta/bin/responsehandlers.py

0 Karma

bkarambelkar
New Member

Thanks for the pointer Damien, But Setting the ResponseHandler to TwitterEventHandler produces no events in the index. At least with the DefaultEventHandler I was getting the index to populate.

Here are my settings
Endpoint URL : https://api.twitter.com/1.1/lists/statuses.json
URL Arguments : slug=XXXXXX,owner_screen_name=XXXXXX,count=100
Response Type : json
Response Handler : TwitterEventHandler
Stream Request : Checked
Source Type : From list / _json

Am I missing something ? I even checked Index Error Response, but nothing in the index.

Thanks for helping out.

0 Karma
Get Updates on the Splunk Community!

New in Observability - Improvements to Custom Metrics SLOs, Log Observer Connect & ...

The latest enhancements to the Splunk observability portfolio deliver improved SLO management accuracy, better ...

Improve Data Pipelines Using Splunk Data Management

  Register Now   This Tech Talk will explore the pipeline management offerings Edge Processor and Ingest ...

3-2-1 Go! How Fast Can You Debug Microservices with Observability Cloud?

Register Join this Tech Talk to learn how unique features like Service Centric Views, Tag Spotlight, and ...