All Apps and Add-ons

REST API Modular Input: Duplicates when fetching data from Twitter

bkarambelkar
New Member

I'll pulling Tweets from a list using the /lists/statuses API
https://dev.twitter.com/rest/reference/get/lists/statuses.

But the events are being duplicated, and I think this is because there is no way for me to specify the since_id param in the request URL.
What I would ideally like to do is, when tweets are fetched, store the max(id_str) value somewhere and pass it as a request param to the next invocation.
How can I accomplish this ?

Also does the module support ARRAYs and decomposes individual events from the Array ? Currently I'm passing count=1 argument, but ideally I would like to pass in count=100 (the max allowed) so as to be able to pull in more than 1 tweet per call.

0 Karma

Damien_Dallimor
Ultra Champion

I think that the twitter response json format may have changed since I wrote that response handler.

Try this instead :

class TwitterEventHandler:

    def __init__(self,**args):
        pass

    def __call__(self, response_object,raw_response_output,response_type,req_args,endpoint):       

        if response_type == "json":        
            output = json.loads(raw_response_output)
            last_tweet_indexed_id = 0
            for twitter_event in output:
                print_xml_stream(json.dumps(twitter_event))
                if "id_str" in twitter_event:
                    tweet_id = twitter_event["id_str"]
                    if tweet_id > last_tweet_indexed_id:
                        last_tweet_indexed_id = tweet_id

            if not "params" in req_args:
                req_args["params"] = {}

            req_args["params"]["since_id"] = last_tweet_indexed_id

        else:
            print_xml_stream(raw_response_output)
0 Karma

Damien_Dallimor
Ultra Champion

The App does come with an example custom response handler for Twitter.

Look at TwitterEventHandler in rest_ta/bin/responsehandlers.py

0 Karma

bkarambelkar
New Member

Thanks for the pointer Damien, But Setting the ResponseHandler to TwitterEventHandler produces no events in the index. At least with the DefaultEventHandler I was getting the index to populate.

Here are my settings
Endpoint URL : https://api.twitter.com/1.1/lists/statuses.json
URL Arguments : slug=XXXXXX,owner_screen_name=XXXXXX,count=100
Response Type : json
Response Handler : TwitterEventHandler
Stream Request : Checked
Source Type : From list / _json

Am I missing something ? I even checked Index Error Response, but nothing in the index.

Thanks for helping out.

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...