All Apps and Add-ons

REST API Modular Input: Duplicates when fetching data from Twitter

bkarambelkar
New Member

I'll pulling Tweets from a list using the /lists/statuses API
https://dev.twitter.com/rest/reference/get/lists/statuses.

But the events are being duplicated, and I think this is because there is no way for me to specify the since_id param in the request URL.
What I would ideally like to do is, when tweets are fetched, store the max(id_str) value somewhere and pass it as a request param to the next invocation.
How can I accomplish this ?

Also does the module support ARRAYs and decomposes individual events from the Array ? Currently I'm passing count=1 argument, but ideally I would like to pass in count=100 (the max allowed) so as to be able to pull in more than 1 tweet per call.

0 Karma

Damien_Dallimor
Ultra Champion

I think that the twitter response json format may have changed since I wrote that response handler.

Try this instead :

class TwitterEventHandler:

    def __init__(self,**args):
        pass

    def __call__(self, response_object,raw_response_output,response_type,req_args,endpoint):       

        if response_type == "json":        
            output = json.loads(raw_response_output)
            last_tweet_indexed_id = 0
            for twitter_event in output:
                print_xml_stream(json.dumps(twitter_event))
                if "id_str" in twitter_event:
                    tweet_id = twitter_event["id_str"]
                    if tweet_id > last_tweet_indexed_id:
                        last_tweet_indexed_id = tweet_id

            if not "params" in req_args:
                req_args["params"] = {}

            req_args["params"]["since_id"] = last_tweet_indexed_id

        else:
            print_xml_stream(raw_response_output)
0 Karma

Damien_Dallimor
Ultra Champion

The App does come with an example custom response handler for Twitter.

Look at TwitterEventHandler in rest_ta/bin/responsehandlers.py

0 Karma

bkarambelkar
New Member

Thanks for the pointer Damien, But Setting the ResponseHandler to TwitterEventHandler produces no events in the index. At least with the DefaultEventHandler I was getting the index to populate.

Here are my settings
Endpoint URL : https://api.twitter.com/1.1/lists/statuses.json
URL Arguments : slug=XXXXXX,owner_screen_name=XXXXXX,count=100
Response Type : json
Response Handler : TwitterEventHandler
Stream Request : Checked
Source Type : From list / _json

Am I missing something ? I even checked Index Error Response, but nothing in the index.

Thanks for helping out.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...