All Apps and Add-ons

REST API Modular Input: Duplicates when fetching data from Twitter

bkarambelkar
New Member

I'll pulling Tweets from a list using the /lists/statuses API
https://dev.twitter.com/rest/reference/get/lists/statuses.

But the events are being duplicated, and I think this is because there is no way for me to specify the since_id param in the request URL.
What I would ideally like to do is, when tweets are fetched, store the max(id_str) value somewhere and pass it as a request param to the next invocation.
How can I accomplish this ?

Also does the module support ARRAYs and decomposes individual events from the Array ? Currently I'm passing count=1 argument, but ideally I would like to pass in count=100 (the max allowed) so as to be able to pull in more than 1 tweet per call.

0 Karma

Damien_Dallimor
Ultra Champion

I think that the twitter response json format may have changed since I wrote that response handler.

Try this instead :

class TwitterEventHandler:

    def __init__(self,**args):
        pass

    def __call__(self, response_object,raw_response_output,response_type,req_args,endpoint):       

        if response_type == "json":        
            output = json.loads(raw_response_output)
            last_tweet_indexed_id = 0
            for twitter_event in output:
                print_xml_stream(json.dumps(twitter_event))
                if "id_str" in twitter_event:
                    tweet_id = twitter_event["id_str"]
                    if tweet_id > last_tweet_indexed_id:
                        last_tweet_indexed_id = tweet_id

            if not "params" in req_args:
                req_args["params"] = {}

            req_args["params"]["since_id"] = last_tweet_indexed_id

        else:
            print_xml_stream(raw_response_output)
0 Karma

Damien_Dallimor
Ultra Champion

The App does come with an example custom response handler for Twitter.

Look at TwitterEventHandler in rest_ta/bin/responsehandlers.py

0 Karma

bkarambelkar
New Member

Thanks for the pointer Damien, But Setting the ResponseHandler to TwitterEventHandler produces no events in the index. At least with the DefaultEventHandler I was getting the index to populate.

Here are my settings
Endpoint URL : https://api.twitter.com/1.1/lists/statuses.json
URL Arguments : slug=XXXXXX,owner_screen_name=XXXXXX,count=100
Response Type : json
Response Handler : TwitterEventHandler
Stream Request : Checked
Source Type : From list / _json

Am I missing something ? I even checked Index Error Response, but nothing in the index.

Thanks for helping out.

0 Karma
Get Updates on the Splunk Community!

Splunk Observability Cloud | Unified Identity - Now Available for Existing Splunk ...

Raise your hand if you’ve already forgotten your username or password when logging into an account. (We can’t ...

Index This | How many sides does a circle have?

February 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

Registration for Splunk University is Now Open!

Are you ready for an adventure in learning?   Brace yourselves because Splunk University is back, and it's ...