Getting Data In

Using Splunk modular data inputs for the REST API to ingest Twitter data, how do I delete or filter out non-English events?

Engager

I am ingesting a lot of Twitter data for a project, and incidentally, I am ingesting Japanese and Hindi tweets along with the English ones. I do not want to collect these tweets, so is there a way to limit the collection to only English?

Or is there a way to delete the non English Twitter data?

I'm using the Splunk Modular Data inputs for the REST API.

Thanks.

0 Karma
1 Solution

Communicator

Use a filter! Twitter has a fantastic streaming API which you can use with Splunk. Check out this great tutorial: http://discoveredintelligence.ca/stream-twitter-splunk-10-simple-steps/

Use the language filter in your endpoint (https://dev.twitter.com/streaming/overview/request-parameters#language). For example:
https://stream.twitter.com/1.1/statuses/filter.json?track=twitterapi&language=en

View solution in original post

Communicator

Use a filter! Twitter has a fantastic streaming API which you can use with Splunk. Check out this great tutorial: http://discoveredintelligence.ca/stream-twitter-splunk-10-simple-steps/

Use the language filter in your endpoint (https://dev.twitter.com/streaming/overview/request-parameters#language). For example:
https://stream.twitter.com/1.1/statuses/filter.json?track=twitterapi&language=en

View solution in original post

State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!