Getting Data In

Using Splunk modular data inputs for the REST API to ingest Twitter data, how do I delete or filter out non-English events?

sunnyd
Engager

I am ingesting a lot of Twitter data for a project, and incidentally, I am ingesting Japanese and Hindi tweets along with the English ones. I do not want to collect these tweets, so is there a way to limit the collection to only English?

Or is there a way to delete the non English Twitter data?

I'm using the Splunk Modular Data inputs for the REST API.

Thanks.

0 Karma
1 Solution

gwobben
Communicator

Use a filter! Twitter has a fantastic streaming API which you can use with Splunk. Check out this great tutorial: http://discoveredintelligence.ca/stream-twitter-splunk-10-simple-steps/

Use the language filter in your endpoint (https://dev.twitter.com/streaming/overview/request-parameters#language). For example:
https://stream.twitter.com/1.1/statuses/filter.json?track=twitterapi&language=en

View solution in original post

gwobben
Communicator

Use a filter! Twitter has a fantastic streaming API which you can use with Splunk. Check out this great tutorial: http://discoveredintelligence.ca/stream-twitter-splunk-10-simple-steps/

Use the language filter in your endpoint (https://dev.twitter.com/streaming/overview/request-parameters#language). For example:
https://stream.twitter.com/1.1/statuses/filter.json?track=twitterapi&language=en

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Dynamic formatting from XML events

This challenge was first posted on Slack #puzzles channelFor a previous puzzle, I needed a set of fixed-length ...

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

  🚀 Your data just got a serious AI upgrade — are you ready? Say hello to the Agentic Era with the ...

Stronger Security with Federated Search for S3, GCP SQL & Australian Threat ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...