Dashboards & Visualizations
Highlighted

Real-time REST search stops returning new data

Path Finder

I'm trying to do an external visualization of real-time data in Splunk using D3- I'm using Node to proxy the search results, manage the search lifetimes, and deliver them to the browser using Socket.IO. So far all of this is working great, but the visualizations I'm doing are pulling a lot of events. At around event count 0 (the default limit) I'm running out of new data and (using a negative offset value), the search starts returning the same data over and over.

I know that I can extend the event count, but eventually I'll run into the same problem, and then I'm just eating space on the server without reason. Is there a way to clear the prior buffer or in some way keep the search streaming? I'm less concerned about dropped events (I'm sipping from a firehose) than I am keeping the data coming.

Highlighted

Re: Real-time REST search stops returning new data

Splunk Employee
Splunk Employee

Need more information. Are you using a GET or a POST? Are you connecting to /services/search/jobs/export? What are your get/post args?

0 Karma
Highlighted

Re: Real-time REST search stops returning new data

Path Finder

On the post to create the search I'm using

{search: searchstring, searchmode: 'realtime', earliesttime: 'rt', latesttime: 'rt', rtblocking: 'false', auto_cancel: 120}

0 Karma
Highlighted

Re: Real-time REST search stops returning new data

Path Finder

On the get to retrieve results I ended up splunking requests on a real-time search in the GUI to see what the GUI is doing- I replaced xml with json and ended up with:

count: 0,
segmentation: 'full',
outputmode: 'json',
time
format:'%25s.%25Q',
maxlines: 10,
show
emptyfields: 'True',
offset:-100,
output
timeformat: '%25Y-%25m-%25dT%25H%3A%25M%3A%25S.%25Q%25z',
field
list: '',
truncation_mode: 'abstract'

0 Karma
Highlighted

Re: Real-time REST search stops returning new data

Splunk Employee
Splunk Employee

I'm still not clear on what endpoint you are posting to.

In general, the best approach would be to GET /services/search/jobs/export, which will async stream the results to you. GET is more appropriate than POST in this case since the former will create minimal artifacts on the Splunk server. Can you try that?

Regarding what the UI does, you should not need to specifcy autocancel, count, segmentation, maxlines, showemptyfields, offset, outputtimeformat, or truncationmode if you use the above method. You will need to specify fieldlist, search, earliesttime, latesttime, and remoteserverlist.

View solution in original post

Highlighted

Re: Real-time REST search stops returning new data

Path Finder

Sorry about forgetting to add the endpoints- the search is kicked off with a post to /search and then I'm retrieving events with gets to /search/{job no}/events. I'll try using jobs/export to see if I can get farther that way.

0 Karma
Highlighted

Re: Real-time REST search stops returning new data

Splunk Employee
Splunk Employee

No need to be sorry, it is kind of confusing at first. I have several apps that use GET /services/search/jobs/export? for real-time searches. If you want an in-product example, check out how our export functionality works (streamingRequest() in $SPLUNK_HOME/lib/python2.7/site-packages/splunk/rest/__init__.py).

0 Karma
Highlighted

Re: Real-time REST search stops returning new data

Splunk Employee
Splunk Employee

zenmoto, would you mind getting in touch with me at ineeman@splunk.com? I have a sample that does exactly what you're trying to do, and I'd love to work through your issues with you. I'll post any resolutions here.

One thing to note is that Node.js HTTP client does not work with Splunk's search/jobs/export module, because Splunk is returning non-HTTP compliant responses, and Node closes the socket (at least in 4.2 and 4.3).

0 Karma
Highlighted

Re: Real-time REST search stops returning new data

Path Finder

I wasn't able to work on this for a couple weeks but got back to it with fresh eyes. I found that the problem was that I had specified a from time and a to time of 'rt' on the post, but I wasn't specifying anything on the get to retrieve events. After specifying '-1m' on earliest and 'now' as the latest on the get I started to get the behavior I had originally expected. I suspect that Araitz is right concerning the /export endpoint being the better endpoint to use- I had been leery of that as I wanted to use the JSON output format and wanted each read to be parseable. This connection also has the potential to be long-running, so I was concerned about leaving the connection open, but in retrospect that would probably be quite a bit more efficient overall. On the next iteration I will likely try using the /export endpoint instead.

0 Karma