Dashboards & Visualizations

Real-time REST search stops returning new data

zenmoto
Path Finder

I'm trying to do an external visualization of real-time data in Splunk using D3- I'm using Node to proxy the search results, manage the search lifetimes, and deliver them to the browser using Socket.IO. So far all of this is working great, but the visualizations I'm doing are pulling a lot of events. At around event count 0 (the default limit) I'm running out of new data and (using a negative offset value), the search starts returning the same data over and over.

I know that I can extend the event count, but eventually I'll run into the same problem, and then I'm just eating space on the server without reason. Is there a way to clear the prior buffer or in some way keep the search streaming? I'm less concerned about dropped events (I'm sipping from a firehose) than I am keeping the data coming.

1 Solution

araitz
Splunk Employee
Splunk Employee

I'm still not clear on what endpoint you are posting to.

In general, the best approach would be to GET /services/search/jobs/export, which will async stream the results to you. GET is more appropriate than POST in this case since the former will create minimal artifacts on the Splunk server. Can you try that?

Regarding what the UI does, you should not need to specifcy auto_cancel, count, segmentation, max_lines, show_empty_fields, offset, output_time_format, or truncation_mode if you use the above method. You will need to specify field_list, search, earliest_time, latest_time, and remote_server_list.

View solution in original post

zenmoto
Path Finder

I wasn't able to work on this for a couple weeks but got back to it with fresh eyes. I found that the problem was that I had specified a from time and a to time of 'rt' on the post, but I wasn't specifying anything on the get to retrieve events. After specifying '-1m' on earliest and 'now' as the latest on the get I started to get the behavior I had originally expected. I suspect that Araitz is right concerning the /export endpoint being the better endpoint to use- I had been leery of that as I wanted to use the JSON output format and wanted each read to be parseable. This connection also has the potential to be long-running, so I was concerned about leaving the connection open, but in retrospect that would probably be quite a bit more efficient overall. On the next iteration I will likely try using the /export endpoint instead.

0 Karma

ineeman
Splunk Employee
Splunk Employee

zenmoto, would you mind getting in touch with me at ineeman@splunk.com? I have a sample that does exactly what you're trying to do, and I'd love to work through your issues with you. I'll post any resolutions here.

One thing to note is that Node.js HTTP client does not work with Splunk's search/jobs/export module, because Splunk is returning non-HTTP compliant responses, and Node closes the socket (at least in 4.2 and 4.3).

0 Karma

araitz
Splunk Employee
Splunk Employee

I'm still not clear on what endpoint you are posting to.

In general, the best approach would be to GET /services/search/jobs/export, which will async stream the results to you. GET is more appropriate than POST in this case since the former will create minimal artifacts on the Splunk server. Can you try that?

Regarding what the UI does, you should not need to specifcy auto_cancel, count, segmentation, max_lines, show_empty_fields, offset, output_time_format, or truncation_mode if you use the above method. You will need to specify field_list, search, earliest_time, latest_time, and remote_server_list.

araitz
Splunk Employee
Splunk Employee

No need to be sorry, it is kind of confusing at first. I have several apps that use GET /services/search/jobs/export? for real-time searches. If you want an in-product example, check out how our export functionality works (streamingRequest() in $SPLUNK_HOME/lib/python2.7/site-packages/splunk/rest/__init__.py).

0 Karma

zenmoto
Path Finder

Sorry about forgetting to add the endpoints- the search is kicked off with a post to /search and then I'm retrieving events with gets to /search/{job no}/events. I'll try using jobs/export to see if I can get farther that way.

0 Karma

zenmoto
Path Finder

On the get to retrieve results I ended up splunking requests on a real-time search in the GUI to see what the GUI is doing- I replaced xml with json and ended up with:

count: 0,
segmentation: 'full',
output_mode: 'json',
time_format:'%25s.%25Q',
max_lines: 10,
show_empty_fields: 'True',
offset:-100,
output_time_format: '%25Y-%25m-%25dT%25H%3A%25M%3A%25S.%25Q%25z',
field_list: '',
truncation_mode: 'abstract'

0 Karma

zenmoto
Path Finder

On the post to create the search I'm using

{search: search_string, searchmode: 'realtime', earliest_time: 'rt', latest_time: 'rt', rt_blocking: 'false', auto_cancel: 120}

0 Karma

araitz
Splunk Employee
Splunk Employee

Need more information. Are you using a GET or a POST? Are you connecting to /services/search/jobs/export? What are your get/post args?

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...