Solved: Inconsistency between Splunk api vs GUI search res...

user121 · ‎05-31-2011

Inconsistency between Splunk api vs GUI search results.
I am using the Rest API. When I use a search language string for a search on Rest API, After isDone the search end points shows a number of results and matching events, resultCount, eventCount. But when I use the same exact search language string to do manual search on the GUI, I get a different number of matching events.
Example "search earliest=xxx latest=yyy sourcetype=zzz" on Rest API returns 100,000 matching events, but using the same search string (without the 'search' keyword) on GUI returns 300,000 matching events. The difference is big. I am not specifying any other search options, I am 100% sure of that.

Anybody know why is there such difference for a same search on Rest and GUI?

Thanks

sideview · ‎05-31-2011

UPDATE:

in the end it's both quite simple and confusing. When you're using the REST API, if you're interested in the count of events and nothing more, you will have to tack on a " | stats count" on the end of your search. And when the job is done you have to hit the /results endpoint, and retrieve the value of the count field. Although the 'eventCount' property on the job looks like what you want, it will actually NOT BE ACCURATE. Once the job passes 100,000 events, and the search was submitted with the default of status_buckets=0, it knows that there is no point in continuing to run the search so it 'finalizes' the search. Yes, you might argue that the eventCount itself proceeding towards an accurate number amounts to meaningful progress so why not continue the search anyway. I guess the official answer is that properties on the job are really just meant to be internal debugging stuff, and for canonical answers you should use appropriate search language and get field values from the /results endpoints.

Anyway, when you run the same search in the flashtimeline, the reason that search does not quietly autofinalize when it passes 100,000 events, is that the UI submits the search with status_buckets=300. Whenever status_buckets is greater than 0, that means splunk has to summarize the field results (into at least one bucket), so in that case it doesnt let the search self-finalize and instead it runs to completion so that the summaries it's building will be accurate.

ORIGINAL ANSWER:

There's definitely shouldn't be a difference in the results. But there definitely is a difference in the arguments being used at some level, simply because the UI itself uses the REST API to dispatch its searches.

Unfortunately it's the POST that kicks off the jobs, otherwise the troubleshooting task would be very simple in that we could just go look in your splunkd_access log and read the arguments for ourselves.

I dont have any answers but I have more questions. 😃

Are either the first events or the last events the same in both search results?

Maybe somehow the timerange is being interpreted differently. If you go to 'inspect search job' in the UI or hit the jobs endpoint in the REST API, both jobs will have properties on them called earliestTime and latestTime. These represent the absolute-time equivalents of the time arguments you specified. Check that they are the same. Incidentally it's not best practice to set your earliest and latest in the search string when you're using the rest API. You can use the earliest and latest API args instead.

How long do the searches take to complete? It's possible that somehow a lower default threshold is being set to auto_finalize the search in the API.

Is there anything special about that sourcetype? Was this sourcetype ever renamed? Does it happen with other sourcetypes as well?

Incidentally how are you determining the eventCount for both searches?

View solution in original post

user121 · ‎06-03-2011

THANKS A LOT!!
Just adding status_buckets = integer to my API search query's post parameters solved the problem! 🙂

sideview · ‎06-03-2011

this is just summarizing my answer but if you add a "| stats count" onto the end of your REST search, and then when the job has finished, you make a separate request to the /results endpoint and retrieve the value of the 'count' field from the first row. I know it seems complicated. The other way is to submit your search with status_buckets set to 1, but then the search will run MUCH slower and do tons of work that you dont need. It's FAR better to take the step into the world of the search language and start with " | stats count".

user121 · ‎06-01-2011

Hello Nick, thanks for the reply.
I am adding inspects of both searches if that can give us any clues. One from API and other from GUI, I don't see any differences in there in search string, the only difference is of providers. Which I don't understand why would it use different sources if the search is run on a single platform. Anyway the Rest API has more sources and less number of results (110k), and GUI has less sources still more results (375k). The username doesn't matter, it can be one user or different user, all get the same result. And no the sourcetype is never changed, timezones are also same. API will finish the search relatively quickly (less than 30 seconds) compared to GUI (about a minutes).

Thanks!

GUI Search -

  `Search job properties

createTime  2011-06-01T07:01:16.000+00:00
cursorTime  2011-05-30T02:30:00.000+00:00
delegate    None
diskUsage   0
doneProgress    1.0
dropCount   0
eai:acl {'sharing': 'global', 'perms': {'read': ['user1'], 'write': ['user1']}, 'app': 'search', 'modifiable': 'true', 'can_write': 'true', 'owner': 'user1'}
earliestTime    2011-05-30T02:30:00.000+00:00
eventAvailableCount 10000
eventCount  375218
eventFieldCount 26
eventIsStreaming    True
eventIsTruncated    False
eventSearch search sourcetype="bankapp" earliest=05/30/2011:02:30:00 latest=05/30/2011:06:00:00
eventSorting    desc
isDone  True
isFailed    False
isFinalized False
isPaused    False
isPreviewEnabled    1
isRealTimeSearch    False
isSaved False
isSavedSearch   False
isZombie    False
keywords    earliest::05/30/2011:02:30:00 latest::05/30/2011:06:00:00 sourcetype::bankapp
label   None
latestTime  2011-05-30T06:00:00.000+00:00
messages    {'info': ['Your timerange was substituted based on your search string', '[splunk-tx-a1p] Your timerange was substituted based on your search string', '[splunk-tx-a2p] Your timerange was substituted based on your search string', '[splunk-tx-a3p] Your timerange was substituted based on your search string', '[splunk-nc-a2p] Your timerange was substituted based on your search string', '[splunk-nc-a3p] Your timerange was substituted based on your search string'], 'warn': ['Unable to distribute to peer named splunk-nc-a1p:8089 at uri https://splunk-nc-a1p:8089 because peer has status = "Down".']}
modifiedTime    2011-06-01T07:18:56.000+00:00
performance {'dispatch.fetch': {'duration_secs': '20.058', 'invocations': '102'}, 'command.search.typer': {'duration_secs': '0.001', 'output_count': '0', 'input_count': '0', 'invocations': '1'}, 'dispatch.timeline': {'duration_secs': '47.979', 'invocations': '102'}, 'command.search.index': {'duration_secs': '0.001', 'invocations': '1'}, 'dispatch.preview': {'duration_secs': '0.101', 'invocations': '101'}, 'command.search.tags': {'duration_secs': '0.001', 'output_count': '0', 'input_count': '0', 'invocations': '1'}, 'command.search.filter': {'duration_secs': '0.001', 'invocations': '1'}, 'command.fields': {'duration_secs': '0.001', 'output_count': '0', 'input_count': '0', 'invocations': '1'}, 'command.search': {'duration_secs': '0.002', 'output_count': '0', 'input_count': '0', 'invocations': '2'}}
priority    5
remoteSearch    litsearch ( "sourcetype::bankapp" ) _time>=1306722600.000 _time<1306735200.000 | litsearch sourcetype="bankapp" _time>=1306722600.000 _time<1306735200.000 | fields keepcolorder=t * "*" "host" "index" "source" "sourcetype" "splunk_server"
reportSearch    None
request {'time_format': '%s.%Q', 'search': 'search sourcetype="bankapp" earliest=05/30/2011:02:30:00 latest=05/30/2011:06:00:00', 'required_field_list': '*', 'max_count': '10000', 'ui_dispatch_app': 'search', 'latest_time': None, 'status_buckets': '300', 'ui_dispatch_view': 'flashtimeline', 'earliest_time': None, 'auto_cancel': '100'}
resultCount 10000
resultIsStreaming   True
resultPreviewCount  10000
runDuration 73.526
scanCount   375218
search  search sourcetype="bankapp" earliest=05/30/2011:02:30:00 latest=05/30/2011:06:00:00
searchEarliestTime  1306722600.000000000
searchLatestTime    1306735200.000000000
searchProviders ['splunk-tx-a1p', 'splunk-tx-a2p', 'splunk-tx-a3p', 'splunk-nc-a2p', 'splunk-nc-a3p', 'splunkn-nc-a1p']
sid 1306911674.727
statusBuckets   300
ttl 555
Server info: Splunk 4.1.3, splunksearch, Wed Jun 1 07:19:41 2011; User: user1`

Rest API search -
Splunk Atom Feed: search sourcetype="bankapp" earliest=05/30/2011:02:30:00 latest=05/30/2011:06:00:00 Updated: 2011-06-01T06:49:28.000+00:00 Splunk build: search sourcetype="bankapp" earliest=05/30/2011:02:30:00 latest=05/30/2011:06:00:00 cursorTime 1970-01-01T00:00:00.000+00:00 delegate diskUsage 0 doneProgress 1.00000 dropCount 0 eai:acl app search can_write true modifiable true owner user3 perms read user3 write user3 sharing global earliestTime 2011-05-30T02:30:00.000+00:00 eventAvailableCount 110902 eventCount 110902 eventFieldCount 0 eventIsStreaming 1 eventIsTruncated 0 eventSearch search sourcetype="bankapp" earliest=05/30/2011:02:30:00 latest=05/30/2011:06:00:00 eventSorting desc isDone 1 isFailed 0 isFinalized 0 isPaused 0 isPreviewEnabled 0 isRealTimeSearch 0 isSaved 0 isSavedSearch 0 isZombie 0 keywords earliest::05/30/2011:02:30:00 latest::05/30/2011:06:00:00 sourcetype::bankapp label latestTime 2011-05-30T06:00:00.000+00:00 messages info Your timerange was substituted based on your search string [splunk-nc-a1p] Your timerange was substituted based on your search string [splunk-nc-a2p] Your timerange was substituted based on your search string [splunk-nc-a3p] Your timerange was substituted based on your search string [splunk-tx-a1p] Your timerange was substituted based on your search string [splunk-tx-a2p] Your timerange was substituted based on your search string [splunk-tx-a3p] Your timerange was substituted based on your search string performance command.fields duration_secs 0.001 input_count 0 invocations 1 output_count 0 command.search duration_secs 0.002 input_count 0 invocations 2 output_count 0 command.search.filter duration_secs 0.001 invocations 1 command.search.index duration_secs 0.001 invocations 1 command.search.tags duration_secs 0.001 input_count 0 invocations 1 output_count 0 command.search.typer duration_secs 0.001 input_count 0 invocations 1 output_count 0 dispatch.fetch duration_secs 5.373 invocations 71 dispatch.timeline duration_secs 3.267 invocations 71 priority 5 remoteSearch litsearch ( "sourcetype::bankapp" ) _time>=1306722600.000 _time<1306735200.000 | litsearch sourcetype="bankapp" _time>=1306722600.000 _time<1306735200.000 | fields keepcolorder=t "host" "index" "source" "sourcetype" "splunk_server" reportSearch request search search sourcetype="bankapp" earliest=05/30/2011:02:30:00 latest=05/30/2011:06:00:00 resultCount 110902 resultIsStreaming 1 resultPreviewCount 110902 runDuration 17.015000 scanCount 110902 searchEarliestTime 1306722600.000000000 searchLatestTime 1306735200.000000000 searchProviders splunk-nc-a1p splunk-nc-a2p splunk-nc-a3p splunk-tx-a1p splunk-tx-a2p splunk-tx-a3p splunkn-tx-a1p sid 1306910951.708 statusBuckets 0 ttl 574 events - results - results_preview - timeline - summary - control: 2011-06-01T06:49:28.000+00:00 | user3

sideview · ‎06-02-2011

Got it. I found out what was going on and updated my answer. See above.

sideview · ‎05-31-2011

UPDATE:

in the end it's both quite simple and confusing. When you're using the REST API, if you're interested in the count of events and nothing more, you will have to tack on a " | stats count" on the end of your search. And when the job is done you have to hit the /results endpoint, and retrieve the value of the count field. Although the 'eventCount' property on the job looks like what you want, it will actually NOT BE ACCURATE. Once the job passes 100,000 events, and the search was submitted with the default of status_buckets=0, it knows that there is no point in continuing to run the search so it 'finalizes' the search. Yes, you might argue that the eventCount itself proceeding towards an accurate number amounts to meaningful progress so why not continue the search anyway. I guess the official answer is that properties on the job are really just meant to be internal debugging stuff, and for canonical answers you should use appropriate search language and get field values from the /results endpoints.

Anyway, when you run the same search in the flashtimeline, the reason that search does not quietly autofinalize when it passes 100,000 events, is that the UI submits the search with status_buckets=300. Whenever status_buckets is greater than 0, that means splunk has to summarize the field results (into at least one bucket), so in that case it doesnt let the search self-finalize and instead it runs to completion so that the summaries it's building will be accurate.

ORIGINAL ANSWER:

There's definitely shouldn't be a difference in the results. But there definitely is a difference in the arguments being used at some level, simply because the UI itself uses the REST API to dispatch its searches.

Unfortunately it's the POST that kicks off the jobs, otherwise the troubleshooting task would be very simple in that we could just go look in your splunkd_access log and read the arguments for ourselves.

I dont have any answers but I have more questions. 😃

Are either the first events or the last events the same in both search results?

Maybe somehow the timerange is being interpreted differently. If you go to 'inspect search job' in the UI or hit the jobs endpoint in the REST API, both jobs will have properties on them called earliestTime and latestTime. These represent the absolute-time equivalents of the time arguments you specified. Check that they are the same. Incidentally it's not best practice to set your earliest and latest in the search string when you're using the rest API. You can use the earliest and latest API args instead.

How long do the searches take to complete? It's possible that somehow a lower default threshold is being set to auto_finalize the search in the API.

Is there anything special about that sourcetype? Was this sourcetype ever renamed? Does it happen with other sourcetypes as well?

Incidentally how are you determining the eventCount for both searches?

jdunlea_splunk · ‎12-14-2011

I am also using the Splunk REST API to do summary indexing (adding a " | collect index=" at the end of my search.... Does what your saying, mean that when my search runs, and encounters more than 100,000 rows from which it is THEN to summarize and populate the SI with, that it will stop searching for events after 100,000 rows, and only summarize the first 100,000 into the SI???

Inconsistency between Splunk api vs GUI search results.

Announcing Scheduled Export GA for Dashboard Studio

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!