All Apps and Add-ons

Getting SPL of search requests using API using `join` command

Na_Kang_Lim
Path Finder

I want to get the `search` of the requests that made to the search head using API (not UI) like Splunk Python SDK.

My idea is to parse the _internal log to get the search_id, then join it with _audit log to get the search. So here is my SPL:

index=_internal sourcetype=splunkd_access source=*splunkd_access.log method=POST useragent IN (axios*, curl*, python-requests*, splunk-sdk-python*, node-fetch*) NOT user IN (splunk-system-user, "-")
| rex field=uri_path ".*/search/jobs/(?<search_id>[^/]+)"
| eval search_id = "'" . search_id . "'"
| where isnotnull(search_id) AND !like(search_id, "'export'")
| join search_id [ 
    search index=_audit action=search info=granted 
    | fields search_id search
]
| table _time host clientip user useragent search_id search

However, this query `search` column returned nothing, though the search_id column has the correct value as `'<search_id>'`.

If I take out the `'<search_id>'` and make a query like:

index=_audit action=search info=granted search_id="'<search_id>'" | table _time search

I could get the corresponding search.

Somehow my `join` command is not working. 

Labels (2)
0 Karma

livehybrid
SplunkTrust
SplunkTrust

Here is another version which removes the limitations from append, this might be a more efficient search:

(index=_internal sourcetype=splunkd_access source=*splunkd_access.log method=POST useragent IN (axios*, curl*, python-requests*, splunk-sdk-python*, node*) NOT user IN (splunk-system-user, "-")) OR (index=_audit "info=completed" "action=search" NOT user IN (splunk-system-user, "-"))
| rex field=uri_path ".*\/search(\/v2)?\/jobs\/(?<extracted_search_id>[^\/]+)"
| eval extracted_search_id = "'" . extracted_search_id . "'"
| eval search_id = coalesce(search_id,extracted_search_id)
| where isnotnull(search_id) AND !like(search_id, "'export'")
| stats first(_time) as _time, values(host) as host, first(clientip) as clientip, first(search) as search, first(user) as user, first(useragent) as useragent by search_id
| table _time host clientip user useragent search_id search

Note - I also tweaked the regex so you can extract the search_id from clients hitting the v2 endpoints.

livehybrid_0-1751531697944.png

 

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

livehybrid
SplunkTrust
SplunkTrust

Hi @Na_Kang_Lim 

join is very rarely the way to go with this, you could try the following which uses append to join them together, although even this has limitations (50,000 events I believe) - I will put together a version without append too...

index=_internal sourcetype=splunkd_access source=*splunkd_access.log method=POST useragent IN (axios*, curl*, python-requests*, splunk-sdk-python*, node*) NOT user IN (splunk-system-user, "-")
| rex field=uri_path ".*\/search(\/v2)?\/jobs\/(?<search_id>[^\/]+)"
| eval search_id = "'" . search_id . "'"
| where isnotnull(search_id) AND !like(search_id, "'export'")
| append [search index=_audit "info=completed" "action=search" NOT user IN (splunk-system-user, "-")]
| stats first(_time) as _time, values(host) as host, first(clientip) as clientip, first(search) as search, first(user) as user, first(useragent) as useragent by search_id

livehybrid_0-1751531557118.png

 

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

 

Get Updates on the Splunk Community!

Splunk Observability for AI

Don’t miss out on an exciting Tech Talk on Splunk Observability for AI! Discover how Splunk’s agentic AI ...

[Puzzles] Solve, Learn, Repeat: Dereferencing XML to Fixed-length events

This challenge was first posted on Slack #puzzles channelFor a previous puzzle, I needed a set of fixed-length ...

Stay Connected: Your Guide to December Tech Talks, Office Hours, and Webinars!

What are Community Office Hours? Community Office Hours is an interactive 60-minute Zoom series where ...