All Apps and Add-ons

Getting SPL of search requests using API using `join` command

Na_Kang_Lim
Path Finder

I want to get the `search` of the requests that made to the search head using API (not UI) like Splunk Python SDK.

My idea is to parse the _internal log to get the search_id, then join it with _audit log to get the search. So here is my SPL:

index=_internal sourcetype=splunkd_access source=*splunkd_access.log method=POST useragent IN (axios*, curl*, python-requests*, splunk-sdk-python*, node-fetch*) NOT user IN (splunk-system-user, "-")
| rex field=uri_path ".*/search/jobs/(?<search_id>[^/]+)"
| eval search_id = "'" . search_id . "'"
| where isnotnull(search_id) AND !like(search_id, "'export'")
| join search_id [ 
    search index=_audit action=search info=granted 
    | fields search_id search
]
| table _time host clientip user useragent search_id search

However, this query `search` column returned nothing, though the search_id column has the correct value as `'<search_id>'`.

If I take out the `'<search_id>'` and make a query like:

index=_audit action=search info=granted search_id="'<search_id>'" | table _time search

I could get the corresponding search.

Somehow my `join` command is not working. 

Labels (2)
0 Karma

livehybrid
Ultra Champion

Here is another version which removes the limitations from append, this might be a more efficient search:

(index=_internal sourcetype=splunkd_access source=*splunkd_access.log method=POST useragent IN (axios*, curl*, python-requests*, splunk-sdk-python*, node*) NOT user IN (splunk-system-user, "-")) OR (index=_audit "info=completed" "action=search" NOT user IN (splunk-system-user, "-"))
| rex field=uri_path ".*\/search(\/v2)?\/jobs\/(?<extracted_search_id>[^\/]+)"
| eval extracted_search_id = "'" . extracted_search_id . "'"
| eval search_id = coalesce(search_id,extracted_search_id)
| where isnotnull(search_id) AND !like(search_id, "'export'")
| stats first(_time) as _time, values(host) as host, first(clientip) as clientip, first(search) as search, first(user) as user, first(useragent) as useragent by search_id
| table _time host clientip user useragent search_id search

Note - I also tweaked the regex so you can extract the search_id from clients hitting the v2 endpoints.

livehybrid_0-1751531697944.png

 

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

livehybrid
Ultra Champion

Hi @Na_Kang_Lim 

join is very rarely the way to go with this, you could try the following which uses append to join them together, although even this has limitations (50,000 events I believe) - I will put together a version without append too...

index=_internal sourcetype=splunkd_access source=*splunkd_access.log method=POST useragent IN (axios*, curl*, python-requests*, splunk-sdk-python*, node*) NOT user IN (splunk-system-user, "-")
| rex field=uri_path ".*\/search(\/v2)?\/jobs\/(?<search_id>[^\/]+)"
| eval search_id = "'" . search_id . "'"
| where isnotnull(search_id) AND !like(search_id, "'export'")
| append [search index=_audit "info=completed" "action=search" NOT user IN (splunk-system-user, "-")]
| stats first(_time) as _time, values(host) as host, first(clientip) as clientip, first(search) as search, first(user) as user, first(useragent) as useragent by search_id

livehybrid_0-1751531557118.png

 

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

 

Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...

Maximizing the Value of Splunk ES 8.x

Splunk Enterprise Security (ES) continues to be a leader in the Gartner Magic Quadrant, reflecting its pivotal ...

Operationalizing TDIR: Building a More Resilient, Scalable SOC

Optimizing SOC workflows with a unified, risk-based approach to Threat Detection, Investigation, and Response ...