Community Blog
Get the latest updates on the Splunk Community, including member experiences, product education, events, and more!

Integrating Splunk Search API and Quarto to Create Reproducible Investigation Workflows

dfirr
New Member

Xnip_12-05-2025_10-55-AM.jpg


Splunk is More Than Just the Web Console

For Digital Forensics and Incident Response (DFIR) practitioners, Splunk is a core part of daily workflow. Its Schema on the Fly and powerful Search Processing Language (SPL) allow for iterative and flexible investigation—ideal for the nature of forensic analysis.

However, many users only interact with Splunk through its web interface. What if you want to document your investigative process? Add commentary? Re-run past searches with identical parameters? This is where the Splunk Search API becomes invaluable.

To demonstrate, we’ve published a walkthrough of an investigation using the Boss of the SOC v3 dataset. The entire analysis—from SPL query to result output—was scripted, executed, and rendered into an HTML report using code.

Curious? Let’s dive into how you can do the same.

MSTICPy Makes the Search API Easy

Interacting with APIs often involves complexities like authentication, pagination, and rate limits. Microsoft Threat Intelligence Security Tools for Python (MSTICPy) abstracts away much of that pain.

Originally designed for threat hunting, MSTICPy supports multiple data sources including Splunk and Splunk Cloud. It allows you to submit SPL queries and get results back as tidy Pandas DataFrames—ready for further analysis, visualization, or transformation in your Python environment.

To get started, install MSTICPy via PyPI. If you’re using RStudio (as we’ll explain later), you may want to ensure compatibility by specifying the pandas version (e.g., 1.5.x). Here’s how we set up my environment using Miniforge on Windows:

```
> conda create -n msticpy python=3.10 pandas=1.5.3 pip notebook ipykernel
> conda activate msticpy
> pip install msticpy[splunk]
```

Embedding Search Logic in Markdown with Quarto

The walkthrough we mentioned earlier was rendered using Quarto, a next-generation scientific and technical publishing system. When combined with tools like RStudio or VS Code, Quarto lets you write Markdown enriched with executable code blocks—in R, Python, or even Bash.

Screenshot: Quarto document with inline R, Python and Bash codeScreenshot: Quarto document with inline R, Python and Bash code

 Screenshot: Quarto document with inline R, Python and Bash code

 

Within a single .qmd file, you can mix prose, code, and live output. Upon rendering, you get a self-contained HTML file where each query and result is visible and reproducible.

 

dfirr_1-1764960837634.png

Screenshot: HTML output generated by Quarto

 

For newcomers, we recommend starting with RStudio Desktop or RStudio Server, which natively support Quarto. By using the Reticulate package, you can access Python objects seamlessly from R.

 

From Query to HTML Report

Step 1: Connect to Splunk via API

Here’s how the connection looks in R using reticulate to import MSTICPy:

```
install.packages("pacman") # If the pacman package is not installed
pacman::p_load(tidyverse, reticulate)
mp <- import("msticpy")
qry_splunk <- mp$QueryProvider("Splunk")
qry_splunk$connect(host = "172.17.0.1", port = "8089", 
                   username = "admin", password = "testpassword") # Not recommended!
```

 

Equivalent Python code:

```
import msticpy as mp
qry_splunk = mp.QueryProvider("Splunk")
qry_splunk.connect(host="172.17.0.1", port="8089",
username="admin", password="testpassword") # Not recommended!
```
```
import msticpy as mp
qry_splunk = mp.QueryProvider("Splunk")
qry_splunk.connect(host="172.17.0.1", port="8089",
username="admin", password="testpassword") # Not recommended!
```

Note: Don’t hardcode credentials in production. Use msticpyconfig.yaml or API tokens.

Step 2: Execute a Search and Retrieve Results

Define your SPL query:

```
spl <- r"(
| inputlookup security_example_data.csv 
| table timestamp threat_src_ip threat_dest_ip threat_status
| head 5
)"
```

Run it and store results as a DataFrame:

```
sample_df <- qry_splunk$exec_query(spl)
sample_df
```

 

dfirr_2-1764960837634.png

Screenshot: DataFrame preview with search results

 

What’s convenient is that the resulting DataFrame can be referenced in a separate tab or window at any time. This makes it easy to revisit or reuse your data during further analysis.

dfirr_3-1764960837635.png

Screenshot: Viewing the DataFrame in a separate tab

However, you may notice that the timestamp field is in Unix time format and stored as a string. In fact, when retrieving data via the Search API, all fields are returned as strings by default.

To improve readability, let's convert this to a proper datetime format. In R, this is easily done using a pipeline. We enjoy using the Tidyverse, which provides a consistent and expressive grammar for data manipulation—reminiscent of SPL’s pipe-based syntax.

```
sample_df |> 
  mutate(
    timestamp = timestamp |> as.numeric() |> as_datetime()
  )
```

dfirr_4-1764960837635.png

Screenshot: Human-readable timestamps after conversion

 

Step 3: Render to HTML

Once your code and results are finalized in RStudio, click the “Render” button (or press Ctrl + Shift + K). Quarto compiles everything into a polished HTML report with embedded code and output.

dfirr_5-1764960837635.png

Screenshot: Final HTML report with code and tables

 

Toward “Literate Log Analysis”

Inspired by a 2018 talk by Masaru Nagaku on Literate Computing for Reproducible Infrastructure, we coined the term Literate Log Analysis to describe this approach. Thanks to MSTICPy and Quarto, we now have the tools to write forensic investigations as narratives—code, commentary, and context all in one place.

Xnip_12-05-2025_11-07-AM.jpg

If this piques your interest, give it a try! And while you're at it, maybe take R for a spin too.

Happy Splunking!

Author Profile

Shintaro Watanabe is a seasoned cybersecurity professional specializing in incident response and information security planning. He is a Staff Engineer in the Information Security Division at JCOM Co., Ltd., Japan’s largest cable TV operator. 

Beyond technical expertise, Shintaro is an effective communicator who aligns stakeholders and drives security improvements. He is active in organizations such as ICT-ISAC Japan, CRIC CSF, and JCTA, and holds numerous certifications including CISA, CISSP, and ten GIACs. He also teaches SANS SEC504 in Japan.

Xnip_12-05-2025_11-08-AM.jpg

Contributors
Get Updates on the Splunk Community!

AI for AppInspect

We’re excited to announce two new updates to AppInspect designed to save you time and make the app approval ...

App Platform's 2025 Year in Review: A Year of Innovation, Growth, and Community

As we step into 2026, it’s the perfect moment to reflect on what an extraordinary year 2025 was for the Splunk ...

Operationalizing Entity Risk Score with Enterprise Security 8.3+

Overview Enterprise Security 8.3 introduces a powerful new feature called “Entity Risk Scoring” (ERS) for ...