All Apps and Add-ons

OpenAI API add-on

Remigiusz
Explorer

Hi,
I want to ask if I can use generative AI to generate SPL based on my Splunk indices and the data models in those indices. The main story is being able to type in the input field what you want from Splunk and then return you a usable SPL.
Is this possible using the Open AI API add-on? Is there any other recommended tool?

Labels (2)
0 Karma
1 Solution

PickleRick
SplunkTrust
SplunkTrust

1. OpenAI produces things that are rarely usable.

Example -

[begin chatgpt]

Certainly! Here's an example of a Splunk SPL search that finds all network sessions initiated from a host with IP 172.16.0.4 (stored in the src_ip field) from the last two weeks and performs a timechart of the count over destination IP addresses (stored in the dest_ip field) aggregated to the /26 level:

 
index=<your_index> src_ip="172.16.0.4" earliest=-2w
| stats count by dest_ip
| iprange dest_ip
| eval dest_ip_prefix = cidrize(dest_ip, 26)
| stats sum(count) as count by dest_ip_prefix, _time
| timechart span=1d sum(count) by dest_ip_prefix

[end chatgpt]

 At first glance it seems legit. The problem is that Splunk doesn't know about any "iprange" or "cidrize" (and that was the point of the whole exercise!)

And even if it did, the final two lines are completely pointless. Statsing over _time without binning usually doesn't do anything useful. It should have been done with just the timechart.

2. Partially shown above - automatically generated code - even if it's giving you right results - is often highly sub-optimal performance-wise.

View solution in original post

PickleRick
SplunkTrust
SplunkTrust

1. OpenAI produces things that are rarely usable.

Example -

[begin chatgpt]

Certainly! Here's an example of a Splunk SPL search that finds all network sessions initiated from a host with IP 172.16.0.4 (stored in the src_ip field) from the last two weeks and performs a timechart of the count over destination IP addresses (stored in the dest_ip field) aggregated to the /26 level:

 
index=<your_index> src_ip="172.16.0.4" earliest=-2w
| stats count by dest_ip
| iprange dest_ip
| eval dest_ip_prefix = cidrize(dest_ip, 26)
| stats sum(count) as count by dest_ip_prefix, _time
| timechart span=1d sum(count) by dest_ip_prefix

[end chatgpt]

 At first glance it seems legit. The problem is that Splunk doesn't know about any "iprange" or "cidrize" (and that was the point of the whole exercise!)

And even if it did, the final two lines are completely pointless. Statsing over _time without binning usually doesn't do anything useful. It should have been done with just the timechart.

2. Partially shown above - automatically generated code - even if it's giving you right results - is often highly sub-optimal performance-wise.

Remigiusz
Explorer

I had similar problems with the generated SPL on the chat gpt site, so I'm curious if the splunk add-on will at least partially solve this problem. Did you use add-on or was the message from their regular website?

0 Karma

PickleRick
SplunkTrust
SplunkTrust

I would not count on any automatic solution to "fix" such stuff.

So called "AI" is just a generator based on some huge corpus of already-seen solutions. It only correlates known patterns, it doesn't _understand_ what you're trying to do.

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...