All Apps and Add-ons

OpenAI API add-on

Remigiusz
Explorer

Hi,
I want to ask if I can use generative AI to generate SPL based on my Splunk indices and the data models in those indices. The main story is being able to type in the input field what you want from Splunk and then return you a usable SPL.
Is this possible using the Open AI API add-on? Is there any other recommended tool?

Labels (2)
0 Karma
1 Solution

PickleRick
SplunkTrust
SplunkTrust

1. OpenAI produces things that are rarely usable.

Example -

[begin chatgpt]

Certainly! Here's an example of a Splunk SPL search that finds all network sessions initiated from a host with IP 172.16.0.4 (stored in the src_ip field) from the last two weeks and performs a timechart of the count over destination IP addresses (stored in the dest_ip field) aggregated to the /26 level:

 
index=<your_index> src_ip="172.16.0.4" earliest=-2w
| stats count by dest_ip
| iprange dest_ip
| eval dest_ip_prefix = cidrize(dest_ip, 26)
| stats sum(count) as count by dest_ip_prefix, _time
| timechart span=1d sum(count) by dest_ip_prefix

[end chatgpt]

 At first glance it seems legit. The problem is that Splunk doesn't know about any "iprange" or "cidrize" (and that was the point of the whole exercise!)

And even if it did, the final two lines are completely pointless. Statsing over _time without binning usually doesn't do anything useful. It should have been done with just the timechart.

2. Partially shown above - automatically generated code - even if it's giving you right results - is often highly sub-optimal performance-wise.

View solution in original post

PickleRick
SplunkTrust
SplunkTrust

1. OpenAI produces things that are rarely usable.

Example -

[begin chatgpt]

Certainly! Here's an example of a Splunk SPL search that finds all network sessions initiated from a host with IP 172.16.0.4 (stored in the src_ip field) from the last two weeks and performs a timechart of the count over destination IP addresses (stored in the dest_ip field) aggregated to the /26 level:

 
index=<your_index> src_ip="172.16.0.4" earliest=-2w
| stats count by dest_ip
| iprange dest_ip
| eval dest_ip_prefix = cidrize(dest_ip, 26)
| stats sum(count) as count by dest_ip_prefix, _time
| timechart span=1d sum(count) by dest_ip_prefix

[end chatgpt]

 At first glance it seems legit. The problem is that Splunk doesn't know about any "iprange" or "cidrize" (and that was the point of the whole exercise!)

And even if it did, the final two lines are completely pointless. Statsing over _time without binning usually doesn't do anything useful. It should have been done with just the timechart.

2. Partially shown above - automatically generated code - even if it's giving you right results - is often highly sub-optimal performance-wise.

Remigiusz
Explorer

I had similar problems with the generated SPL on the chat gpt site, so I'm curious if the splunk add-on will at least partially solve this problem. Did you use add-on or was the message from their regular website?

0 Karma

PickleRick
SplunkTrust
SplunkTrust

I would not count on any automatic solution to "fix" such stuff.

So called "AI" is just a generator based on some huge corpus of already-seen solutions. It only correlates known patterns, it doesn't _understand_ what you're trying to do.

0 Karma
Get Updates on the Splunk Community!

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

March Community Office Hours Security Series Uncovered!

Hello Splunk Community! In March, Splunk Community Office Hours spotlighted our fabulous Splunk Threat ...

Stay Connected: Your Guide to April Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars in April. This post ...