All Apps and Add-ons

OpenAI API add-on

Remigiusz
Explorer

Hi,
I want to ask if I can use generative AI to generate SPL based on my Splunk indices and the data models in those indices. The main story is being able to type in the input field what you want from Splunk and then return you a usable SPL.
Is this possible using the Open AI API add-on? Is there any other recommended tool?

Labels (2)
0 Karma
1 Solution

PickleRick
SplunkTrust
SplunkTrust

1. OpenAI produces things that are rarely usable.

Example -

[begin chatgpt]

Certainly! Here's an example of a Splunk SPL search that finds all network sessions initiated from a host with IP 172.16.0.4 (stored in the src_ip field) from the last two weeks and performs a timechart of the count over destination IP addresses (stored in the dest_ip field) aggregated to the /26 level:

 
index=<your_index> src_ip="172.16.0.4" earliest=-2w
| stats count by dest_ip
| iprange dest_ip
| eval dest_ip_prefix = cidrize(dest_ip, 26)
| stats sum(count) as count by dest_ip_prefix, _time
| timechart span=1d sum(count) by dest_ip_prefix

[end chatgpt]

 At first glance it seems legit. The problem is that Splunk doesn't know about any "iprange" or "cidrize" (and that was the point of the whole exercise!)

And even if it did, the final two lines are completely pointless. Statsing over _time without binning usually doesn't do anything useful. It should have been done with just the timechart.

2. Partially shown above - automatically generated code - even if it's giving you right results - is often highly sub-optimal performance-wise.

View solution in original post

PickleRick
SplunkTrust
SplunkTrust

1. OpenAI produces things that are rarely usable.

Example -

[begin chatgpt]

Certainly! Here's an example of a Splunk SPL search that finds all network sessions initiated from a host with IP 172.16.0.4 (stored in the src_ip field) from the last two weeks and performs a timechart of the count over destination IP addresses (stored in the dest_ip field) aggregated to the /26 level:

 
index=<your_index> src_ip="172.16.0.4" earliest=-2w
| stats count by dest_ip
| iprange dest_ip
| eval dest_ip_prefix = cidrize(dest_ip, 26)
| stats sum(count) as count by dest_ip_prefix, _time
| timechart span=1d sum(count) by dest_ip_prefix

[end chatgpt]

 At first glance it seems legit. The problem is that Splunk doesn't know about any "iprange" or "cidrize" (and that was the point of the whole exercise!)

And even if it did, the final two lines are completely pointless. Statsing over _time without binning usually doesn't do anything useful. It should have been done with just the timechart.

2. Partially shown above - automatically generated code - even if it's giving you right results - is often highly sub-optimal performance-wise.

Remigiusz
Explorer

I had similar problems with the generated SPL on the chat gpt site, so I'm curious if the splunk add-on will at least partially solve this problem. Did you use add-on or was the message from their regular website?

0 Karma

PickleRick
SplunkTrust
SplunkTrust

I would not count on any automatic solution to "fix" such stuff.

So called "AI" is just a generator based on some huge corpus of already-seen solutions. It only correlates known patterns, it doesn't _understand_ what you're trying to do.

0 Karma
Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

REGISTER NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If ...

Observability | Use Synthetic Monitoring for Website Metadata Verification

If you are on Splunk Observability Cloud, you may already have Synthetic Monitoringin your observability ...

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...