All Posts

Find Answers
Ask questions. Get answers. Find technical product solutions from passionate members of the Splunk community.

All Posts

Thanks for the reply.  Cheers.
Can you explain @richgalloway 's main question: How can two events produce 4 transactions (durations)? Here is an emulation of the two events you illustrated, and the transaction command to follow ... See more...
Can you explain @richgalloway 's main question: How can two events produce 4 transactions (durations)? Here is an emulation of the two events you illustrated, and the transaction command to follow   | makeresults format=csv data="_raw 2024-10-10T06:30:11.478-04:00 | INFO | 1 | | xxxxxxxxxxxxxxxxx : Start View Refresh (price_vw) !!! 2024-10-10T06:30:11.509-04:00 | INFO | 1 | | xxxxxxxxxxxxxxxxx : End View Refresh (price_vw) !!!" | eval _time = strptime(replace(_raw, "(\S+).*", "\1"), "%FT%T.%3N%z") | sort - _time ``` the above emulates index=* ("Start View Refresh (price_vw)" OR "End View Refresh (price_vw)") ``` | transaction endswith="End View Refresh" startswith="Start View Refresh"   The result is _raw _time closed_txn duration eventcount field_match_sum linecount 2024-10-10T06:30:11.478-04:00 | INFO | 1 | | xxxxxxxxxxxxxxxxx : Start View Refresh (price_vw) !!! 2024-10-10T06:30:11.509-04:00 | INFO | 1 | | xxxxxxxxxxxxxxxxx : End View Refresh (price_vw) !!! 2024-10-10 03:30:11.478 1 0.031 2 0 2 As richgalloway predicted, one duration.
So your timestamp extraction definition is not used because unless &auto_extract_timestamp=true is added to the /event URI, that endpoint skips timestamp extraction completely and uses the "time" fie... See more...
So your timestamp extraction definition is not used because unless &auto_extract_timestamp=true is added to the /event URI, that endpoint skips timestamp extraction completely and uses the "time" field from the event's envelope or (if there isn't one) a current timestamp from the receiving component (in your case - the HF).
@PickleRick  It is sending to services/collector/event    
Thanks @dural_yyz  Will try that.
No. Your data is not "in Splunk". You're fetching the results from the remote data source on every single search. I would ingest the data into Splunk's index and simply do stats-based "join" on that... See more...
No. Your data is not "in Splunk". You're fetching the results from the remote data source on every single search. I would ingest the data into Splunk's index and simply do stats-based "join" on that data.
Hi @PickleRick  Thanks for your help 1.  Like I mentioned, the DB is on a different connection, if it's possible it will take a while until the DB team work on this. So, as a workaround I will need... See more...
Hi @PickleRick  Thanks for your help 1.  Like I mentioned, the DB is on a different connection, if it's possible it will take a while until the DB team work on this. So, as a workaround I will need to do this at least to get the data now. 2. Yes 50k is for the join 3. Thanks. Let me look into _internal. The alerting that I am looking for is not only for a case where certain data hits a Splunk's internal threshold, but I also need it for other cases (non-Splunk's internal threshold), for example, if my scheduled report contains empty data or if data hits a certain threshold (max/min). 4.  Sorry, perhaps my explanation in the example is not clear enough because it's difficult to lay it out without a real example in SPL. Both tables (host table and contact table) in the example have been in Splunk and can be accessible via a DBX query. Like I mentioned before, the problem is that we cannot join in the DB; both are on different connections; the table host is in Connection A, and the table contact is in Connection B.  | dbxquery connection ="connectionA" query="select ip, host from table host" | dbxquery connection ="connectionB" query="select ip, contact from table contact" I did not search remotely on every search, but instead I ran this command for each subnet to find the number of rows. For example 10.0.1.0/16 => 20k rows and so on.   | dbxquery connection ="connectionB" query="select ip, contact from table contact where ip::inet<'10.0.0.0/16'" | dbxquery connection ="connectionB" query="select ip, contact from table contact where ip::inet<'10.1.0.0/16'" | dbxquery connection ="connectionB" query="select ip, contact from table contact where ip::inet<'10.2.0.0/16'" | dbxquery connection ="connectionB" query="select ip, contact from table contact where ip::inet<'10.3.0.0/16'" Once I figure the number of rows, then I group them until it hits right below 50k, so I am saving subsearches.  If one subnet above 50k, I will need to split them.   I hope this makes sense.  Note that this is only workaround. join max=0 type=left ip [| dbxquery connection ="connectionB" query="select ip, contact from table contact where ip::inet<'10.0.0.0/16' OR ip::inet<'10.1.0.0/16'" |eval source="group1" ]
I can make it wider by reducing the time frame, but last 7 days is usually the default. Are you sure we can't do it on the CSS part? Thanks
Yes, I understand that it's pushed to HEC input on the HF. But to which API endpoint? Because there are at at least three endpoints for the HEC input /services/collector/raw /services/collector/eve... See more...
Yes, I understand that it's pushed to HEC input on the HF. But to which API endpoint? Because there are at at least three endpoints for the HEC input /services/collector/raw /services/collector/event /services/collector/mint Additionally the /event endpoint can accept parameters changing the ingestion process. So I repeat my question - to which endpoint is your data being sent?
Ahh... you found out yourself what I've just wrote you Good job. Remember that case matters in field names. It might matter or not for field values depending on how you're using the condition. ... See more...
Ahh... you found out yourself what I've just wrote you Good job. Remember that case matters in field names. It might matter or not for field values depending on how you're using the condition. something | search a=b will match whenever field a has value of either b or B But something | where a="B" will match only upper-case B.
Case matters for field names so if you indeed use status_Code<300 when the field is named status_code it won't match
Ok. Several things. 1. I'm not sure why you're trying to join two sql database search results in Splunk. I get it that if you have a hammer everything looks like a nail but don't you have a better t... See more...
Ok. Several things. 1. I'm not sure why you're trying to join two sql database search results in Splunk. I get it that if you have a hammer everything looks like a nail but don't you have a better tool for it? SQL and Splunk are fairly disjoint worlds and while there is some form of interfacing between them. 2. The 50k result limit is not a general subsearch limit. The general subsearch limit is 10k results. 50k is the limits for results for the join command. Only. 3. Splunks runs a search, Splunk gets results (possibly incomplete if - as in your case - subsearch hits limits), Splunk sends results. That's it. You can try searching _internal or even trying accessing specific search job logs for some signs of anomalies but that would have to be a thing completely separate from the main search. You don't get any "metadata" about your search just within a saved search. I already wrote you that. 4. What I would probably do if I wanted to do something like this (still remember about my first point) would be to firstly get the data into Splunk either with a dbconnect input instead of searching remotely on every search or at least by summary indexing (I don't know your source data so I don't know which option would be better). Then you can simply do stats over your data in the index instead of bending over backwards to do joins over external data.
Hi @yuanliu  First, I would like to thank you for your help. This is "partly" related to my previous post that you solved, but I will describe it better here https://community.splunk.com/t5/Splunk... See more...
Hi @yuanliu  First, I would like to thank you for your help. This is "partly" related to my previous post that you solved, but I will describe it better here https://community.splunk.com/t5/Splunk-Search/How-do-I-quot-Left-join-quot-by-appending-CSV-to-an-index-in/m-p/697794#M237015 This is just an example: I have a host table containing IP and hostname, approximately 100k rows with unique IPs I have a contact table containing IP and contact, approximately 1> million rows with unique IPs Both can be accessed with DBX query,  but unfortunately they are both located in different DB connections, so it's not possible to join them at the backend. So, the workaround is to filter out subnet on the contact DB and use subsearches to join the contact DB with the Host DB Due to 50k rows limit using subsearch, I ran a separate query on the contact DB to find out the number of rows for each subnet, then I grouped them together to make sure the number of rows is below 50k. (Please see the diagram below) Group 1 = 40 rows, Group 2 = 45k rows,  and Group 3 = 30k rows. After that, I used left join for each group on the contact DB with the Host DB. Since I don't control the growth of data in the Contact DB,  I am trying to figure out a way to get an email alert if one of the groups exceeded 50k limit. I think I am able to create a scheduled report to produce the stats of each subnet in the group, but going back to my original question: I simply want to know if it's possible for Splunk to send me an email alert only if it meets certain thresholds. The subsearch is only one of my cases. Another case is  I have multiple reports that run daily, I intend to read the reports only if there is a problem, such as empty data, meeting certain thresholds, etc.       Input: Host table ip host 10.0.0.1 host1 10.0.0.2 host2 10.0.0.3 host3 10.1.0.1 host4 10.1.0.2 host5 10.1.0.3 host6 10.2.0.1 host7 10.2.0.2 host8 10.2.0.3 host9 Contact table ip contact 10.0.0.1 person1 10.0.0.2 person2 10.0.0.3 person3 10.1.0.1 person4 10.1.0.2 person5 10.1.0.3 person6 10.2.0.1 person7 10.2.0.2 person8 10.2.0.3 person9 Output:  Join host and contact DB ip host contact 10.0.0.1 host1 person1 10.0.0.2 host2 person2 10.0.0.3 host3 person3 10.1.0.1 host4 person4 10.1.0.2 host5 person5 10.1.0.3 host6 person6 10.2.0.1 host7 person7 10.2.0.2 host8 person8 10.2.0.3 host9 person9
figured out.. my column name had one upper case letter in it.....i think i need to slowdown from the Splunk..ing excitement
TIME_PREFIX is a regex match for what immediately precedes your timestamp.  There are extra quotes, spaces, and what appears to be json key value pair identifiers.  I would make the value more explic... See more...
TIME_PREFIX is a regex match for what immediately precedes your timestamp.  There are extra quotes, spaces, and what appears to be json key value pair identifiers.  I would make the value more explicit and add a MAX_TIMESTAMP_LOOKAHEAD key once you establish a proper match above.
tired both of the below... i only see errors which are  >=499..for some reason i dont see the success ones none of the 200 or showing...something is wrong AND ((status_code>=199 status_Code<300) OR... See more...
tired both of the below... i only see errors which are  >=499..for some reason i dont see the success ones none of the 200 or showing...something is wrong AND ((status_code>=199 status_Code<300) OR (status_code>=499) )  - understand that there is an implied AND in it   AND ((status_code>=199 AND status_Code<300) OR (status_code>=499) )  --explicit AND mentioned  
@PickleRick Updated the post with the settings in place on HF. Data is being received at Heavy Forwader via HEC input. It then gets forwarded to indexers. 
@dural_yyz  I dont see any specific settings for this sourcetype under local props.conf. I added TIME_PREFIX and TZ values but that didnt change anything.  This is on the source which is getting th... See more...
@dural_yyz  I dont see any specific settings for this sourcetype under local props.conf. I added TIME_PREFIX and TZ values but that didnt change anything.  This is on the source which is getting the data, i.e. Heavy Forwarder. Do I need to place any of these settings on indexer/SH as well? [change:auditor] category = Custom pulldown_type = 1 TIME_PREFIX = timeDetected TZ = UTC   System time zone on HF is set to EDT    
You can simply do ...  ((status_code>=199 status_code<300) OR (status_code>=499))  
What are your setting for that sourcetype/source/host? And how are you pushing the events (to which endpoint)?