Splunk Search

How to make timechart active connections between start end end times?

mcaulsc
Path Finder

Hi,

Any thoughts appreciated.

I have some connection data captured at connection termination, it has connection start and end times. CONSTA and CONEND in the format "2022-10-18 15:40:00.000000".
What I'd like to do is timechart in say 5 minute intervals the number of connections that were active in those intervals. So all connections in an interval 15:40 - 15:45 that had started but not terminated and repeat that across the timechart so 15:45 - 15:50 etc.
Hopefully that make sense.

Thanks in advance
Steve

Labels (1)
0 Karma
1 Solution

johnhuang
Motivator

Since you're looking for a running total of active connection, we're going to ignore the port. Basically the approach is:

 

  • Extract all start and end connection timestamps and convert it into epoch time.
    • Note, if your splunk timezone doesn't match the event's timezone, you need to offset it.
  • Expand the start and end time into its own event and sort by oldest to newest.
  • Calculate the running total with accum
    • Each start event +1 to total
    • Each end event -1 to total
  • Run timechart
    • If you want to show the max active connections per interval, use max(active_connections)
    • If you want to show the number of active connections at the end of the interval, use last(active_connections) 


Using your sample data (not the best use case):

 

 

| makeresults
| eval events="55300,2022-10-15 07:33:40.919446,2022-10-19 07:25:46.440451|55300,2022-10-15 07:33:57.704310,2022-10-19 07:17:45.082410|55300,2022-10-15 07:34:39.447813,2022-10-19 07:25:42.237149|55300,2022-10-15 07:34:48.125852,2022-10-19 07:38:46.745400|55300,2022-10-15 07:34:55.561466,2022-10-19 07:33:44.027628|55300,2022-10-15 07:48:57.706834,2022-10-19 07:17:45.086782|55300,2022-10-15 07:57:37.375340,2022-10-19 07:24:24.382886|55300,2022-10-15 08:01:05.756897,2022-10-19 06:50:45.339550|55300,2022-10-15 08:02:24.329066,2022-10-19 07:22:37.247145|55300,2022-10-15 08:21:32.202534,2022-10-19 06:46:35.348080|55300,2022-10-15 08:44:39.721370,2022-10-19 07:38:46.746090|55300,2022-10-15 08:59:03.215559,2022-10-19 07:39:39.372292|55300,2022-10-16 15:08:47.255462,2022-10-19 07:24:24.353308|55300,2022-10-16 17:37:33.308594,2022-10-19 06:50:45.328513|55300,2022-10-17 10:00:11.261248,2022-10-18 10:00:12.426608"
| eval events=split(events, "|")
| mvexpand events
| rex field=events "^(?<PORT>\d+)\,(?<CONNSTA>[^\,]*)\,(?<CONNEND>[^\,]*)"

| eval start_time=strptime(CONNSTA, "%Y-%m-%d %H:%M:%S.%N")."|start"
| eval end_time=strptime(CONNEND, "%Y-%m-%d %H:%M:%S.%N")."|end"
| eval event_list=MVAPPEND(start_time, end_time)
| rex field=event_list "^(?<_time>\d+\.\d+)\|(?<event_type>.*)"
| eval event_count=CASE(event_type="start", 1, event_type="end", -1)
| sort 0 _time
| accum event_count AS active_connection
| timechart span=5m max(active_connection) AS active_connection
| fillnull value=0

 

 

 

View solution in original post

0 Karma

mcaulsc
Path Finder

Hi,

In the data there is nothing else, of that type, lots of other connection data but not that would help us, it is essentially a dataset of records cut at socket termination. Each report would be for limited ports, 1 normally maybe 3 in unusual circumstances, but code for 1 would work. Dataset size to be honest no idea, I don't manage that, but I'd guess up to 1GB.

0 Karma

johnhuang
Motivator

How do you distinguished the different connections? e.g. by source ip, session id, or user?

For active connections, do you need to account for CONSTA that began at an earlier interval? For example, if CONSTA at 3:30PM and CONEND at 5:30PM, the connection will need to be accounted for in each interval between 3:30 and 5:30pm.

Some sample data could help.

 

0 Karma

mcaulsc
Path Finder

Hi,

Connections can be distinguished by PORT in this example. So the count would be any records that have started before or during the interval but not terminated.

Sample data:

PORT   CONSTA                                             CONEND
55300 2022-10-15 07:33:40.919446 2022-10-19 07:25:46.440451
55300 2022-10-15 07:33:57.704310 2022-10-19 07:17:45.082410
55300 2022-10-15 07:34:39.447813 2022-10-19 07:25:42.237149
55300 2022-10-15 07:34:48.125852 2022-10-19 07:38:46.745400
55300 2022-10-15 07:34:55.561466 2022-10-19 07:33:44.027628
55300 2022-10-15 07:48:57.706834 2022-10-19 07:17:45.086782
55300 2022-10-15 07:57:37.375340 2022-10-19 07:24:24.382886
55300 2022-10-15 08:01:05.756897 2022-10-19 06:50:45.339550
55300 2022-10-15 08:02:24.329066 2022-10-19 07:22:37.247145
55300 2022-10-15 08:21:32.202534 2022-10-19 06:46:35.348080
55300 2022-10-15 08:44:39.721370 2022-10-19 07:38:46.746090
55300 2022-10-15 08:59:03.215559 2022-10-19 07:39:39.372292
55300 2022-10-16 15:08:47.255462 2022-10-19 07:24:24.353308
55300 2022-10-16 17:37:33.308594 2022-10-19 06:50:45.328513
55300 2022-10-17 10:00:11.261248 2022-10-18 10:00:12.426608

0 Karma

johnhuang
Motivator

Since you're looking for a running total of active connection, we're going to ignore the port. Basically the approach is:

 

  • Extract all start and end connection timestamps and convert it into epoch time.
    • Note, if your splunk timezone doesn't match the event's timezone, you need to offset it.
  • Expand the start and end time into its own event and sort by oldest to newest.
  • Calculate the running total with accum
    • Each start event +1 to total
    • Each end event -1 to total
  • Run timechart
    • If you want to show the max active connections per interval, use max(active_connections)
    • If you want to show the number of active connections at the end of the interval, use last(active_connections) 


Using your sample data (not the best use case):

 

 

| makeresults
| eval events="55300,2022-10-15 07:33:40.919446,2022-10-19 07:25:46.440451|55300,2022-10-15 07:33:57.704310,2022-10-19 07:17:45.082410|55300,2022-10-15 07:34:39.447813,2022-10-19 07:25:42.237149|55300,2022-10-15 07:34:48.125852,2022-10-19 07:38:46.745400|55300,2022-10-15 07:34:55.561466,2022-10-19 07:33:44.027628|55300,2022-10-15 07:48:57.706834,2022-10-19 07:17:45.086782|55300,2022-10-15 07:57:37.375340,2022-10-19 07:24:24.382886|55300,2022-10-15 08:01:05.756897,2022-10-19 06:50:45.339550|55300,2022-10-15 08:02:24.329066,2022-10-19 07:22:37.247145|55300,2022-10-15 08:21:32.202534,2022-10-19 06:46:35.348080|55300,2022-10-15 08:44:39.721370,2022-10-19 07:38:46.746090|55300,2022-10-15 08:59:03.215559,2022-10-19 07:39:39.372292|55300,2022-10-16 15:08:47.255462,2022-10-19 07:24:24.353308|55300,2022-10-16 17:37:33.308594,2022-10-19 06:50:45.328513|55300,2022-10-17 10:00:11.261248,2022-10-18 10:00:12.426608"
| eval events=split(events, "|")
| mvexpand events
| rex field=events "^(?<PORT>\d+)\,(?<CONNSTA>[^\,]*)\,(?<CONNEND>[^\,]*)"

| eval start_time=strptime(CONNSTA, "%Y-%m-%d %H:%M:%S.%N")."|start"
| eval end_time=strptime(CONNEND, "%Y-%m-%d %H:%M:%S.%N")."|end"
| eval event_list=MVAPPEND(start_time, end_time)
| rex field=event_list "^(?<_time>\d+\.\d+)\|(?<event_type>.*)"
| eval event_count=CASE(event_type="start", 1, event_type="end", -1)
| sort 0 _time
| accum event_count AS active_connection
| timechart span=5m max(active_connection) AS active_connection
| fillnull value=0

 

 

 

0 Karma

mcaulsc
Path Finder

Thanks,

Couple of new functions I not used yet in there, it certainly produces a timechart. I just need to reduce the data and try and validate what I see with the data.

Thanks for your time and the solution.

johnhuang
Motivator

I have a few ideas but I like to have a better understanding --

Are there any other events, e.g. polling, heartbeat, keep alive, etc, available that could leveraged to determine what ports are active for each time interval?

How big is the dataset? Roughly how many hosts and ports are in scope for this report? 

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...