Solved: How to calculate duration time after office hour w...

Splunkin · ‎03-14-2021

Im currently having trouble with query to get result of user activity duration after office hour within a month.

i expected result will be like below:
Query A:
user A = 13d 22h 12m 2s
user B = 10d 15h 27m 3s

OR

Query B:
user A=320h 23m 2s
user B=267h 42m 1s

Both answer can be acceptable for me to get as i try multiple set of query for the result above. Both query take time range after 5.30pm until 8.00am only for a month.

Below sample query :
Query A:
| eval date_hourmin = strftime(_time, "%H%M")
| where date_hourmin>=1730 OR date_hourmin<=800
| transaction user,date_hourmin
| stats sum(duration) as duration by user
| sort - duration
| eval string_dur = tostring(round(duration), "duration")
| eval formatted_dur = replace(string_dur,"(?:(\d+)\+)?0?(\d+):0?(\d+):0?(\d+)","\1d \2h \3m \4s")
| eval result=replace(formatted_dur, "^d (0h (0m )?)?","")
| table user duration string_dur formatted_dur result

Query B:
| eval date_hourmin = strftime(_time, "%H%M")
| where date_hourmin>=1730 OR date_hourmin<=800
| convert timeformat="%b %Y" ctime(_time) as date_month
| streamstats earliest(date_hourmin) as time_in latest(date_hourmin) as time_out by date_month
| eval duration=time_out-time_in
| stats values(src_ip) as SourceIP values(srcPort) as SourcePort values(dstService) as DestService earliest(time_in) as TimeIn latest(time_out) as TimeOut values(dest_ip) as DestIP values(dstPort) as DestPort sum(duration) as TotalDuration count by date_month,user
| eval secs=TotalDuration%60,mins=floor((TotalDuration/60)%60),hrs=floor((TotalDuration/3600)%60)
| eval HOURS=if(len(hrs)=1,"0".tostring(hrs), tostring(hrs)),MINUTES=if(len(mins)=1,"0".tostring(mins), tostring(mins)),SECONDS=if(len(secs)=1,"0".tostring(secs), tostring(secs))
| eval Time=HOURS.":".MINUTES.":".SECONDS
| table user SourceIP SourcePort DestIP DestPort DestService Time count
| sort - duration limit=10

Both query display the result, HOWEVER, its look like both query are not giving accurate result.
Ive been struggling for this kind of query for a month now, perhaps im missing something here.

Really appreciated if anyone can help n assist on this. TQ.

tscroggins · ‎03-14-2021

If your boundary is the previous month between 8:00 AM local time on the first day of the month and 8:00 AM local time on the first day of the following month, you can set time constraints in the base search:

... earliest=-1mon@mon+8h latest=@mon+8h

When user activity crosses time boundaries, where do you begin counting time? For example:

Mar 1 14:30 user=foo src_ip=10.0.0.1 src_port=65535 dest_ip=10.1.0.1 dest_port=80 dest_service=http
Mar 1 17:45 user=foo src_ip=10.0.0.1 src_port=65535 dest_ip=10.1.0.1 dest_port=80 dest_service=http

Do you begin counting at 17:30 with the assumption the user has been working since 14:30, or do you begin counting at 17:45? For simplicity, I'll assume 17:45 and report the duration of observed activity.

This returns _time values after 5:30 PM local time or before 8:00 AM local time:

| where (_time > relative_time(_time, "@d+1050m")) OR (_time < relative_time(_time, "@d+8h"))

17:30 to 08:00 is less than 24 hours, so we can shift _time values by eight hours (28,800 seconds) to align or normalize them to the same calendar day:

| eval _time=_time - 28800

Finally, we can calculate the range (duration in seconds) of _time by day and user and then sum the duration for a grand total:

| bin _time span=1d as date_day
| stats range(_time) as duration by date_day user
| stats sum(duration) as duration by user
| fieldformat duration=tostring(duration, "duration")

Using Splunk audit data as an example:

index=_audit user=* earliest=-1mon@mon+8h latest=@mon+8h
| where (_time > relative_time(_time, "@d+1050m")) OR (_time < relative_time(_time, "@d+8h"))
| eval _time=_time - 28800
| bin _time span=1d as date_day
| stats range(_time) as duration by date_day user
| stats sum(duration) as duration by user
| fieldformat duration=tostring(duration, "duration")

You can extend stats with additional fields as needed:

| stats range(_time) as duration by date_day user src_ip src_port dest_ip dest_port dest_service
| stats sum(duration) as duration by user src_ip src_port dest_ip dest_port dest_service
| fieldformat duration=tostring(duration, "duration")

The fieldformat command uses the tostring function to display the duration field (in seconds) in duration format: [d+]HH:MM:SS[.FFFFFF]. The underlying duration field value remains a decimal number.

View solution in original post

tscroggins · ‎03-14-2021

If your boundary is the previous month between 8:00 AM local time on the first day of the month and 8:00 AM local time on the first day of the following month, you can set time constraints in the base search:

... earliest=-1mon@mon+8h latest=@mon+8h

When user activity crosses time boundaries, where do you begin counting time? For example:

Mar 1 14:30 user=foo src_ip=10.0.0.1 src_port=65535 dest_ip=10.1.0.1 dest_port=80 dest_service=http
Mar 1 17:45 user=foo src_ip=10.0.0.1 src_port=65535 dest_ip=10.1.0.1 dest_port=80 dest_service=http

Do you begin counting at 17:30 with the assumption the user has been working since 14:30, or do you begin counting at 17:45? For simplicity, I'll assume 17:45 and report the duration of observed activity.

This returns _time values after 5:30 PM local time or before 8:00 AM local time:

| where (_time > relative_time(_time, "@d+1050m")) OR (_time < relative_time(_time, "@d+8h"))

17:30 to 08:00 is less than 24 hours, so we can shift _time values by eight hours (28,800 seconds) to align or normalize them to the same calendar day:

| eval _time=_time - 28800

Finally, we can calculate the range (duration in seconds) of _time by day and user and then sum the duration for a grand total:

| bin _time span=1d as date_day
| stats range(_time) as duration by date_day user
| stats sum(duration) as duration by user
| fieldformat duration=tostring(duration, "duration")

Using Splunk audit data as an example:

index=_audit user=* earliest=-1mon@mon+8h latest=@mon+8h
| where (_time > relative_time(_time, "@d+1050m")) OR (_time < relative_time(_time, "@d+8h"))
| eval _time=_time - 28800
| bin _time span=1d as date_day
| stats range(_time) as duration by date_day user
| stats sum(duration) as duration by user
| fieldformat duration=tostring(duration, "duration")

You can extend stats with additional fields as needed:

| stats range(_time) as duration by date_day user src_ip src_port dest_ip dest_port dest_service
| stats sum(duration) as duration by user src_ip src_port dest_ip dest_port dest_service
| fieldformat duration=tostring(duration, "duration")

The fieldformat command uses the tostring function to display the duration field (in seconds) in duration format: [d+]HH:MM:SS[.FFFFFF]. The underlying duration field value remains a decimal number.

Splunkin · ‎03-15-2021

Hi tscroggins,

Thank you for your sharing idea, looks like i manage to get the duration and seem accurate for me. Kudos to you.

However, i wonder if can i use this topic to list out top 3 src_ip, src_port, dest_ip, dest_port, service by using stats values? or is it need to change to dc(field) first before get back the values?
what i understand when using the stats values(field), it will list out all the values from the field.
i can limit it using eval field = mvindex(field,0,2) --> for three field, but it will return on the first three field value that it capture.

Ex output:

User src_ip src_port dest_ip dest_port service duration
A 1.1.1.1 22 1.1.1.1 80 http 45h 22m 3s
2.2.2.2 445 2.2.2.2 443 https
3.3.3.3 53 3.3.3.3 53 dns

For now i have to do it manually for each user n all the detail from the user.
really helpful if get breakthrough this problem.

tscroggins · ‎03-16-2021

@Splunkin

If you've used something like this:

| stats range(_time) as duration by date_day user src_ip src_port dest_ip dest_port dest_service
| stats sum(duration) as duration by user src_ip src_port dest_ip dest_port dest_servic

you can use streamstats to find the top 3:

| sort 0 + user - duration

| streamstats count by user

| where count < 4

Splunkin · ‎03-16-2021

hi @tscroggins

thanks for your suggest. Its work, but it separate each field which the duration of it also will be split.

| stats range(_time) as duration by date_day user src_ip srcPort dest_ip dstPort dstService
| stats sum(duration) as duration by user src_ip srcPort dest_ip dstPort dstService
| sort 0 + user - duration
| streamstats values(src_ip) as src_ip values(srcPort) as srcPort values(dest_ip) as dest_ip values(dstPort) as dstPort values(dstService) as dstService count by user duration
| where count <4
| fieldformat duration=tostring(duration,"duration")

What i would expect is something like stats values, where it would be display like below.

Sample as above comment that i mention.
User src_ip src_port dest_ip dest_port service duration
10.1.1.1 22 10.1.1.1 80 http
A 10.1.1.2 23 10.1.1.2 443 https 45h 22m 3s
10.1.1.3 25 10.1.1.3 8447 https

10.1.1.4 25 10.1.1.4 80 http
B 10.1.1.5 37 10.1.1.5 443 https 40h 22m 3s
10.1.1.6 445 10.1.1.6 8447 https

tscroggins · ‎03-16-2021

@Splunkin

Don't add values() aggregations to streamstats. We're only using streamstats to count sorted duration values by user:

| sort 0 + user - duration
| streamstats count by user
| where count <4
| fieldformat duration=tostring(duration,"duration")

The earlier stats command has already provided tabular output with the fields you need. You can rename them in stats or with separate rename commands if desired.

How to calculate duration time after office hour within a month

eval

fields

stats

transaction

Can’t make it to .conf25? Join us online!

Calling All Security Pros: Ready to Race Through Boston?

Beyond Detection: How Splunk and Cisco Integrated Security Platforms Transform ...

Customer success is front and center at .conf25

Are you a member of the Splunk Community?