Hi how can i extract table like this: (“myserver” is a field that already extracted)
source destination duration V
server1 myserver 0.001 9288
myserver server2 0.002 9288
server2 myserver 0.032 0298
myserver server1 0.004 9298
FYI: duration calculate as described below:
Line1 (duration 00:00:00.001) = (12:00:59.853) - (12:00:59.852)
Line2 (duration 00:00:00.002) = (start_S 12:00:59.855) - (start_S 12:00:59.853)
Line3 (duration 00:00:00.110) = (forWE_APP_AS: G 12:00:59.994) - (forWE_APP_AS: P 12:00:59.884)
Line4 (duration 00:00:00.004) = (end_E 12:01:00.007) - (end_E 12:01:00.003)
Here is the log: (G=get, P=push)
12:00:59.852 app module1: G[server1]Q[000]V[9288]
12:00:59.853 app start_S: A_B V[9288]X[000000]G[0]L:
12:00:59.855 app module2: A_B V[9288]X[000000]G[0]L:
12:00:59.855 app start_S: C_D V[9288]X[000000]G[0]L:
12:00:59.881 app module3: A_B V[9288]X[000000]G[0]L:
12:00:59.884 app forWE_APP_AS: P[server2]K[000]V[0288]
12:00:59.994 app forWE_APP_AS: G[server2]K[000]V[0298]
12:00:59.995 app module2: A_B V[9298]X[000000]G[0]K:
12:01:00.003 app end_E: A_B V[9298]X[000000]G[0]K:
12:01:00.007 app module1: P[server1]K[458]V[9298]
12:01:00.007 app end_E: C_D V[9298]X[000000]G[0]K:
any idea?
Thanks
Trying to understand the rules...
What are your rules for determining that the second of your log lines is destination 'myserver'
What is it about the 2nd log line that makes it the end time for your summary table row 1 calculation and what then makes that line also the start time for your second log data line?
What defines the logic to say that the last 3 lines of your data make up the start/end time for the duration calculation of your last table row.
Hi
1-servername extract from source field from filename (not exist in log)
2,3-actually this is the flow, from source to destination and return response to source
server1>myserver>server2>myserver>server1
0.001 0.002 0.110 0.004
want to calculate each duration
step 1: need this condition ... | WHERE V=V AND X=X
12:00:59.853 app start_S: A_B V[0001]X[000000]G[0]L:
12:00:59.855 app start_S: C_D V[0001]X[000000]G[0]L:
step 2: need this condition ... | WHERE X=X AND V=V+10
12:00:59.884 app forWE_APP_AS: P[server2]K[000]V[0288]X[000000]
12:00:59.994 app forWE_APP_AS: G[server2]K[000]V[0298]X[000000]
step 3: need this condition ... | WHERE V=V AND X=X
12:01:00.003 app end_E: A_B V[1000]X[000000]G[0]K:
12:01:00.007 app end_E: C_D V[1000]X[000000]G[0]K:
all of this steps tell me duration of packets when send and recieve.
FYI: about this fields:
A_B =start
C_D =end
P=push
G=get
A_B= start
C_D=end
any idea?
There are lots of things going on here that make this difficult. Your step2 has no X value in your example data, but in your explanation, you show it has value. You have data in events that needs to be pushed up to previous events to get those values useable.
This example with your data shows some techniques to manipulate the data, but it makes lots of assumptions for grouping - and does not get all the data in the right place, but hopefully gives you some pointers
| makeresults
| eval event=split("12:00:59.852 app module1: G[server1]Q[000]V[9288];12:00:59.853 app start_S: A_B V[9288]X[000000]G[0]L:;12:00:59.855 app module2: A_B V[9288]X[000000]G[0]L:;12:00:59.855 app start_S: C_D V[9288]X[000000]G[0]L:;12:00:59.881 app module3: A_B V[9288]X[000000]G[0]L:;12:00:59.884 app forWE_APP_AS: P[server2]K[000]V[0288];12:00:59.994 app forWE_APP_AS: G[server2]K[000]V[0298];12:00:59.995 app module2: A_B V[9298]X[000000]G[0]K:;12:01:00.003 app end_E: A_B V[9298]X[000000]G[0]K:;12:01:00.007 app module1: P[server1]K[458]V[9298];12:01:00.007 app end_E: C_D V[9298]X[000000]G[0]K:",";")
| mvexpand event
| table event
| eval myserver="myserver"
| rex field=event "(?<t>\d+:\d+:\d+.\d+)\s\w+\s+"
| eval _time=strptime(t, "%H:%M:%S.%Q")
| rex field=event "app\s+(?<type>(module1:\sG|start_S:\s[AC]_[BD]|forWE_APP_AS:\s\w|end_E:\s[AC]_[BD]))"
| rex field=event "\s[PG]\[(?<server>[^]]*)\]"
| eval condition=case(match(type,"module1: G"), "1_1_1", match(type,"start_S: A_B"), "2_1_1", match(type,"start_S: C_D"), "2_2", match(type,"forWE_APP_AS: P"), "3_1", match(type,"forWE_APP_AS: G"), "3_2", match(type,"end_E: A_B"), "4_1", match(type,"end_E: C_D"), "4_2")
| rex field=condition "(?<step>\d)_(?<bound>\d)(_(?<group>\d))?"
| rex field=event "V\[(?<V>\d+)\]"
| eval VV=if(condition="3_2", V-10, tonumber(V))
| rex field=event "X\[(?<X>\d+)\]"
| eval X=coalesce(X, "000000")
| streamstats global=f min(_time) as mint max(_time) as maxt by VV step
| eval duration=if(bound=2,maxt-mint,null())
| streamstats global=f min(mint) as mint max(maxt) as maxt by VV group
| eval duration=coalesce(duration, if(group=1, maxt-mint, null()))
| fields - event t type mint maxt
| where isnotnull(bound)
| eval source=case(step=1 OR step=3, server, step=2 OR step=4, myserver)
| eval destination=case(step=1 OR step=3, myserver, step=2 OR step=4, server)
| table source destination duration V *
| stats values(source) as source values(destination) as destination max(duration) as duration by step VV
Thank you for answer
here is the output
step VV source destination duration
1 9288 server1 myserver 0.000000
2 9288 myserver 0.002000
3 288 server2 myserver 0.110000
4 9298 myserver 0.004000
1-destination 2 and 4 missed. (must be 2=server2, 4=server1)
2-VV step 2 must be 0288
3-VV step 3 must be 0298
3-duration step 1 must be 0.001000
any idea?
Yes, filling in those gaps is challenging, as for step 2, you need data from future events to populate server2 into destination, whereas in step4, you need data from past events. You can use forms of streamstats but the challenge is that you don't have a common correlation id to group the events together, i.e. how do you know that server2, from a row that contains V=0288 and V=0298 is related to V=9288 in the data where it needs to be applied.
From your explanation earlier, I assume that you may have many events like this running concurrently in the log with different values of V - is that correct?
If you have the possibility to change your logging output, then that would allow you to make the reporting side easier, but at the moment, I can't easily see a way to get to where you want to go without very specific tweaking of the SPL, which may not be useful with your real data.