Splunk Search

parse ps command

lorinj62
Engager

I have events like this :

11/06/2023 12:34:56 ip 1.2.3.4 This is record 1 of 5
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
user 1 1.0 0.0 2492 604 ? Ss 12:27 0:00 proc01
user 6 0.5 0.0 2608 548 ? S 12:27 0:00 proc02
user 19 0.0 0.0 12168 7088 ? S 12:27 0:00 proc03
user 223 0.0 0.1 852056 39300 ? Ssl 12:27 0:00 proc04
user 470 0.0 0.0 7844 6016 pts/0 Ss 12:27 0:00 proc05
user 683 0.0 0.0 7872 3380 pts/0 R+ 12:37 0:00 proc06

11/06/2023 12:34:56 ip: 1.2.3.4 This is record 2 of 5
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
user 1 0.0 0.0 2492 604 ? Ss 12:27 0:00 proc07
user 6 9.0 0.0 2608 548 ? S 12:27 0:00 proc08
user 19 6.0 0.0 12168 7088 ? S 12:27 0:00 proc09
user 223 0.0 0.1 852056 39300 ? Ssl 12:27 0:00 proc10
user 470 0.0 0.0 7844 6016 pts/0 Ss 12:27 0:00 proc11
user 683 0.0 0.0 7872 3380 pts/0 R+ 12:37 0:00 proc12

and repeating with different data, but the same structure: record 1 of 18...record 2 of 18...etc.

The dates and times are the same for each "subsection" of the ps command.

I want to be able to make a graph of each "proc" to show their cpu and memory usage over time. The processes will be in a random order. I have the time line parsed with fields extracted (like the ip), and want the header of the ps command to be field names for the ps data.

I'm struggling with this! I tried mvepand and/or max_match=0 but failed.

Thanks for any help.

Labels (2)
0 Karma
1 Solution

FelixLeh
Contributor

Try this:

 

| rex field=_raw "(?<header>[^\n]+)"
| eval temp = split(_raw,"
")
| mvexpand temp
| regex temp="proc\d+"
| rex field=temp "(?<USER>[^\s]+)\s(?<PID>[^\s]+)\s(?<CPU>[^\s]+)\s(?<MEM>[^\s]+)\s(?<VSZ>[^\s]+)\s(?<RSS>[^\s]+)\s(?<TTY>[^\s]+)\s(?<STAT>[^\s]+)\s(?<START>[^\s]+)\s(?<TIME>[^\s]+)\s(?<COMMAND>[^\s]+)"
| rename CPU as "%CPU" MEM as "%MEM"
| fields - temp

 

You can then use transforming commands with the COMMAND field (procs).
Also be careful to keep the new line in the query in the split command to split correctly. If that doesnt work try "\n" instead.

EDIT: The Version from @ITWhisperer  with mvindex is simpler...

View solution in original post

0 Karma

lorinj62
Engager

Thanks all. I ended up using a modified version of @FelixLeh ....it works well!

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Assuming each event contains the timestamp line, followed by a header line, then lines for each process, you could try something like this

 

| eval process=mvindex(split(_raw,"
"),2,-1)
| mvexpand process
| rex field=process "(?<USER>[^\s]+)\s+(?<PID>[^\s]+)\s+(?<CPU>[^\s]+)\s+(?<MEM>[^\s]+)\s+(?<VSZ>[^\s]+)\s+(?<RSS>[^\s]+)\s+(?<TTY>[^\s]+)\s+(?<STAT>[^\s]+)\s+(?<START>[^\s]+)\s+(?<TIME>[^\s]+)\s+(?<COMMAND>.*)"
| chart max(CPU) max(MEM) by _time PID

 

0 Karma

FelixLeh
Contributor

Try this:

 

| rex field=_raw "(?<header>[^\n]+)"
| eval temp = split(_raw,"
")
| mvexpand temp
| regex temp="proc\d+"
| rex field=temp "(?<USER>[^\s]+)\s(?<PID>[^\s]+)\s(?<CPU>[^\s]+)\s(?<MEM>[^\s]+)\s(?<VSZ>[^\s]+)\s(?<RSS>[^\s]+)\s(?<TTY>[^\s]+)\s(?<STAT>[^\s]+)\s(?<START>[^\s]+)\s(?<TIME>[^\s]+)\s(?<COMMAND>[^\s]+)"
| rename CPU as "%CPU" MEM as "%MEM"
| fields - temp

 

You can then use transforming commands with the COMMAND field (procs).
Also be careful to keep the new line in the query in the split command to split correctly. If that doesnt work try "\n" instead.

EDIT: The Version from @ITWhisperer  with mvindex is simpler...

0 Karma
Get Updates on the Splunk Community!

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...

SignalFlow: What? Why? How?

What is SignalFlow? Splunk Observability Cloud’s analytics engine, SignalFlow, opens up a world of in-depth ...

Federated Search for Amazon S3 | Key Use Cases to Streamline Compliance Workflows

Modern business operations are supported by data compliance. As regulations evolve, organizations must ...