Solved: parse ps command

lorinj62 · ‎11-06-2023

I have events like this :

11/06/2023 12:34:56 ip 1.2.3.4 This is record 1 of 5
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
user 1 1.0 0.0 2492 604 ? Ss 12:27 0:00 proc01
user 6 0.5 0.0 2608 548 ? S 12:27 0:00 proc02
user 19 0.0 0.0 12168 7088 ? S 12:27 0:00 proc03
user 223 0.0 0.1 852056 39300 ? Ssl 12:27 0:00 proc04
user 470 0.0 0.0 7844 6016 pts/0 Ss 12:27 0:00 proc05
user 683 0.0 0.0 7872 3380 pts/0 R+ 12:37 0:00 proc06

11/06/2023 12:34:56 ip: 1.2.3.4 This is record 2 of 5
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
user 1 0.0 0.0 2492 604 ? Ss 12:27 0:00 proc07
user 6 9.0 0.0 2608 548 ? S 12:27 0:00 proc08
user 19 6.0 0.0 12168 7088 ? S 12:27 0:00 proc09
user 223 0.0 0.1 852056 39300 ? Ssl 12:27 0:00 proc10
user 470 0.0 0.0 7844 6016 pts/0 Ss 12:27 0:00 proc11
user 683 0.0 0.0 7872 3380 pts/0 R+ 12:37 0:00 proc12

and repeating with different data, but the same structure: record 1 of 18...record 2 of 18...etc.

The dates and times are the same for each "subsection" of the ps command.

I want to be able to make a graph of each "proc" to show their cpu and memory usage over time. The processes will be in a random order. I have the time line parsed with fields extracted (like the ip), and want the header of the ps command to be field names for the ps data.

I'm struggling with this! I tried mvepand and/or max_match=0 but failed.

Thanks for any help.

FelixLeh · ‎11-07-2023

Try this:

| rex field=_raw "(?<header>[^\n]+)"
| eval temp = split(_raw,"
")
| mvexpand temp
| regex temp="proc\d+"
| rex field=temp "(?<USER>[^\s]+)\s(?<PID>[^\s]+)\s(?<CPU>[^\s]+)\s(?<MEM>[^\s]+)\s(?<VSZ>[^\s]+)\s(?<RSS>[^\s]+)\s(?<TTY>[^\s]+)\s(?<STAT>[^\s]+)\s(?<START>[^\s]+)\s(?<TIME>[^\s]+)\s(?<COMMAND>[^\s]+)"
| rename CPU as "%CPU" MEM as "%MEM"
| fields - temp

You can then use transforming commands with the COMMAND field (procs).
Also be careful to keep the new line in the query in the split command to split correctly. If that doesnt work try "\n" instead.

EDIT: The Version from @ITWhisperer with mvindex is simpler...

View solution in original post

lorinj62 · ‎11-08-2023

Thanks all. I ended up using a modified version of @FelixLeh ....it works well!

ITWhisperer · ‎11-07-2023

Assuming each event contains the timestamp line, followed by a header line, then lines for each process, you could try something like this

| eval process=mvindex(split(_raw,"
"),2,-1)
| mvexpand process
| rex field=process "(?<USER>[^\s]+)\s+(?<PID>[^\s]+)\s+(?<CPU>[^\s]+)\s+(?<MEM>[^\s]+)\s+(?<VSZ>[^\s]+)\s+(?<RSS>[^\s]+)\s+(?<TTY>[^\s]+)\s+(?<STAT>[^\s]+)\s+(?<START>[^\s]+)\s+(?<TIME>[^\s]+)\s+(?<COMMAND>.*)"
| chart max(CPU) max(MEM) by _time PID

FelixLeh · ‎11-07-2023

Try this:

| rex field=_raw "(?<header>[^\n]+)"
| eval temp = split(_raw,"
")
| mvexpand temp
| regex temp="proc\d+"
| rex field=temp "(?<USER>[^\s]+)\s(?<PID>[^\s]+)\s(?<CPU>[^\s]+)\s(?<MEM>[^\s]+)\s(?<VSZ>[^\s]+)\s(?<RSS>[^\s]+)\s(?<TTY>[^\s]+)\s(?<STAT>[^\s]+)\s(?<START>[^\s]+)\s(?<TIME>[^\s]+)\s(?<COMMAND>[^\s]+)"
| rename CPU as "%CPU" MEM as "%MEM"
| fields - temp

You can then use transforming commands with the COMMAND field (procs).
Also be careful to keep the new line in the query in the split command to split correctly. If that doesnt work try "\n" instead.

EDIT: The Version from @ITWhisperer with mvindex is simpler...

parse ps command

field extraction

subsearch

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer at Splunk .conf24 ...

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...