Solved: Joining two records from a csv file based on a col...

madakkas · ‎02-22-2018

I am working on a monitoring tool where in I have to monitor the job completion and calculate the estimates in accordance.

Till now I have kind of been able to capture the start and endtime of the jobs as below and have written them to a file as job_monitor.csv. I do this search repeatedly evey two minutes and append the relevant jobs.

sl_no,JOBNAME,START_TIME,END_TIME
1,S3,,15.51.42
2,S2,,15.21.35
3,J3,,14.52.28
4,J2,,14.51.22
5,S1,,15.01.28
6,J1,,14.31.02
7,S3,15.21.42,
8,S2,15.01.34,
9,S1,14.51.28,
10,J3,14.51.28,
11,J2,14.31.22,
12,J1,14.30.02,

Once the above details are captured, I am looking to convert this into a below format. There could be jobs with same JOBNAME, but two jobs cannot run in parallel, ie , if there is an entry for start_time , then the next End_time is relevant to the same job itself. I am not able to map these to look as below.

sl_no,JOBNAME,START_TIME,END_TIME
1,S3,15.21.42,15.51.42
2,S2,15.01.34,15.21.35
3,J3,14.51.28,14.52.28
4,J2,14.31.22,14.51.22
5,S1,14.51.28,15.01.28
6,J1,14.30.02,14.31.02

Any guidance would be appreciated. Thank You in Advance to all here.

mayurr98 · ‎02-23-2018

You can try something like this

<your_base_search> | stats values(START_TIME) as START_TIME values(END_TIME) as END_TIME by JOBNAME

View solution in original post

mayurr98 · ‎02-23-2018

You can try something like this

<your_base_search> | stats values(START_TIME) as START_TIME values(END_TIME) as END_TIME by JOBNAME

FrankVl · ‎02-23-2018

Any specific reason to do it like this, instead of suggesting to use the transaction command?

Especially since he states that job names can be re-used (but not in parallel), using transactions may give more accurate results than using stats values()..., right?

madakkas · ‎02-26-2018

The stats solution did work. thank You for that. I just appended it with some more where clause to meet my demand.

thank You mayurr98

mayurr98 · ‎02-23-2018

stats is any time better than transaction command I gave this solution considering performance. I think even this would give accurate results as long as there is a unique start_time and end_time for a specific jobname
because I could see from the table there are only two records for a specific jobname.

madakkas · ‎02-22-2018

Just to add on , there cuuld be a situation where in the job has started and has not completed, In that case I would need the record as below

sl_no,JOBNAME,START_TIME,END_TIME
1,S3,15.21.42,

which would let me know that the job has started but not completed yet. thanks again.

Joining two records from a csv file based on a column

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Agent Mode Engaged! Enchaining Agentic Operations with Splunk AI Assistant 2.0

Announcing Modern Navigation: A New Era of Splunk User Experience

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

Join the Conversation