Hi All,
I am working on prediction of start time of job and i have scheduled time as a independent variable.
Approach 1:
I am thinking to convert the H:M:S time of start time and scheduled time into seconds and them predict the start time in seconds using independent variable as schedule time in seconds and hour if the schedule time.and convert it again into H:M:S and append it with the respective date
Approach 2:
Another approach can be convert the start time and scheduled Time into epoch. Get the difference between them, predict that difference using independent variable as schedule time in epoch and hour of the schedule time, type of the job
Please let me know which approach is better and algorithm - RandomForestRegressor algorithm is feasible here,
Thanks in Advance !
This questions is impossible to answer well without knowing more about the data, but here are a few suggestions based on what you've provided:
This questions is impossible to answer well without knowing more about the data, but here are a few suggestions based on what you've provided:
Thanks Aoliner,
I have worked on both approach , but i got good results with the approach 1 of calculating the start time into seconds.
Random forest is working fine for me, but i have some outliers because of that my result is having more RMSE value and R square value is coming 0.99.
I have one question that we should remove the outliers (deviation in data) or it should be there?
Do you consider the outliers to be noise (e.g., measurement error, external interference, etc.) or a phenomenon you want to model?
Also, perfect prediction isn't always possible, especially in the presence of random noise or factors missing from your dataset. You may find it difficult to do better than R^2=0.99.