I am trying to find a way to compare the results listed in a table to each other.
Basically the table lists the results of many different test where some test have the same names, but have been run and rerun so they have same test_names but different test_IDs.
something like this
test_name | test_id | Score | Drift |
test 1 | .98 | 100 | |
test 1 | .99 | 98 | -2 |
test 1 | 1.00 | 100 | 2 |
test 2 | .01 | 30 | |
test 3 | 0.54 | 34 | |
test 3 | 0.55 | 76 | 42 |
I am looking for a way to take the score from line one and have some sort of logic that will look at the result of the next line and if the test has the sane test_name BUT a different test_ID it will subtract the first lines score from the second lines score and continue along until the (for example) the next line has a different test_name and it skips the line until it finds another line where the following line have the same test_name. and it continues on until all scores are compared.
The delta command almost works but I need to have the ability to say BY test_name
something like
| delta Score as Drift by test_name
unfortunately delta doesn't accept by clauses
I am trying to find a way to calculate the drift column using Splunk so I can create a detection where the drift exceeds a specific threshold.
Hi @dolj
You should be able to use the streamstats command, which allows you to perform operations on a stream of events and group them by a specific field. In your case, you want to calculate the difference in scores for tests with the same test_name but different test_id. Here's how you can do it:
Here's a Splunk search query that should accomplish this:
| your_search_here | sort test_name, test_id | streamstats current=f last(Score) as previous_score by test_name | eval Drift = if(isnull(previous_score), null(), Score - previous_score) | table test_name, test_id, Score, Drift
Please let me know how you get on and consider adding karma to this or any other answer if it has helped.
Regards
Will
You made a point of emphasizing the different test_id (sane test_name BUT a different test_ID)
Is it possible to have a row with the SAME test_id at some point, i.e. could you insert a row at row 3 with test_name=test 1 and test_id="0.98"?
If so, the simple streamstats solution suggested by @livehybrid won't work.
Is it possible to have the same test_id and if so, what should be the behaviour?
Ah @bowesmana , I may have misunderstood the ask here, as you say. I used streamstats by test_name after sorting by (an assumed sequential) test_id. Although I'm not sure why there being the same test_id for multiple test_name would affect the output here, as I'm not using the test_id in the streamstats? I may have missed something though (and not had coffee yet!)
@dolj Please let us know how you are getting on, and if you clarify the requirement I'd be happy to help further and update the previously posted search if required 🙂
Please let me know how you get on and consider adding karma to this or any other answer if it has helped.
Regards
Will
Hi @dolj
You should be able to use the streamstats command, which allows you to perform operations on a stream of events and group them by a specific field. In your case, you want to calculate the difference in scores for tests with the same test_name but different test_id. Here's how you can do it:
Here's a Splunk search query that should accomplish this:
| your_search_here | sort test_name, test_id | streamstats current=f last(Score) as previous_score by test_name | eval Drift = if(isnull(previous_score), null(), Score - previous_score) | table test_name, test_id, Score, Drift
Please let me know how you get on and consider adding karma to this or any other answer if it has helped.
Regards
Will