Hi,
I have a very specific problem. I have a field with following values at different timestamps. Example:
1,3,20
0
2,3,43,9,12
3,3,40,8,20,9,80
2,3,20,9,30
6,2,0,3,30,4,42,5,29,6,80,9,92
This field actually represents very specific information, which I need to extract to feed my calculation.
The first number shows us how many fields are there to be extracted.
The second (and every other even number) is the name of the field to be extracted.
The third (and every other odd number) is the value of the field, whose name is stated just before.
That means that the last example I stated means that:
I want to be able to extract these fields, assigning them the approriate name. Is there a command / function that handles this well?
Thanks in advance!
You could try something like this
| makeresults
| eval _raw="1,3,20
0
2,3,43,9,12
3,3,40,8,20,9,80
2,3,20,9,30
6,2,0,3,30,4,42,5,29,6,80,9,92"
| multikv noheader=t
| table _raw
| streamstats count as row
| eval count=mvindex(split(_raw,","),0)
| rex mode=sed "s/,(?<mv>\d+,\d+)/!\1/g"
| rex max_match=0 "!(?<mv>\d+,\d+)"
| eval _raw=replace(_raw,"!",",")
| mvexpand mv
| eval name=mvindex(split(mv,","),0)
| eval value=mvindex(split(mv,","),1)
| eval {name}=value
| stats values(*) as * values(_raw) as _raw by row
| fields - row name value mv count
You could try something like this
| makeresults
| eval _raw="1,3,20
0
2,3,43,9,12
3,3,40,8,20,9,80
2,3,20,9,30
6,2,0,3,30,4,42,5,29,6,80,9,92"
| multikv noheader=t
| table _raw
| streamstats count as row
| eval count=mvindex(split(_raw,","),0)
| rex mode=sed "s/,(?<mv>\d+,\d+)/!\1/g"
| rex max_match=0 "!(?<mv>\d+,\d+)"
| eval _raw=replace(_raw,"!",",")
| mvexpand mv
| eval name=mvindex(split(mv,","),0)
| eval value=mvindex(split(mv,","),1)
| eval {name}=value
| stats values(*) as * values(_raw) as _raw by row
| fields - row name value mv count
Hi again @ITWhisperer,
I am still working with the field extraction you helped me with (super solution, thanks!), however, I am now facing a new issue - the code is too heavy to run (runs 9 minutes instead of 4, and has 300MB instead of 1MB) - this is due to the last stats command 😞
I have tried implementing a couple of solutions to this:
- streamstats and eventstats - these, however, cannot properly set the data back to its original state
- dedup - I thought this would be a better solution, but it turns out that dedup is even slower than stats
- using eval after streamstats, but then, I would just end up using stats command again
I am yet to try data model acceleration, but maybe you know another way to speed it up? In my search, I only use the fast mode. I am not able to reduce the time-range of the search, as I have already hit the minimum time range necessary for the dashboard to be useful.
Thanks!
Do you need to rebuild the rows or can you do you further calculations without? Do you need to keep _raw?
I need the other fields to be back, as I need them for further calculation.
I do not need _raw files - I removed _raw as _raw from the stats command, and that helped with the size of the search (now only 15mb), but still it takes 10 minutes instead of the original 5.
I tried this trick with streamstats instead, but it seems that streamstats doesn't really do anything and the duplicate rows are not removed.
Genius. It works. Thanks!