I have some data that I need to pull out. This data can be in one of any 3 fields (symbol, symbols or p1) and contain any number of entries. I think I have that part right. What I'm attempting to do is put them all into one consolidated variable, called syms. Then I want to remove the individual entries from syms that contain a "." or "-" character. Further I want to only show the syms with ":GR". So far the below is not working and it's making me crazy. Anyone know where I'm going wrong?
index=snaptor sourcetype=AccessApp index=snaptor query_string="*" |fillnull value=NULL | rex field=query_string "(symbol=|symbols=|p1=)+(?<syms>[.:/-\w]+(,[.:/-\w]+)*|[\w])" |eval syms=upper(syms) | rex field=syms mode=sed "s/%2C/,/g" | rex field=syms mode=sed "s/\+/,/g"| rex field=syms mode=sed "s/%2E/\./g" | rex field=syms mode=sed "s/%23/\#/g" |rex field=syms mode=sed "s/%2F/\//g" | makemv delim="," syms | regex syms!="(?=\\-)|(?=\\.)"| regex syms="(?=:GR)"|eval sym_count=mvcount(syms) | mvexpand syms | stats count by syms, sym_count, uri, productid | sort -sym_count
Based on your two example values, this should provide you a field GR
with in total three values:
index=snaptor sourcetype=AccessApp index=snaptor query_string="*"
| rex field=query_string "(p1|symbols?)=(?<symbols>[^&\"\s]+)"
| makemv symbols tokenizer="([^,+]+)"
| eval GR = mvfilter(match(symbols, ":GR$"))
First I try to pull out the values into one field similar to your start, not sure if this regex will work for your entire data set.
Second, split up the string of symbols - I've assumed commas and plus signs are your separators.
Third, filter for things ending in :GR
.
I'll give it a go. I understand your logic there so I guess that's a good first step 🙂 thanks.
What's the separator for individual values, comma?
Download this tool and work inside of it to build your RegEx:
http://www.ultrapico.com/expresso.htm
If you need help building/debugging RegEx, this is a GREAT tool:
A big part of your frustration is the delay in building/fixing RegEx inside searches; try to avoid doing that if you can.
A good on-line tool for testing regex strings is www.regex101.com.
It'd be great if you posted some sample data.
I guess I'm just looking at first to get a count of ':GR' symbols... that might work for now. so in the above case I guess that total would be 3 (1 in the first and 2 in the second).
Sure:
One piece of data may look like this:
"?p1=PRGO+UTIW+-EOG150724C87+-JUNO150717C55+MON+-EOG150724P86.5+-AMGN150731C152.5+PANW+PTEN+TLT+-PANW150731P180+T+-AMGN150717P157.5+-MON150821C105+-KR150717P72.5+NLYPRD+-UNP150731C99+TGT+EOG+CHSCM+JUNO+-POT150717P34+-TLT150710P116+UNP+KR+-PRGO150821C190+POT+-PTEN150821C19+-PRGO150717P195+AMGN+-USB150807C43.5+-T150702P34.5+PLG+-PANW150731P175+GSF+-TGT150717P80+-MON150724C107+-MON150710P105+AA+USB:GR+-TGT150717P82+-TLT150710C116+&p2=R&uuid=1a208cfc-ce40-11d1-b8a2-ac19649eaa77&pro=null&mc=null"eDisplay=null"
another may look like this:
"?productid=rtb&subClientProductCode=ABAL"etype=D&uuid=4072000124&scoCompanyId=0&timeout=0.01&searchListCC=all&symbols=1113:GR,1:HK,APTS,AW9U:SG,BBT,BIK,BX,CAD=,CEF,CERN,COP,DEM:GB,DMLP:GR,DXJ,GILD,GIM,GLD,HEDJ,HKD=,IAK,IDX,IGA,IHI,IXP,IYR"