How to edit my search and regex to consolidate 3 f...

billycote · ‎06-30-2015

I have some data that I need to pull out. This data can be in one of any 3 fields (symbol, symbols or p1) and contain any number of entries. I think I have that part right. What I'm attempting to do is put them all into one consolidated variable, called syms. Then I want to remove the individual entries from syms that contain a "." or "-" character. Further I want to only show the syms with ":GR". So far the below is not working and it's making me crazy. Anyone know where I'm going wrong?

index=snaptor sourcetype=AccessApp index=snaptor query_string="*" |fillnull value=NULL  | rex field=query_string "(symbol=|symbols=|p1=)+(?<syms>[.:/-\w]+(,[.:/-\w]+)*|[\w])" |eval syms=upper(syms) | rex field=syms mode=sed "s/%2C/,/g" | rex field=syms mode=sed "s/\+/,/g"| rex field=syms mode=sed "s/%2E/\./g" | rex field=syms mode=sed "s/%23/\#/g" |rex field=syms mode=sed "s/%2F/\//g" | makemv delim="," syms | regex syms!="(?=\\-)|(?=\\.)"| regex syms="(?=:GR)"|eval sym_count=mvcount(syms) | mvexpand syms | stats  count by syms, sym_count, uri, productid | sort -sym_count

martin_mueller · ‎06-30-2015

Based on your two example values, this should provide you a field GR with in total three values:

  index=snaptor sourcetype=AccessApp index=snaptor query_string="*"
| rex field=query_string "(p1|symbols?)=(?<symbols>[^&\"\s]+)"
| makemv symbols tokenizer="([^,+]+)"
| eval GR = mvfilter(match(symbols, ":GR$"))

First I try to pull out the values into one field similar to your start, not sure if this regex will work for your entire data set.
Second, split up the string of symbols - I've assumed commas and plus signs are your separators.
Third, filter for things ending in :GR.

billycote · ‎06-30-2015

I'll give it a go. I understand your logic there so I guess that's a good first step 🙂 thanks.

martin_mueller · ‎06-30-2015

What's the separator for individual values, comma?

woodcock · ‎06-30-2015

Download this tool and work inside of it to build your RegEx:

http://www.ultrapico.com/expresso.htm

If you need help building/debugging RegEx, this is a GREAT tool:

http://www.Debuggex.com

A big part of your frustration is the delay in building/fixing RegEx inside searches; try to avoid doing that if you can.

richgalloway · ‎06-30-2015

A good on-line tool for testing regex strings is www.regex101.com.

---
If this reply helps you, Karma would be appreciated.

martin_mueller · ‎06-30-2015

It'd be great if you posted some sample data.

billycote · ‎06-30-2015

I guess I'm just looking at first to get a count of ':GR' symbols... that might work for now. so in the above case I guess that total would be 3 (1 in the first and 2 in the second).

billycote · ‎06-30-2015

Sure:
One piece of data may look like this:

"?p1=PRGO+UTIW+-EOG150724C87+-JUNO150717C55+MON+-EOG150724P86.5+-AMGN150731C152.5+PANW+PTEN+TLT+-PANW150731P180+T+-AMGN150717P157.5+-MON150821C105+-KR150717P72.5+NLYPRD+-UNP150731C99+TGT+EOG+CHSCM+JUNO+-POT150717P34+-TLT150710P116+UNP+KR+-PRGO150821C190+POT+-PTEN150821C19+-PRGO150717P195+AMGN+-USB150807C43.5+-T150702P34.5+PLG+-PANW150731P175+GSF+-TGT150717P80+-MON150724C107+-MON150710P105+AA+USB:GR+-TGT150717P82+-TLT150710C116+&p2=R&uuid=1a208cfc-ce40-11d1-b8a2-ac19649eaa77&pro=null&mc=null&quoteDisplay=null"

another may look like this:
"?productid=rtb&subClientProductCode=ABAL&quotetype=D&uuid=4072000124&scoCompanyId=0&timeout=0.01&searchListCC=all&symbols=1113:GR,1:HK,APTS,AW9U:SG,BBT,BIK,BX,CAD=,CEF,CERN,COP,DEM:GB,DMLP:GR,DXJ,GILD,GIM,GLD,HEDJ,HKD=,IAK,IDX,IGA,IHI,IXP,IYR"

How to edit my search and regex to consolidate 3 fields into one field, remove values that contain "." or "-", and only show values with ":GR"?

Data-Driven Success: Splunk & Financial Services

Video | Welcome Back to Smartness, Pedro

Detector Best Practices: Static Thresholds