I have log entries that look effectively like this: (I have to break the URLs so I can even post this, how annoying...)
2013-09-01T00:00:00.000 url1=http_:_//_foo._com_/hedgehogs.html?primate=ring-tailed%20lemur&movie=princess%20bride url2=http_:_//_bar._com_/weasels.html?primate=gorilla&terrain=marshland another_field=something another_field=something
I need to extract the query parameters, but still know which URL they came from.
For example, given the above entry, it should show up for a search like:
search url1_primate="ring-tailed lemur"
But NOT show up for a search like:
search url1_primate="gorilla"
If I search on the value of url1 or url2, then I have to do something like:
search url1="primate=ring-tailed lemur"
But that matches anything with that string--I get false matches with my real data because I might have the equivalent of "least_loved_primate=ring-tailed lemur" that'd be a false positive.
So, in summary, I think what I need to do is:
Extract the url1 and url2 fields explicitly
Extract the values of each url field into url1_* and url2_* fields with duplicate transforms
Prefix extracted query parameter keys with url1_ or url2_
I can do the first thing, but the others are hard. Ideas?
Paul
... View more