Hi,
I would like to extract fields from an unstructured data that contain multiple labels followed by its HTML href tag:
Sample events:
Change: <a href="https://xxyyzz.com/changes/12345">#12345</a> - Review: <a href="https://xxyyzz.com/reviews/7890">#7890</a>
Change: <a href="https://xxyyzz.com/changes/1345">#1345</a> - Review: <a href="https://xxyyzz.com/reviews/7891">#7891</a>
Review: <a href="https://zzyyyxxx/reviews/205657">205657</a>
I wish to get results for the above data as follows:
change_url change review_url review
https://xxyyzz.com/changes/12345 #12345 https://xxyyzz.com/reviews/7890 #7890
https://xxyyzz.com/changes/1345 #1345 https://xxyyzz.com/reviews/7891 #7891
https://zzyyyxxx/reviews/205657 #205657
Can someone suggest how can I use rex to obtain the above fields?
Try
| kv pairdelim="-" kvdelim=":\s"
| foreach Change Review
[rex field=<<FIELD>> "href=(?<<<FIELD>>_url>[^\>]+)>(?<<<FIELD>>_value>[^\<]+)"]
This is an emulation that you can play with and compare with real data
| makeresults
| eval data = mvappend("Change: <a href=\"https://xxyyzz.com/changes/12345\">#12345</a> - Review: <a href=\"https://xxyyzz.com/reviews/7890\">#7890</a>",
"Change: <a href=\"https://xxyyzz.com/changes/1345\">#1345</a> - Review: <a href=\"https://xxyyzz.com/reviews/7891\">#7891</a>",
"Review: <a href=\"https://zzyyyxxx/reviews/205657\">205657</a>")
| mvexpand data
| rename data AS _raw
``` data emulation above ```
Put the two together, I get
Change | Change_url | Review | Review_url |
#12345 | https://xxyyzz.com/changes/12345 | #7890 | https://xxyyzz.com/reviews/7890 |
#1345 | https://xxyyzz.com/changes/1345 | #7891 | https://xxyyzz.com/reviews/7891 |
205657 | https://zzyyyxxx/reviews/205657 |
Hi @firoagni ,
you have to use two regexes because there's the possibility that a part of the event is missing, so please try this:
<your_search>
| rex "Change:\s*\<a href\=\"(?<change_url>[^\"]*)\"\>(?<change>[^\<]*)"
| rex "Review: <a href="(?<review_url>[^\"]*)\"\>(?<review>[^\>]*)"
| table change_url change review_url review
you can test these regexes at https://regex101.com/r/Vnsxl9/1 and https://regex101.com/r/Vnsxl9/2
Ciao.
Giuseppe
Try
| kv pairdelim="-" kvdelim=":\s"
| foreach Change Review
[rex field=<<FIELD>> "href=(?<<<FIELD>>_url>[^\>]+)>(?<<<FIELD>>_value>[^\<]+)"]
This is an emulation that you can play with and compare with real data
| makeresults
| eval data = mvappend("Change: <a href=\"https://xxyyzz.com/changes/12345\">#12345</a> - Review: <a href=\"https://xxyyzz.com/reviews/7890\">#7890</a>",
"Change: <a href=\"https://xxyyzz.com/changes/1345\">#1345</a> - Review: <a href=\"https://xxyyzz.com/reviews/7891\">#7891</a>",
"Review: <a href=\"https://zzyyyxxx/reviews/205657\">205657</a>")
| mvexpand data
| rename data AS _raw
``` data emulation above ```
Put the two together, I get
Change | Change_url | Review | Review_url |
#12345 | https://xxyyzz.com/changes/12345 | #7890 | https://xxyyzz.com/reviews/7890 |
#1345 | https://xxyyzz.com/changes/1345 | #7891 | https://xxyyzz.com/reviews/7891 |
205657 | https://zzyyyxxx/reviews/205657 |
Hi @firoagni ,
good for you, see next time!
Ciao and happy splunking
Giuseppe
P.S.: Karma Points are appreciated by all the contributors 😉