Splunk Search

How to extract URL into a new field?

Vfinney
Observer

What would be a command to extract the url and create a new field from Cisco eStreamer logs using the rex command?

0 Karma
1 Solution

oscar84x
Contributor

Splunk should be automatically extracting all those field for you because of the "=" delim? I just tested the two lines you sent and everything was extracted automatically.

Either way, the rex command would be something like this:

<your search>
| rex field=_raw "\burl\b\=(?<url>[^ ]+)\s"

View solution in original post

0 Karma

gcusello
Legend

HI @Vfinney,
try this:

url\=(?<url>[^ ]*)

that you can test at https://regex101.com/r/OiAXaV/1

Ciao.
Giuseppe

0 Karma

woodcock
Esteemed Legend

If you set KV_MODE=auto for your sourcetype on your Search Head, this will be done for you (along with the other fields). To do it manually, just add this to your search:

... | rex "url=(?<url>\S+)"
0 Karma

oscar84x
Contributor

Splunk should be automatically extracting all those field for you because of the "=" delim? I just tested the two lines you sent and everything was extracted automatically.

Either way, the rex command would be something like this:

<your search>
| rex field=_raw "\burl\b\=(?<url>[^ ]+)\s"
0 Karma

Vfinney
Observer

Sorry... data certainly would help. Not sure what I was thinking.

rec_type=71 tcp_flags=0 sec_intel_event=No monitor_rule_8=0 client_version="" monitor_rule_7=N/A monitor_rule_4=N/A snmp_out=0 monitor_rule_2=N/A ssl_server_name="" monitor_rule_1=N/A dest_pkts=8 sensor=ipkdol02p last_pkt_sec=1574690104 ssl_flow_error=0 ssl_url_category=0 src_ip=10.140.6.154 ssl_rule_id=0 dns_rec_id=0 src_port=59057 ssl_cert_fingerprint=0000000000000000000000000000000000000000 monitor_rule_6=N/A rec_type_simple=RNA event_desc="Flow Statistics" first_pkt_sec=1574690104 security_context=00000000000000000000000000000000 netflow_src=00000000-0000-0000-0000-000000000000 has_ipv6=1 monitor_rule_5=N/A ssl_actual_action=Unknown src_autonomous_system=0 src_bytes=863 connection_id=40461 event_usec=0 dest_autonomous_system=0 dest_mask=0 monitor_rule_3=N/A dest_ip_country="united states" iface_ingress=vrf client_app="SSL client" user_agent="" ssl_flow_status=Unknown snmp_in=0 file_count=0 dest_ip=216.58.193.142 ssl_ticket_id=0000000000000000000000000000000000000000 dest_tos=0 fw_rule_reason=N/A sinkhole_uuid=00000000-0000-0000-0000-000000000000 fw_policy=00000000-0000-0000-0000-00005dd129e8 sec_intel_ip=N/A ssl_flow_messages=0 ssl_cipher_suite=TLS_NULL_WITH_NULL_NULL http_referrer="" src_mask=0 legacy_ip_address=0.0.0.0 ssl_version=Unknown ssl_flow_flags=0 event_sec=1574693323 dns_query="" url_category="Search Engines" ip_proto=TCP dest_port=443 url=https://fcmatch.google.com ssl_server_cert_status="Not Checked" mac_address=00:00:00:00:00:00 netbios_domain="" dns_ttl=0 src_tos=0 ssl_policy_id=00000000000000000000000000000000 ssl_session_id=0000000000000000000000000000000000000000000000000000000000000000 iface_egress=outside num_ioc=0 referenced_host="" event_type=1003 dest_bytes=4538 dns_resp_id=0 user="No Authentication Required" ip_layer=0 fw_rule="Default Action" fw_rule_action=Allow rec_type_desc="Connection Statistics" ssl_expected_action=Unknown app_proto=HTTPS vlan_id=0 sec_zone_ingress=Internal-ASA sec_zone_egress=External-ASA event_subtype=1 http_response=0 web_app=Google url_reputation="Well known" src_ip_country=unknown src_pkts=8 instance_id=1 ips_count=0

rec_type=71 tcp_flags=0 sec_intel_event=No monitor_rule_8=0 client_version="" monitor_rule_7=N/A monitor_rule_4=N/A snmp_out=0 monitor_rule_2=N/A ssl_server_name="" monitor_rule_1=N/A dest_pkts=10 sensor=ipkdol02p last_pkt_sec=1574690104 ssl_flow_error=0 ssl_url_category=0 src_ip=10.140.6.154 ssl_rule_id=0 dns_rec_id=0 src_port=59066 ssl_cert_fingerprint=0000000000000000000000000000000000000000 monitor_rule_6=N/A rec_type_simple=RNA event_desc="Flow Statistics" first_pkt_sec=1574690104 security_context=00000000000000000000000000000000 netflow_src=00000000-0000-0000-0000-000000000000 has_ipv6=1 monitor_rule_5=N/A ssl_actual_action=Unknown src_autonomous_system=0 src_bytes=2750 connection_id=40463 event_usec=0 dest_autonomous_system=0 dest_mask=0 monitor_rule_3=N/A dest_ip_country="united states" iface_ingress=vrf client_app="SSL client" user_agent="" ssl_flow_status=Unknown snmp_in=0 file_count=0 dest_ip=198.8.70.129 ssl_ticket_id=0000000000000000000000000000000000000000 dest_tos=0 fw_rule_reason=N/A sinkhole_uuid=00000000-0000-0000-0000-000000000000 fw_policy=00000000-0000-0000-0000-00005dd129e8 sec_intel_ip=N/A ssl_flow_messages=0 ssl_cipher_suite=TLS_NULL_WITH_NULL_NULL http_referrer="" src_mask=0 legacy_ip_address=0.0.0.0 ssl_version=Unknown ssl_flow_flags=0 event_sec=1574693323 dns_query="" url_category="Web Advertisements" ip_proto=TCP dest_port=443 url=https://p.rfihub.com ssl_server_cert_status="Not Checked" mac_address=00:00:00:00:00:00 netbios_domain="" dns_ttl=0 src_tos=0 ssl_policy_id=00000000000000000000000000000000 ssl_session_id=0000000000000000000000000000000000000000000000000000000000000000 iface_egress=outside num_ioc=0 referenced_host="" event_type=1003 dest_bytes=4843 dns_resp_id=0 user="No Authentication Required" ip_layer=0 fw_rule="Default Action" fw_rule_action=Allow rec_type_desc="Connection Statistics" ssl_expected_action=Unknown app_proto=HTTPS vlan_id=0 sec_zone_ingress=Internal-ASA sec_zone_egress=External-ASA event_subtype=1 http_response=0 web_app="Rocket Fuel" url_reputation="Benign sites with security risks" src_ip_country=unknown src_pkts=11 instance_id=1 ips_count=0

0 Karma

woodcock
Esteemed Legend

You have not given us anything to work with (sample events, field names, and explanation of which text to clip), so, short of that, I can VERY highly recommend the URL Toolbox app:
https://splunkbase.splunk.com/app/2734/

0 Karma

oscar84x
Contributor

If you're looking for the syntax for the command, it's in the Splunk Doc below. If you'd like help with the regex and the command please provide a few sample events.

https://docs.splunk.com/Documentation/Splunk/8.0.0/SearchReference/Rex

0 Karma

gcusello
Legend

HI @Vfinney,
the command is rex, could you share an example and what you want to extract?

Ciao.
Giuseppe

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!