Re: Regex question/request

Mike6960 · ‎09-20-2019

Is it possible to use regex to extract values in events that always end with .PDF ? I have got a chain of events, somewhere in this process a PDF doucment is generated, So the name of the PDF is not in all the events.

dmarling · ‎09-30-2019

Based on the example you provided in the question comments, this should return the data you are looking for:

| rex "(?<PDFFileName>\S+)\.[Pp][Dd][Ff]"

Here's the regex101 link showing it function on your example: https://regex101.com/r/OyLl8z/1

If this comment/answer was helpful, please up vote it. Thank you.

jpolvino · ‎09-20-2019

Sounds like you're saying that you're looking for all events related to one that eventually generates a PDF? If so then is there a unique identifier that ties them all together? I'm asking because you could use a subsearch to gather all unique identifiers from those "PDF" events, and then use those identifiers later in your search to find relates events.

Posting a sample list of events will help.

somesoni2 · ‎09-20-2019

YOu should be able to use following regex (assuming that youru PDF file name contains alphanueric characters only)

your base search | rex "(?<PDFFileName>[A-z0-9_]+\.(pdf|PDF))"

Again, for better solution, please provide sample data and highlight the portion you want to extract.

Mike6960 · ‎09-30-2019

Thanks, almost what I need, due to the lack of me supplying an example not quite everything I need.

this is a fragment of the events:

: Get file ABC_6_2019-09-30_VK-161.2285507.pdf from /opt/mulesoft/

I would like to extract the values: ABC_6_2019-09-30_VK-161.2285507.pdf
It always ends with .PDF but the first part can differ, in my example it starts with ABC but this can also be ZZ for example

richgalloway · ‎09-20-2019

Please share some sample data and what you want extracted from it.

---
If this reply helps you, Karma would be appreciated.

Mike6960 · ‎09-30-2019

this is a fragment of the events:

: Get file ABC_6_2019-09-30_VK-161.2285507.pdf from /opt/mulesoft/

I would like to extract the values: ABC_6_2019-09-30_VK-161.2285507.pdf
It always ends with .PDF but the first part can differ, in my example it starts with ABC but this can also be ZZ for example

Regex question/request

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Think Like an Architect: Introducing the Splunk Certified Cybersecurity Defense ...

Best Practices: Splunk auto adjust pipeline queue

Announcing Modern Navigation: A New Era of Splunk User Experience

Join the Conversation