All Apps and Add-ons

what does an empty line represent in a regular expression?

royimad
Builder

I am looking to find a character (regular expression) in Splunk that searches for and returns values (from a file) starting with a word (ex.Total) and ending with a new empty line (representing a new paragraph etc..).

An random text chosen from the web:

A Memorandum of Understanding was signed by Total and MOGE on July 9, 1992. In addition to the construction of offshore gas facilities by the partners, a separate company in which PTT-EP, MOGE, and other affiliates of Total and Unocal are investors (the Moattama Gas Transportation Company - MGTC) built a 346-kilometer subsea pipeline to bring the gas to landfall in Myanmar, and a 63-kilometer onshore pipeline, with control and metering units, to carry the gas to the border with Thailand, which purchases most of the field's output under a long-term sales and purchase agreement.

Construction was carried out between fall 1995 and mid-1998, with gas production beginning in July 1998. The total investment outlay was approximately US$1 billion. Further capital expenditure will be requiredduring the field's lifetime to drill additional wells and install compressors. The export production threshold of 525 million cubic feet per day was reached in early 2001.

In this case, the regular expression would return the following:

"Total and MOGE on July 9, 1992. In addition to the construction of offshore gas facilities by the partners, a separate company in which PTT-EP, MOGE, and other affiliates of Total and Unocal are investors (the Moattama Gas Transportation Company - MGTC) built a 346-kilometer subsea pipeline to bring the gas to landfall in Myanmar, and a 63-kilometer onshore pipeline, with control and metering units, to carry the gas to the border with Thailand, which purchases most of the field's output under a long-term sales and purchase agreement"

To make it easier my text doesn't contain 7 consecutive empty spaces, you can look for a new line that contain 7 consecutive spaces at the beginning.

0 Karma
1 Solution

sowings
Splunk Employee
Splunk Employee

I would probably go with "(?ms)(?<capture>Total.*)\n^\n"

I haven't tested it, but the principle is: (?ms) -- use both multiline and single line mode together. This allows . to match any character (including a newline), while allowing ^ and $ to reference beginning and ending of a line (as demarcated with newline characters).

Next, start capturing with the word Total until you find a newline followed by a newline which is itself at the beginning of a line.

The tester at http://gskinner.com/RegExr/ suggests that (?ms)(?<capture>Total.*)^ might be enough.

View solution in original post

rturk
Builder

Hi Royimad,

Have you tried http://www.pythonregex.com/? As Splunk is based on Python, I find this site really useful for testing regular expressions.

Using this, the following regex gives you what you listed above:

A Memorandum of Understanding was signed by (?P<blah>.*)\n\n

The dot (.) won't match newline characters, so bounding the search with two \n's will ensure it breaks on a blank line.

Hope this helps 🙂

sowings
Splunk Employee
Splunk Employee

I would probably go with "(?ms)(?<capture>Total.*)\n^\n"

I haven't tested it, but the principle is: (?ms) -- use both multiline and single line mode together. This allows . to match any character (including a newline), while allowing ^ and $ to reference beginning and ending of a line (as demarcated with newline characters).

Next, start capturing with the word Total until you find a newline followed by a newline which is itself at the beginning of a line.

The tester at http://gskinner.com/RegExr/ suggests that (?ms)(?<capture>Total.*)^ might be enough.

royimad
Builder

do you think there are a reverse regular expression to capture text started by end line and ending with beginning of a character?

Thanks for your help

0 Karma

royimad
Builder

Thanks sowings for (?ms)(?Total.*)^ capturing my text.
How about then having the Total number of lines ?

0 Karma
Get Updates on the Splunk Community!

Index This | What is broken 80% of the time by February?

December 2025 Edition   Hayyy Splunk Education Enthusiasts and the Eternally Curious!    We’re back with this ...

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Hello Splunk Community,   We're thrilled to share an exciting update that will help you manage your data more ...

Splunk MCP & Agentic AI: Machine Data Without Limits

Discover how the Splunk Model Context Protocol (MCP) Server can revolutionize the way your organization uses ...