HomeLinuxPython Extract Substring Utilizing Regex

Python Extract Substring Utilizing Regex


There will be a number of case situations the place it’s required to extract a substring from a string in Python. As an illustration, whereas engaged on giant datasets, it’s possible you’ll must get particular information from the textual content fields or match a selected sample in a string, similar to an electronic mail tackle or telephone quantity. Furthermore, the substring extraction operation additionally assists in textual content processing and evaluation.

This submit will cowl the next approaches:

Methodology 1: Python Extract Substring Utilizing Regex in “re.search()” Methodology

The Python “re.search()” methodology seems for the primary prevalence of the occasion of the added sample inside a string and outputs a “Match” object. It may be invoked once you need to find a particular substring inside an extended string however do not know how continuously it happens.

Syntax

To make use of the re.search() methodology, observe the given syntax:

re.search(sample, string, flags)

Right here:

  • sample” represents the regex that you just need to search.
  • string” refers back to the specified string during which you need to search.
  • flags” represents the optionally available parameters, similar to multi-line mode, case sensitivity, and so on.

Instance 1: Extracting Textual content-based Substring Utilizing “re.search()” Methodology

For using the “re.search()” methodology to extract a substring, firstly import the “re” module. This module provides assist for regex:

Outline the string from which you need to retrieve a substring:

string = ‘Linuxhint is one of the best tutorial web site’

Then, specify the regex. Right here, “r” signifies that it’s a uncooked string to deal with backlashes because the literal characters, and “finest” is the outlined common expression or regex:

Go the created “regex” and “string” to the re.search() methodology and retailer the resultant object within the “match”:

match = re.search(regex, string)

Now, add the given situation to extracts the matched substring from the “match” object returned by the re.search() methodology, and show it to the console:

if match:
    sub_string = match.group()
    print(sub_string)

It may be noticed that the substring “finest” has been extracted by using the “group()” methodology of the match object:

Instance 2: Extracting Numeric Substring Utilizing “re.search()” Methodology

Now, outline a numeric string and seek for the primary prevalence of a number of digits in it by passing the “d+” because the regex to “re.search()” methodology:

string = ‘039-6546-0987’
print(re.search(rd+’, string))

Within the specified regex:

  • ” is utilized for escaping the letter “d” (digit character).
  • +” signifies one or match digits in a row:

As you possibly can see, the matched object has been returned by the “re.search()” methodology.

Methodology 2: Python Extract Substring Utilizing Regex in “re.match()” Methodology

re.match()” solely searches for the regex firstly of the strings and outputs a Match object in case of a profitable search. This methodology will be utilized when you realize that the substring solely happens firstly of the given string.

Syntax

To invoke the re.match() methodology, observe the given syntax:

re.match(sample, string, flags)

Instance

Firstly, outline the common expression as “‘^l…….t$‘”. This regex matches the strings that start with “l”, finish with “t”, and have precisely 8 characters.

Then, declare the string. Go it to the re.match() methodology, together with the regex as arguments:

string = ‘linuxhint’
consequence = re.match(regex, string)

Add the “if-else” situation and specify the respective print statements for the instances if “Match” object has been returned or not:

if consequence:
  print(“Search has been accomplished efficiently”, consequence)
else:
  print(“Search was unsuccessful.”)

Output

Methodology 3: Python Extract Substring Utilizing Regex in “re.findall()” Methodology

The “re.findall()” Python methodology searches for each occasion of a sample inside the given strings and outputs a listing of extracted substrings. This methodology is utilized in these case situations the place it’s required to retrieve a number of substrings with none specific order.

Syntax

To invoke the re.findall() methodology, take a look at the given syntax:

re.findall(sample, string, flags)

Instance

Outline a string comprising numeric values. Then, specify the regex sample as “r’d+‘” to match a number of digits:

string = ‘4 Hour Boot camp Linuxhint course for $14.99’
regex = rd+’

Then, name the “re.findall()” methodology and move the outlined regex and the string as arguments

matches = re.findall(regex, string)

Now, iterate over the returned “Match” object saved within the matches variable and print the weather on the console:

for match in matches:
    print(match)

Output

Methodology 4: Python Extract Substring Utilizing Regex in “re.finditer()” Methodology

The “re.finditer()” methodology works the identical because the re.findall() methodology. Nonetheless, it returns an iterator reasonably than a listing of substrings. In Python, this methodology will be utilized when there exists a big information set and it doesn’t must retailer all matches without delay. Extra particularly, the re.finditer() methodology processes the extracted substring separately.

Syntax

To invoke the re.finditer() methodology, observe the given syntax:

re.finditer(sample, string, flags)

Instance

First, create a string. Then, outline a regex sample as “r'[A-Z]+’” that matches a number of uppercase letters:

string = ‘Linuxhint is the Finest Tutorial Web site’
regex = r‘[A-Z]+’

Go the regex and the string as arguments to the “re.finditer()” methodology and retailer the resultant Match object in “matches”:

matches = re.finditer(regex, string)

Lastly, iterate over the matches object components, extract the substring with the assistance of the “group()” methodology and print out on the console:

for match in matches:
    sub_string = match.group()
    print(sub_string)

Output

We’ve compiled important approaches associated to extracting substring in Python.

Conclusion

To extract substring utilizing regex in Python, use the “re.search()”, “re.match()”, “re.findall()”, or the “re.finditer()” strategies. Relying in your necessities, make the most of “re.search()” methodology when it’s required to extract solely the primary occasion of the regex, “re.match()” extracts the substring current the beginning of a string, “re.findall()” retrieves a number of substrings in response to the sample, and lastly “re.finditer()” course of the a number of strings separately. This weblog lined the strategies for extracting substring in Python.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments