Regular Expressions (RegEx) are a powerful tool in Python for pattern matching and text processing. Whether you’re validating user input, searching for patterns in strings, or manipulating text, RegEx simplifies complex operations efficiently.
Regular expressions allow you to search for specific patterns in strings. For example, suppose you need to find employees whose names:
Using RegEx, the pattern ^h....i$
helps identify such names.
^
Ensures the string starts with ‘h’.....
Represents any four characters.i$
Ensures the string ends with ‘i’.Name | Matches? |
---|---|
harini | Yes |
henri | No |
Python provides the re
module to work with regular expressions. The primary method for pattern matching is match()
.
import re
pattern = '^h....i$'
string = 'harini'
result = re.match(pattern, string)
if result:
print("Pattern Matched")
else:
print("Pattern Not Matched")
Pattern Matched
Metacharacters define the matching rules. Below are some essential metacharacters:
Symbol | Description | Example |
---|---|---|
. | Matches any character except newline | h.llo → matches hello |
^ | Matches the start of the string | ^hello → matches hello world |
$ | Matches the end of the string | world$ → matches hello world |
* | Matches 0 or more repetitions | a* → matches aaa |
+ | Matches 1 or more repetitions | a+ → matches aaa |
[] | Matches any character inside brackets | [abc] → matches a , b , or c |
Special sequences simplify pattern matching:
Sequence | Description | Example |
---|---|---|
\d | Matches any digit (0-9) | \d+ → matches 123 |
\w | Matches alphanumeric characters | \w+ → matches hello123 |
\s | Matches any whitespace character | \s → matches spaces or tabs |
\b | Matches word boundaries | \bword\b → matches word |
Python’s re
module provides useful methods:
findall()
– Find All Matchesimport re
result = re.findall('[AFP]', 'FACE Prep')
print(result)
Output: ['F', 'A', 'P']
split()
– Split String by a Patternresult = re.split('\d+', 'FACE10Prep3Python')
print(result)
Output: ['FACE', 'Prep', 'Python']
search()
– Find First Matchresult = re.search('FA', 'FACE Prep')
if result:
print("Pattern Found")
else:
print("Pattern Not Found")
Output: Pattern Found
A regular expression is a sequence of characters that define a search pattern, primarily used for string matching and text processing.
Use m*
(matches zero or more) or m+
(matches one or more occurrences).
result = re.search('m+', 'programming')
if result:
print("Found")
Use the pattern !?
.
result = re.search('!?', 'Hello World!')
if result:
print("Found")
else:
print("Not Found")
Regular expressions in Python are a powerful tool for pattern matching and text processing. With the re
module, you can efficiently search, replace, and manipulate strings using metacharacters, special sequences, and built-in functions like match()
, findall()
, and search()
. Whether you’re validating input, extracting data, or parsing text, mastering RegEx can significantly improve your coding efficiency.