2 Python Regular Expressions Flashcards

Question

How can I match alphanumeric characters only (both letters and digits)?

Answer 1

import re text = "User123@ai.com" alphanumeric = re.findall(r"[a-zA-Z0-9]", text) print(alphanumeric) Output: ['U', 's', 'e', 'r', '1', '2', '3', 'a', 'i', 'c', 'o', 'm']

Answer 2

Use \d+ with re.findall() to find all digit sequences (like "123", "99"). import re text = "I have 2 cats, 1 dog, and 12 fishes." numbers = re.findall(r"\d+", text) print(numbers) Output: ['2', '1', '12']

Answer 3

import re text = "Version 2.0 released on 2025-05-01" cleaned = re.sub(r"\d", "", text) print(cleaned)

Answer 4

import re text = "Room 401 has 3 chairs and 2 tables." digits = re.findall(r"\d", text) print("Total digits:", len(digits)+1) Output: Total digits: 5

Answer 5

import re text = "We launched in 2019, updated in 2021, and expanded in 2023." years = re.findall(r"\d{4}", text) print(years) Output: ['2019', '2021', '2023'] \d{4} matches exactly four digits (common year format).

Answer 6

import re text = "apple ant banana anchor ball" matches = re.findall(r"a\w*", text) print(matches) Output: ['apple', 'ant', 'anchor'] a\w* matches a followed by zero or more word characters (so "a", "ap", "apple" etc.).

Answer 7

import re text = "Hello NLP! Data Science" matches = re.findall(r".*?", text) print(matches) Output: ['Hello NLP!', 'Data Science'] .*? is non-greedy: it matches the shortest content between two tags.

Answer 8

import re text = "color or colour — both are valid." matches = re.findall(r"colou?r", text) print(matches) Output: ['color', 'colour'] u? means: the letter "u" may or may not be there.

Answer 9

import re text = "My numbers are +91-9876543210 and 9876543210" matches = re.findall(r"(\+91-)?\d{10}", text) print(matches) Output: ['+91-', ''] matches = re.findall(r"\+91-\d{10}|\d{10}", text) print(matches) Output: ['+91-9876543210', '9876543210']

Answer 10

It groups patterns together. It can capture matched text (used in findall(), finditer(), etc.). You can apply quantifiers (*, +, ?, etc.) to the entire group.

Answer 11

import re text = "+91-9876543210" match = re.search(r"(\+91-)(\d{10})", text) print(match.groups()) Output: ('+91-', '9876543210')

Answer 12

import re text = "hahaha hehe" match = re.search(r"(ha){3}", text) print(match.group()) Output: 'hahaha'

Answer 13

\b matches the position between a word character ([a-zA-Z0-9_]) and a non-word character (like space, punctuation, or line start/end). It does not match any character, just a position. Think of it like: "start of a word" → \bword "end of a word" → word\b "exact word" → \bword\b Pattern Meaning \bword\b Match exact word “word” \bun\w+ Words starting with “un” \w+ed\b Words ending in “ed” \b\w+\b Match complete words (tokenizing) \b\w{3}\b Match all 3-letter words

Answer 14

import re text = "I saw a cat. It was not in scatter or catalog." matches = re.findall(r"\bcat\b", text) print(matches) Output: ['cat'] Only exact cat is matched. scatter and catalog are ignored.

Answer 15

A space A tab (\t) A newline (\n) A carriage return (\r) Basically, any whitespace (invisible spacing)

Answer 16

import re text = "Natural Language Processing" matches = re.findall(r"\s", text) print(matches) [' ', ' ']

Answer 17

text = "Hello \tWorld \n NLP " cleaned = re.sub(r"\s+", " ", text).strip() print(cleaned) Output: 'Hello World NLP' \s+ removes all extra spacing and replaces with just one space.

Answer 18

Any letter → A to Z, a to z Any digit → 0 to 9 The underscore → _ ⚠️ Note: It does not match special characters or spaces.

Answer 19

import re text = "NLP_101 is #awesome!" matches = re.findall(r"\w", text) print(matches) Output: ['N', 'L', 'P', '_', '1', '0', '1', 'i', 's', 'a', 'w', 'e', 's', 'o', 'm', 'e']

Answer 20

text = "AI_2025 is great!" words = re.findall(r"\w+", text) print(words) Output:['AI_2025', 'is', 'great']

Answer 21

text = "AI is amazing! #future-ready_2025" cleaned = re.sub(r"[^\w\s]", "", text) print(cleaned) Output: 'AI is amazing futureready_2025'

Answer 22

text = "Email: ai_dev2025@example.com" count = len(re.findall(r"\w", text)) print(count) Output: 22

Answer 23

# ✅ Match: 'Data' Match at the start import re re.search(r'\AData', 'Data Science is powerful') No match if not at start re.search(r'\AScience', 'Data Science is cool') # ❌ No match Even with multiline, still only matches start of full string text = '''AI is rising Machine learning is growing''' re.search(r'\AAI', text) # ✅ Match: 'AI'

Answer 24

# ✅ Match: 'end' Match at the end re.search(r'end\Z', 'This is the real end') No match if not at end re.search(r'world\Z', 'Hello world! ') # ❌ No match Match at true end even with newlines text = 'Durga is learning\nRegex is fun\nEND' re.search(r'END\Z', text) # ✅ Match: 'END' Tip: When dealing with full documents, .strip() helps clean trailing spaces/newlines before using \Z.

2 Python Regular Expressions Flashcards

(48 cards)