About
Kodeclik is an online coding academy for kids and teens to learn real world programming. Kids are introduced to coding in a fun and exciting way and are challeged to higher levels with engaging, high quality content.
Popular Classes
Scratch Coding
Minecraft Coding
TinkerCAD
Roblox Studio
Python for Kids
Javascript for Kids
Pre-Algebra
Geometry for Kids
Copyright @ Kodeclik 2025. All rights reserved.
There will come a time when you are processing strings in your Python program and the string contains some “nuisance” characters that need to be removed. So here then is a blogpost on how to remove non-alphabetic characters from your Python string.
We explore three different methods to accomplish our objective!
This approach uses Python's built-in string method isalpha() in combination with a list comprehension and join().
In the above program, the isalpha() method checks each character to verify if it's an alphabetic letter. The list comprehension creates a new sequence containing only the characters that pass this check, and join() combines them back into a string. This method is straightforward and easy to understand, making it ideal for simple text cleaning tasks.
The output is:
This solution employs regular expressions through the re module.
Here, the re.sub() function replaces all characters that don't match the pattern [^a-zA-Z] with an empty string. The caret ^ inside the square brackets means "not", so this pattern matches any character that is not a letter from a to z or A to Z.
The output is:
As can be seen here, regular expressions offer more flexibility and power when dealing with complex pattern matching requirements.
The translate() method provides a highly efficient way to remove multiple characters at once.
This program creates a translation “table” using str.maketrans() that maps all punctuation, digits, and whitespace characters to None, effectively removing them. The string.punctuation, string.digits, and string.whitespace constants from the string module provide comprehensive lists of characters to remove. This method is particularly efficient when processing large strings because it performs the removal operation in a single pass.
Here are some applications of what you have learnt so far! In the below we slightly update the code so that we retain spaces for readability, but you can change it if you so desire.
Here we aim to sanitize user comments that might contain non-alphabetic characters:
This method uses a list comprehension with two conditions. The char.isalpha() checks if a character is a letter, while char.isspace() checks if it's a whitespace character. The join() method then combines all retained characters. This approach is particularly useful for cleaning social media comments or user reviews where emoticons, excessive punctuation, and numbers are common but need to be removed while maintaining readability.
Here we are standardizing product descriptions that might contain a range of non-alphabetic characters:
This solution uses regular expressions with the pattern [^a-zA-Z\s]. The ^ inside square brackets means "not", a-zA-Z represents all letters, and \s represents whitespace characters. The re.sub() function replaces all characters that don't match this pattern with an empty string. This method is ideal for cleaning product descriptions or titles that often contain special characters, model numbers, and prices.
Here we demonstrate how to clean email subject lines which can contain a lot of extraneous information:
This approach uses the translate() method with a custom translation table. The table is created by combining punctuation and digits from the string module, but explicitly excludes spaces using if c != ' '. The translate() method then removes all specified characters in a single efficient pass. This method is particularly effective for email subjects or headers that often contain various prefixes, brackets, and reference numbers.
So we have learnt three different methods to "clean up" your string. Which method is your favorite?
Enjoy this blogpost? Want to learn Python with us? Sign up for 1:1 or small group classes.
inputstring = "K0odecl1ik!"
only_letters = ''.join(char for char in inputstring if char.isalpha())
print(only_letters)Kodeclikimport re
inputstring = "K0odecl1ik!"
clean_text = re.sub(r'[^a-zA-Z]', '', inputstring)
print(clean_text)Kodeclikimport string
inputstring = "K0odecl1ik!"
translator = str.maketrans(‘’, ‘’, string.punctuation + string.digits + string.whitespace)
clean_text = inputstring.translate(translator)
print(clean_text)user_comment= "Hey!!! This product is AMAZING!!! <3 :) Would buy again... 100% satisfied!!!"
clean_text = ''.join(char for char in user_comment if char.isalpha() or char.isspace())
print(clean_text)
# Output: Hey This product is AMAZING Would buy again satisfiedimport re
product_desc = "$Special-Edition* Nike Air-Max 2024 (Limited Release) @ $299.99!!!"
clean_text = re.sub(r'[^a-zA-Z\s]', '', product_desc)
print(clean_text)
# Output: Special Edition Nike Air Max Limited Releaseimport string
email_subject = "RE: [URGENT!!!] Your Order #12345 Status Update - Shipped!"
translator = str.maketrans('', '', ''.join(c for c in string.punctuation + string.digits if c != ' '))
clean_text = email_subject.translate(translator)
print(clean_text)
# Output: RE URGENT Your Order Status Update Shipped