开发者

How do I lowercase a string in Python?

开发者 https://www.devze.com 2023-03-22 04:23 出处:网络
Is there a way to convert a string to lowercase? "Kilometers"→"kilometers" See 开发者_开发问答How to change a string into uppercase? for the opposite.Use str.lower():

Is there a way to convert a string to lowercase?

"Kilometers"  →  "kilometers"

See 开发者_开发问答How to change a string into uppercase? for the opposite.


Use str.lower():

"Kilometer".lower()


The canonical Pythonic way of doing this is

>>> 'Kilometers'.lower()
'kilometers'

However, if the purpose is to do case insensitive matching, you should use case-folding:

>>> 'Kilometers'.casefold()
'kilometers'

Here's why:

>>> "Maße".casefold()
'masse'
>>> "Maße".lower()
'maße'
>>> "MASSE" == "Maße"
False
>>> "MASSE".lower() == "Maße".lower()
False
>>> "MASSE".casefold() == "Maße".casefold()
True

This is a str method in Python 3, but in Python 2, you'll want to look at the PyICU or py2casefold - several answers address this here.

Unicode Python 3

Python 3 handles plain string literals as unicode:

>>> string = 'Километр'
>>> string
'Километр'
>>> string.lower()
'километр'

Python 2, plain string literals are bytes

In Python 2, the below, pasted into a shell, encodes the literal as a string of bytes, using utf-8.

And lower doesn't map any changes that bytes would be aware of, so we get the same string.

>>> string = 'Километр'
>>> string
'\xd0\x9a\xd0\xb8\xd0\xbb\xd0\xbe\xd0\xbc\xd0\xb5\xd1\x82\xd1\x80'
>>> string.lower()
'\xd0\x9a\xd0\xb8\xd0\xbb\xd0\xbe\xd0\xbc\xd0\xb5\xd1\x82\xd1\x80'
>>> print string.lower()
Километр

In scripts, Python will object to non-ascii (as of Python 2.5, and warning in Python 2.4) bytes being in a string with no encoding given, since the intended coding would be ambiguous. For more on that, see the Unicode how-to in the docs and PEP 263

Use Unicode literals, not str literals

So we need a unicode string to handle this conversion, accomplished easily with a unicode string literal, which disambiguates with a u prefix (and note the u prefix also works in Python 3):

>>> unicode_literal = u'Километр'
>>> print(unicode_literal.lower())
километр

Note that the bytes are completely different from the str bytes - the escape character is '\u' followed by the 2-byte width, or 16 bit representation of these unicode letters:

>>> unicode_literal
u'\u041a\u0438\u043b\u043e\u043c\u0435\u0442\u0440'
>>> unicode_literal.lower()
u'\u043a\u0438\u043b\u043e\u043c\u0435\u0442\u0440'

Now if we only have it in the form of a str, we need to convert it to unicode. Python's Unicode type is a universal encoding format that has many advantages relative to most other encodings. We can either use the unicode constructor or str.decode method with the codec to convert the str to unicode:

>>> unicode_from_string = unicode(string, 'utf-8') # "encoding" unicode from string
>>> print(unicode_from_string.lower())
километр
>>> string_to_unicode = string.decode('utf-8') 
>>> print(string_to_unicode.lower())
километр
>>> unicode_from_string == string_to_unicode == unicode_literal
True

Both methods convert to the unicode type - and same as the unicode_literal.

Best Practice, use Unicode

It is recommended that you always work with text in Unicode.

Software should only work with Unicode strings internally, converting to a particular encoding on output.

Can encode back when necessary

However, to get the lowercase back in type str, encode the python string to utf-8 again:

>>> print string
Километр
>>> string
'\xd0\x9a\xd0\xb8\xd0\xbb\xd0\xbe\xd0\xbc\xd0\xb5\xd1\x82\xd1\x80'
>>> string.decode('utf-8')
u'\u041a\u0438\u043b\u043e\u043c\u0435\u0442\u0440'
>>> string.decode('utf-8').lower()
u'\u043a\u0438\u043b\u043e\u043c\u0435\u0442\u0440'
>>> string.decode('utf-8').lower().encode('utf-8')
'\xd0\xba\xd0\xb8\xd0\xbb\xd0\xbe\xd0\xbc\xd0\xb5\xd1\x82\xd1\x80'
>>> print string.decode('utf-8').lower().encode('utf-8')
километр

So in Python 2, Unicode can encode into Python strings, and Python strings can decode into the Unicode type.


With Python 2, this doesn't work for non-English words in UTF-8. In this case decode('utf-8') can help:

>>> s='Километр'
>>> print s.lower()
Километр
>>> print s.decode('utf-8').lower()
километр


Also, you can overwrite some variables:

s = input('UPPER CASE')
lower = s.lower()

If you use like this:

s = "Kilometer"
print(s.lower())     - kilometer
print(s)             - Kilometer

It will work just when called.


Don't try this, totally un-recommend, don't do this:

import string
s='ABCD'
print(''.join([string.ascii_lowercase[string.ascii_uppercase.index(i)] for i in s]))

Output:

abcd

Since no one wrote it yet you can use swapcase (so uppercase letters will become lowercase, and vice versa) (and this one you should use in cases where i just mentioned (convert upper to lower, lower to upper)):

s='ABCD'
print(s.swapcase())

Output:

abcd


lowercasing

This method not only converts all uppercase letters of the Latin alphabet into lowercase ones, but also shows how such logic is implemented. You can test this code in any online Python sandbox.

def turnIntoLowercase(string):
    
    lowercaseCharacters = ''
    
    abc = ['a','b','c','d','e','f','g','h','i','j','k','l','m', 
           'n','o','p','q','r','s','t','u','v','w','x','y','z',
           'A','B','C','D','E','F','G','H','I','J','K','L','M',
           'N','O','P','Q','R','S','T','U','V','W','X','Y','Z']
  
    for character in string:
        if character not in abc:
            lowercaseCharacters += character
        elif abc.index(character) <= 25:
            lowercaseCharacters += character
        else: 
            lowercaseCharacters += abc[abc.index(character) - 26]
    return lowercaseCharacters

string = str(input("Enter your string, please: " ))

print(turnIntoLowercase(string = string))

Performance check

Now, let's enter the following string (and press Enter) to make sure everything works as intended:

# Enter your string, please: 

"PYTHON 3.11.2, 15TH FeB 2023"

Result:

"python 3.11.2, 15th feb 2023"


If you want to convert a list of strings to lowercase, you can map str.lower:

list_of_strings = ['CamelCase', 'in', 'Python']
list(map(str.lower, list_of_strings))            # ['camelcase', 'in', 'python']


There are several different ways in which this can be done.

  1. Using .lower() method
original_string = "UPPERCASE"
lowercase_string = original_string.lower()
print(lowercase_string)  # Output: "uppercase"
  1. Using str.lower()
original_string = "UPPERCASE"
lowercase_string = str.lower(original_string)
print(lowercase_string)  # Output: "uppercase"
  1. Using combination of str.translate() and str.maketrans()
original_string = "UPPERCASE"
lowercase_string = original_string.translate(str.maketrans(string.ascii_uppercase, string.ascii_lowercase))
print(lowercase_string)  # Output: "uppercase"
0

精彩评论

暂无评论...
验证码 换一张
取 消