textcase
A feature-rich Python text case conversion library.
Features#
- Text case conversion: Convert strings between various text cases (e.g., snake_case, kebab-case, camelCase, etc.).
- Extensible Design: Easily extend the library with custom cases and boundaries.
- Acronym Handling: Properly detects and formats acronyms in strings (as in
HTTPRequest
). - Non-ASCII Support: Handles non-ASCII characters seamlessly (no inferences on the input language itself is made).
- 100% Test Coverage: Comprehensive tests ensure reliability and correctness.
- Well-Documented: Clean documentation with usage examples for easy understanding.
- Performant: Efficient implementation without the use of regular expressions.
- Zero Dependencies: The library has no external dependencies, making it lightweight and easy to integrate.
Installation#
Usage#
You can convert strings into a case using the convert
function:
from textcase import case, convert
print(convert("ronnie james dio", case.SNAKE))
print(convert("Ronnie_James_dio", case.CONSTANT))
print(convert("RONNIE_JAMES_DIO", case.KEBAB))
print(convert("RONNIE-JAMES-DIO", case.CAMEL))
print(convert("ronnie-james-dio", case.PASCAL))
print(convert("RONNIE JAMES DIO", case.LOWER))
print(convert("ronnie james dio", case.UPPER))
print(convert("ronnie-james-dio", case.TITLE))
print(convert("ronnie james dio", case.SENTENCE))
By default, convert
and CaseConverter.convert
will split along a set of default word boundaries, that is:
- Underscores:
_
, - Hyphens:
-
, - Spaces:
,
- Changes in capitalization from lowercase to uppercase:
aA
, - Adjacent digits and letters:
a1
,1a
,A1
,1A
, - Acronyms:
AAa
(as inHTTPRequest
).
For more precision, you can specify boundaries to split based on the word boundaries of a particular case. For example, you can explicitly specify which boundaries will be used:
This library can detect acronyms in camel-like strings. It also ignores any leading, trailing, or duplicate delimiters:
The library also supports non-ASCII characters. However, no inferences on the input language itself is made. For example, in Dutch, the digraph "ij" is treated as two separate Unicode characters and will not be capitalized. In contrast, the character "æ" will be capitalized as expected. Also, in English the text "I THINK I DO" will be converted to "i think i do", not "I think I do". This means that the library can handle various characters:
By default, characters followed by digits and vice-versa are considered word boundaries. In addition, any special ASCII characters (besides _
and -
) are ignored:
You can also test what case a string is in: