textcase
Python library for text case conversions.
Features#
Text case conversion: convert strings between various text cases (e.g., snake_case, kebab-case, camelCase, etc.).
Extensible: extend the library with custom word boundaries and cases.
Accurate: handles any word boundaries in strings including acronyms (as in
"HTTPRequest"
).Non-ASCII Support: handles non-ASCII characters seamlessly (no inferences on the input language itself is made).
Tiny, Performant & Zero Dependencies: a regex-free, efficient library that stays lightweight with no external dependencies.
100% test coverage: every line of code is rigorously tested for reliability.
100% type annotated codebase: full type annotations for best developer experience.
Installation#
Usage#
Convert a string to a text case:
You can also test what case a string is in:
match.py | |
---|---|
Boundaries#
By default, the library will words split along a set of default word boundaries, that is:
- Underscores:
"_"
, - Hyphens:
"-"
, - Spaces:
" "
, - Interpuncts:
"·"
, - Changes in capitalization from lowercase to uppercase:
"aA"
, - Adjacent digits and letters:
"a1"
,"1a"
,"A1"
,"1A"
, - Acronyms:
"AAa"
(as in"HTTPRequest"
).
You can learn more about boundaries here.
Precision#
For more precision, you can specify boundaries to split based on the word boundaries of a particular case. For example, you can explicitly specify which boundaries will be used:
precision.py | |
---|---|
This library can detect acronyms in camel-like strings. It also ignores any leading, trailing, or duplicate delimiters:
acronyms.py | |
---|---|
Non-ASCII Characters#
The library also supports non-ASCII characters. However, no inferences on the input language itself is made. For example, in Dutch, the digraph "ij"
is treated as two separate Unicode characters and will not be capitalized. In contrast, the character "æ"
will be capitalized as expected. Also, in English the text "I THINK I DO"
will be converted to "i think i do"
, not "I think I do"
. This means that the library can handle various characters:
non_ascii.py | |
---|---|
Punctuation#
By default, characters followed by digits and vice-versa are considered word boundaries. In addition, punctuation characters are stripped (excluding current case delimiter
) and other special characters are ignored. You can control this behavior using the strip_punctuation
argument: