Calculate character count, byte count (UTF-8/ASCII), word count, and line count.
String length can be measured in different ways: character count (visual units), byte count (storage size), codepoint count (Unicode units), or grapheme count (user-perceived characters). Understanding these distinctions is crucial when working with international text, databases with size limits, or APIs with character restrictions.
Character count varies by encoding: an emoji might be 1 character visually but 4 bytes in UTF-8. Word count typically splits on whitespace. Line count depends on newline characters. Byte count reflects actual storage size and varies with encoding (UTF-8, UTF-16, etc.).
Emojis can be composed of multiple Unicode codepoints. A family emoji might be several characters joined by Zero-Width Joiners. Different systems count these differently.
This tool uses UTF-8, the most common web encoding. UTF-8 uses 1-4 bytes per character: ASCII uses 1 byte, most other languages use 2-3 bytes, and emojis use 4 bytes.