Data & AnalyticsLive🔒 Private

Entropy of String

Calculate the Shannon entropy of a string. Free online entropy calculator. No signup, 100% private, browser-based.

Entropy of String

Entropy (bits)

3.15

How it works

Information entropy (Shannon entropy) measures the average amount of information — or unpredictability — in a string, in bits per character. It was formalized by Claude Shannon in his 1948 paper "A Mathematical Theory of Communication" and is the foundational concept in data compression, cryptography, and information theory.

**Formula** H = −Σ p(c) × log₂(p(c)) for each unique character c in the string, where p(c) is the probability (relative frequency) of character c. A uniform distribution (every character equally likely) maximizes entropy: for a 256-character alphabet, maximum entropy is log₂(256) = 8 bits/char. A string of all identical characters has entropy 0 (completely predictable).

**Interpreting entropy for security** Password entropy estimates the information-theoretic strength: a random lowercase password has log₂(26) ≈ 4.7 bits/char. A random alphanumeric+symbols password has log₂(95) ≈ 6.57 bits/char. A 20-character random alphanumeric password has ~131 bits of entropy — computationally unbreakable. Low entropy per character does not mean a password is weak if it is sufficiently long.

**Data compression context** Shannon entropy represents the theoretical lower bound on lossless compression. Text with 4 bits/char entropy can, in theory, be compressed to 50% of its original size (from 8 bits/byte ASCII). Zip and gzip approach this limit for English text. Truly random data (entropy = 8 bits/char) cannot be compressed.

Frequently Asked Questions

What is the Shannon entropy of 'AAAAAAA' vs 'ABCDEFG'?
'AAAAAAA' has entropy 0 bits/char — every character is A (p=1.0), −1.0 × log₂(1.0) = 0. 'ABCDEFG' has 7 distinct characters each with p=1/7, entropy = −7 × (1/7) × log₂(1/7) = log₂(7) ≈ 2.807 bits/char. Maximum possible entropy for a 7-character alphabet is log₂(7) ≈ 2.807 bits/char — 'ABCDEFG' achieves the maximum because each character is equally probable. Adding repeated characters reduces entropy.
How does entropy relate to password strength?
Password strength in bits = length × bits_per_character. Random lowercase = log₂(26) ≈ 4.7 bits/char. Random alphanumeric (62 chars) ≈ 5.95 bits/char. Random printable ASCII (95 chars) ≈ 6.57 bits/char. A 20-char random printable-ASCII password has 131 bits — computationally unbreakable. Important: entropy assumes truly random selection. 'P@ssw0rd' has low entropy despite using multiple character types, because the substitution pattern is predictable. Shannon entropy of the string itself (measured by this tool) underestimates password security if the password is chosen from a natural language pattern.
What does Shannon entropy tell us about data compressibility?
Shannon entropy is the theoretical minimum bits per symbol for lossless compression. A string with H=3 bits/char can theoretically be compressed from 8 bits/char (ASCII) to 3 bits/char — 62.5% compression. Huffman coding achieves near-optimal compression for known distributions. LZ77/LZ78 (used in gzip/zip) achieve near-entropy compression even for unknown distributions. A string with entropy 8 bits/char (uniformly random bytes) cannot be compressed — its compressed output is at least as large as the input.
Can I use entropy to detect encryption or random data?
Yes. Encrypted data and random data have entropy near 8 bits/byte (near-maximum for byte values 0–255). Compressed data also has high entropy (that's why compressing already-compressed or encrypted data doesn't help). Unencrypted ASCII text typically has entropy 4–5 bits/byte. Executable binaries: 5–7 bits/byte. A file with measured entropy > 7.5 bits/byte is likely encrypted, compressed, or contains random padding. Malware analysis tools use entropy to detect packed/encrypted executables.