The Unicode Consortium develops the Unicode Standard.
Their goal is to replace the existing character sets with its standard Unicode Transformation Format (UTF).
The Unicode Standard has become a success and is implemented in HTML, XML, Java, JavaScript, E-mail, ASP, PHP, etc.
The Unicode Consortium cooperates with the leading standards development organizations, like ISO, W3C, and ECMA.
The Unicode Standard covers (almost) all the characters, punctuations, and symbols in the world.
The default character encoding in HTML-5 is UTF-8.
The Unicode standard is also supported in many operating systems and all modern browsers.
Unicode can be implemented by different character sets.
The most commonly used encodings are UTF-8 and UTF-16:
UTF-8 :
A character in UTF8 can be from 1 to 4 bytes long.
UTF-8 can represent any character in the Unicode standard.
UTF-8 is backwards compatible with ASCII.
UTF-8 is the preferred encoding for e-mail and web pages
UTF-16:
16-bit Unicode Transformation Format is a variable-length character encoding for Unicode,
capable of encoding the entire Unicode repertoire.
UTF-16 is used in major operating systems and environments, like Microsoft Windows, Java and .NET.
Tip:
All HTML 4 processors support UTF-8,
and all HTML 5 and XML processors support both UTF-8 and UTF-16!
If an HTML5 web page uses a different character set than UTF-8,
it should be specified in the tag like:
w3schools.com/charsets/ref_utf_symbols.asp
Unicode Characters (graphemica.com)
Char | Entity | HTML Dec | HTML Hex | Description |
---|---|---|---|---|
ă | abreve | ă | ă | latin small letter a with breve |
â | acirc | â | â | latin small letter a with circumflex |
Î | Icirc | Î | Î | latin capital letter I with circumflex |
î | icirc | î | î | latin small letter i with circumflex |
à | agrave | à | à | latin small letter a with grave = latin small letter a grave |
á | aacute | á | á | latin small letter a with acute |
æ | aelig | æ | æ | latin small letter ae = latin small ligature ae |
ç | ccedil | ç | ç | latin small letter c with cedilla |
è | egrave | è | è | latin small letter e with grave |
é | eacute | é | é | latin small letter e with acute |
ê | ecirc | ê | ê | latin small letter e with circumflex |
Δ | Delta | Δ | Δ | greek capital letter delta |
Φ | Phi | Φ | Φ | greek capital letter phi |
φ | phi | φ | φ | greek smal letter phi |
α | alpha | α | α | greek smal letter alpha |
β | beta | β | β | greek smal letter beta |
γ | gamma | γ | γ | greek smal letter gamma |
μ | mu | μ | μ | greek smal letter mu |
π | pi | π | π | greek smal letter pi |
ϖ | piv | ϖ | ϖ | Greek pi symbol |
ρ | rho | ρ | ρ | greek smal letter rho |
ψ | psi | ψ | ψ | greek smal letter psi |
ω | omega | ω | ω | greek smal letter omega |
& | amp | & | & | ampersand |
< | lt | < |   | less than |
> | gt | > |   | greater than |
± | plusmn | ± | ± | plus-minus sign = plus-or-minus sign |
µ | micro | µ | µ | micro sign |
¼ | frac14 | ¼ | ¼ | vulgar fraction one quarter = fraction one quarter |
½ | frac12 | ½ | ½ | vulgar fraction one half = fraction one half |
¾ | frac34 | ¾ | ¾ | vulgar fraction three quarters = fraction three quarters |
• | bull | • | • | bullet = black small circle |
• | bull | 02022 | ||
• | bullet | 02022 | ||
′ | prime | ′ | ′ | prime = minutes = feet |
″ | Prime | ″ | ″ | double prime = seconds = inches |
‾ | oline | ‾ | ‾ | overline = spacing overscore |
⁄ | frasl | ⁄ | ⁄ | fraction slash |
← | larr | ← | ← | leftwards arrow |
↑ | uarr | ↑ | ↑ | upwards arrow |
→ | rarr | → | → | rightwards arrow |
↓ | darr | ↓ | ↓ | downwards arrow |
↔ | harr | ↔ | ↔ | left right arrow |
↵ | crarr | ↵ | ↵ | downwards arrow with corner leftwards = carriage return |
⇒ | rArr | ⇒ | ⇒ | rightwards double arrow |
∃ | exist | ∃ | ∃ | there exists |
∅ | empty | ∅ | ∅ | empty set = null set = diameter |
∈ | isin | ∈ | ∈ | element of |
∉ | notin | ∉ | ∉ | not an element of |
∑ | sum | ∑ | ∑ | n-ary sumation |
− | minus | − | − | minus sign |
∗ | lowast | ∗ | ∗ | asterisk operator |
√ | radic | √ | √ | square root = radical sign |
∝ | prop | ∝ | ∝ | proportional to |
∞ | infin | ∞ | ∞ | infinity |
∼ | sim | ∼ | ∼ | tilde operator = varies with = similar to |
∽ | reversed tilde | ∽ | ∽ | |
∿ | sine wave | ∿ | ∿ | |
∾ | inverted lazy s | ∾ | ∾ | |
∻ | homothetic | ∻ | ∻ | |
∷ | proportion | ∷ | ∷ | |
∺ | geometric proportion | ∺ | ∺ | |
∹ | excess | ∹ | ∹ | |
∸ | dot minus | ∸ | ∸ | |
≅ | cong | ≅ | ≅ | approximately equal to |
≊ | approxeq | ≊ | ≊ | almost equal or equal to |
≈ | approx | ≈ | ≈ | almost equal to |
≃ | asymp | ≃ | ≃ | almost equal to = asymptotic to |
≂ | minus tilde | ≂ | ≂ | |
≁ | not tilde | ≁ | ≁ | |
≀ | wreath product | ≀ | ≀ | |
≠ | ne | ≠ | ≠ | not equal to |
≡ | equiv | ≡ | ≡ | identical to |
≤ | le | ≤ | ≤ | less-than or equal to |
≥ | ge | ≥ | ≥ | greater-than or equal to |
⊂ | sub | ⊂ | ⊂ | subset of |
⊃ | sup | ⊃ | ⊃ | superset of |
⊄ | nsub | ⊄ | ⊄ | not a subset of |
⊥ | perp | ⊥ | ⊥ | up tack (orthogonal to) (perpendicular) |
⊤ | down tack | ⊤ | ⊤ | |
⊣ | left tack | ⊣ | ⊣ | |
⊢ | right tack | ⊢ | ⊢ | |
⊡ | squared dot operator | ⊡ | ⊡ | |
⊠ | squared times | ⊠ | ⊠ | |
⊟ | squared minus | ⊟ | ⊟ | |
⊞ | squared plus | ⊞ | ⊞ | |
⊝ | circled dash | ⊝ | ⊝ | |
⊜ | circled equals | ⊜ | ⊜ | |
⊛ | circled asterisk operator | ⊛ | ⊛ | |
⊚ | circled ring operator | ⊚ | ⊚ | |
⊙ | circled dot operator | ⊙ | ⊙ | |
⊘ | circled division slash | ⊘ | ⊘ | |
⊗ | circled times | &⊗ | ⊗ | otimes |
⊖ | circled minus | ⊖ | ⊖ | |
⊕ | circled plus | ⊕ | ⊕ | oplus |
⊔ | square cup | ⊔ | ⊔ | |
⊓ | square cap | ⊓ | ⊓ | |
⊒ | square original of or equal to | ⊒ | ⊒ | |
⊑ | square image of or equal to | ⊑ | ⊑ | |
⊐ | square original of | ⊐ | ⊐ | |
⊏ | square image of | ⊏ | ⊏ | |
⊎ | multiset union | ⊎ | ⊎ | |
⊍ | multiset multiplication | ⊍ | ⊍ | |
⊌ | multiset | ⊌ | ⊌ | |
& | & | |||
♥ | hearts | ♥ | ♥ | black heart suit = valentine |
◯ | bigcirc | 025EF | ||
★ | bigstar | 02605 | ||
⊟ | boxminus | 0229F | ||
⊞ | boxplus | 0229E | ||
⊠ | boxtimes | 022A0 | ||
✓ | check | 02713 | ||
✓ | checkmark | 02713 | ||
† | dagger | 02020 | ||
˜ | DiacriticalTilde | 002DC | ||
÷ | div | 000F7 | ||
÷ | divide | 000F7 | ||
$ | dollar | 00024 | ||
⟹ | DoubleLongRightArrow | 027F9 | ||
⇒ | DoubleRightArrow | 021D2 |