loading

HTML Charsets

Html Charsets, Character Encoding, Ascii, Ansi, Iso-8859-1, Utf-8, Character Set Comparison

An HTML Charsets page needs to be recognized by a web browser in order for it to show effectively.


The HTML charset Attribute

The <meta> tag specifies the character set:

Example

				
					<meta charset="UTF-8">
				
			

Web developers are encouraged to use the UTF-8 character set by the HTML5 specification.

Nearly every character and symbol in the world is covered by UTF-8!

The Html Charset Attribute

The ASCII Character Set

The initial character encoding standard for the internet was ASCII. It provided definitions for 128 distinct characters that may be utilized online:

  • English letters (A-Z)
  • Numbers (0-9)
  • Special characters like ! $ + – ( ) @ < >.

The ANSI Character Set

ANSI (Windows-1252) was the original Windows character set:

  • Identical to ASCII for the first 127 characters
  • Special characters from 128 to 159
  • Identical to UTF-8 from 160 to 255
				
					<meta charset="Windows-1252">
				
			

The ISO-8859-1 Character Set

ISO-8859-1 was the default character set for HTML 4. This character set supported 256 different character codes. HTML 4 also supported UTF-8.

  • Identical to ASCII for the first 127 characters
  • Does not use the characters from 128 to 159
  • Identical to ANSI and UTF-8 from 160 to 255

HTML 4 Example

				
					<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1">
				
			

HTML 5 Example

				
					<meta charset="ISO-8859-1">

				
			

The UTF-8 Character Set

  • is identical to ASCII for the values from 0 to 127
  • Does not use the characters from 128 to 159
  • Identical to ANSI and 8859-1 from 160 to 255
  • Continues from the value 256 to 10 000 characters
				
					<meta charset="UTF-8">
				
			

Differences Between Character Sets

The following table displays the differences between the character sets described above:

Numb ASCII ANSI 8859 UTF‑8 Description
32 space
33!!!!exclamation mark
34""""quotation mark
35####number sign
36$$$$dollar sign
37%%%%percent sign
38&&&&ampersand
39''''apostrophe
40((((left parenthesis
41))))right parenthesis
42****asterisk
43++++plus sign
44,,,,comma
45----hyphen-minus
46....full stop
47////solidus
480000digit zero
491111digit one
502222digit two
513333digit three
524444digit four
535555digit five
546666digit six
557777digit seven
568888digit eight
579999digit nine
58::::colon
59;;;;semicolon
60<<<<less than
61====equals sign
62>>>>greater than
63????question mark
64@@@@commercial at
65AAAALatin A
66BBBBLatin B
67CCCCLatin C
68DDDDLatin D
69EEEELatin E
70FFFFLatin F
71GGGGLatin G
72HHHHLatin H
73IIIILatin I
74JJJJLatin J
75KKKKLatin K
76LLLLLatin L
77MMMMLatin M
78NNNNLatin N
79OOOOLatin O
80PPPPLatin P
81QQQQLatin Q
82RRRRLatin R
83SSSSLatin S
84TTTTLatin T
85UUUULatin U
86VVVVLatin V
87WWWWLatin W
88XXXXLatin X
89YYYYLatin Y
90ZZZZLatin Z
91[[[[left square bracket
92\\\\reverse solidus
93]]]]right square bracket
94^^^^circumflex accent
95____low line
96````grave accent
97aaaaLatin small a
98bbbbLatin small b
99ccccLatin small c
100ddddLatin small d
101eeeeLatin small e
102ffffLatin small f
103ggggLatin small g
104hhhhLatin small h
105iiiiLatin small i
106jjjjLatin small j
107kkkkLatin small k
108llllLatin small l
109mmmmLatin small m
110nnnnLatin small n
111ooooLatin small o
112ppppLatin small p
113qqqqLatin small q
114rrrrLatin small r
115ssssLatin small s
116ttttLatin small t
117uuuuLatin small u
118vvvvLatin small v
119wwwwLatin small w
120xxxxLatin small x
121yyyyLatin small y
122zzzzLatin small z
123{{{{left curly bracket
124||||vertical line
125}}}}right curly bracket
126~~~~tilde
127DEL    
128   euro sign
129 NOT USED
130   single low-9 quotation mark
131 ƒ  Latin small f with hook
132   double low-9 quotation mark
133   horizontal ellipsis
134   dagger
135   double dagger
136 ˆ  modifier letter circumflex accent
137   per mille sign
138 Š  Latin S with caron
139   single left-pointing angle quotation mark
140 Œ  Latin capital ligature OE
141 NOT USED
142 Ž  Latin Z with caron
143 NOT USED
144 NOT USED
145   left single quotation mark
146   right single quotation mark
147   left double quotation mark
148   right double quotation mark
149   bullet
150   en dash
151   em dash
152 ˜  small tilde
153   trade mark sign
154 š  Latin small s with caron
155   single right-pointing angle quotation mark
156 œ  Latin small ligature oe
157 NOT USED
158 ž  Latin small z with caron
159 Ÿ  Latin Y with diaeresis
160    no-break space
161 ¡¡¡inverted exclamation mark
162 ¢¢¢cent sign
163 £££pound sign
164 ¤¤¤currency sign
165 ¥¥¥yen sign
166 ¦¦¦broken bar
167 §§§section sign
168 ¨¨¨diaeresis
169 ©©©copyright sign
170 ªªªfeminine ordinal indicator
171 «««left-pointing double angle quotation mark
172 ¬¬¬not sign
173 ­­­soft hyphen
174 ®®®registered sign
175 ¯¯¯macron
176 °°°degree sign
177 ±±±plus-minus sign
178 ²²²superscript two
179 ³³³superscript three
180 ´´´acute accent
181 µµµmicro sign
182 pilcrow sign
183 ···middle dot
184 ¸¸¸cedilla
185 ¹¹¹superscript one
186 ºººmasculine ordinal indicator
187 »»»right-pointing double angle quotation mark
188 ¼¼¼vulgar fraction one quarter
189 ½½½vulgar fraction one half
190 ¾¾¾vulgar fraction three quarters
191 ¿¿¿inverted question mark
192 ÀÀÀLatin A with grave
193 ÁÁÁLatin A with acute
194 ÂÂÂLatin A with circumflex
195 ÃÃÃLatin A with tilde
196 ÄÄÄLatin A with diaeresis
197 ÅÅÅLatin A with ring above
198 ÆÆÆLatin AE
199 ÇÇÇLatin C with cedilla
200 ÈÈÈLatin E with grave
201 ÉÉÉLatin E with acute
202 ÊÊÊLatin E with circumflex
203 ËËËLatin E with diaeresis
204 ÌÌÌLatin I with grave
205 ÍÍÍLatin I with acute
206 ÎÎÎLatin I with circumflex
207 ÏÏÏLatin I with diaeresis
208 ÐÐÐLatin Eth
209 ÑÑÑLatin N with tilde
210 ÒÒÒLatin O with grave
211 ÓÓÓLatin O with acute
212 ÔÔÔLatin O with circumflex
213 ÕÕÕLatin O with tilde
214 ÖÖÖLatin O with diaeresis
215 ×××multiplication sign
216 ØØØLatin O with stroke
217 ÙÙÙLatin U with grave
218 ÚÚÚLatin U with acute
219 ÛÛÛLatin U with circumflex
220 ÜÜÜLatin U with diaeresis
221 ÝÝÝLatin Y with acute
222 ÞÞÞLatin Thorn
223 ßßßLatin small sharp s
224 àààLatin small a with grave
225 áááLatin small a with acute
226 âââLatin small a with circumflex
227 ãããLatin small a with tilde
228 äääLatin small a with diaeresis
229 åååLatin small a with ring above
230 æææLatin small ae
231 çççLatin small c with cedilla
232 èèèLatin small e with grave
233 éééLatin small e with acute
234 êêêLatin small e with circumflex
235 ëëëLatin small e with diaeresis
236 ìììLatin small i with grave
237 íííLatin small i with acute
238 îîîLatin small i with circumflex
239 ïïïLatin small i with diaeresis
240 ðððLatin small eth
241 ñññLatin small n with tilde
242 òòòLatin small o with grave
243 óóóLatin small o with acute
244 ôôôLatin small o with circumflex
245 õõõLatin small o with tilde
246 öööLatin small o with diaeresis
247 ÷÷÷division sign
248 øøøLatin small o with stroke
249 ùùùLatin small u with grave
250 úúúLatin small u with acute
251 ûûûLatin small with circumflex
252 üüüLatin small u with diaeresis
253 ýýýLatin small y with acute
254 þþþLatin small thorn
255 ÿÿÿLatin small y with diaeresis

HTML Charsets

Character Encoding

ASCII

ANSI

ISO-8859-1

UTF-8

Character Set Comparison

HTML

HTML5

HTML tutorials
Learn HTML
Free HTML tutorials
HTML Example
HTML Explained

HTML Character Sets: Ensuring Proper Encoding for Your Web Content

When it comes to creating web content, understanding HTML character sets is crucial. Character encoding plays a vital role in ensuring that your text displays correctly across different platforms and languages. In this informative section, we’ll explore the various character sets used in HTML and their implications.

Character Encoding Basics

At the core of character encoding lies the concept of mapping characters to numerical values. The most widely recognized character encoding is ASCII, which covers the basic English alphabet, numbers, and punctuation. However, as the web has become more global, the need for more comprehensive character sets has emerged.

Common HTML Character Sets

– ANSI: Also known as ISO-8859-1, this character set supports the Latin alphabet and some additional characters used in Western European languages.

– UTF-8: The most widely used character encoding on the web, UTF-8 can represent a vast array of characters from multiple language scripts, including Chinese, Japanese, and Arabic.

– ISO-8859-1: A character encoding that supports the Latin alphabet and some additional characters used in Western European languages.

Comparing Character Set Capabilities

When choosing a character set for your HTML content, it’s essential to consider the language and script requirements of your target audience. UTF-8 is generally the recommended choice as it provides the broadest support for international characters, ensuring your content is accessible to a global audience.

When it comes to creating web content, understanding HTML charsets and character encoding is crucial. Charsets, or character sets, define the range of characters that can be represented in a document. Choosing the right charset ensures your content is displayed correctly across different platforms and languages.

The most common charsets used in HTML include ASCII, ANSI, ISO-8859-1, and UTF-8. ASCII is a basic character set that includes only English letters, numbers, and a limited set of symbols. ANSI and ISO-8859-1 expand on ASCII to include additional characters for European languages.

However, the industry standard today is UTF-8, a universal character set that can represent characters from virtually any language. UTF-8 is compatible with ASCII, making it a versatile choice for multilingual web content.

 

When selecting a charset for your HTML documents, consider the languages and scripts you need to support. UTF-8 is generally the best option, as it provides comprehensive character coverage while maintaining compatibility with legacy systems. By mastering HTML charsets, you can ensure your web content is accessible and displayed correctly for all your users.

Share this Doc

HTML Charsets

Or copy link

Explore Topic