XML Encoding

« Previous Chapter Next Chapter »

XML documents can contain non ASCII characters, like India � � � , or French � � �.

To avoid errors, specify the XML encoding, or save XML files as Unicode.

XML Encoding Errors

If you load an XML document, you can get two different errors indicating encoding error:

An invalid character was found in text content.

You get this error if your XML contains non ASCII characters, and the file was saved as single-byte ANSI (or ASCII) with no encoding specified.

Switch from current encoding to specified encoding not supported.

You get this error if your XML file was saved as double-byte Unicode (or UTF-16) with a single-byte encoding (Windows-1252, ISO-8859-1, UTF-8) specified.

You also get this error if your XML file was saved with single-byte ANSI (or ASCII), with double-byte encoding (UTF-16) specified.

Windows Notepad

Windows Notepad save files as single-byte ANSI (ASCII) by default.

If you select "Save as...", you can specify double-byte Unicode (UTF-16).

Save the XML file below as Unicode (Note: document does not contain any encoding attribute):

<?xml version="1.0"?>
<note>
<from>saina</from>
<to>Guru</to>
<message>India: ���. French: ���</message>
</note>

The file above will NOT generate an error. But if you specify a single-byte encoding it will.

The Encoding beolw will give an error message:

<?xml version="1.0" encoding="windows-1252"?>

The Encoding beolw will give an error message:

<?xml version="1.0" encoding="ISO-8859-1"?>

The Encoding beolw will give an error message:

<?xml version="1.0" encoding="UTF-8"?>

The Encoding beolw will NOT give an error:

<?xml version="1.0" encoding="UTF-16"?>


Conclusion

  • Always use the encoding attribute
  • Use an editor that supports encoding
  • Make sure you know what encoding the editor uses
  • Use the same encoding in your encoding attribute

« Previous Chapter Next Chapter »

Have Any Suggestion? We Are Waiting To Hear from YOU!

Your Query was successfully sent!