If you have ever run into an issue where you see strange characters in online content that should not be there, you have probably run into an encoding issue. In this article we hope to explain how you can solve some of these issues when interacting with the PressPage platform.
What is encoding? Information is stored in the computer as bytes. This information includes characters in various alphabets and how they are represented. A character set is the mapping of specific characters (such as letters and numbers) to specific bytes.
The problem is that there is not just one encoding. This means that a certain set of bytes may represent the letter 'A' in one encoding, and something wildly different in another.
You may have heard of encoding standards such as ASCII, which has been around since the sixties. Other well known encodings include ISO-8859-1 and Windows-1252 (popularly known as ANSI).
As of 2008, UTF-8 has been the most used encoding for web pages, and since then it has become the de facto online standard for encoding. 95 percent of all web pages use UTF-8, with some languages using up to 100 percent.
If you look at the source code of webpages or XML files you may also see a reference to the encoding standard used. For example, if you see the HTML tag
<meta-charset="utf-8"> it instructs the browser to interpret the document as having UTF-8 encoding.
The Windows-1252 encoding is used by default in older versions of programs in the Microsoft Office suite, such as Word and Excel. So, as you can imagine, this can cause issues when you import content generated from Word or Excel into a web application, or the other way around: when you export web content and then open it in Word or Excel.
Like most other web applications, content exported from PressPage, such as CSV files of mail contact lists, will use UTF-8.
If you use a legacy version of Microsoft Excel, opening such a CSV file may cause some strange characters to appear, especially if there are letters with accents or diacritics (such as a tilde, or the letter ø) in the source text.
The good news is that this can be solved by instructing Excel to import the file as UTF-8. This web page contains step-by-step instructions for Microsoft Excel 2007.
Similarly, if you are working in Excel or Word and you wish to export your file so you can import it into a web application later, make sure to choose UTF-8 encoding when you save the file.