Should we HTML-encode special characters before storing them in the database?

Don’t HTML-encode your characters before storage. You should store as pure a form of your data as possible. HTML encoding is needed because you are going to display the data on an HTML page, so do the encoding during the processing of the data to create the page. For example, suppose you decide you’re also … Read more

What is the most efficient binary to text encoding?

This really depends on the nature of the binary data, and the constraints that “text” places on your output. First off, if your binary data is not compressed, try compressing before encoding. We can then assume that the distribution of 1/0 or individual bytes is more or less random. Now: why do you need text? … Read more

Why is base128 not used? [closed]

The problem is that at least 32 characters of the ASCII character set are ‘control characters’ which may be interpreted by the receiving terminal. E.g., there’s the BEL (bell) character that makes the receiving terminal chime. There’s the SOT (Start Of Transmission) and EOT (End Of Transmission) characters which performs exactly what their names imply. … Read more

Emoji value range

The Unicode standard’s Unicode® Technical Report #51 includes a list of emoji (emoji-data.txt): … 21A9 ; text ; L1 ; none ; j # V1.1 (↩) LEFTWARDS ARROW WITH HOOK 21AA ; text ; L1 ; none ; j # V1.1 (↪) RIGHTWARDS ARROW WITH HOOK 231A ; emoji ; L1 ; none ; j … Read more

How do I correct the character encoding of a file?

Follow these steps with Notepad++ 1- Copy the original text 2- In Notepad++, open new file, change Encoding -> pick an encoding you think the original text follows. Try as well the encoding “ANSI” as sometimes Unicode files are read as ANSI by certain programs 3- Paste 4- Then to convert to Unicode by going … Read more

Why does base64 encoding require padding if the input length is not divisible by 3?

Your conclusion that padding is unnecessary is right. It’s always possible to determine the length of the input unambiguously from the length of the encoded sequence. However, padding is useful in situations where base64 encoded strings are concatenated in such a way that the lengths of the individual sequences are lost, as might happen, for … Read more

Base64 Encode String in VBScript

I was originally using some VBScript code from Antonin Foller: Base64 Encode VBS Function and Base64 Decode VBS Function. Searching Antonin’s site, I saw he had some code for quoted printable encoding, using the CDO.Message object, so I tried that. Finally, I ported the code mentioned in Mark’s answer to VBScript (also used some code … Read more