What is base64 and how does it work ?

Base64 is a binary-to-text encoding scheme that represents binary data in an ASCII string format by translating it into a radix-64 representation

Posted by Gregory Pacheco on December 16, 2022
Base64 is a binary-to-text encoding scheme that represents binary data in an ASCII string format by translating it into a radix-64 representation. It was originally developed to represent 8-bit binary data in a way that was more compact and easier to transmit over communication channels that only support ASCII characters.

Here's how base64 works:

  1. The input data is divided into blocks of 3 bytes (24 bits). If the input data does not have a multiple of 3 bytes, it is padded with 0s to make it a multiple of 3 bytes.
  2. Each block of 3 bytes is then divided into 4 groups of 6 bits.
  3. Each group of 6 bits is then used to index a specific character in the base64 character set. The base64 character set consists of 65 characters, with the 65th character being "=" (equal sign). The equal sign is used as a padding character to indicate that the input data was not a multiple of 3 bytes.
  4. The resulting base64-encoded string is made up of these indexed characters.

Here's an example of how base64 encoding works using the following input data: "Hello, World!"

  1. The input data is converted to binary form:
    01001000 01100101 01101100 01101100 01101111 00101100 00100000 01010111 01101111 01110010 01101100 01100100 00100001
  2. The binary data is divided into blocks of 3 bytes (24 bits):
                                        01001000 01100101 01101100
                                        01101100 01101111 00101100
                                        00100000 01010111 01101111
                                        01110010 01101100 01100100
                                        00100001
                                    
  3. Each block of 3 bytes is divided into 4 groups of 6 bits:
                                        010010 000110 010110 010100
                                        011011 000101 101100 001001
                                        000001 010101 011101 010111
                                        011001 001101 010100 000110
                                        000001
                                    
  4. Each group of 6 bits is used to index a specific character in the base64 character set:
    
                                        S G V Y
                                        X F l I
                                        A U d s
                                        a c d G
                                        E
    
                                        The resulting base64-encoded string is "SGVsbG8sIFdvcmxkIQ=="
                                    

To decode a base64-encoded string, the process is simply reversed. The base64-encoded string is divided into blocks of 4 characters, and each block is decoded by looking up the corresponding 6-bit binary value in the base64 character set. The resulting binary data is then converted back to its original form. I hope this helps! Let me know if you have any questions.