BASE 64 ENCODING AND PRIVACY ENHANCED MAIL

INTRODUCTION
We are living in a world of electronics and computers and internet and emails are the mostly used communication media now days. What is the purpose of communication without security? Communication over the internet need security protocols and different encoding schemes are worldwide for security.
In this seminar an encoding scheme called base 64 encoding and its many implementations including PEM are presented.Base64 is a generic term for any number of similar encoding schemes that encode binary data by treating it numerically and translating it into a base 64 representation.Base64 encoding schemes are commonly used when there is a need to encode binary data that needs be stored and transferred over media that are designed to deal with textual data. That is, base64 is commonly used in transfer of emails .This is to ensure that the data remains intact without modification during transport. Base64 is used commonly in a number of applications including email via MIME, and storing complex data in XML. Besides being the default Encoding standard being used for encoding files to be sent as attachments by Multipurpose Internet Mail Extensions or MIME, it has also started being used in a number of other places.
PEM,UTF 7,Open PGP ,MIME are other implementations of base 64which uses different encryption schemes and base 64 .PEM is the first implementation which is used to secure emails using base 64.
The advantage of base 64 is that it provides security and being an easy algorithm, it can be easily implemented.

ENCRYPTION
Encryption refers to algorithmic schemes that encode plain text into non-readable form or cipher text, providing privacy. The receiver of the encrypted text uses a "key"  to decrypt the message, returning it to its original plain text form. The key is the trigger mechanism to the algorithm we interact with the Internet. A cipher (or cipher) is a pair of algorithms that create the encryption and the reversing decryption. The detailed operation of a cipher is controlled both by the algorithm and in each instance by a key. This is a secret parameter (ideally known only to the communicants) for a specific message exchange context. Keys are important, as ciphers without variable keys can be trivially broken with only the knowledge of the cipher used and are therefore useless (or even counter-productive) for most purposes. Historically, ciphers were often used directly for encryption or decryption without additional procedures such as authentication or integrity checks.    

 HISTORY OF ENCRYPTION
The earliest forms of secret writing required little more than local pen and paper analogs, as most people could not read. More literacy, or literate opponents, required actual cryptography. The main classical cipher types are transposition ciphers, which rearrange the order of letters in a message (e.g., 'hello world' becomes 'ehlol owrdl' in a trivially simple rearrangement scheme), and substitution ciphers, which systematically replace letters or groups of letters with other letters or groups of letters (e.g., 'fly at once' becomes 'gmz bu podf' by replacing each letter with the one following it in the Latin alphabet). Simple versions of either offered little confidentiality from enterprising opponents, and still do. An early substitution cipher was the Caesar cipher, in which each letter in the plaintext was replaced by a letter some fixed number of positions further down the alphabet. It was named after Julius Caesar who is reported to have used it, with a shift of 3, to communicate with his generals during his military campaigns, just like Excess-3 code in Boolean algebra. There is record of several early Hebrew ciphers as well. The earliest known use of cryptography is some carved cipher text on stone in Egypt (ca 1900 BC), but this may have been done for the amusement of literate observers. The next oldest is bakery recipes from Mesopotamia.
Until the advent of the Internet, encryption was rarely used by the public, but was largely a military tool. The development of digital computers and electronics after WWII made possible much more complex ciphers. Furthermore, computers allowed for the encryption of any kind of data representable in any binary format, unlike classical ciphers which only encrypted written language texts; this was new and significant. Today, with online marketing, banking, healthcare and other services, even the average householder is aware of encryption. Now the process of hiding information is collectively denoted by the term cryptography. The term is derived from the Greek language. ’krytos’ means secret and ‘graphos’ means writing.
Modern cryptography intersects the disciplines of mathematics, computer science, and engineering. Applications of cryptography include ATM cards, computer passwords, and electronic commerce.

BASE 64: AN INTRODUCTION
Base64 is a generic term for any number of similar encoding schemes that encode binary data by treating it numerically and translating it into a base 64 representation. The Base64 term originates from a specific MIME content transfer encoding. Base64 encoding schemes are commonly used when there is a need to encode binary data that needs be stored and transferred over media that are designed to deal with textual data. This is to ensure that the data remains intact without modification during transport. So Base 64 encoding method is commonly used in email systems. The email systems that where developed back in the time of Arpanet, where designed to support only Letters (A-Z, a-z), Numbers (0-9) and some limited punctuation marks. So in order to transfer files which can contain more than characters and digits (for e.g., a picture.jpg file), Base 64 Encoding is used.
 Since its introduction, Base64 encoding has extremely quickly gained popularity. Besides being the default Encoding standard being used for encoding files to be sent as
attachments by Multipurpose Internet Mail Extensions or MIME, it has also started being
used in a number of other places.Base64 is used commonly in a number of applications including email via MIME, and storing complex data in XML, used in web servers for implementing HTTP based basic authentication etc.

HISTORY AND IMPLEMENTATIONS OF BASE64
PEM (PRIVACY ENHANCED MAIL)
Privacy Enhanced Mail (PEM), is an early IETF proposal for securing email using public key cryptography. Although PEM became an IETF proposed standard it was never widely deployed or used.
MIME 
Multipurpose Internet Mail Extensions (MIME) is an Internet standard that extends the format of e-mail to support.
UTF 7
UTF-7 (7-bit Unicode Transformation Format) is a variable-length character encoding that was proposed for representing Unicode text using a stream of ASCII characters, for example for use in Internet E-mail messages. UTF-7 was first proposed as an experimental protocol in RFC 1642, A Mail-Safe Transformation Format of Unicode.
OPEN PGP
OpenPGP is a non-proprietary protocol for encrypting email using public key cryptography. It is based on PGP as originally developed by Phil Zimmermann. The OpenPGP protocol defines standard formats for encrypted messages, signatures, and certificates for exchanging public keys. Beginning in 1997, the OpenPGP Working Group was formed in the Internet Engineering Task Force (IETF) to define this standard. Over the past decade, PGP, and later OpenPGP, has become the standard for nearly all of the world's encrypted email.

BASE 64 
Base64 is a different way of interpreting bits of data in order to transmit that data over a text-only medium, such as the body of an e-mail. In the standard 8-bit ASCII character set, there are 256 characters that are used to format text . However, only a fraction of these characters are actually printable and readable when you are looking at them onscreen, or sending them in an e- mail. We need a way to convert unreadable characters into readable characters, do something with them (i.e. send them in an e- mail), and convert them back to their original format.
How do you convert unreadable, nonprintable characters into readable, printable characters? There are many ways to do this, but the way we are covering now is by using base64 encoding. The 256 characters in the ASCII character set are numbered 0 through 255. For the tech savvy, this is the same as 28, 8 binary placeholders, or a byte. So for any ASCII character, you simply need one byte to represent this data. As far as a computer is concerned, there is no difference between an ASCII character, and a number between 0 and 255 (which is a string of 8 binary placeholders), only how it is interpreted. Because we are now detached from ASCII characters, you can also apply these same techniques to binary data, for example, a picture, or executable file. All you are doing is interpreting data one byte at a time.
The problem with representing data one byte at a time in a readable manner is that there are not 256 readable characters in the ASCII character set, so we cannot print a character for each of the 256 combinations that a byte can offer. So we need to take a different approach to looking at the bits in a byte. So what if instead of looking at a whole byte, we looked at half of a byte, or 4 bits (also known as a nibble) at a time. This would be entirely possible because 24 is equal to 16, and there are certainly sixteen readable characters that we could use to represent each variation of nibble.
The problem with using hex, is that since you are using one ASCII character (which is, remember, one byte long in storage space) to represent every four bits, anything you translate into hex will be exactly twice as big as the original data. This might not seem like a problem for a small message, but imagine you are trying to send an image or executable. The original size of perhaps a megabyte or more is now doubled. Sending this over email or a slow Internet connection will take twice as long.

BASE64 AS AN ALTERNATIVE
We now know that using 16 different characters to represent each half byte is a viable option, but not our ideal option because it is only half as space efficient as a byte. So how else can we dice bytes up to get our goal: readable characters for any value of 0 to 255? Instead of looking at one byte at a time, and trying to chop that byte up, take several bytes and see what we can do with them.
As you can easily see, using three bytes, we have a total of 24 bits. How else can we chop 24 bits up? If instead of 3 bytes of 8 bits each we use 4 "clumps" of 6 bytes each, what are we left with? Now we have 26 which equals 64. So now instead of needing 3 instances of a character that can represent any of 256 different combinations, we now need just 4 instances of a character that can represent any of 64 different combinations. The same bits as in the above table fit into the table below.

BASE 64 ALPHABETS
The particular choice of character set selected for the 64 characters required for the base varies between implementations. The general rule is to choose a set of 64 characters that is both part of a subset common to most encodings, and also printable. This combination leaves the data unlikely to be modified in transit through information systems, such as email, that were traditionally not 8-bit clean. For example, MIME's Base64 implementation uses uppercase A-Z (26 characters), lowercase a-z (26characters), 0-9 (10 characters), '+' (1 character) and '/' (1 character). 26 + 26 + 10 + 1 + 1 = 64, just the number we need. As you can
surmise, base64 is still less space efficient than using a full byte, but instead of hex's double space usage, base64 uses only one and a third as much space. In other words for every 3 bytes, you must have 4 base64 characters. All of the characters listed above are easily readable. Other variations, usually derived from Base64, share this property but differ in the symbols chosen for the last two values

ENCODING INTO BASE 64
BASE 64 ENCODING ALGORITHM
The Base64 encoding process is to:
Divide the input bytes stream into blocks of 3 bytes.
Divide 24 bits of each 3-byte block into 4 groups of 6 bits.
Map each group of 6 bits to 1 printable character, based on the 6-bit value using the Base64 character set map.
If the last 3-byte block has only 1 byte of input data, pad 2 bytes of zero (\x0000). After encoding it as a normal block, override the last 2 characters with 2 equal signs (==), so the decoding process knows 2 bytes of zero were padded.
If the last 3-byte block has only 2 bytes of input data, pad 1 byte of zero (\x00). After encoding it as a normal block, override the last 1 character with 1 equal signs (=), so the decoding process knows 1 byte of zero was padded.
Carriage return (\r) and new line (\n) are inserted into the output character stream. They will be ignored by the decoding process.

BASE64 DECODING
we will now tackle translating from base64 characters back into normal bytes. We will use the same mapping of values (0 through 63) to base64 characters (A-Z, a-z, 0-9, '+', and '/'). The reverse process is relatively simple now that we know how to perform the forward operation. Let’s start with the base64 string "YmFzZTY0IGlzIGZ1biEh". Right now, that makes no sense. We begin the same way, by looking up the value for each base64 character.

APPLICATIONS
URL APPLICATIONS
Base64 encoding can be helpful when fairly lengthy identifying information is used in an HTTP environment. For example, a database persistence framework for Java objects might use Base64 encoding to encode a relatively large unique id (generally 128-bit ) into a string for use as an HTTP parameter in HTTP forms or HTTP GET URLs. Also, many applications need to encode binary data in a way that is convenient for inclusion in URLs, including in hidden web form fields, and Base64 is a convenient encoding to render them in not only a compact way, but in a relatively unreadable one when trying to obscure the nature of data from a casual human observer.
Using standard Base64 in URL requires encoding of '+' and '/' characters into special percent-encoded hexadecimal sequences ('+' = '%2B' and '/' = '%2F'), which makes the string unnecessarily longer.
For this reason, a modified Base64 for URL variant exists, where no padding '=' will be used, and the '+' and '/' characters of standard Base64 are respectively replaced by '-' and '_', so that using URL encoders/decoders are no longer necessary and have no impact on the length of the encoded value, leaving the same encoded form intact for use in relational databases, web forms, and object identifiers in general.
FOR PRIVACY PROTECTION SYSTEMS
    Base 64 encoding is commonly used by Proxy Web Sites or Anonymous websites to encode the website url . I.e. these sites will hide the names of the sites we are visiting and protect our privacy. These sites are commonly used by people all around the world  to bypass country restrictions.
These systems use base 64 encoding to encrypt the page url there by providing security to the users. Figure represents such a site which uses base 64 encoding on the address.
XML
XML identifiers and name tokens are encoded using two variants.
PROGRAM identifiers
There are other variants that use '_-' or '._' when the Base64 variant string must be used within valid identifiers for programs
REGULAR expressions
Another variant called modified Base64 for regexps uses '!-' instead of '*-' to replace the standard Base64 '+/', because both '+' and '*' may be reserved for regular expressions (note that '[]' used in the IRCu variant above would not work in that context).

Download this Seminar Report from:
Embed-upload: Download

           
Share on Google Plus

About Unknown

This is a short description in the author block about the author. You edit it by entering text in the "Biographical Info" field in the user admin panel.

0 comments:

Post a Comment

Thanks for your Valuable comment