Keywords: data compression, arithmetic coding, Wavelet-based algorithms Abstract. Data compression is a common requirement for most of the computerized applications. There are number of data compression algorithms, which are dedicated to compress different data formats. Even for a single data type there are number of different compression algorithms, which use different approaches.

This paper examines lossless data compression algorithm “Arithmetic Coding” In this method, a code word is not used to represent a symbol of the text. Instead it uses a fraction to represent the entire source message. The occurrence probabilities and the cumulative probabilities of a set of symbols in the source message are taken into account. The cumulative probability range is used in both compression and decompression processes. In the encoding process, the cumulative probabilities are calculated and the range is created in the beginning.

While reading the source character by character, the corresponding range of the character within the cumulative probability range is selected. Then the selected range is divided into sub parts according to the probabilities of the alphabet. Then the next character is read and the corresponding sub range is selected. In this way, characters are read repeatedly until the end of the message is encountered.

Finally a number should be taken from the final sub range as the output of the encoding process. This will be a fraction in that sub range. Therefore, the entire source message can be represented using a fraction. To decode the encoded message, the number of characters of the source message and the probability/frequency distribution are needed.

Introduction. Compression is the art of representing the information in a compact form rather than its original or uncompressed form. This is very useful when processing, storing or transferring a huge file, which needs lots of resources. If the algorithms used to encrypt works properly, there should be a significant difference between the original file and the compressed file. Compression can be classified as either lossy or lossless. Lossless compression techniques reconstruct the original data from the compressed file without any loss of data. Some of the main techniques in use are the Huffman Coding, Run Length Encoding, Arithmetic Encoding and Dictionary Based Encoding.

Image compression is the application of data compression on digital images. In effect, the objective is to reduce redundancy of the image data in order to be able to store or transmit data in an efficient form. Lossy wavelet based compression is especially suitable for natural images such as photos in applications where minor loss of fidelity is acceptable to achieve a substantial reduction in bit rate.

Smooth areas of the image are efficiently represented with a few low-frequency wavelet coefficients, while important edge features are represented with a few high-frequency coefficients, localized around the edge. The majority of the information is localized in low frequency filters while the high frequency filters are sparse. Wavelet-based algorithms have been adopted by government agencies as a standard method for coding fingerprint images, and are considered in the JPEG2000 standardization activity.

Figure 1.Image compression/decompression system

We implemented wavelet with integer lifting

The integer wavelet with lifting has three steps:

I) Separation step: Separating the main signal to odd and even parts. II) Lifting step: we apply the prediction filters and update even and odd signals. III) Normalization step

The next step is implementing the coder/decoder units shown in Figure 1. For our coder and decoder we have chosen Arithmetic coding over Huffman code.. We used C++ for our compressor and de-compressor. The input to the compressor systems is a 256 gray scale bitmap file. In compressor, first, we read the bitmap matrix and pass it to wavelet module. The integer to integer wavelet is applied to the matrix in two dimensions both horizontally and vertically. The arithmetic encoder, code the transformed 2-D wavelet and generates the compressed file.

In de-compressor, the compressed file is then passed to the decoder for decompression. The inverse integer wavelet transform is then applied to generate the bitmap matrix. The final bitmap image is generated which is the retrieved image. Description. Our system has the following classes:

-Wavelet class: The wavelet class is for integer to integer forward and inverse wavelet transforms. It does the forward integer wavelet transform both in 1 dimension and also 2- D on matrix which corresponds to the image column and row pixels. In the inverse wavelet transform, we reverse all forward process. It does both 1-D and 2-D wavelet transform.

-Image class: This class reads and write image from the 256 gray scale bitmap file. It reads the image before the transform and compression and also regenerates the .bmp file after the decompression and inverse wavelet transform.

-Arithmetic Coder class: In arithmetic coding, we separated the source modeling from entropy coding. For coding purposes the only information needed for modeling a data source is its number of data symbols, and the probability of each symbol. During the actual coding process what is used is data that is computed from the probabilities. The arithmetic encoder does the coding and the decoder generates the original decoded symbols.

-Codec class: It instantiates image, wavelet, Arithmetic coder and compress and decompress image using wavelet and arithmetic coding.

-Utilities class: Utilities are some utility functions for observing the outputs, doing test and debugging.

-Matrix class: This is a class for data matrix manipulation and matrix processing. Algorithm Steps.

1. We begin with a \current interval” [L;H) initialized to [0; 1). 2. For each symbol of the file, we perform two steps :

(a) We subdivide the current interval into subintervals, one for each possible alphabet symbol. The size of a symbol’s subinterval is proportional to the estimated probability that the symbol will be the next symbol in the file, according to the model of the input.

(b) We select the subinterval corresponding to the symbol that actually occurs next in the file, and make it the new current interval.

3. We output enough bits to distinguish the final current interval from all other possible final intervals.

Results.

The following is a sample of our 2-level wavelet transform applied on a 512 x 512 gray scale bitmap image.

Figure 3. Results from applying 2 levels of wavelet on a 512 x 512 bitmap

The idea behind wavelet transform is expressed in Fig.3. Most of the image information is in the low frequency filters. High frequency filters only represent the fine details. For the lossy compression, the idea is to ignore the high level transforms and regenerate your signal using your low frequency filters. Below, we show our results from applying our compression to different sizes.

Conclusion.

References.

[1] Amir Said,Introduction to Arithmetic Coding Theory and Practice,

Hewlett-Packard Laboratories Report, HPL-2004-76, Palo Alto, CA, April 2004.

[2] C. Sidney Burrus, Ramesh A. Gopinath, Haitato, “Introduction to Wavelets and Wavelet Transforms, Aprimer,” Prentice-Hall, New Jersey, 1998.

[3] M. D. Adams and F. Kossentini, “Reversible Integer-to-Integer Wavelet Transforms for Image Compression: Performance Evaluation and Analysis,” IEEE Trans. on Image Processing, vol. 9, no. 6, pp. 1010-1024, Jun. 2000.

[4] Paul G. Howard AND Jeffrey Scott Vitter, “Arithmetic Coding for Data Compression”, Proceedings of the IEEE, vol. 82, no.6, June 1994.