## May 20, 2018

Question.  List three pattern substitution based compression algorithms. For each algorithm, give one application where the method is used with respect
to multimedia data.

• Repetitive Sequence Suppression.
• Run-length Encoding.
• Pattern Substitution.
Repetitive Sequence Suppression Example: 1 from Silence suppression in audio, ‘white space’ in text, simple uniform backgrounds in images.

Run-length Encoding : 1 from Computer graphics generated images, Faxes, part of JPEG (latter stage) pipeline.

Pattern Substitution : 1 from Pattern recognition/token substitution, Entropy coding (Huffman), LZW/GIF, vector quantization.

Question. What is the basic concept used in defining an Information Theoretic approach to data compression?

The entropy of an information source S, defined as:
,
is the basis Information Theoretic compression algorithms.

Question: What advantages does the arithmetic coding algorithm offer over Huffman coding algorithm with respect to data compression?

• Good compression ratio (better than Huffman coding), entropy around the Shannon Ideal value. – Huffman coding uses an integer number of bits for each symbol,
• hence k is never less than 1.
• – Use decimal number of bits
Disadvantages with the arithmetic coding algorithm:
• Memory: potentially large symbol tables needed.
• Speed due possibly complex computations due to large symbol tables.
Question: Given the following Differential Pulse Code Modulated (DPCM) Sequence reconstruct the original signal.
+4 + 2 + 3 - 2 + 3 - 1 + 1 + 1

DPCM decoding: Simply start with accumulator zero for each number add the value to current accumulator, output accumulator value.
So solution is:
4 6 9 7 10 9 10 11

Question: Given the following Run Length Encoded (RLE) Sequence reconstruct the
original 2D 8x8 (binary) data array.
(0, 8),
(0, 1), (1, 1), (0, 4), (1, 1), (0, 1),
(0, 1), (1, 2), (0, 2), (1, 2), (0, 1),
(0, 1), (1, 6), (0, 1),
(0, 2), (1, 4), (0, 2),
(0, 3), (1, 2), (0, 3),
(0, 2), (1, 1), (0, 2), (1, 1), (0, 2),
(0, 1), (1, 1), (0, 4), (1, 1), (0, 1)

The format of RLE is for each pair (colour, length) so just ‘parse’ each row, to expand colour to length number of values to get the solution:

Question:
Calculate the uncompressed digital output if a video signal is sampled using the following values: 25 frames per second, 160 x 120 pixels, True (Full) color depth.

True color = 24 bits (3 bytes) per pixel

So  number of bytes per second is
24*160*120*25 =  11,520,000 bits

Question:
If a suitable CD stereo quality audio signal is included with the video signal in part (c), how long would it take for the signal to be transmitted on a 128 kbps channel?

CD quality audio = 22,050 Hz. Sampling rate = 44,100 Hz.
Stereo  audio = 44100*2 * 16 bits  =1,411,200 bits per second

So uncompressed bytes stream is 11,520,000 + 1,411,200 =  12,931,200 bits per second. Channel is 128 kbps. Hence time taken = 101 seconds

Question:
For compressing an image for power point slide, which compression would you prefer-regular JPEG or JPEG 2000? Why?

JPEG – 2000 – handles computer generated images (sharp edges etc.) better. Compression ratio is 20% higher than original JPEG

Question: What are the key differences between the JPEG and MPEG I-Frame compression
pipeline?

Four main differences:
• JPEG uses YIQ whilst MPEG use YUV (YCrCb) colour space.
• MPEG used larger block size DCT windows 16 even 32 as opposed to JPEG’s 8.
• Different quantization— MPEG usually uses a constant quantizationvalue.
• Only Discrete Pulse Code Modulation (DPCM) on DC component in JPEG on zig zag scan. AC (JEPG) and complete zig zag scan get RLE.
Question: What processes give rise to the lossy nature of JPEG/MPEG video compression?

Lossy steps:
• Colour space subsampling in IQ or UV components.
• Quantization reduces bits needed for DCT components.
Question
(A) In MPEG audio compression, what is frequency masking?

When an audio signal consists of multiple frequencies the sensitivity of the ear changes with the relative amplitude of the signals. If the frequencies are close and the amplitude of one is less than the other close frequency then the second frequency may not be heard.

(B) Briefly describe the cause of frequency masking in the human auditory system?

• Stereocilia in inner ear get excited as fluid pressure waves flow over them.
• Stereocilia of different length and tightness on Basilar membrane so resonate in sympathy to different frequencies of fluid waves (banks of stereocilia at each frequency band).
• Stereocilia already excited by a frequency cannot be further excited by a lower amplitude near frequency wave.
(C) In MPEG audio compression, what is temporal masking?

After the ear hears a loud sound, consisting of multiple frequencies, it takes a further short while before it can hear a quieter sound close in frequency.

(D) Briefly describe the cause of temporal masking in the human auditory system?

• (Like frequency masking) Stereocilia in inner ear get excited as fluid pressure waves flow over them and respond to different frequencies.
• Stereocilia already excited by a certain frequency will take a while to return to rest state, as inner ear is a closed fluid chamber and pressure waves will eventually dampen down.
• Similar to frequency masking Stereocilia in a ’dampening state’ may not respond to a a lower amplitude near frequency wave.
(E) Briefly describe, using a suitable diagram if necessary, the MPEG-1 audio compression algorithm, outlining how frequency masking and temporal masking are encoded.

MPEG audio compression basically works by:
• Dividing the audio signal up into a set of frequency subbands (Filtering)
• Use filter banks to achieve this.
• Subbands approximate critical bands.
• Each band quantised according to the audibility of quantization noise.

Frequency Masking: MPEG Audio encodes this by quantising each filter bank with adaptive values from neighbouring bands energy, defined by a look up table.

Temporal Masking: Not so easy to model as frequency masking. MP3 achieves this with a 50% overlap between successive transform windows gives window sizes of 36 or 12 and applies basic frequency masking as above.

Question: Construct coding tree for ROADROADIES by Shannon-Fano Algorithm and calculate the entropy. Get the Compression Ratio also for the same.

Shannon-Fano Algorithm:
1. Sort the symbols according to the frequency count of their occurrences.
2. Recursively divide the symbols into two parts, each with approximately the same number of counts, until all the parts contain only 1 symbol.
• Below are the letters with their occurrence counts.[occurrence counts should be in descending form]

log2(1/pi) is calculated as:
Ex: For symbol R:
pi=Count/total number of counts = 2/11=0.18181
log2(1/0.18181) = 2.459 approx.

Code is calculated as:
Ex: Traversing the tree from top to down.

No. of bits used is calculated as:
Ex: For symbol R:> count*code[number of bits/digits]= 2*1=2
For symbol I:> count*code[number of bits]= 1*5=5

Calculate Entropy:

=2.7312

Hence, 2.7312 number of bits are required to represent each symbol. [It can be asked as, What is the average number of bits needed for each character?]

Compression Ratio:  This can be found as:

Default ASCII encoding: 8 bits per symbol
Result after Shannon Fano Coding: 2.7312 bits per symbol

Compression Ratio = (11*8)/(11*2.7312) = 2.929

Question:
Consider the following uncompressed sequence of character bytes: AAAABBBBCAAAAACCCCCAAAAABBBBBBDDDDDD. Encode this sequence with Run-Length Coding. Also calculate the compression ratio.

Solution:
A4B4CA5C5A5B6D6

Compression ratio = 36/15=2.4

Question: Construct Huffman tree to encode the table below. What is the average number of bits needed for each character?
 Symbol A B C D E F G Frequency 28 10 20 13 5 15 9

Answer: Huffman CodingHuffman coding is a lossless data compression algorithm. The most frequent character gets the smallest code and the least frequent character gets the largest code.

Steps to build Huffman Tree

Step 1. Build a min heap that contains 7 nodes where each node represents root of a tree with single node.

Symbol    Frequency
E           5
G           9
B           10
D           13
F           15
C           20
A          28

Step 2. Extract two minimum frequency nodes from min heap. Add a new internal node with frequency 5 + 9 = 14.

Now min heap contains 6 nodes where 5 nodes are roots of trees with single element each, and one heap node is root of tree with 3 elements

B           10
D           13
Node     14
F           15
C           20
A           28

Step 3: Extract two minimum frequency nodes from heap. Add a new internal node with frequency 10 + 13 = 23

Now min heap contains 5 nodes where 3 nodes are roots of trees with single element each, and two heap nodes are root of tree with more than one nodes.

Node     14
F           15
C           20
Node     23
A           28

Step 4: Extract two minimum frequency nodes. Add a new internal node with frequency 14 + 15 = 29

Now min heap contains 4 nodes where 2 nodes are roots of trees with single element each, and two heap nodes are root of tree with more than one nodes.

C           20
Node     23
A           28
Node     29

Step 5: Extract two minimum frequency nodes. Add a new internal node with frequency 20 + 23 = 43
Now min heap contains 3 nodes.

A      28
Node 29
Node 43

Step 6: Extract two minimum frequency nodes. Add a new internal node with frequency 28 + 29 = 57

Now min heap contains 2 nodes.

Node 43
Node 57

Step 7: Extract two minimum frequency nodes. Add a new internal node with frequency 43 + 57 = 100

Now min heap contains only one node.

Node 100

Since the heap contains only one node, the algorithm stops here.

Traverse the tree formed starting from the root. Maintain an auxiliary array. While moving to the left child, write 0 to the array. While moving to the right child, write 1 to the array

The codes are as follows:

Symbol   code
C          0
B          10
D          11
A          10
E          1100
G          1101
F          111

Average number of bits needed for each character

Entropy

2.6282 average number of bits are required to represent each symbol.

Question:
Suppose eight characters have a distribution A:(1), B:(1), C:(1), D:(2), E:(3), F:(5), G:(5), H:(10). Draw a Huffman tree for this distribution.

Solution:

Question:
Perform run length encoding for the following and calculate compression ratio AC0000AAAAB00000000A0AAA00000.

Solution:
Run length encoding
Before Encoding : AC0000AAAAB00000000A0AAA00000
After Encoding :   1A1C404A1B801A103A50

Compression Ratio
Compression Ratio = Total number of bits before encoding/Total number of bits after encoding

Compression Ratio  = 29/20 = 1.45

Question:

a. Why is the display order and encoding order of MPEG frames different ? Given the display order of frames, find the coding order.  I(1) B(2) B(3) P(4) B(5) B(6) P(7) B(8) B(9) P(10) I(11)
b. State the difference between frequency masking and temporal masking in MPEG audio. How are these two encoded in MPEG audio?
c. If the block size for a 2D DCT transform is 8 X8, and we use only the DC components to create a thumbnail image, what fraction of the original pixels would we be using?

Solution:

a.
P and B frames are much more complicated, since they depend on other frames.   So we have to buffer a previous frame and a future frame to decode P and B frames. The name "future frame" indicates that the frame should be displayed after the current frame.  However, it is actually encoded before the current frame. So the display order of frames in a MPEG sequence is different from the decoding order. Coding order : I(1)P(4)B(2)B(3)P(7)B(5)B(6)P(10)B(7)B(8)I(11)

b.