Home
Authors Papers Year of conference Themes Organizations To MES conference
Improving the efficiency of the tensor approximation for image compression by using the trained dictionary |
|
|
|
|
Authors |
| Tchobanou M.K. |
| Makarov D.V. |
Date of publication |
| 2016 |
|
Abstract |
| Nowadays tensor decomposition methods are frequently used in digital signal processing to improve characteristics of standard algorithms. In this paper, we consider the use of one of the most effective methods of tensor decomposition. We show that there is possibility to use this method of tensor decomposition for image compression. Also we introduce an approach which uses a dictionary with pre-calculated elements that increases the efficiency of image compression.
Tensor analysis and the theory of tensor approximations are playing an increasingly important role in computational mathematics and numerical analysis. An efficient representation of a tensor (by tensor we mean only an array with d indices) with a small number of parameters may give us ability to work with d-dimensional problems, with d being as high as 10, 100, or even 1000. The application of this approach is warranted when there is an exponential growth of the transmission, storage and processing of visual information [1].
The purpose of tensor approximation is to find regularities among elements of an input tensor and decompose it into tensors with smaller dimension and smaller total count of elements [2].
There are three effective methods of tensor approximation:
• Canonical decomposition (the CANDECOMP, PARAFAC algorithms [3]).
• Tucker decomposition (the N-mode PCA and N-mode SVD algorithms [4]).
• Tensor-Train Decomposition (the TT decomposition algorithm [5]).
There is an effective way to represent d-dimensional tensors by using the canonical decomposition. It decomposes initial tensor in set of two-dimensional tensors.
Thus, it is possible to approximate a tensor of size n by n by n by … by n with a number of elements, which are measured as O(dnr). The advantage of the method is that if the rank r is small, the tensor can be represented very compactly. But the computation of the canonical rank is an NP-hard problem and the numerical algorithms for computing an approximate representation in such cases might fail.
The Tucker decomposition is stable but has exponential in d number of parameters. It is suitable for “small” dimensions, especially for the three-dimensional case. For large d it is not suitable.
The TT decomposition doesn’t have these drawbacks. There is a stable algorithm for calculating TT decomposition and the number of elements in the approximation are close to canonical approximation.
Sparse representation (like wavelet [6]) is an efficient way for image compression. In this case, we obtain projection of an image to more suitable basis where an image will have a large number of repeating elements. It allows to encode an image more effectively.
The Wavelet Tensor-Train (WTT, [7]) is a modification of TT decomposition method for computing a sparse representation. This is possible to compute a set of orthogonal matrices which is a good basis for the image. Then matrices will be used as filters for the image. Using filters for sparse image computation leads to sequence of matrix multiplications [8].
We showed that tensor decomposition used for calculating sparse image representation via WTT is more effective way of image compression than using tensor decomposition directly.
WTT filters are signal-dependent, so we need to store it with compressed image. It significantly reduces efficiency of image compression via WTT because filters usually have high entropy.
We also showed that there is a possibility to make WTT filters signal-independent via tuning algorithm’s parameters. When using pre-calculated signal-independent filter’s set we have no need to save filters with encoded image. Efficiency of such filters is lower than efficiency of signal-dependent filters but it is still better than saving filters with image. Comparison showed that efficiency of the introduced method is close to JPEG efficiency [8].
Moreover, WTT can be used for small blocks of image instead of applying WTT for the whole image. For example, we can use WTT for a block of size 8 by 8 by analogy of using discrete cosine transform (DCT) in JPEG. Our research showed that in this case we can’t tune parameters of WTT to make filters signal-independent. It means that we need special filters for each block.
Also we analyzed efficiency of applying same filters to different blocks and saw that filters that were computed for a source block can be effective for blocks that are similar to the source block and also for completely different blocks [9]. So we can train dictionary of filters and use it to increase efficiency of the image compression.
Dictionary training includes finding filters that will be more effective than DCT and cover a lot of blocks. More precise training procedure was described in paper [9] where we also compared DCT to WTT with dictionary.
But we had an issue while using WTT with dictionary. How to choose right filters from dictionary for concrete input block? For this purpose we suggest three methods:
• First one and the simplest one is to use brute force method for choosing filters from the dictionary. This method is extremely slow and cannot be used in real time system for image compression. But this algorithm is still effective because it is guaranteed that best filter for input block will be found if it is present in the dictionary. Also worth noting that the speed of image reconstruction is still fast and close to JPEG image reconstruction speed.
• Second one is to use special hash functions. In this case we store additional hash values in the dictionary besides filters. When we handle input block we calculate hash for it and search in dictionary for the nearest hash value and choose filter which is associated with that value. This procedure reduces algorithm complexity of finding filters down to logarithm but that kind of search can lead to false positives especially in case of the large dictionary.
• Another method is to use machine learning technics. This is a topic for further research. There is an assumption that the modern machine learning methods can help to find patterns between a specific set of filters and image blocks for which they can be effectively applied. If this is so, it will be possible to get rid of method’s shortcomings and to reach a speed comparable to the speed of the method number two (hash functions).
So with using of WTT we can achieve efficiency of image compression similar to JPEG efficiency. Moreover, we can train a dictionary in order to increase the efficiency of WTT.
Introduced method can be used as an alternative or as an extension to DCT-base compression algorithms. When it is used as an extension it allows to make existing algorithms of image and video compression more efficient.
Further ways of research in the area of using WTT for image compression:
• Detecting the optimal block size for WTT with a dictionary. We used blocks of 8 by 8 for ease of comparison with discrete cosine transform. But perhaps it would be better to use blocks of smaller size.
• Finding out the most efficient way of choosing filters from dictionary for applying it to input block of image.
• Using WTT for color image compression. There are a lot of ways to use tensor decomposition for color images. |
Keywords |
| tensor, tensor approximation, tensor decomposition, lossy image compression, train tensor approximation, wavelet tensor-train, signal-dependent filters, sparse representation. |
Library reference |
| Tchobanou M.K., Makarov D.V. Improving the efficiency of the tensor approximation for image compression by using the trained dictionary // Problems of Perspective Micro- and Nanoelectronic Systems Development - 2016. Proceedings / edited by A. Stempkovsky, Moscow, IPPM RAS, 2016. Part 1. P. 218-223. |
URL of paper |
| http://www.mes-conference.ru/data/year2016/pdf/D098.pdf |
|
|