Сжатие без потерь

Сжатие без потерь - это класс алгоритмов сжатия данных , который позволяет идеально восстанавливать исходные данные из сжатых данных. Напротив, сжатие с потерями позволяет реконструировать только аппроксимацию исходных данных, хотя обычно со значительно улучшенными степенями сжатия (и, следовательно, уменьшенными размерами носителя).

Благодаря принципу работы с ячейками ни один алгоритм сжатия без потерь не может эффективно сжать все возможные данные. По этой причине существует множество различных алгоритмов, которые разработаны либо с учетом конкретного типа входных данных, либо с конкретными предположениями о том, какие виды избыточности могут содержать несжатые данные.

Сжатие данных без потерь используется во многих приложениях. Например, он используется в формате файла ZIP и в инструменте GNU gzip . Он также часто используется в качестве компонента в технологиях сжатия данных с потерями (например, совместная предварительная обработка стерео среднего и бокового каналов без потерь кодировщиками MP3 и другими аудиокодерами с потерями).

Сжатие без потерь используется в тех случаях, когда важно, чтобы исходные и распакованные данные были идентичными, или когда отклонения от исходных данных были бы нежелательными. Типичными примерами являются исполняемые программы, текстовые документы и исходный код. Некоторые форматы файлов изображений, такие как PNG или GIF , используют только сжатие без потерь, в то время как другие, такие как TIFF и MNG, могут использовать методы без потерь или с потерями. Аудиоформаты без потерь чаще всего используются для архивирования или производства, в то время как аудиофайлы с потерями меньшего размера обычно используются на портативных плеерах и в других случаях, когда пространство для хранения ограничено или точное воспроизведение звука не требуется.

Методы сжатия без потерь [ править ]

Большинство программ сжатия без потерь выполняют две операции последовательно: на первом этапе создается статистическая модель для входных данных, а на втором этапе эта модель используется для сопоставления входных данных с последовательностями битов таким образом, чтобы «вероятные» (например, часто встречающиеся) данные даст более короткий результат, чем «невероятные» данные.

Основными алгоритмами кодирования, используемыми для создания битовых последовательностей, являются кодирование Хаффмана (также используемое алгоритмом дефлятирования ) и арифметическое кодирование . Арифметическое кодирование обеспечивает степень сжатия, близкую к наилучшей из возможных для конкретной статистической модели, которая задается информационной энтропией , тогда как сжатие Хаффмана проще и быстрее, но дает плохие результаты для моделей, которые имеют дело с вероятностями символов, близкими к 1.

Существует два основных способа построения статистических моделей: в статической модели данные анализируются и строится модель, затем эта модель сохраняется со сжатыми данными. Этот подход прост и модулен, но имеет недостаток, заключающийся в том, что сама модель может быть дорогостоящей в хранении, а также в том, что он заставляет использовать единую модель для всех сжимаемых данных и поэтому плохо работает с файлами, содержащими разнородные данные. Адаптивные модели динамически обновляют модель по мере сжатия данных. И кодер, и декодер начинаются с тривиальной модели, что дает плохое сжатие исходных данных, но по мере того, как они узнают больше о данных, производительность улучшается. В наиболее популярных типах сжатия, используемых на практике, сейчас используются адаптивные кодеры.

Методы сжатия без потерь можно разделить на категории в соответствии с типом данных, для сжатия которых они предназначены. Хотя, в принципе, любой алгоритм сжатия без потерь общего назначения ( универсальный означает, что они могут принимать любую строку битов) может использоваться с любым типом данных, многие из них не могут достичь значительного сжатия данных, которые не имеют формы, для которой они были созданы для сжатия. Многие методы сжатия без потерь, используемые для текста, также достаточно хорошо работают для индексированных изображений .

Мультимедиа [ править ]

Эти методы используют особые характеристики изображений, такие как обычное явление непрерывных двухмерных областей схожих тонов. Каждый пиксель, кроме первого, заменяется разницей его левого соседа. Это приводит к тому, что малые значения имеют гораздо большую вероятность, чем большие значения. Это часто также применяется к звуковым файлам и может сжимать файлы, содержащие в основном низкие частоты и малую громкость. Для изображений этот шаг можно повторить, взяв разницу в верхний пиксель, а затем в видео можно взять разницу в пикселе в следующем кадре.

Иерархическая версия этого метода берет соседние пары точек данных, сохраняет их разность и сумму, а на более высоком уровне с более низким разрешением продолжает вычисление сумм. Это называется дискретным вейвлет-преобразованием . JPEG2000 дополнительно использует точки данных из других пар и коэффициенты умножения, чтобы смешать их с разницей. Эти множители должны быть целыми числами, чтобы результат был целым при любых обстоятельствах. Значения увеличиваются, увеличивается размер файла, но, надеюсь, распределение значений более пиковое. ^{[ необходима цитата ]}

Адаптивное кодирование использует вероятности из предыдущего образца при кодировании звука, от левого и верхнего пикселя при кодировании изображения и дополнительно из предыдущего кадра при кодировании видео. В вейвлет-преобразовании вероятности также проходят через иерархию.

Исторические правовые вопросы [ править ]

Многие из этих методов реализованы в инструментах с открытым исходным кодом и в проприетарных инструментах, особенно в LZW и его вариантах. Некоторые алгоритмы запатентованы в США и других странах, и их законное использование требует лицензирования держателем патента. Из-за патентов на определенные виды сжатия LZW и, в частности, практики лицензирования со стороны патентообладателя Unisys, которую многие разработчики сочли оскорбительной, некоторые сторонники открытого исходного кода призывали людей избегать использования формата обмена графическими данными (GIF) для сжатия файлов неподвижных изображений в пользу Portable. Network Graphics (PNG), который сочетает в себе LZ77 основанное выкачивает алгоритмс выбором фильтров прогнозирования для конкретной предметной области. Однако срок действия патентов на LZW истек 20 июня 2003 г. ^[1]

Многие из методов сжатия без потерь, используемых для текста, также достаточно хорошо работают для индексированных изображений , но есть другие методы, которые не работают для типичного текста, которые полезны для некоторых изображений (особенно простых растровых изображений), и другие методы, которые используют преимущества определенных характеристики изображений (такие как обычное явление смежных двухмерных областей схожих тонов и тот факт, что цветные изображения обычно имеют преобладание ограниченного диапазона цветов из тех, которые представляются в цветовом пространстве).

Как упоминалось ранее, сжатие звука без потерь - это несколько специализированная область. Алгоритмы сжатия звука без потерь могут использовать повторяющиеся шаблоны, показанные волнообразной природой данных - по сути, с использованием авторегрессионных моделей для прогнозирования «следующего» значения и кодирования (надеюсь, небольшой) разницы между ожидаемым значением и фактическими данными. Если разница между предсказанными и фактическими данными (называемая ошибкой ) имеет тенденцию быть небольшой, то определенные значения разницы (например, 0, +1, -1 и т. Д. В выборочных значениях) становятся очень частыми, что можно использовать путем их кодирования. в нескольких выходных битах.

Иногда полезно сжать только различия между двумя версиями файла (или, при сжатии видео , последовательных изображений в последовательности). Это называется дельта-кодированием (от греческой буквы Δ , которая в математике обозначает различие), но этот термин обычно используется только в том случае, если обе версии имеют смысл вне сжатия и распаковки. Например, хотя процесс сжатия ошибки в вышеупомянутой схеме сжатия звука без потерь может быть описан как дельта-кодирование от приближенной звуковой волны до исходной звуковой волны, приближенная версия звуковой волны не имеет смысла ни в каком другом контексте. .

Методы сжатия без потерь [ править ]

Ни один алгоритм сжатия без потерь не может эффективно сжать все возможные данные (подробности см. В разделе « Ограничения» ниже). По этой причине существует множество различных алгоритмов, которые разработаны либо с учетом конкретного типа входных данных, либо с конкретными предположениями о том, какие виды избыточности могут содержать несжатые данные.

Некоторые из наиболее распространенных алгоритмов сжатия без потерь перечислены ниже.

Общего назначения [ править ]

bzip2 - объединяет преобразование Барроуза – Уиллера с кодированием RLE и Хаффмана.
Конечная энтропия состояния - энтропийное кодирование , табличный вариант ANS , используемый LZFSE и Zstandard
Кодирование Хаффмана - энтропийное кодирование, хорошо сочетающееся с другими алгоритмами, используемыми packутилитой Unix
Сжатие Лемпеля-Зива (LZ77 и LZ78) - алгоритм на основе словаря, который лежит в основе многих других алгоритмов
- Алгоритм цепи Лемпеля – Зива – Маркова (LZMA) - очень высокая степень сжатия, используется 7zip и xz
- Lempel – Ziv – Oberhumer (LZO) - рассчитан на скорость за счет степени сжатия
- Lempel – Ziv – Storer – Szymanski (LZSS) - используется WinRAR в тандеме с кодированием Хаффмана.
  - Deflate - сжатие Сочетания LZSS с кодированием Хаффмана, используемое ZIP , GZIP и PNG изображений
- Lempel – Ziv – Welch (LZW) - используется изображениями GIF и compressутилитой Unix.
- Конечная энтропия Лемпеля – Зива (LZFSE) - объединяет энтропию Лемпеля – Зива и конечную энтропию, используется в iOS и macOS
- Zstandard (ZSTD) - объединяет LZ77, энтропию конечных состояний и кодирование Хаффмана, используемое ядром Linux.
Прогнозирование путем частичного совпадения (PPM) - оптимизировано для сжатия обычного текста
Кодирование длин серий (RLE) - Простая схема, обеспечивающая хорошее сжатие данных, содержащих множество прогонов одного и того же значения.

Аудио [ править ]

Apple Lossless (ALAC - аудиокодек Apple без потерь)
Акустическое кодирование с адаптивным преобразованием (ATRAC)
Кодирование без потерь звука (также известное как MPEG-4 ALS)
Прямая потоковая передача (DST)
Dolby TrueHD
DTS-HD Master Audio
Бесплатный аудиокодек без потерь (FLAC)
Упаковка без потерь Meridian (MLP)
Аудио Обезьяны (Monkey's Audio APE)
MPEG-4 SLS (также известный как HD-AAC)
OptimFROG
Исходное качество звука (OSQ)
RealPlayer (RealAudio без потерь)
Сократить (ШН)
TTA (True Audio Lossless)
WavPack (WavPack без потерь)
WMA без потерь (Windows Media без потерь)

Растровая графика [ править ]

AVIF - формат файла изображения AOMedia Video 1
FLIF - бесплатный формат изображений без потерь
HEIF - высокоэффективный формат файлов изображений (сжатие без потерь или с потерями, с использованием HEVC )
ILBM - (сжатие RLE без потерь изображений Amiga IFF )
JBIG2 - (сжатие без потерь или с потерями черно-белых изображений)
JPEG 2000 - (включает метод сжатия без потерь через обратимое целочисленное вейвлет-преобразование LeGall-Tabatabai 5/3 ^[2]^[3]^[4] )
JPEG-LS – (lossless/near-lossless compression standard)
JPEG XL – (lossless or lossy compression)
JPEG XR – formerly WMPhoto and HD Photo, includes a lossless compression method
LDCT – Lossless Discrete Cosine Transform^[5]^[6]
PCX – PiCture eXchange
PDF – Portable Document Format (lossless or lossy compression)
PNG – Portable Network Graphics
TGA – Truevision TGA
TIFF – Tagged Image File Format (lossless or lossy compression)
WebP – (lossless or lossy compression of RGB and RGBA images)

3D Graphics[edit]

OpenCTM – Lossless compression of 3D triangle meshes

Video[edit]

See this list of lossless video codecs.

Cryptography[edit]

Cryptosystems often compress data (the "plaintext") before encryption for added security. When properly implemented, compression greatly increases the unicity distance by removing patterns that might facilitate cryptanalysis.^[7] However, many ordinary lossless compression algorithms produce headers, wrappers, tables, or other predictable output that might instead make cryptanalysis easier. Thus, cryptosystems must utilize compression algorithms whose output does not contain these predictable patterns.

Genetics and Genomics[edit]

Genetics compression algorithms (not to be confused with genetic algorithms) are the latest generation of lossless algorithms that compress data (typically sequences of nucleotides) using both conventional compression algorithms and specific algorithms adapted to genetic data. In 2012, a team of scientists from Johns Hopkins University published the first genetic compression algorithm that does not rely on external genetic databases for compression. HAPZIPPER was tailored for HapMap data and achieves over 20-fold compression (95% reduction in file size), providing 2- to 4-fold better compression much faster than leading general-purpose compression utilities.^[8]

Genomic sequence compression algorithms, also known as DNA sequence compressors, explore the fact that DNA sequences have characteristic properties, such as inverted repeats. The most successful compressors are XM and GeCo.^[9] For eukaryotes XM is slightly better in compression ratio, though for sequences larger than 100 MB its computational requirements are impractical.

Executables[edit]

Self-extracting executables contain a compressed application and a decompressor. When executed, the decompressor transparently decompresses and runs the original application. This is especially often used in demo coding, where competitions are held for demos with strict size limits, as small as 1k. This type of compression is not strictly limited to binary executables, but can also be applied to scripts, such as JavaScript.

Lossless compression benchmarks[edit]

Lossless compression algorithms and their implementations are routinely tested in head-to-head benchmarks. There are a number of better-known compression benchmarks. Some benchmarks cover only the data compression ratio, so winners in these benchmarks may be unsuitable for everyday use due to the slow speed of the top performers. Another drawback of some benchmarks is that their data files are known, so some program writers may optimize their programs for best performance on a particular data set. The winners on these benchmarks often come from the class of context-mixing compression software.

Matt Mahoney, in his February 2010 edition of the free booklet Data Compression Explained, additionally lists the following:^[10]

The Calgary Corpus dating back to 1987 is no longer widely used due to its small size. Matt Mahoney currently maintains the Calgary Compression Challenge, created and maintained from May 21, 1996 through May 21, 2016 by Leonid A. Broukhis.
The Large Text Compression Benchmark^[11] and the similar Hutter Prize both use a trimmed Wikipedia XML UTF-8 data set.
The Generic Compression Benchmark,^[12] maintained by Mahoney himself, tests compression of data generated by random Turing machines.
Sami Runsas (author of NanoZip) maintains Compression Ratings, a benchmark similar to Maximum Compression multiple file test, but with minimum speed requirements. It also offers a calculator that allows the user to weight the importance of speed and compression ratio. The top programs here are fairly different due to speed requirement. In January 2010, the top programs were NanoZip followed by FreeArc, CCM, flashzip, and 7-Zip.
The Monster of Compression benchmark by N. F. Antonio tests compression on 1Gb of public data with a 40-minute time limit. As of Dec. 20, 2009 the top ranked archiver is NanoZip 0.07a and the top ranked single file compressor is ccmx 1.30c, both context mixing.

The Compression Ratings website published a chart summary of the "frontier" in compression ratio and time.^[13]

The Compression Analysis Tool^[14] is a Windows application that enables end users to benchmark the performance characteristics of streaming implementations of LZF4, Deflate, ZLIB, GZIP, BZIP2 and LZMA using their own data. It produces measurements and charts with which users can compare the compression speed, decompression speed and compression ratio of the different compression methods and to examine how the compression level, buffer size and flushing operations affect the results.

Limitations[edit]

Lossless data compression algorithms cannot guarantee compression for all input data sets. In other words, for any lossless data compression algorithm, there will be an input data set that does not get smaller when processed by the algorithm, and for any lossless data compression algorithm that makes at least one file smaller, there will be at least one file that it makes larger. This is easily proven with elementary mathematics using a counting argument, as follows:

Assume that each file is represented as a string of bits of some arbitrary length.
Suppose that there is a compression algorithm that transforms every file into an output file that is no longer than the original file, and that at least one file will be compressed into an output file that is shorter than the original file.
Let M be the least number such that there is a file F with length M bits that compresses to something shorter. Let N be the length (in bits) of the compressed version of F.
Because N<M, every file of length N keeps its size during compression. There are 2^N such files possible. Together with F, this makes 2^N+1 files that all compress into one of the 2^N files of length N.
But 2^N is smaller than 2^N+1, so by the pigeonhole principle there must be some file of length N that is simultaneously the output of the compression function on two different inputs. That file cannot be decompressed reliably (which of the two originals should that yield?), which contradicts the assumption that the algorithm was lossless.
We must therefore conclude that our original hypothesis (that the compression function makes no file longer) is necessarily untrue.

Any lossless compression algorithm that makes some files shorter must necessarily make some files longer, but it is not necessary that those files become very much longer. Most practical compression algorithms provide an "escape" facility that can turn off the normal coding for files that would become longer by being encoded. In theory, only a single additional bit is required to tell the decoder that the normal coding has been turned off for the entire input; however, most encoding algorithms use at least one full byte (and typically more than one) for this purpose. For example, deflate compressed files never need to grow by more than 5 bytes per 65,535 bytes of input.

In fact, if we consider files of length N, if all files were equally probable, then for any lossless compression that reduces the size of some file, the expected length of a compressed file (averaged over all possible files of length N) must necessarily be greater than N.^{[citation needed]} So if we know nothing about the properties of the data we are compressing, we might as well not compress it at all. A lossless compression algorithm is useful only when we are more likely to compress certain types of files than others; then the algorithm could be designed to compress those types of data better.

Thus, the main lesson from the argument is not that one risks big losses, but merely that one cannot always win. To choose an algorithm always means implicitly to select a subset of all files that will become usefully shorter. This is the theoretical reason why we need to have different compression algorithms for different kinds of files: there cannot be any algorithm that is good for all kinds of data.

The "trick" that allows lossless compression algorithms, used on the type of data they were designed for, to consistently compress such files to a shorter form is that the files the algorithms are designed to act on all have some form of easily modeled redundancy that the algorithm is designed to remove, and thus belong to the subset of files that that algorithm can make shorter, whereas other files would not get compressed or even get bigger. Algorithms are generally quite specifically tuned to a particular type of file: for example, lossless audio compression programs do not work well on text files, and vice versa.

In particular, files of random data cannot be consistently compressed by any conceivable lossless data compression algorithm: indeed, this result is used to define the concept of randomness in algorithmic complexity theory.

It's provably impossible to create an algorithm that can losslessly compress any data.^[15] While there have been many claims through the years of companies achieving "perfect compression" where an arbitrary number N of random bits can always be compressed to N − 1 bits, these kinds of claims can be safely discarded without even looking at any further details regarding the purported compression scheme. Such an algorithm contradicts fundamental laws of mathematics because, if it existed, it could be applied repeatedly to losslessly reduce any file to length 0. Allegedly "perfect" compression algorithms are often derisively referred to as "magic" compression algorithms for this reason.

On the other hand, it has also been proven^{[citation needed]} that there is no algorithm to determine whether a file is incompressible in the sense of Kolmogorov complexity. Hence it's possible that any particular file, even if it appears random, may be significantly compressed, even including the size of the decompressor. An example is the digits of the mathematical constant pi, which appear random but can be generated by a very small program. However, even though it cannot be determined whether a particular file is incompressible, a simple theorem about incompressible strings shows that over 99% of files of any given length cannot be compressed by more than one byte (including the size of the decompressor).

Mathematical background[edit]

Abstractly, a compression algorithm can be viewed as a function on sequences (normally of octets). Compression is successful if the resulting sequence is shorter than the original sequence (and the instructions for the decompression map). For a compression algorithm to be lossless, the compression map must form an injection from "plain" to "compressed" bit sequences.

The pigeonhole principle prohibits a bijection between the collection of sequences of length N and any subset of the collection of sequences of length N−1. Therefore, it is not possible to produce a lossless algorithm that reduces the size of every possible input sequence.

Points of application in real compression theory[edit]

Real compression algorithm designers accept that streams of high information entropy cannot be compressed, and accordingly, include facilities for detecting and handling this condition. An obvious way of detection is applying a raw compression algorithm and testing if its output is smaller than its input. Sometimes, detection is made by heuristics; for example, a compression application may consider files whose names end in ".zip", ".arj" or ".lha" uncompressible without any more sophisticated detection. A common way of handling this situation is quoting input, or uncompressible parts of the input in the output, minimizing the compression overhead. For example, the zip data format specifies the 'compression method' of 'Stored' for input files that have been copied into the archive verbatim.^[16]

The Million Random Digit Challenge[edit]

Mark Nelson, in response to claims of magic compression algorithms appearing in comp.compression, has constructed a 415,241 byte binary file of highly entropic content, and issued a public challenge of $100 to anyone to write a program that, together with its input, would be smaller than his provided binary data yet be able to reconstitute it without error.^[17]

The FAQ for the comp.compression newsgroup contains a challenge by Mike Goldman offering $5,000 for a program that can compress random data. Patrick Craig took up the challenge, but rather than compressing the data, he split it up into separate files all of which ended in the number 5, which was not stored as part of the file. Omitting this character allowed the resulting files (plus, in accordance with the rules, the size of the program that reassembled them) to be smaller than the original file. However, no actual compression took place, and the information stored in the names of the files was necessary to reassemble them in the correct order in the original file, and this information was not taken into account in the file size comparison. The files themselves are thus not sufficient to reconstitute the original file; the file names are also necessary. Patrick Craig agreed that no meaningful compression had taken place, but argued that the wording of the challenge did not actually require this. A full history of the event, including discussion on whether or not the challenge was technically met, is on Patrick Craig's web site.^[18]

References[edit]

^ "LZW Patent Information". About Unisys. Unisys. Archived from the original on 2009-06-02.
^ Sullivan, Gary (8–12 December 2003). "General characteristics and design considerations for temporal subband video coding". ITU-T. Video Coding Experts Group. Retrieved 13 September 2019.
^ Unser, M.; Blu, T. (2003). "Mathematical properties of the JPEG2000 wavelet filters" (PDF). IEEE Transactions on Image Processing. 12 (9): 1080–1090. Bibcode:2003ITIP...12.1080U. doi:10.1109/TIP.2003.812329. PMID 18237979. S2CID 2765169. Archived from the original (PDF) on 2019-10-13.
^ Bovik, Alan C. (2009). The Essential Guide to Video Processing. Academic Press. p. 355. ISBN 9780080922508.
^ Ahmed, Nasir; Mandyam, Giridhar D.; Magotra, Neeraj (17 April 1995). "DCT-based scheme for lossless image compression". Digital Video Compression: Algorithms and Technologies 1995. International Society for Optics and Photonics. 2419: 474–478. Bibcode:1995SPIE.2419..474M. doi:10.1117/12.206386. S2CID 13894279.
^ Komatsu, K.; Sezaki, Kaoru (1998). "Reversible discrete cosine transform". Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181). 3: 1769–1772 vol.3. doi:10.1109/ICASSP.1998.681802. ISBN 0-7803-4428-6. S2CID 17045923.
^ Alfred J. Menezes; Jonathan Katz; Paul C. van Oorschot; Scott A. Vanstone (16 October 1996). Handbook of Applied Cryptography. CRC Press. ISBN 978-1-4398-2191-6.
^ Chanda, P.; Elhaik, E.; Bader, J.S. (2012). "HapZipper: sharing HapMap populations just got easier". Nucleic Acids Res. 40 (20): 1–7. doi:10.1093/nar/gks709. PMC 3488212. PMID 22844100.
^ Pratas, D.; Pinho, A. J.; Ferreira, P. J. S. G. (2016). "Efficient compression of genomic sequences". Data Compression Conference (PDF). Snowbird, Utah.
^ Matt Mahoney (2010). "Data Compression Explained" (PDF). pp. 3–5.
^ "Large Text Compression Benchmark". mattmahoney.net.
^ "Generic Compression Benchmark". mattmahoney.net.
^ Visualization of compression ratio and time
^ "Compression Analysis Tool". Free Tools. Noemax Technologies.
^ "comp.compression Frequently Asked Questions (part 1/3) / Section - [9] Compression of random data (WEB, Gilbert and others)". faqs.org.
^ ".ZIP File Format Specification". PKWARE, Inc. chapter V, section J.
^ Nelson, Mark (2006-06-20). "The Million Random Digit Challenge Revisited".
^ Craig, Patrick. "The $5000 Compression Challenge". Retrieved 2009-06-08.

External links[edit]

"LZF compression format". github. Retrieved 2017-10-17.
Phamdo, Nam. "Theory of Data Compression". Data Compression. Retrieved 2017-10-17.
"Lossless comparison". Hydrogenaudio Knowledgebase. 2015-01-05. Retrieved 2017-10-17.
"Lossless and lossy audio formats for music". Bobulous Central. 2003-11-06. Retrieved 2017-10-17.
"Image Compression Benchmark". Archived from the original on 2013-02-10. overview of
- US patent #7,096,360, "[a]n "Frequency-Time Based Data Compression Method" supporting the compression, encryption, decompression, and decryption and persistence of many binary digits through frequencies where each frequency represents many bits."
Image rars (graphical representation of compression)

[1] "LZW Patent Information". About Unisys. Unisys. Archived from the original on 2009-06-02.

[2] Sullivan, Gary (8–12 December 2003). "General characteristics and design considerations for temporal subband video coding". ITU-T. Video Coding Experts Group. Retrieved 13 September 2019.

[Unser-3] Unser, M.; Blu, T. (2003). "Mathematical properties of the JPEG2000 wavelet filters" (PDF). IEEE Transactions on Image Processing. 12 (9): 1080–1090. Bibcode:2003ITIP...12.1080U. doi:10.1109/TIP.2003.812329. PMID 18237979. S2CID 2765169. Archived from the original (PDF) on 2019-10-13.

[4] Bovik, Alan C. (2009). The Essential Guide to Video Processing. Academic Press. p. 355. ISBN 9780080922508.

[5] Ahmed, Nasir; Mandyam, Giridhar D.; Magotra, Neeraj (17 April 1995). "DCT-based scheme for lossless image compression". Digital Video Compression: Algorithms and Technologies 1995. International Society for Optics and Photonics. 2419: 474–478. Bibcode:1995SPIE.2419..474M. doi:10.1117/12.206386. S2CID 13894279.

[6] Komatsu, K.; Sezaki, Kaoru (1998). "Reversible discrete cosine transform". Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181). 3: 1769–1772 vol.3. doi:10.1109/ICASSP.1998.681802. ISBN 0-7803-4428-6. S2CID 17045923.

[MenezesKatz1996-7] Alfred J. Menezes; Jonathan Katz; Paul C. van Oorschot; Scott A. Vanstone (16 October 1996). Handbook of Applied Cryptography. CRC Press. ISBN 978-1-4398-2191-6.

[8] Chanda, P.; Elhaik, E.; Bader, J.S. (2012). "HapZipper: sharing HapMap populations just got easier". Nucleic Acids Res. 40 (20): 1–7. doi:10.1093/nar/gks709. PMC 3488212. PMID 22844100.

[Pratas-9] Pratas, D.; Pinho, A. J.; Ferreira, P. J. S. G. (2016). "Efficient compression of genomic sequences". Data Compression Conference (PDF). Snowbird, Utah.

[10] Matt Mahoney (2010). "Data Compression Explained" (PDF). pp. 3–5.

[11] "Large Text Compression Benchmark". mattmahoney.net.

[12] "Generic Compression Benchmark". mattmahoney.net.

[13] Visualization of compression ratio and time

[14] "Compression Analysis Tool". Free Tools. Noemax Technologies.

[15] "comp.compression Frequently Asked Questions (part 1/3) / Section - [9] Compression of random data (WEB, Gilbert and others)". faqs.org.

[16] ".ZIP File Format Specification". PKWARE, Inc. chapter V, section J.

[17] Nelson, Mark (2006-06-20). "The Million Random Digit Challenge Revisited".

[18] Craig, Patrick. "The $5000 Compression Challenge". Retrieved 2009-06-08.

Сжатие без потерь

СОДЕРЖАНИЕ

Методы сжатия без потерь [ править ]

Мультимедиа [ править ]

Исторические правовые вопросы [ править ]

Методы сжатия без потерь [ править ]

Общего назначения [ править ]

Аудио [ править ]

Растровая графика [ править ]

3D Graphics[edit]

Video[edit]

Cryptography[edit]

Genetics and Genomics[edit]

Executables[edit]

Lossless compression benchmarks[edit]

Limitations[edit]

Mathematical background[edit]

Points of application in real compression theory[edit]

The Million Random Digit Challenge[edit]

See also[edit]

References[edit]

Further reading[edit]

External links[edit]