Current location - Quotes Website - Personality signature - Portable Network Graphics (PNG) Specification (2nd Edition)
Portable Network Graphics (PNG) Specification (2nd Edition)
W3C officially recommended it in June 2003 165438+ 10/0.

Current version: working byte order).

At present, the X86 and ARM processors commonly used by the host are all in small-endian mode, which is called host byte order.

PNG uses network byte order.

Ordinary true color pictures are composed of at least three channels: red, green and blue, that is, each pixel occupies more than three bytes of space. So the compression efficiency of the picture is very low.

So we build a palette, preset the colors we need to use in the palette, and only need to provide the index position of the palette for subsequent use.

And in order to prevent the palette from taking up too much space, we set the capacity of the palette within 256. The index position will not exceed 256, only one byte is needed to represent it, and one byte only occupies one channel, so the index mode has only one channel.

A picture is made up of many pixels and can be regarded as a two-dimensional array. We can extract several special positions, such as every other pixel in the horizontal row and every other pixel in the vertical column, thus forming a new two-dimensional array, which can represent the original image concisely and distortedly. This extraction method is called Pass extraction, and the data formed after extraction is staggered, and each segment can contain thumbnails of the whole picture. Only need to recombine to produce a complete image, even if there is no complete data, as long as some of them can get thumbnails. It is a way to speed up image transmission in network transmission, but now the network speed is too fast, which is almost useless.

The middle area of the color card is white, which we call white point, which is a parameter of the color card. Another parameter is the benchmark, which is used for the conversion of color mapping.

We can change the white area in the middle by setting the coordinates of the white point to make it shift to red, green and blue, so as to adjust the change degree of the color map.

The PNG specification does not specify the application program interface, but it involves four kinds of images: original image, standard image, PNG image and delivery image. This relationship is as follows:

The number of bits occupied by binary coding is the sample depth.

PNG has three ways to manage color space: ICC configuration, sRGB configuration, chromaticity reference and white point position configuration.

ICC configuration is flexible and easy to adapt; SRGB configuration needs to set a specific color space, which may occupy more capacity; The last one is more accurate. For the first two, Gamma values are also recommended.

We need some methods to convert standard images into PNG images. The process is as follows:

Separation of transparent channels, in fact, many standard images do not have transparent channels, which can be defaulted as opaque, saving a channel.

If the number of different pixel values is less than 256 and the sample depth is less than or equal to 8, an index can be established.

If the depth of the color samples is the same and each channel has the same value, one channel can be used to represent all the contents, that is, the gray image.

A method of expressing transparency without alpha channel needs to set the background color.

PNG does not support all depths, only 1, 2, 4, 8, 16. If it is not these numbers, the depth must be adjusted by software.

For example, the original depth was 5, and now it has to be changed to 8, which means it has been expanded.

If the depth of different channels is different, we will choose the maximum depth to adjust.

This depth conversion is reversible.

There are five kinds of * * *:

Here, two methods are used to extract Pass.

The first is the empty method, which means doing nothing. Then why do you count this as a method so rigidly? )

In the second method, seven reduced images are obtained by multiple scans. That is, Adam7 algorithm (not Adam algorithm for deep learning).

However, this algorithm can hardly be found on domestic websites, and it is not clear from Wikipedia (https://en.wikipedia.org/wiki/adam7 _ algorithm). So I will simply say here:

Read the above miniature picture (of course, the original picture was read in an empty way) line by line. There can be many operations here, such as converting the above extraction into output. )

There are several filter types that will be written before the filter array.

It's coding and encryption.

The encode data is divided into one or more block.

A standard PNG file consists of many blocks, and each block has four parts: length, name, data body and check code.

The standard PNG definition has 18 block types, and you can add custom blocks.

These 18 block types are:

Key block:

IHDR (image header), PLTE (color palette), IDAT (image data image content), IEND (image end file end).

Auxiliary block:

Transparent association: transparent information

Color correlation: chrm (chroma and white point), gAMA (gamma gamma value), iCCP (embedded ICC profile embedded ICC overview), sBIT (significant bit), sRGB (standard RGB color space).

Text-related: iTXt (International Text Data Internationalized Text), Text (Text Data Text), zTXt(zip Text Data Compressed Text).

Time correlation: time (last modified time)

Others: bKGD (background color), hIST (histogram), pHYs (physical pixel size), sPLT (suggested color palette),

Transmission error or file corruption, which will destroy most or all data streams; Syntax error, invalid block or missing block.

These two error handling methods should be distinguished.

You can submit related extensions to ISO/IEC or PNG group, register new block types and text keywords, and extend new filtering algorithms, staggered mode algorithms and compression algorithms.

It is the binary structure of the data stream.

The first eight characters of all PNG data streams are137 80 78 78 713102610.

Bytes stands for b'\x89PNG\r\n\x 1a\n'

This signature indicates that the data behind it are all PNG data streams. If you encounter empty characters, don't interrupt them. You need IEND to complete them.

Each module consists of the following four parts:

Through the name convention, the PNG decoder can obtain relevant information by name when it cannot identify the purpose of the current block.

The name of the block has four numbers:

The first digit means auxiliary, lowercase means that this block is auxiliary, and uppercase means that this block is key.

The second digit indicates private, lowercase indicates that this block is private and there is no international standard definition, and the size indicates the aforementioned 18 block type.

The third bit is reserved, with lowercase indicating that the block is discarded and uppercase indicating that it can be used. (used to agree on future expansion)

The fourth bit means copy safety, that is, when editing a picture, if the PNG editor encounters an unsafe data block, it will not be copied completely, but will be copied selectively. Capitalization means that the PNG editor can be completely copied without worrying about any problems.

See crc32 algorithm for details.

Because PNG images can be streamed, that is, they can be previewed in a PNG browser without reading the end of the file.

So before reading the image content, you need to prepare something, such as an index palette.

It seems to have been written in chapter 4.3?

As we wrote in Chapter 4.4, there are five color types:

Color types are recorded in IHDR.

In gray mode, brightness depends on gamma, sRGB, iCCP, if not, it depends on the machine.

The color sample is not necessarily proportional to the light intensity, and can be adjusted by setting gAMA.

The calculation method of the value is as follows: initially 0, using the palette plus 1, using the true color plus 2, and using the transparent channel plus 4. Indexes cannot be used in grayscale.

Transparency can be expressed in four ways: using transparent channel, setting transparent color information with tRNS block, setting alpha table in tRNS in index, and expressing complete opacity without using transparent channel or tRNS.

The sampling depths of transparent channels are 8 and 16, and transparent channels are stored in pixels, indicating complete transparency and complete opacity. Transparency is used to combine the foreground and background colors of an image.

Some ordinary pictures do not contain transparency, and even the pixel value has been multiplied by transparency, and the compounding step with black background has been done in advance; But Papua New Guinea does not.

Integer (int) is multibyte, short is 2, int is 4, and long is 8.

PNG uses network byte order, with MSB in high order and LSB in low order.

That is, the rows and pixels of each PNG image are arranged compactly.

When the depth is less than 8, there may be mismatched bytes at the end of the scan line, and these unused bytes will not be processed.

Filters can improve the compressibility of compressed data and are reversible. PNG allows scanning line data to be filtered, in other words, it can be filtered without filtering.

The filtered byte sequence is the same as before, but according to different filtering types, a byte tag will be added in front. If the length does not increase, there is no filtering. The specific filtering method will be explained later.

Interlaced mode can improve the loading speed of network pictures on CRT display (in other words, interlaced mode is useless without network and CRT display).

Please refer to [4.5.2 Channel Extraction ](#4.5.2 Channel Extraction).

Due to the characteristics of Adam7, images with width or height less than 5 will lack thumbnails (2 in the fifth column and 3 in the fifth row).

The purpose of filtering is to improve the compression ratio. The method of filtering is not unique. In interlaced mode, all reduced images should use the same filter. In non-interlaced mode, there is only one picture, and of course there is only one method.

Method 0 is defined in this standard, and other numbers are reserved for future use. Method 0 contains five types of filters, and each scan line can use different filter types.

The PNG specification does not require the type of filter, and the specific selection method will be discussed later.

This filter applies to bytes, regardless of pixels, channels and depth. Just give bytes to filter.

The following are the definitions of several parameters:

Org () represents the original value of bytes;

Flt () represents the filtered value;

Rc () represents the reconstructed value;

Paeth () See [9.4 Filter Type 4](#9.4 Filter Type 4).

If there is no previous pixel, 0 is used. The first line of each thumbnail does not have the previous line, and it is also replaced by 0.

Because filtering is used, reconstruction should also be calculated in this order.

The input and output values of the filter are unsigned bytes.

0, 1 2 filters are very simple, that is, alternating column/interlaced subtraction.

But in the third type, FLT (x) = org (x)-floor ((Org(a)+Org(b))/2) and org (a)+org (b) overflow, so byte operation cannot be used, and it should be short or multi-bit. And, of course, the right shift algorithm.

The Paeth algorithm first calculates the linear values of three adjacent pixels (left, top and left), and then selects the adjacent pixels closest to the calculated values to recalculate. Be careful not to overflow the cache. Its functions are as follows:

As with the above filtering, the default is method 0. These two are marked in IHDR.

Of course, zlib compression is used here, the default level is 8, and the compressed bytes do not exceed 32768.

This check value is different from the check value of PNG block and cannot be confused.

Packed and compressed a plurality of filtered lines into a zlib data stream, put in a plurality of PNG blocks, and unpacked the PNG blocks to obtain a zlib data stream.

Of course, this also involves asynchronous reading. Zlib data stream itself can be interrupted, even if it is interrupted, the previously arranged data can still be read, so that the staggered pattern can be explained. Therefore, for the python method above, the following improvements have been made:

Establish a continuous reading, reading and analysis.

The following is an introduction to the 18 PNG specification block:

IHDR is the first block in PNG data stream. The composition is as follows:

Therefore, the block length of IHDR is 13 and will not change.

The color palette is a two-dimensional array, which can be regarded as Array[n][3], and the color is represented by index n.

So n will not exceed 256, and the length of PLTE block is also a multiple of 3.

In any case, the palette is 8 bits deep. Even if the image is 1, 2, 4, the color palette is still 8.

All IDAT blocks together are a zlib data stream, see [10 compression ](# 10 compression).

This piece of data is empty, indicating the end of PNG data stream. Of course, if this block is damaged, it will be fine.

This is a block representing transparency information, which has three components:

Gray scale mode (any color equal to this gray scale is considered transparent)

True color mode (transparent color is represented by these three values)

Index mode (in index mode, tRNS is equivalent to an alpha table, which is as big as the index, indicating the transparency of the index)

Why are gray mode and true color mode represented by 2 bytes? Because it needs to adapt to the depth of 16 bits, and the depth of index mode is always less than or equal to 8.

This module is used to set CIE chromaticity space. The composition is as follows:

The stored value is 100000 times the actual value.

CIE chromaticity space is a two-dimensional image, and four points of red, green, blue and white form a myopic triangle to set the degree of color deviation.

This block only stores an unsigned int, and its value needs to be divided by 100000 to get the actual gamma value.

This block is used to set the ICC description.

PNG only supports fixed depth. If it does not match the depth of the original image, it will be forced to zoom in and out, but the original information will be preserved here and the original image will be restored. (So it is generally not needed. Why do you want to convert a standard image into a non-standard image? )

Different channel numbers have different sBIT lengths.

Using sRGB color space, it can't be described by ICC now. SRGB contains only one unsigned byte, indicating the rendering intent.

The meaning of this value is as follows:

GAMA and cHRM are recommended when using sRGB, because some devices do not support sRGB, so they can be used compatibly.

These are the keywords that will be used in text messages. Keyword is not very important, but a definition, which can be changed by yourself. But can be read by image software standard.

The text contains the following components:

Of course, the compression method here is also 0, and zlib is used to decompress the following data.

The international text data is a bit high.

For language types, see RCF-3066, ISO 646 and ISO 639.

Background color.

The histogram gives the approximate usage frequency of each color in the palette.

If the PNG browser cannot provide all the colors in the palette, the histogram can help create a similar palette.

Of course, you can't provide a complete palette without software.

This block is used to represent the actual size of pixels on the screen. Its structure is as follows:

There are two unit descriptions. If False, it means that the block only represents the aspect ratio, not the real value. When True, it is in meters, that is, a unit is one meter and how many pixels does one meter contain.

The specific channel length is determined by the depth of the sample. The depth of 16 is two, and 8 or less is one.

Palette names are case-sensitive and are limited by keyword parameters.

In grayscale PNG images, each destination usually has equal red, green, blue and blue values, but this is not necessary.

Each frequency value is proportional to the proportion of pixels in the image, not the actual frequency.

Universal Time (UTC) is used here.

The formula behind it is too complicated. Forget it, just say it.