此页面用于多媒体技术大作业展示

1. 引言

图像在数字化时代扮演着重要角色,无论是在日常生活中还是在专业领域,图像的存储与传输需求日益增长。然而,原始图像数据通常占用大量存储空间,传输效率低下。因此,对图像进行有效的压缩成为必要手段。本报告将详细解析JPEG(Joint Photographic Experts Group)图像压缩的原理与实现,通过代码示例帮助读者深入理解这一广泛应用的有损压缩标准。

2. 图像存储与压缩

2.1 图像存储

图像由像素(Pixel)组成,每个像素包含颜色信息。彩色图像通常采用RGB(红、绿、蓝)三通道或RGBA(红、绿、蓝、Alpha)四通道表示,其中Alpha通道用于描述透明度。以三通道图像为例,每个像素由三个值表示,如RGB(255, 255, 255)表示白色。

2.2 图像压缩

为了有效传输和存储图像,需对图像数据进行压缩。依据图像的保真度,图像压缩可分为无损压缩和有损压缩。

2.2.1 无损压缩

无损压缩的基本原理是相同的颜色信息只需保存一次。无损压缩保证解压后的数据与原始数据完全一致,是一种可逆过程。无损压缩算法通常通过去除或减少数据中的冗余来实现,常见的无损压缩算法包括PNG、GIF等。一般情况下,无损压缩可以将文件大小缩减到原来的1/2至1/4。

2.2.2 有损压缩

有损压缩在压缩过程中舍弃部分图像信息,导致解压后的图像与原始图像不完全一致,是一种不可逆的压缩方式。有损压缩通过保留人眼不易察觉的细节信息,显著降低图像文件大小,同时尽量保持图像质量。常见的有损压缩格式包括JPEG、WebP等。由于信息量减少,压缩比可以非常高,但图像质量可能会有所下降。

2.3 图像格式

常见的图像格式分为有损和无损两类:

  • 有损格式:JPEG、WebP
  • 无损格式:PNG、BMP、GIF

图像格式通常通过文件的后缀名来区分,但有时后缀名可能不准确。实际的图片格式可通过查看图片数据的头部信息来确定。例如:

  • JPEG:以0xFF D8开头,以0xFF D9结尾。
  • PNG:以0x89 50 4E 47 0D 0A 1A 0A开头,以00 00 00 00 49 45 4E 44 AE 42 60 82结尾。

3. JPEG压缩

3.1 JPEG概述

JPEG是一种广泛应用的有损图像压缩标准,特别适用于自然图像的压缩。JPEG文件格式常见的扩展名有.jpg.jpeg,两者在实质上是相同的。JPEG压缩通过减少人眼不易察觉的信息,显著降低图像文件大小,同时尽量保持图像质量。

3.2 JPEG压缩步骤

JPEG压缩过程包括多个步骤,每个步骤都旨在通过不同的方法减少图像数据量,同时尽量保持图像质量。以下将详细介绍每个步骤的原理与实现。

3.2.1 色彩空间转换

人眼对亮度信息比色度信息更敏感,因此JPEG首先将图像从RGB颜色空间转换为YUV(或YCrCb)颜色空间,其中:

  • Y:表示亮度(明亮度),即灰度值。
  • UV:表示色度,用于描述图像的色彩和饱和度。

转换公式如下:

$$
Y=0.299R+0.587G+0.114B
$$

$$
U=0.5R−0.4187G−0.0813B+128
$$

$$
V = -0.1687R - 0.3313G + 0.5B + 128
$$

Python实现:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
import cv2
import numpy as np

def rgb_to_yuv(image):
"""将RGB图像转换为YUV颜色空间"""
img = np.array(image, dtype=np.float32)
R = img[:, :, 0]
G = img[:, :, 1]
B = img[:, :, 2]
Y = np.round(0.299 * R + 0.587 * G + 0.114 * B)
U = np.round(0.5 * R - 0.4187 * G - 0.0813 * B + 128)
V = np.round(-0.1687 * R - 0.3313 * G + 0.5 * B + 128)
Y = np.clip(Y, 0, 255).astype(np.uint8)
U = np.clip(U, 0, 255).astype(np.uint8)
V = np.clip(V, 0, 255).astype(np.uint8)
return Y, U, V

def yuv_to_rgb(Y, U, V):
"""将YUV颜色空间转换回RGB"""
Y = Y.astype(np.float32)
U = U.astype(np.float32) - 128
V = V.astype(np.float32) - 128
R = Y + 1.402 * V
G = Y - 0.344136 * U - 0.714136 * V
B = Y + 1.772 * U
R = np.clip(R, 0, 255).astype(np.uint8)
G = np.clip(G, 0, 255).astype(np.uint8)
B = np.clip(B, 0, 255).astype(np.uint8)
return np.stack([R, G, B], axis=2)

# 示例代码
if __name__ == "__main__":
# 读取图像
image = cv2.imread('data/dog.jpg')
Y, U, V = rgb_to_yuv(image)

# 保存Y、U、V通道图像
cv2.imwrite('Y.png', Y)
cv2.imwrite('U.png', U)
cv2.imwrite('V.png', V)
cv2.imwrite('YUV.png', cv2.merge([Y, U, V]))

3.2.2 降采样

由于人眼对色度信息不如亮度信息敏感,JPEG利用这一特点对U和V色度分量进行下采样,以减少数据量。常用的下采样方式为4:2:0,即色度分量的水平和垂直分辨率都减半。

Python实现:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
def downsample(Y, U, V):
"""对色度分量进行4:2:0下采样"""
U_ds = U[::2, ::2]
V_ds = V[::2, ::2]
return Y, U_ds, V_ds

def upsample(U_ds, V_ds):
"""对色度分量进行上采样"""
U = U_ds.repeat(2, axis=0).repeat(2, axis=1)
V = V_ds.repeat(2, axis=0).repeat(2, axis=1)
return U, V

# 示例代码
if __name__ == "__main__":
image = cv2.imread('data/dog.jpg')
Y, U, V = rgb_to_yuv(image)
Y_ds, U_ds, V_ds = downsample(Y, U, V)

# 保存下采样后的色度分量
cv2.imwrite('Y_downsampled.png', Y_ds)
cv2.imwrite('U_downsampled.png', U_ds)
cv2.imwrite('V_downsampled.png', V_ds)

3.2.3 离散余弦变换(DCT)

人类视觉对高频信息不敏感,离散余弦变换(DCT)用于将图像从空间域转换到频率域,分析图像中高低频信息的含量。JPEG将图像分为8x8的像素块,对每个像素块应用二维DCT,将数据集中到低频部分。

二维离散余弦变换公式为:

$$
F(u, v) = \frac{1}{4} C(u) C(v) \sum_{x=0}^{7} \sum_{y=0}^{7} f(x, y) \cos \left[\frac{(2x+1)u\pi}{16}\right] \cos \left[\frac{(2y+1)v\pi}{16}\right]
$$
Python实现:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
import math
import numpy as np
import cv2

def alpha(u):
return 1 / math.sqrt(2) if u == 0 else 1

def DCT_block(block):
"""对8x8块进行二维DCT"""
block = block.astype(np.float32) - 128 # 中心化
dct = np.zeros((8, 8), dtype=np.float32)
for u in range(8):
for v in range(8):
sum_val = 0.0
for x in range(8):
for y in range(8):
sum_val += block[x, y] * math.cos((2*x + 1) * u * math.pi / 16) * math.cos((2*y + 1) * v * math.pi / 16)
dct[u, v] = 0.25 * alpha(u) * alpha(v) * sum_val
return np.round(dct)

def apply_dct(image):
"""对整个图像应用DCT变换"""
h, w = image.shape
# 填充图像,使其高度和宽度都是8的倍数
h_padded = h if h % 8 == 0 else h + (8 - h % 8)
w_padded = w if w % 8 == 0 else w + (8 - w % 8)
padded = np.zeros((h_padded, w_padded), dtype=np.float32)
padded[:h, :w] = image
blocks = []
for i in range(0, h_padded, 8):
for j in range(0, w_padded, 8):
block = padded[i:i+8, j:j+8]
dct_block = DCT_block(block)
blocks.append(dct_block)
return blocks

# 示例代码
if __name__ == "__main__":
image = cv2.imread('data/dog.jpg')
Y, U, V = rgb_to_yuv(image)
Y_ds, U_ds, V_ds = downsample(Y, U, V)
dct_blocks = apply_dct(Y_ds)

# 打印第一个DCT块
print("第一个DCT块:\n", dct_blocks[0])
1
2
3
4
5
6
7
8
9
第一个DCT块:
[[ 39. 4. -4. 0. 0. 2. -2. -1.]
[ -1. -1. 0. 0. 0. 0. 1. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0.]]

3.2.4 量化

量化是JPEG压缩中关键的一步,通过对DCT系数进行舍入,减少数据精度,从而降低文件大小。量化过程使用预定义的量化矩阵,通常根据输出图像的质量需求自定义。量化矩阵中的数值越大,表示对相应频率信息的舍弃越多,压缩率越高,图像质量下降越明显。

标准量化矩阵(亮度):

1
2
3
4
5
6
7
8
9
10
QY = np.array([
[16,11,10,16,24,40,51,61],
[12,12,14,19,26,58,60,55],
[14,13,16,24,40,57,69,56],
[14,17,22,29,51,87,80,62],
[18,22,37,56,68,109,103,77],
[24,35,55,64,81,104,113,92],
[49,64,78,87,103,121,120,101],
[72,92,95,98,112,100,103,99]
])

Python实现:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def quantize(blocks, Q):
"""对DCT系数进行量化"""
quantized = []
for block in blocks:
quant = np.round(block / Q).astype(np.int32)
quantized.append(quant)
return quantized

# 示例代码
if __name__ == "__main__":
# 继续前面的代码
quantized_blocks = quantize(dct_blocks, QY)

# 打印第一个量化块
print("第一个量化块:\n", quantized_blocks[0])
1
2
3
4
5
6
7
8
9
第一个量化块:
[[ 2. 0. -0. 0. 0. 0. -0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0.]]

3.2.5 ZIGZAG排序

量化后的8x8块中,低频信息集中在左上角,高频信息分散在右下角。为了进一步压缩数据,JPEG采用ZIGZAG(锯齿形)排序,将二维矩阵转换为一维数组,有效地将零值集中在数组的末尾,从而便于后续的编码。

ZIGZAG排序示意:

1
2
3
4
5
6
7
8
00 01 05 06 14 15 27 28
02 04 07 13 16 26 29 42
03 08 12 17 25 30 41 43
09 11 18 24 31 40 44 53
10 19 23 32 39 45 52 54
20 22 33 38 46 51 55 60
21 34 37 47 50 56 59 61
35 36 48 49 57 58 62 63

Python实现:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
def zigzag_order(block):
"""对8x8块进行ZIGZAG排序"""
zigzag_indices = [
(0,0),(0,1),(1,0),(2,0),(1,1),(0,2),(0,3),(1,2),
(2,1),(3,0),(4,0),(3,1),(2,2),(1,3),(0,4),(0,5),
(1,4),(2,3),(3,2),(4,1),(5,0),(6,0),(5,1),(4,2),
(3,3),(2,4),(1,5),(0,6),(0,7),(1,6),(2,5),(3,4),
(4,3),(5,2),(6,1),(7,0),(7,1),(6,2),(5,3),(4,4),
(3,5),(2,6),(1,7),(2,7),(3,6),(4,5),(5,4),(6,3),
(7,2),(7,3),(6,4),(5,5),(4,6),(3,7),(4,7),(5,6),
(6,5),(7,4),(7,5),(6,6),(5,7),(6,7),(7,6),(7,7)
]
return [block[i][j] for i, j in zigzag_indices]

def zigzag(blocks):
"""对所有块进行ZIGZAG排序"""
zigzagged = []
for block in blocks:
z = zigzag_order(block)
zigzagged.append(z)
return zigzagged

# 示例代码
if __name__ == "__main__":
# 继续前面的代码
zigzagged_blocks = zigzag(quantized_blocks)

# 打印第一个ZIGZAG排序后的块
print("第一个ZIGZAG排序后的块:\n", zigzagged_blocks[0])
1
2
第一个ZIGZAG排序后的块:
[2.0, 0.0, -0.0, 0.0, 0.0, 0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]

3.2.6 差分脉冲编码调制(DPCM)对直流系数(DC)编码

在ZIGZAG排序后的数据中,第一个元素为直流(DC)系数,代表块的平均亮度。由于相邻块的DC系数变化较小,JPEG采用差分脉冲编码调制(DPCM)技术对DC系数进行编码,通过存储DC系数与前一个块DC系数的差值来减少数据量。

Python实现:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
def DPCM_dc(zigzagged):
"""对ZIGZAG排序后的块进行DPCM编码"""
dpcm = []
previous = 0
for block in zigzagged:
current = block[0]
diff = current - previous
dpcm.append(diff)
previous = current
return dpcm

# 示例代码
if __name__ == "__main__":
# 继续前面的代码
dpcm_dc = DPCM_dc(zigzagged_blocks)

# 打印前十个DC差分值
print("前十个DC差分值:\n", dpcm_dc[:10])

结果展示:

1
2
前十个DC差分值:
[50.0, -2.0, -13.0, -7.0, -3.0, 0.0, -1.0, 0.0, -1.0, -2.0]

3.2.7 行程长度编码(RLE)对交流系数(AC)编码

交流(AC)系数表示块中的细节信息,通常集中在低频部分。JPEG采用行程长度编码(Run-Length Encoding,RLE)对AC系数进行编码,记录连续零值的数量和非零值的具体数值。这种方法有效地压缩了高频信息中大量的零值。

Python实现:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
def RLE_ac(zigzagged):
"""对ZIGZAG排序后的块进行RLE编码"""
rle = []
for block in zigzagged:
ac = []
zero_count = 0
for coef in block[1:]:
if coef == 0:
zero_count += 1
else:
while zero_count > 15:
ac.append((15, 0))
zero_count -= 16
ac.append((zero_count, coef))
zero_count = 0
if zero_count > 0:
ac.append((0, 0)) # End of Block
rle.append(ac)
return rle

# 示例代码
if __name__ == "__main__":
# 继续前面的代码
rle_ac = RLE_ac(zigzagged_blocks)

# 打印第一个RLE编码结果
print("第一个RLE编码结果:\n", rle_ac[0])

结果展示:

1
2
第一个RLE编码结果:
[(0, 0)]

3.2.8 熵编码

JPEG采用霍夫曼编码(Huffman Coding)对量化后的数据进行熵编码,以进一步压缩数据量。霍夫曼编码通过为高频出现的符号分配较短的码字,低频符号分配较长的码字,实现数据压缩。JPEG标准中,DC系数和AC系数分别采用不同的霍夫曼编码表,并根据亮度和色度信息使用不同的编码表。

注意:本报告中的代码实现简化了熵编码过程,未涵盖完整的霍夫曼编码实现。实际应用中,需使用专业的熵编码库或自行实现霍夫曼编码算法。

3.3 代码实现

综合上述步骤,以下是完整的JPEG压缩与解压缩的简化Python代码实现。该实现涵盖了色彩空间转换、下采样、DCT、量化、ZIGZAG排序、DPCM和RLE编码,但未涉及霍夫曼编码和更复杂的压缩优化。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
import cv2
import numpy as np
import math

def rgb_to_yuv(image):
img = np.array(image, dtype=np.float32)
R = img[:, :, 0]
G = img[:, :, 1]
B = img[:, :, 2]
Y = np.round(0.299 * R + 0.587 * G + 0.114 * B)
U = np.round(0.5 * R - 0.4187 * G - 0.0813 * B + 128)
V = np.round(-0.1687 * R - 0.3313 * G + 0.5 * B + 128)
Y = np.clip(Y, 0, 255).astype(np.uint8)
U = np.clip(U, 0, 255).astype(np.uint8)
V = np.clip(V, 0, 255).astype(np.uint8)
return Y, U, V

def yuv_to_rgb(Y, U, V):
Y = Y.astype(np.float32)
U = U.astype(np.float32) - 128
V = V.astype(np.float32) - 128
R = Y + 1.402 * V
G = Y - 0.344136 * U - 0.714136 * V
B = Y + 1.772 * U
R = np.clip(R, 0, 255).astype(np.uint8)
G = np.clip(G, 0, 255).astype(np.uint8)
B = np.clip(B, 0, 255).astype(np.uint8)
return np.stack([R, G, B], axis=2)

def downsample(Y, U, V):
U_ds = U[::2, ::2]
V_ds = V[::2, ::2]
return Y, U_ds, V_ds

def upsample(U_ds, V_ds):
U = U_ds.repeat(2, axis=0).repeat(2, axis=1)
V = V_ds.repeat(2, axis=0).repeat(2, axis=1)
return U, V

def alpha(u):
return 1 / math.sqrt(2) if u == 0 else 1

def DCT_block(block):
block = block.astype(np.float32) - 128
dct = np.zeros((8, 8), dtype=np.float32)
for u in range(8):
for v in range(8):
sum_val = 0.0
for x in range(8):
for y in range(8):
sum_val += block[x, y] * math.cos((2*x + 1) * u * math.pi / 16) * math.cos((2*y + 1) * v * math.pi / 16)
dct[u, v] = 0.25 * alpha(u) * alpha(v) * sum_val
return np.round(dct)

def apply_dct(image):
h, w = image.shape
h_padded = h if h % 8 == 0 else h + (8 - h % 8)
w_padded = w if w % 8 == 0 else w + (8 - w % 8)
padded = np.zeros((h_padded, w_padded), dtype=np.float32)
padded[:h, :w] = image
blocks = []
for i in range(0, h_padded, 8):
for j in range(0, w_padded, 8):
block = padded[i:i+8, j:j+8]
dct_block = DCT_block(block)
blocks.append(dct_block)
return blocks

def quantize(blocks, Q):
quantized = []
for block in blocks:
quant = np.round(block / Q).astype(np.int32)
quantized.append(quant)
return quantized

def zigzag_order(block):
zigzag_indices = [
(0,0),(0,1),(1,0),(2,0),(1,1),(0,2),(0,3),(1,2),
(2,1),(3,0),(4,0),(3,1),(2,2),(1,3),(0,4),(0,5),
(1,4),(2,3),(3,2),(4,1),(5,0),(6,0),(5,1),(4,2),
(3,3),(2,4),(1,5),(0,6),(0,7),(1,6),(2,5),(3,4),
(4,3),(5,2),(6,1),(7,0),(7,1),(6,2),(5,3),(4,4),
(3,5),(2,6),(1,7),(2,7),(3,6),(4,5),(5,4),(6,3),
(7,2),(7,3),(6,4),(5,5),(4,6),(3,7),(4,7),(5,6),
(6,5),(7,4),(7,5),(6,6),(5,7),(6,7),(7,6),(7,7)
]
return [block[i][j] for i, j in zigzag_indices]

def zigzag(blocks):
zigzagged = []
for block in blocks:
z = zigzag_order(block)
zigzagged.append(z)
return zigzagged

def DPCM_dc(zigzagged):
dpcm = []
previous = 0
for block in zigzagged:
current = block[0]
diff = current - previous
dpcm.append(diff)
previous = current
return dpcm

def RLE_ac(zigzagged):
rle = []
for block in zigzagged:
ac = []
zero_count = 0
for coef in block[1:]:
if coef == 0:
zero_count += 1
else:
while zero_count > 15:
ac.append((15, 0))
zero_count -= 16
ac.append((zero_count, coef))
zero_count = 0
if zero_count > 0:
ac.append((0, 0)) # End of Block
rle.append(ac)
return rle

def jpeg_compress(image_path):
image = cv2.imread(image_path)
Y, U, V = rgb_to_yuv(image)
Y_ds, U_ds, V_ds = downsample(Y, U, V)
dct_blocks = apply_dct(Y_ds)
QY = np.array([
[16,11,10,16,24,40,51,61],
[12,12,14,19,26,58,60,55],
[14,13,16,24,40,57,69,56],
[14,17,22,29,51,87,80,62],
[18,22,37,56,68,109,103,77],
[24,35,55,64,81,104,113,92],
[49,64,78,87,103,121,120,101],
[72,92,95,98,112,100,103,99]
])
quantized_blocks = quantize(dct_blocks, QY)
zigzagged_blocks = zigzag(quantized_blocks)
dpcm_dc = DPCM_dc(zigzagged_blocks)
rle_ac = RLE_ac(zigzagged_blocks)
# 简化处理:这里只返回编码后的DC和AC系数
return dpcm_dc, rle_ac

def jpeg_decompress(dpcm_dc, rle_ac, image_shape):
"""简化的JPEG解压缩过程"""
# 重建DC系数
dc = []
previous = 0
for diff in dpcm_dc:
current = previous + diff
dc.append(current)
previous = current
# 重建AC系数
zigzagged = []
for i in range(len(dc)):
block = [dc[i]]
ac = rle_ac[i]
for run, coef in ac:
if run == 0 and coef == 0:
# End of Block
block.extend([0] * (63 - len(block) + 1))
break
block.extend([0] * run)
block.append(coef)
# 填充剩余的0
if len(block) < 64:
block.extend([0] * (64 - len(block)))
zigzagged.append(block)
# 反ZIGZAG排序
def inverse_zigzag(z):
block = np.zeros((8, 8), dtype=np.float32)
zigzag_indices = [
(0,0),(0,1),(1,0),(2,0),(1,1),(0,2),(0,3),(1,2),
(2,1),(3,0),(4,0),(3,1),(2,2),(1,3),(0,4),(0,5),
(1,4),(2,3),(3,2),(4,1),(5,0),(6,0),(5,1),(4,2),
(3,3),(2,4),(1,5),(0,6),(0,7),(1,6),(2,5),(3,4),
(4,3),(5,2),(6,1),(7,0),(7,1),(6,2),(5,3),(4,4),
(3,5),(2,6),(1,7),(2,7),(3,6),(4,5),(5,4),(6,3),
(7,2),(7,3),(6,4),(5,5),(4,6),(3,7),(4,7),(5,6),
(6,5),(7,4),(7,5),(6,6),(5,7),(6,7),(7,6),(7,7)
]
for idx, (i, j) in enumerate(zigzag_indices):
block[i, j] = z[idx]
return block

quantized_blocks = [inverse_zigzag(z) for z in zigzagged]
# 反量化
QY = np.array([
[16,11,10,16,24,40,51,61],
[12,12,14,19,26,58,60,55],
[14,13,16,24,40,57,69,56],
[14,17,22,29,51,87,80,62],
[18,22,37,56,68,109,103,77],
[24,35,55,64,81,104,113,92],
[49,64,78,87,103,121,120,101],
[72,92,95,98,112,100,103,99]
])
dequantized_blocks = []
for block in quantized_blocks:
dequant = block * QY
dequantized_blocks.append(dequant)
# 反DCT
def IDCT_block(block):
block += 128
idct = np.zeros((8, 8), dtype=np.float32)
for x in range(8):
for y in range(8):
sum_val = 0.0
for u in range(8):
for v in range(8):
sum_val += alpha(u) * alpha(v) * block[u, v] * math.cos((2*x + 1) * u * math.pi / 16) * math.cos((2*y + 1) * v * math.pi / 16)
idct[x, y] = 0.25 * sum_val
return np.clip(np.round(idct), 0, 255).astype(np.uint8)

reconstructed_blocks = [IDCT_block(block) for block in dequantized_blocks]
# 合并块
def merge_blocks(blocks, image_shape):
h, w = image_shape
h_padded = h if h % 8 == 0 else h + (8 - h % 8)
w_padded = w if w % 8 == 0 else w + (8 - w % 8)
image = np.zeros((h_padded, w_padded), dtype=np.uint8)
idx = 0
for i in range(0, h_padded, 8):
for j in range(0, w_padded, 8):
image[i:i+8, j:j+8] = reconstructed_blocks[idx]
idx += 1
return image[:h, :w]

reconstructed_Y = merge_blocks(reconstructed_blocks, image_shape=(Y.shape))
# 上采样
U_reconstructed, V_reconstructed = upsample(U_ds, V_ds)
# 转换回RGB
reconstructed_image = yuv_to_rgb(reconstructed_Y, U_reconstructed, V_reconstructed)
return reconstructed_image

def jpeg_compress_decompress(image_path, output_path):
dpcm_dc, rle_ac = jpeg_compress(image_path)
image = cv2.imread(image_path)
Y, U, V = rgb_to_yuv(image)
reconstructed_YUV = jpeg_decompress(dpcm_dc, rle_ac, Y.shape)
cv2.imwrite(output_path, reconstructed_YUV)
print(f"重建图像已保存至 {output_path}")

# 使用示例
if __name__ == "__main__":
input_image = 'data/dog.jpg' # 输入图像路径
output_image = 'output.jpg' # 输出图像路径
jpeg_compress_decompress(input_image, output_image)

注意:上述代码为简化版,实现了JPEG压缩的基本流程,但未涉及熵编码和更复杂的压缩优化。在实际应用中,JPEG标准包含更多细节,如霍夫曼编码、不同的量化表选择等。

运行上述代码,可以对原始图像进行压缩与解压缩,观察压缩效果与图像质量变化。

4.在线体验

在线体验:点击这里,在线体验JPEG图片压缩