Go 部落格

The Go 影像套件

Nigel Tao
2011 年 9 月 21 日

序言

「image」與「image/color」套件中定義多種類型：color.Color 與 color.Model 描述色彩，image.Point 與 image.Rectangle 描述基礎的 2D 幾何圖形，image.Image 結合這兩個概念，用於代表矩形色彩網路。另一篇〈文章〉利用「image/draw」套件說明如何進行圖片編排。

色彩和色彩模式

「Color」介面定義任何被視為色彩之類型的最小方法集：可轉換為紅色、綠色、藍色與 alpha 值。轉換可能會造成資料流失，例如從 CMYK 或 YCbCr 的色彩空間轉換。

type Color interface {
    // RGBA returns the alpha-premultiplied red, green, blue and alpha values
    // for the color. Each value ranges within [0, 0xFFFF], but is represented
    // by a uint32 so that multiplying by a blend factor up to 0xFFFF will not
    // overflow.
    RGBA() (r, g, b, a uint32)
}

有關回傳值有三項重要的細微差別。首先，紅色、綠色和藍色是 alpha 預乘：飽和度為 25% 透明的完全飽和紅色由 RGBA 回傳為 r 的 75%。其次，通道具有 16 位元的有效範圍：100% 紅色由 RGBA 回傳為 r 的 65535，而非 255，如此一來，從 CMYK 或 YCbCr 的轉換就不至於有那麼多失真。第三，回傳的類型為 uint32，即使最大值為 65535，也要確保將兩個值相乘不會溢位。此類相乘會在以第三顏色的 alpha 遮色片為基礎混合兩種顏色時發生，如同波特和達夫經典代數的風格

dstr, dstg, dstb, dsta := dst.RGBA()
srcr, srcg, srcb, srca := src.RGBA()
_, _, _, m := mask.RGBA()
const M = 1<<16 - 1
// The resultant red value is a blend of dstr and srcr, and ranges in [0, M].
// The calculation for green, blue and alpha is similar.
dstr = (dstr*(M-m) + srcr*m) / M

如果處理未經 alpha 預乘的顏色，該程式碼片段的最後一行會更複雜，這就是 Color 使用 alpha 預乘值的原因。

影像/顏色套件還定義了許多實作 Color 介面的具體類型。例如，RGBA 是一個結構，表示傳統「每個通道 8 位元」的顏色。

type RGBA struct {
    R, G, B, A uint8
}

請注意， RGBA 的 R 欄位是範圍為 [0, 255] 的 8 位元 alpha 預乘顏色。RGBA 會將該值乘以 0x101，產生範圍為 [0, 65535] 的 16 位元 alpha 預乘顏色，如此一來便滿足了 Color 介面。類似地，NRGBA 結構類型表示 8 位元未經 alpha 預乘的顏色，正如 PNG 影像格式所使用的那樣。直接操作 NRGBA 的欄位時，這些值未經 alpha 預乘，但呼叫 RGBA 方法時，回傳值會經過 alpha 預乘。

Model 基本上是一個可以將 Color 轉換為其他 Color 的東西，而且可能會有失真。例如，GrayModel 能將任何 Color 轉換為去飽和的 Gray。Palette 能將任何 Color 轉換為色彩數目受限的調色盤中的某種顏色。

type Model interface {
    Convert(c Color) Color
}

type Palette []Color

點和矩形

Point 是整數網格上的 (x, y) 座標，其軸分別往右和往下遞增。這既不是畫素也不是網格方塊。Point 沒有內在的寬度、高度或顏色，但下方的視覺化會使用一個小彩色的方塊來表示。

type Point struct {
    X, Y int
}

p := image.Point{2, 1}

Rectangle 是整數網格上與軸對齊的矩形，由其左上角和右下角的 Point 所定義。Rectangle 同樣也沒有內在的顏色，但下方的視覺化會使用細彩色的線條來勾勒矩形，並呼叫出 Min 和 Max Point。

type Rectangle struct {
    Min, Max Point
}

為方便起見，image.Rect(x0, y0, x1, y1) 等同於 image.Rectangle{image.Point{x0, y0}, image.Point{x1, y1}}，但輸入起來輕鬆許多。

Rectangle 在左上角為包含，在右下角為不包含。對於 Point p 和 Rectangle r，唯當 r.Min.X <= p.X && p.X < r.Max.X，依此類推，Y 也是如此，才為 p.In(r)。這類比於切片 s[i0:i1] 在起始為包含，在結束為不包含的方式。（相異於陣列和切片，Rectangle 通常有非零的原點。）

r := image.Rect(2, 1, 5, 5)
// Dx and Dy return a rectangle's width and height.
fmt.Println(r.Dx(), r.Dy(), image.Pt(0, 0).In(r)) // prints 3 4 false

將一個 Point 加到 Rectangle 會平移 Rectangle。點和矩形不限於在右下象限。

r := image.Rect(2, 1, 5, 5).Add(image.Pt(-4, -2))
fmt.Println(r.Dx(), r.Dy(), image.Pt(0, 0).In(r)) // prints 3 4 true

相交兩個矩形會產出另一個矩形，這個矩形可能為空。

r := image.Rect(0, 0, 4, 3).Intersect(image.Rect(2, 2, 5, 5))
// Size returns a rectangle's width and height, as a Point.
fmt.Printf("%#v\n", r.Size()) // prints image.Point{X:2, Y:1}

點和矩形是透過值傳遞和回傳。一個採用 Rectangle 參數的函式會比採用兩個 Point 參數的函式，或四個 int 參數的函式更有效率。

影像

一個影像會將 Rectangle 中每個網格方格對應到一個來自 Model 的 Color。「在 (x, y) 的像素」是指由點 (x, y)、(x+1, y)、(x+1, y+1) 和 (x, y+1) 定義的網格方格的顏色。

type Image interface {
    // ColorModel returns the Image's color model.
    ColorModel() color.Model
    // Bounds returns the domain for which At can return non-zero color.
    // The bounds do not necessarily contain the point (0, 0).
    Bounds() Rectangle
    // At returns the color of the pixel at (x, y).
    // At(Bounds().Min.X, Bounds().Min.Y) returns the upper-left pixel of the grid.
    // At(Bounds().Max.X-1, Bounds().Max.Y-1) returns the lower-right one.
    At(x, y int) color.Color
}

一個常見的錯誤是假設 Image 的邊界起始於 (0, 0)。舉例來說，一個動態 GIF 包含一連串的影像，而且第一張 Image 之後的每張 Image 通常只持有已變更區域的像素資料，而且該區域並非必定會起自 (0, 0)。遍歷一個 Image m 的像素的正確方式看起來像這樣

b := m.Bounds()
for y := b.Min.Y; y < b.Max.Y; y++ {
 for x := b.Min.X; x < b.Max.X; x++ {
  doStuffWith(m.At(x, y))
 }
}

Image 的實作不一定要以記憶體中像素資料的切片為基礎。例如，一個 Uniform 是具有廣大邊界和均勻顏色的 Image，它的記憶體中表示形式僅是那個顏色。

type Uniform struct {
    C color.Color
}

然而，通常程式會需要一個以切片為基礎的影像。結構類型如 RGBA 和 Gray（其他套件將它們稱為 image.RGBA 和 image.Gray）會持有像素資料的切片，並實作 Image 介面。

type RGBA struct {
    // Pix holds the image's pixels, in R, G, B, A order. The pixel at
    // (x, y) starts at Pix[(y-Rect.Min.Y)*Stride + (x-Rect.Min.X)*4].
    Pix []uint8
    // Stride is the Pix stride (in bytes) between vertically adjacent pixels.
    Stride int
    // Rect is the image's bounds.
    Rect Rectangle
}

這些類型也提供一個 Set(x, y int, c color.Color) 方法，允許一次修改一個像素的影像。

m := image.NewRGBA(image.Rect(0, 0, 640, 480))
m.Set(5, 5, color.RGBA{255, 0, 0, 255})

如果您正在讀取或寫入大量的像素資料，直接存取這些結構類型的 Pix 欄位可能是較有效率的，但也會較複雜。

基於切片的 Image 實作也提供一個 SubImage 方法，它會回傳一個由同一個陣列所支援的 Image。修改子影像中的像素會影響原始影像的像素，這類比於修改子切片 s[i0:i1] 的內容會影響原始切片 s 的內容。

m0 := image.NewRGBA(image.Rect(0, 0, 8, 5))
m1 := m0.SubImage(image.Rect(1, 2, 5, 5)).(*image.RGBA)
fmt.Println(m0.Bounds().Dx(), m1.Bounds().Dx()) // prints 8, 4
fmt.Println(m0.Stride == m1.Stride)             // prints true

對於使用圖像的Pix 欄位處理低層級程式碼時，請注意範圍擴及 Pix 欄位可能會影響圖像邊界外的畫素。在上述範例中，M1.Pix 中包括的畫素以藍色陰影呈現。較高層級的程式碼，例如 At 和 Set 方法，或 image/draw 套件，會將運算裁剪至圖像邊界。

圖像格式

標準套件函式庫支援許多常見的圖像格式，例如 GIF、JPEG 和 PNG。如果您知道原始圖像檔案的格式，可以直接從 io.Reader 解碼。

import (
 "image/jpeg"
 "image/png"
 "io"
)

// convertJPEGToPNG converts from JPEG to PNG.
func convertJPEGToPNG(w io.Writer, r io.Reader) error {
 img, err := jpeg.Decode(r)
 if err != nil {
  return err
 }
 return png.Encode(w, img)
}

如果您有格式不明的圖像資料，image.Decode 函式可偵測該格式。識別的格式組在執行時期建構，不限於標準套件函式庫中的格式。一般來說，圖像格式套件會在 init 函式中註冊其格式，主套件會「底線匯入」該套件，此舉僅為產生格式註冊的副作用。

import (
 "image"
 "image/png"
 "io"

 _ "code.google.com/p/vp8-go/webp"
 _ "image/jpeg"
)

// convertToPNG converts from any recognized format to PNG.
func convertToPNG(w io.Writer, r io.Reader) error {
 img, _, err := image.Decode(r)
 if err != nil {
  return err
 }
 return png.Encode(w, img)
}

下一篇文章：Go image/draw 套件
上一篇文章：反射定律
網誌索引