Getting lines out of an image

Develop a model of the image

We need a way to model the image in memory on the computer in some way. The standard method is to consider the image as a function that takes in two arguments x and y and returns the intensity of the pixel. The intensity of the pixel in color images typically comes in multiple channels representing a component of the overall color of the pixel.

In most cases for raster graphics we layout the image with the origin (0,0) at the top left corner of the image. The Y coordinate increases going down the image and the X coordinate increases going left to right.

Make the image grayscale

Multiple channels per pixel can make the following operations after this one more difficult. If we can clobber all the channels into a single channel we can simplify things for us later on. In most cases we represent this with an 8-bit unsigned integer giving us a range of 0-255 per pixel.

Blur the image

By blurring the image we reduce the amount of detail in the image. When we are looking for overall features in the image we need to reduce the detail and only have the large features remain.

The blurring operation we do to accomplish this is called a gaussian blur.

Get the gradient

The gradient of an image is the derivative of the image. Areas of the image that have sudden changes in intensity will be local maximums. This helps highlight the particular spots of the image where we might find “edges”. In this context, we define edges as spots on the image where the intensity has a sudden change.

A common operator to use for generating the gradient is called the sobel operator

Generate the edge pixels

The gradient appears to show the edges however stil has some extraneous elements in it that aren’t the features we are looking for. So far we have been following an algorithm called the Canny edge detection algorithm.

Additional operations are done on the image to reduce the non-edge pixels: Non-maximum suppression, Double threshold, and Edge tracking by hysteresis.

The Hough transform

At a high-level the hough transform essentially just takes all the edge pixels and for each edge pixel finds all the possible lines that the edge pixel can be a part of. Using all the possible lines, find the lines that have the most edge pixels in common.

Details about how this is accomplished, Hough transform.