Vehicle Detection and Tracking

Sun, May 21, 2017

The goals / steps of this project are the following:

Perform a Histogram of Oriented Gradients (HOG) feature extraction on a labeled training set of images and train a classifier Linear SVM classifier

Optionally, you can also apply a color transform and append binned color features, as well as histograms of color, to your HOG feature vector.

Note: for those first two steps don’t forget to normalize your features and randomize a selection for training and testing.

Implement a sliding-window technique and use your trained classifier to search for vehicles in images.

Run your pipeline on a video stream (start with the test_video.mp4 and later implement on full project_video.mp4) and create a heat map of recurring detections frame by frame to reject outliers and follow detected vehicles.

Estimate a bounding box for vehicles detected.

screen shot 2017-05-21 at 7 00 41 pm

Feature extraction

To detects cars in the image we need to be able to classify subsamples of the full frame into two categories: car and non-car. A classifier requires us to provide it features which it will use to determine if the image is a car or not. In this project I focused on using a particular type of feature called a Histogram of Oriented Gradients.

Histogram of Oriented Gradients (HOG)

The histogram of oriented gradients (HOG) is a feature descriptor used in computer vision and image processing for the purpose of object detection. The technique counts occurrences of gradient orientation in localized portions of an image. Wikipedia

I wrapped the call to the skimage.feature.hog() function in my own class called FeatureExtractor. I intended for this class to encapsulate the logic for extracting features from a given image. It was also useful for optimization in later parts of the pipeline.

Parameter selection

I choose the HOG parameters through trial and error and manually modifying them until I got better results. The quality of the results were evaluated based on timeliness and accuracy of the final output. The pixels_per_cell parameter however varied based on the sliding window scale so that the features from the video would match the feature size from the training data set.

Training the classifier

The classifier was trained using only HOG features from the saturation and value channels of the HSV image. I initially only used the saturation channel but later found that including the value channel improved the performance significantly. I used a support vector machine to power the classifier via sklearn.svm.LinearSVC().

I wrapped the classifier in my own class called CarClassifier. The purpose for this was to split out the training step out from the rest of the code and limit the pickling to only the CarClassifier.

Sub-sampling the full frame

Since the goal of the project is to find cars in a given frame, we need some type of mechanism to get subsamples of the full frame. The classifier is fed these subsamples and tells us which is a car and which is not a car. The approach I went with for generating these subsamples was a sliding window search.

Frame

test6

Car

601199

Not a car

7269436

Sliding window search

Intermediate step (pipeline demonstration)

The simplest approach would be just to slide a window across the entire image of varying sizes. Although this would result in the highest number of possible true positives, it would also be too computationally expensive. In this project one of the constraints is to reduce the amount of time required to process each frame.

My implementation of the sliding window search generates several windows with the given attributes:

Smaller windows are closer to the middle of the frame (farther away cars are smaller)
The top half of the frame is excluded from being searched

I used a couple of tunable parameters to find the best distribution of windows:

`region`

A rectangle the defines where to generate the windows in, this is how I prevented the window search from searching in the sky.

`window_y_step_ratio`

Defined the amount of overlap each window had when it slid in the y direction

`window_x_step_ratio`

Defined the amount of overlap each window had when it slid in the x direction

`scale_rate_y`

The rate at which the windows grew as they approached the bottom of the frame

`min_window_size`

The smallest window size which is found nearest the top of the region

Optimization

The HOG feature extractor is an expensive operation so optimization efforts were focused on reducing the number of calls to that function. Initially I implemented the feature extractor to extract on demand when given a subsample. Using this method resulted in the same areas of the image having the operation performed on it more than once. I optimized this process by:

Get every window size that will be used
Create a FeatureExtractor for each given window size
Extract the HOG features for each given window size in the region of interest
Subsample the HOG features for any given window

Development optimization

Since my development cycle involved a significant amount of parameter tuning, it payed off to reduce the amount of time required to generate an output video. The generation of the HOG features was also saved to disk to be used for future iterations. This was implemented in a class I refer to as the FeatureExtractorCache

Processing the video

video

[video](https://drive.google.com/file/d/0B1CQ1n9EZIF6RjBOOFhYSW1DeDg/view?usp=sharing)

Up to this point the pipeline was focused on a single frame. When given a video, there is additional information that can be used to accurately find a car in the image. I utilized a heatmap that persisted from frame to frame and was the underlying structure that powered the eventual annotations (bounding boxes around the cars) on the frame.

Once again I turned the heatmap into a class which I called Heatmap. The heatmap collected the windows that were identified as cars by the classifier and attempted to filter out false positives.

The filtering of the false positives were controlled by the following tunable parameters:

warmup_rate

This is the value that gets added to the heatmap when a given pixel is found in a window that was positively identified as a car

cooldown_rate

The rate at which all pixels in the heatmap decrease. This removed the false positives from persisting in the heatmap

threshold

The minimum value of a pixel in the heatmap to be considered as a true positive. This is eventually used when generating the bounding box.

Bounding box

The bounding box was the final output and what was used to annotate the frame with a true positive of a car. This was relatively straightforward and merely used a contour finding function provided by OpenCV. The contour was generated around the thresholded heatmap values.

Possible problems/improvements

Try a different classifier
- The current classifier produces a fair amount of false positives
Look at a different approach for the contour finding
- If cars have overlapping heatmaps they will result in the same contour

Finding Lane Lines on the Road - Part Deuce

Mon, May 1, 2017

The goal of this project (from the Udacity Self-driving Car nanodegree):

In this project, your goal is to write a software pipeline to identify the lane boundaries in a video from a front-facing camera on a car.

giphy

I used a combination of different computer vision techniques (camera calibration, region of interest, perspective transform) to create a software pipeline to process images and detect traffic lanes. The technologies used to accomplish this:

Python 3.5
OpenCV 3

The pipeline consisted of these components:

Distortion correction
Region of Interest
Perspective transform
Lane pixel detection
Lane detection
Curvature inference

Camera Calibration

Cameras typically have some level of distortion in the images they take. The distortion can cause the image to appear to be warped in some areas. Since we will be using the images to attempt to infer the dimensions of the pictured objects, we need to make sure that the distortion is corrected.

Calibration Images

Calibration images are a set of images of various calibration objects that have known attributes. By determining the transformation required to go from the known attributes to the actual attributes displayed in the image, we are able to generate a function that can correct for distortion.

Performing the calibration is relatively straight forward (assuming you have multiple calibration images and are using a chessboard diagram):

For each of the calibration images find all the corners in the image with cv2.findChessboardCorners(image, patternSize[, corners[, flags]])
Generate the transformation matrix for distortion correction using cv2.calibrateCamera(objectPoints, imagePoints, imageSize[, cameraMatrix[, distCoeffs[, rvecs[, tvecs[, flags[, criteria]]]]]]) → retval, cameraMatrix, distCoeffs, rvecs, tvecs

My implementation for this project can be found here:

https://github.com/bayne/CarND-Advanced-Lane-Lines-solution/blob/master/main.py#L9

Saving the calibration

Since the distortion is a property of camera we only need to calculate the distortion correction matrix once. During my implementation I was going to be running the pipeline many times so to save time I saved the distortion matrix to a pickle file and reloaded it from disk instead of recalculating it.

Pipeline

The pipeline consisted of 12 tunable parameters that were used to configure how each step ran:

Region of interest (region)
Perspective transform (source_points, destination_points)
Lane pixel detection
- Color threshold (yellow_lane_hsv_range, white_lane_hsv_range)
- Edge detection (gradient_x_threshold, gradient_y_threshold, gradient_magnitude_threshold, gradient_direction_threshold, ksize)
Lane detection (window_margin, window_min)

Region of Interest

straight_lines1

I removed the parts of the image that do not contain lane lines by masking out parts of the image that aren’t in the specified region.

https://github.com/bayne/CarND-Advanced-Lane-Lines-solution/blob/master/main.py#L439

Distortion Correction

straight_lines1

Using the pre-calculated distortion correction matrix, the next step is undistort the image:

https://github.com/bayne/CarND-Advanced-Lane-Lines-solution/blob/master/main.py#L131

Perspective Transform

straight_lines1

The image is transformed to a bird’s eye view to help accentuate curvature in the road:

https://github.com/bayne/CarND-Advanced-Lane-Lines-solution/blob/master/main.py#L251

Lane pixel detection

Detecting the lane pixels is done by reducing the image to a binary image of the pixels that belong to lane lines.

Color Threshold

straight_lines1

Color thresholding is removing the colors not specified by a given range:

https://github.com/bayne/CarND-Advanced-Lane-Lines-solution/blob/master/main.py#L145

Edge Detection

straight_lines1

By tuning a Sobel filter to focus on characteristics found in lane lines, I was able to reduce the amount of noise unrelated to lane lines.

https://github.com/bayne/CarND-Advanced-Lane-Lines-solution/blob/master/main.py#L172

Lane Detection

straight_lines1

Lanes a detected by using a sliding window that search for pixels that belong to the lane based on the pixels that were detect previously as being part of the lane:

https://github.com/bayne/CarND-Advanced-Lane-Lines-solution/blob/master/main.py#L274

Position & Curvature

straight_lines1

I was able to calculate the curvature and position of the car in respect to the lane lines from the image by carefully choosing the source_points and destination_points used in the perspective transformation step. By using the knowledge that the width of a lane is 12 feet and the length of a lane line is 10 feet, I am able to create a pixel to feet conversion function.

The position of the car in respect to the center of the lane is calculated by finding the offset of the middle of the lane with the middle of the image.

The curvature of the lane is done by using cv2.fitPoly which will find a best fit polynomial to the provided points.

https://github.com/bayne/CarND-Advanced-Lane-Lines-solution/blob/master/main.py#L393

Problems & Improvements

The region of interest significantly impacts the robustness of the pipeline since it must be tuned for the video feed.
The color thresholding is also tuned for the particular conditions of the video
Significant inclines or declines on the road would break the assumption that the birds eye view is on a flat plane.
Markers on the road that appear lane-line-like (spilled paint) will completely throw off the lane detection.
Using the information provided by previous frames would increase the smoothness and make it more robust in sudden changes between frames.

Behavioral Cloning

Tue, Apr 18, 2017

The goals / steps of this project are the following:

Use the simulator to collect data of good driving behavior
Build, a convolution neural network in Keras that predicts steering angles from images
Train and validate the model with a training and validation set
Test that the model successfully drives around track one without leaving the road
Summarize the results with a written report

Model Architecture and Training Strategy

Model

The network architecture is a slightly modified model provided by Nvidia that is used for a self-driving car.

Nvidia’s Model cnn-architecture-624x890

The primary differences in my model are: - the removal of the 10 neuron fully connected layer - an addition of a dropout layer - some additional pre-processing steps

The keras framework allows us to describe the model pretty succinctly:

model = Sequential()

# trim image to only see section with road
model.add(Cropping2D(cropping=((50,20), (0,0)), input_shape=(160,320,3)))

# Normalize and center the data around 0
model.add(Lambda(lambda x: x/127.5 - 1.))

# Nvidia's architecture
model.add(Conv2D(24, (5, 5), padding='same', activation='relu', strides=(2, 2)))
model.add(Conv2D(36, (5, 5), padding='same', activation='relu', strides=(2, 2)))
model.add(Conv2D(48, (5, 5), padding='same', activation='relu', strides=(2, 2)))
model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
model.add(Flatten())
model.add(Dense(100))

# Added dropout to prevent overfitting
model.add(Dropout(0.6))

model.add(Dense(50))
model.add(Dense(1))

Pre-processing

For pre-processing the images, I added a layer to normalize the images (for better numeric stability) and cropping to reduce the noise from the cameras.

Overfitting prevention

To prevent overfitting, a dropout layer was added. Keras abstracts the validation mechanism which also provides protection from overfitting.

Appropriate training data

Since this is an end-to-end behavior cloning system, the training data and how that training data is handled determines if the network will provided the desired result.

Shuffling

One of the key changes I made that yielded the best results was changing how shuffling was working. Initially the shuffling of training data was done on a per frame basis, however, I intuited that there was valuable information in the order of the frames. I changed the shuffling to shuffle batches of frames rather than the individual frames themselves. I believe this provided the network extra information on how the steering angle changed as a group of sequential frames changed.

To still get the benefit of shuffling to prevent overfitting, the shuffling was updated to still shuffle groups of frames.

Camera Angles

The simulator also provides multiple views on the car that can be used to provided additional training data for the steering angle. Since the driving mode only provides one viewport, a new hyperparameter had to be introduced that represents a steering offset that would be expected for the given viewports.

Less than ideal training

Initially it seems to make sense to just provide training data that shows perfect runs of the track. When you only provide the ideal path, the network is unable to react to conditions that are less than ideal. By providing training data of a driver that is swerving inside the lane, the network is given better ranges of steering angles along with frames that show the conditions in which the car will fall off the road if corrective action isn’t taken.

Data augmentation

For additional more varied training data, a simple augmentation is to flip the images and flip the associated steering angle. This doubles the amount of data provided to the network.

Traffic Sign Classification

Sat, Mar 25, 2017

In the context of this project, traffic sign classification is taking images of traffic signs and matching them to the information that they are trying to convey to human drivers. By using machine learning we are able use examples of traffic signs that have been pre-matched to their meaning and train a system to automatically identify new images.

Dataset

The Real-Time Computer Vision research group at the Institut für Neuroinformatik provide a dataset of German traffic sign images labeled with their appropriate classification. This dataset is called the German Traffic Sign Recognition Benchmark (GTSRB)

Summary

# Load pickled data
import pickle

training_file = 'train.p'
validation_file= 'valid.p'
testing_file = 'test.p'

with open(training_file, mode='rb') as f:
    train = pickle.load(f)
with open(validation_file, mode='rb') as f:
    valid = pickle.load(f)
with open(testing_file, mode='rb') as f:
    test = pickle.load(f)

n_train = len(train["features"])
n_valid = len(valid["features"])
n_test = len(test["features"])
width, height = len(test["features"][0]), len(test["features"][0][0])
image_shape = (width, height)
n_classes = len(set(test["labels"]))

print("Number of training examples =", n_train)
print("Number of testing examples =", n_test)
print("Image data shape =", image_shape)
print("Number of classes =", n_classes)

Number of training examples = 34799
Number of testing examples = 12630
Image data shape = (32, 32)
Number of classes = 43

Visualization

The training set is the largest of the sets since it its the set of samples that will actually build the model. The test samples are never used for training the model and kept completely separate until the very end when checking the accuracy of the classifier.

output_8_1

Blue: Train
Orange: Validation
Green: Test
Y-Axis: Number of samples
X-Axis: Identifier/Index for the type of sign

The Classifier

The goal of the classifier is to take images of traffic signs that it has never seen before and be able to accurately classify them. This was accomplished using machine learning by applying a large training data set to a convolutional neural network.

Preprocessing

For the neural network to accept the dataset, some transformations needed to be applied to the image. I went with the simplest transformation: converting the image to grayscale. Since I was using an architecture adapted from the LeNet architecture, the initial shape of the sample needed to be a single channeled 32 by 32 image.

import tensorflow as tf

features_placeholder = tf.placeholder(tf.float32, (None, height, width, None), name='features_placeholder')
features = tf.image.rgb_to_grayscale(features_placeholder)

Architecture

I used the LeNet architecture as a starting point and added on top of it to produce better results. I found that if I added 1x1 convolution as the first layer it increased the model size enough to better capture the classifier. Adding the dropout operation also improved the performance by making the neural network more generalized.

LeNet Architecture (initial approach Testing Accuracy: 0.879)

Layer Number	Type	Input Shape	Output Shape	Activation
1	Convolutional (5x5)	32x32x1	28x28x6	Rectified Linear Unit (ReLU)
2	Pooling (2x2)	28x28x6	14x14x6	-
3	Convolutional (5x5)	14x14x6	10x10x16	ReLU
4	Pooling (2x2)	10x10x16	5x5x16	-
5	Fully-Connected	120	84	ReLU
6	Fully-Connected	84	43	ReLU

Solution

The LeNet architecture was choosen as a starting point since it is an effective architecture for classifying images in the MNIST dataset. By itself, the architecture was able to reach a 0.879 validation accuracy however was limited by a couple factors. The LeNet architecture was too small to model the traffic sign concept since the MNIST data has much less features than you would find in traffic sign images. By adding a 1x1 convolutional layer I was able to increase the size of the network to make it better at modeling the concept.

Modified LeNet Architecture (Testing Accuracy: 0.941)

Layer Number	Type	Input Shape	Output Shape	Activation
1	Convolutional (1x1)	32x32x1	32x32x1	Rectified Linear Unit (ReLU)
2	Convolutional (5x5)	32x32x1	28x28x6	Rectified Linear Unit (ReLU)
3	Pooling (2x2)	28x28x6	14x14x6	-
4	Convolutional (5x5)	14x14x6	10x10x16	ReLU
5	Pooling (2x2)	10x10x16	5x5x16	-
6	Fully-Connected + Dropout	120	84	ReLU
7	Fully-Connected + Dropout	84	43	ReLU

Training

# The number of passes the training set is ran through the neural network
EPOCHS = 20
# The number of samples ran through the neural network at one time
BATCH_SIZE = 256
# The constant used in the generating the weight delta during back-propagation
LEARNING_RATE = 0.001
# The probability that a neuron will be zero'd out in the fully connected layers
DROPOUT = 0.60

For training the neural network, I used the AdamOptimizer provided by TensorFlow. As for the hyper-parameters, they were chosen by experimenting with different values.

The number of epochs were increased because using dropout required more iterations to create a more generalized model. Batch size is typically memory limited so if more memory is available, increasing the batch size is beneficial for training the network. Since I am using dropout on one of the layers, a new hyper parameter is introduced.

Outside the Dataset

As an exercise of showing the effectiveness of the classifier on data outside the dataset, I found several images of German traffic signs on the internet and ran them through the classifier.

New Images

I used a couple approaches to grab traffic sign images. I first started with grabbing the top images off of Google for the phrase “German traffic signs”. This produced some SVG based images which are less realistic but an interesting test case.

The other approach I used was taking advantage of Google Street View to produce images in a real life setting. I did a screenshot and some quick processing to get it in a format that was suitable for the classifier.

Although the new images had were able to be classified successfully, I could see cases where images could not be classified correctly. Since the dataset has under-represented labels, I could see images in those classes that have unusual features (blurry, occluded, rotated) being classified incorrectly.

Performance

# Grab the largest logit, the index corresponds to the image's predicted class
classify_operation = tf.argmax(logits, 1)
new_y = np.zeros(n_classes)

new_x = [plt.imread(filename) for filename in filenames]

with tf.Session() as sess:
    saver.restore(sess, tf.train.latest_checkpoint('.'))

    # Dropout probability is set to 1 to make the neural networks behavior deterministic
    index = sess.run(classify_operation, feed_dict={features_placeholder: new_x, logits_placeholder: new_y, dropout_prob: 1.0})
    print(filenames)
    print(index)

Running the images through the classifier worked well with a perfect score of classifying the images. Considering the test set was reporting a 0.94 accuracy, this didn’t appear to be too unusual.

Certainty


# Grab the top 3 probabilities produced by running the logits through a softmax function
top_operation = tf.nn.top_k(tf.nn.softmax(logits), k=3)
new_y = np.zeros(n_classes)

new_x = [plt.imread(filename) for filename in filenames]

with tf.Session() as sess:
    saver.restore(sess, tf.train.latest_checkpoint('.'))
    top = sess.run(top_operation, feed_dict={features_placeholder: new_x, logits_placeholder: new_y, dropout_prob: 1.0})

    print(top)

Output

[
       [  1.00000000e+00,   2.83870527e-08,   1.58895819e-09],
       [  9.74874496e-01,   1.66735873e-02,   6.44520996e-03],
       [  1.00000000e+00,   5.54841106e-09,   1.71303116e-09],
       [  9.99998569e-01,   7.89253022e-07,   3.30958187e-07],
       [  9.99987125e-01,   6.46488252e-06,   4.42701048e-06]
]

The top probabilities for each image were surprisingly high. Some of the probabilities were essentially 1.00 indicating that the classifier was highly confident that the provided image belong to the corresponding class.

Finding Lane Lines on the Road

Thu, Feb 23, 2017

This is a write-up for Project 1 in Udacity’s Self-Driving Car Nanodegree

Project repository: https://github.com/udacity/CarND-LaneLines-P1

Project Rubric: https://review.udacity.com/#!/rubrics/322/view

When we drive, we use our eyes to decide where to go. The lines on the road that show us where the lanes are act as our constant reference for where to steer the vehicle. Naturally, one of the first things we would like to do in developing a self-driving car is to automatically detect lane lines using an algorithm.

In this project you will detect lane lines in images using Python and OpenCV. OpenCV means “Open-Source Computer Vision”, which is a package that has many useful tools for analyzing images.

More about the nanodegree program: https://www.udacity.com/drive

Pipeline

The purpose of the pipeline is to compose several different operations together, apply them to an image, and produce an annotated image that shows where a lane on a road would be.

My pipeline consists of multiple steps:

Applying a color mask
Performing edge detection
Selecting regions to search for lane lines
Using the Hough transform to find line segments
Extrapolating the lane from the line segments provided by the Hough transform

Applying the color mask

solidyellowleft

Lanes on the road are typically found as a single color and designed to stand-out from the background. We can use this fact to help us filter out elements in the image that are irrelevant. To accomplish this, we use something called a color mask. A color mask can be applied to an image to remove all colors except for the ones we specify.

This can be done using OpenCV’s cv2.inRange(src, lowerb, upperb[, dst]) → dst thresholding function. You provide the image and a range of the colors you want to create a mask for.

For this project I created two masks: one for white lanes and one for yellow lanes.

White lane

solidyellowleft

The white lane I manually found values that seemed close and looked to provide good results when ran against the example images.

Yellow lane

solidyellowleft

The yellow lane was a bit more involved since its not as simple as setting all the channels to the same value. I used a color picker to grab the RGB components and plugged it into colorizer to transform it to HSV space. Using the cv2.cvtColor() function to convert the image from BGR colorspace to HSV, I was able to create a mask that isolated the yellow lane.

Performing edge detection

solidyellowleft

I then use the canny edge detector to pull out edges in the image. cv.Canny(image, edges, threshold1, threshold2, aperture_size=3)

Selecting regions to search for lane lines

solidyellowleft

After performing edge detection, there is still a fair amount of irrelevant edges that need to be ignored if we are to find the lane lines. We remove a majority of the image and focus on a region that we would most likely find lane lines.

Using the Hough transform to find line segments

solidyellowleft

The hough transform is the operation in the pipeline that actually finds line segments in the image and provides the most information of where the lanes lines could be. Initially I performed the hough transform on a single region which provided some spurious results. When I adjusted the pipeline to instead use two regions (one of the left line and one for the right), it significantly increased the accuracy.

A large amount of time was spent tweaking the parameters and manually tuning the hough transform to provide good results.

Extrapolating the lane from the line segments provided by the Hough transform

solidyellowleft

Once we have the line segments produced by the hough transform, we can find a line that would be suitable for annotating the initial image. I found cv.FitLine(points, dist_type, param, reps, aeps) to work relatively well at extrapolating the line from the lines found using the hough transform.

Potential shortcomings with my current pipeline

The challenge video exposed a couple flaws with my pipeline:

Highly sensitive to color
Requires hard coded regions
Highly dependent on lane location

Possible improvements to my pipeline

Use information from past frames
Better tuned hough transform and edge detection
Automatically calculate region
Automatically calculate color mask range

Getting lines out of an image

Sat, Jan 7, 2017

Develop a model of the image

We need a way to model the image in memory on the computer in some way. The standard method is to consider the image as a function that takes in two arguments x and y and returns the intensity of the pixel. The intensity of the pixel in color images typically comes in multiple channels representing a component of the overall color of the pixel.

In most cases for raster graphics we layout the image with the origin (0,0) at the top left corner of the image. The Y coordinate increases going down the image and the X coordinate increases going left to right.

Make the image grayscale

Multiple channels per pixel can make the following operations after this one more difficult. If we can clobber all the channels into a single channel we can simplify things for us later on. In most cases we represent this with an 8-bit unsigned integer giving us a range of 0-255 per pixel.

Blur the image

By blurring the image we reduce the amount of detail in the image. When we are looking for overall features in the image we need to reduce the detail and only have the large features remain.

The blurring operation we do to accomplish this is called a gaussian blur.

Get the gradient

The gradient of an image is the derivative of the image. Areas of the image that have sudden changes in intensity will be local maximums. This helps highlight the particular spots of the image where we might find “edges”. In this context, we define edges as spots on the image where the intensity has a sudden change.

A common operator to use for generating the gradient is called the sobel operator

Generate the edge pixels

The gradient appears to show the edges however stil has some extraneous elements in it that aren’t the features we are looking for. So far we have been following an algorithm called the Canny edge detection algorithm.

Additional operations are done on the image to reduce the non-edge pixels: Non-maximum suppression, Double threshold, and Edge tracking by hysteresis.

The Hough transform

At a high-level the hough transform essentially just takes all the edge pixels and for each edge pixel finds all the possible lines that the edge pixel can be a part of. Using all the possible lines, find the lines that have the most edge pixels in common.

Details about how this is accomplished, Hough transform.

Faster Functional Tests

Sun, Nov 16, 2014

Functional tests

Functional tests are one of the tests that provide the most benefit and pay themselves off earlier than other types of tests. The Symfony documentation defines the functional test as so:

Functional tests check the integration of the different layers of an application (from the routing to the views). They are no different from unit tests as far as PHPUnit is concerned, but they have a very specific workflow:

Make a request;

Test the response;

Click on a link or submit a form;

Test the response;

Rinse and repeat.

Too slow

One issue with functional tests is that they are slow. Since a functional test requires testing the entire stack, the performance benefits of mocking out services are not available. There are many approaches to speeding up functional tests but in this post we’ll focus on the major bottleneck: the database.

A highly recommended way of structuring your functional tests is to rebuild the database per test. By starting with a pristine database based on some dataset, you reduce the amount of issues you may run into with tests sharing state with each other. If for some reason you don’t do this, this post probably won’t apply to you as much.

A functional test for an application using Doctrine

<?php

class TestTheTesting extends WebTestCase
{
    public function setUp()
    {
        $this->createDatabase();
        $this->client = self::createClient();
        $this->container = $this->client->getContainer();
        $metadatas = $this->getMetadatas();
        if (!empty($metadatas)) {
            $tool = new \Doctrine\ORM\Tools\SchemaTool(
                $this->container->get('doctrine.orm.entity_manager')
            );
            $tool->dropSchema($metadatas);
            $tool->createSchema($metadatas);
        }
    }
}

Lowest hanging fruit

One quick speed up is to use SQLite for your database when running tests. In this example, we are using Doctrine as a persistence layer so switching out the database is just a simple configuration change. Overriding the doctrine configuration in the test specific config file (config_test.yml):

config_test.yml

...
doctrine:
    dbal:
        default_connection: default
        connections:
            default:
                driver: pdo_sqlite
                memory: true
                db_name: %database_name%_test
                charset: UTF8
...

In my own tests, this saw a significant speed boost from 4-5 seconds per test to about 1.

benchmark

Database pooling

A slightly more involved approach with maximum environment parity, is using a database pool.

The database pool approach front-loads the majority of the work before the tests are ever ran.

Standard Approach

Standard approach

Database Pooling

Database Pooling

The “fetch database from pool” is significantly faster than creating a database on the fly. Although the same amount of work is done (and more due to the pooling overhead) this new method makes the database creation step asynchronous. Since testing during development is sporadic happening in small bursts, we are able to use this fact to get a significant speed increase.

Note: My particular setup involves multiple databases, the database used in pooling for this case is being populated via a MySQL dump file. The test results above are the database managed by Doctrine

Speed up

So how do we implement such a thing?

Create the pool filler

A database pool should have an interface to get a item from the pool and a way to fill the pool. Below is one implementation that is MySQL specific:

DatabasePool.php

<?php

class DatabasePool
{
    public function __construct($host, $port, $user, $password)
    {
        $this->pdo = new \PDO(
            "mysql:host=$host;port=$port",
            $user,
            $password
        );
        $this->host = $host;
        $this->port = $port;
        $this->user = $user;
        $this->password = $password;
    }

    /**
     * Returns the name of the database that is in a pristine condition
     *
     * @return string
     */
    public function get()
    {
        $pooledDatabases = $this->getPool();

        if (count($pooledDatabases) == 0) {
            throw new \RuntimeException("Database pool is empty");
        } else {
            return array_pop($pooledDatabases);
        }
    }

    /**
     * Drops the database that was used from the pool
     */
    public function drop()
    {
        $this->pdo->exec("DROP DATABASE {$this->get()}");
    }

    /**
     * Fills the pool with the given number of databases
     *
     * @param int $size The number of pristine databases to create
     *
     * @return string
     */
    public function fillPool($size)
    {
        if ($size <= 0) {
            throw new \UnexpectedValueException('Pool size must be greater than 0');
        }
        $result = $this->getPool();
        $delta = $size - count($result);
        if ($delta < 0) {
            foreach ($this->getPool() as $pooledDatabase) {
                $this->pdo->exec("DROP DATABASE {$pooledDatabase}");
            }
            $delta = $size;
        }
        for ($i = 0; $i < $delta; $i++) {
            $databaseName = 'test_'.str_replace('.', '_', microtime(true));
            $this->createDatabase($databaseName);
        }
        $pooledDatabases = $this->getPool();
        return array_pop($pooledDatabases);
    }

    /**
     * Returns all the database names that are within the pool
     *
     * @return array
     */
    private function getPool()
    {
        $sql = <<<SQL
SELECT `schema_name` FROM information_schema.schemata
WHERE `schema_name` LIKE 'test_%' ORDER BY `schema_name` DESC;
SQL;
        return $this->pdo->query($sql)->fetchAll(\PDO::FETCH_COLUMN);
    }

    /**
     * Creates a pristine database with the given name
     *
     * @param string $name
     * @return string
     */
    private function createDatabase($name)
    {
        // Your database creation code here
    }

}

Pool management is tracked using a naming scheme of the pooled databases, in this case its test_<MICROTIMESTAMP>.

Create a command to fill your database pool

With your new DatabasePool you need some way to invoke it and have it run along side your tests in a seperate process. A Symfony command would handle that nicely:

FillPoolCommand.php

...
    protected function execute(InputInterface $input, OutputInterface $output)
    {
        $pool = new DatabasePool(
            $this->getContainer()->getParameter('database_host'),
            $this->getContainer()->getParameter('database_port'),
            $this->getContainer()->getParameter('database_user'),
            $this->getContainer()->getParameter('database_password')
        );
        $pool->fillPool($input->getArgument('pool_size'));
        if ($input->getOption('watch')) {
            while(true) {
                $pool->fillPool($input->getArgument('pool_size'));
                sleep(5);
            }
        }
    }
...

You run this command in the background while you are developing and you will always have fresh databases to work with.

Update your tests to use the database pool

Now you need to update your tests to use the newly created database pool. By using the fact that config_test.yml overrides the configuration for the test environment we are able to create a custom database connection that pulls database names from the pool. We also use the handy expression language that Symfony supports in its configuration files:

config_test.yml

...
services:
    database_pool:
        class: DatabasePool
        arguments:
            - %database_host%
            - %database_port%
            - %database_user%
            - %database_password%
    doctrine.dbal.default_connection:
        class: Doctrine\DBAL\Portability\Connection
        factory_class: Doctrine\DBAL\DriverManager
        factory_method: getConnection
        arguments:
            - driver:   %database_driver%
              host:     %database_host%
              port:     %database_port%
              dbname:   @=service('database_pool').get()
              user:     %database_user%
              password: %database_password%

The magic lies here

dbname:   @=service('database_pool').get()

The dbname for the doctrine connection is evaluated on each test run based on the the DatabasePool::get() method.

Fin

By doing these improvements I was able to shave down my test run time from 39 seconds for 10 tests down to 2.1 seconds. This only captures the amount time savings if I just stared at the screen while the tests were running. I haven’t looked into the amount of time savings from not being able to open up reddit because my tests were already done.

Remote streaming with Plex by using sshfs

Mon, Nov 3, 2014

What is Plex?

Plex is an amazing piece of software that pretty much gives you your own personal Netflix. You can take your own video files and have Plex stream them to their many clients on different platforms (Roku, Android, iOS, Web).

The common setup for Plex is to have one computer on your local network running Plex and have your Plex clients connect to that. My problem was that my files were not local and on a remote server.

How the library is populated

For Plex to actually work, you have to tell it where your media files are on the server. Plex has a nice web interface making easy to configure:

So using this interface you can point Plex to any directory on your machine and tell it search there for media files.

The remote file problem

I have a remote server that has a couple media files that I would like Plex to stream. The manual way is to just copy over the files from the remote server to the local machine then have Plex stream it from there.

This is annoying and time consuming depending on the bandwidth between the local and remote machine.

Lets just use sshfs

Network mounted directories to the rescue! For the simplicity sake, I just used sshfs to mount my remote directory so that Plex thinks that its a local directory.

First install sshfs and fuse

$ sudo pacman -S sshfs fuse

Create a local directory that you will mount the remote location to:

$ mkdir /mnt/remote

Then update your /etc/fstab

/etc/fstab

#
# /etc/fstab: static file system information
#
/dev/sda1 / ext4 rw,relatime,data=ordered0 1
<USERNAME>@<HOST>: /mnt/remote  fuse.sshfs noauto,x-systemd.automount,_netdev,users,idmap=user,IdentityFile=/<PATH>/<TO>/<YOUR>/<PRIVATE_KEY>,allow_other,reconnect 0 0

See here for more detailed documentation.

Done

And there you go no more needing to copy over the files locally, you have instant access to your files on your remote server. Barring your have enough bandwidth to your remote server

Deploying a Symfony2 app using Doctrine and PostgreSQL to Heroku

Sun, Nov 2, 2014

Symfony2 Angular TodoMVC

What is Heroku

Heroku is a platform to quickly and easily deploy your applications to. They abstract much of the sysadmin-type work away so the developer only has to stay at the application-level. If you get everything configured correctly you can end up with a single button click deployment.

The free-tier is more than enough to get started and be able to get a sample Symfony2 application up and running with a database backend.

Getting Symfony2 on Heroku

First follow this documentation found on the Heroku site to get a Symfony2 application up and running:

https://devcenter.heroku.com/articles/getting-started-with-symfony2

Making a deploy button

Heroku has documentation on how to do this

https://devcenter.heroku.com/articles/heroku-button

Essentially you create an app.json file that describes the application and tells Heroku how to deploy it.

Using Doctrine with Symfony2 on Heroku

The above documentation gets you to the point where you have a fully functioning Symfony2 application running on Heroku, however it stops there. If you want your application to be database driven and running on the free-tier of Heroku there are a couple more steps you have to take.

Since you install the standard edition in the tutorial, Doctrine is already included as a dependency in your project.

composer.json

...
"require": {
    "php": ">=5.3.3",
    "symfony/symfony": "2.6.x-dev",
    "doctrine/orm": "~2.2,>=2.2.3",
    "doctrine/doctrine-bundle": "~1.2",
    "twig/extensions": "~1.0",
    "symfony/assetic-bundle": "~2.3",
    "symfony/swiftmailer-bundle": "~2.3",
    "symfony/monolog-bundle": "~2.4",
    "sensio/distribution-bundle": "~3.0",
    "sensio/framework-extra-bundle": "~3.0",
    "incenteev/composer-parameter-handler": "~2.0"
},
...

Heroku provides a free instance of a PostgreSQL database so we are going to use that for this example. There also appears to be a MySQL equivalent (ClearDB). Most steps should be able to apply to that too but with replacing PostgreSQL with MySQL.

(Optional) Include PDO_PGSQL as a composer extension dependency

This tells Heroku to enable/install the pdo_pgsql.so extension for your application. This is optional however since pdo_pgsql.so is one of the automatically installed extensions.

composer.json

"require": {
    ...
    "ext-pdo": "*",
    "ext-pdo_pgsql": "*",
    ...
},

Add the PostgreSQL Heroku add-on

After you have deployed your application, use the heroku command to enable the PostgreSQL add-on for your application:

$ heroku addons:add heroku-postgresql:hobby-dev

Add it to your app.json file also to have it automatically enabled for your deploy button

app.json

{
    ...
    "addons": [
        "heroku-postgresql:hobby-dev"
    ]
}

Automatically set your database parameters

When you add the PostgreSQL addon, Heroku automatically provisions a PostgreSQL database for you and provides the access credentials via an environment variable: DATABASE_URL. Unfortunately thats all they provide you, so you have to parse the URL to put it in a format acceptable for Doctrine.

Using a combination of Composer post/pre install commands and some custom code to pull and push variables to the environment, we can get Doctrine properly configured to run correctly when deployed.

Parse and populate the environment variables

I created a simple static callback to grab the DATABASE_URL variable, parse it, and stick other variables back in.

HerokuDatabase.php

<?php

namespace Acme\DemoBundle;

use Composer\Script\Event;

class HerokuDatabase
{
    public static function populateEnvironment(Event $event)
    {
        $url = getenv("DATABASE_URL");

        if ($url) {
            $url = parse_url($url);
            putenv("DATABASE_HOST={$url['host']}");
            putenv("DATABASE_USER={$url['user']}");
            putenv("DATABASE_PASSWORD={$url['pass']}");
            $db = substr($url['path'],1);
            putenv("DATABASE_NAME={$db}");
        }

        $io = $event->getIO();

        $io->write("DATABASE_URL=".getenv("DATABASE_URL"));
    }
}

Update composer.json to run HerokuDatabase::populateEnvironment

Add this as a preinstall step to get the environment populated earlier than the configuration stage

composer.json

"scripts": {
    "pre-install-cmd": [
        "Acme\\DemoBundle\\HerokuDatabase::populateEnvironment"
    ],
    "post-install-cmd": [
        "Incenteev\\ParameterHandler\\ScriptHandler::buildParameters",
        "Sensio\\Bundle\\DistributionBundle\\Composer\\ScriptHandler::clearCache",
        "Sensio\\Bundle\\DistributionBundle\\Composer\\ScriptHandler::installAssets",
        "Sensio\\Bundle\\DistributionBundle\\Composer\\ScriptHandler::installRequirementsFile"
    ],
    "post-update-cmd": [
        "Incenteev\\ParameterHandler\\ScriptHandler::buildParameters",
        "Sensio\\Bundle\\DistributionBundle\\Composer\\ScriptHandler::clearCache",
        "Sensio\\Bundle\\DistributionBundle\\Composer\\ScriptHandler::installAssets",
        "Sensio\\Bundle\\DistributionBundle\\Composer\\ScriptHandler::installRequirementsFile"
    ]
},

Note: Ensure that you include the buildParameters callback in your post install/update commands

Configure composer-parameter-handler to use the new variables for the database configuration

The heroku provided buildpack for PHP runs composer in the non-interactive mode. This means that the behavior of composer-parameter-handler is to just copy and paste the parameters.yml.dist file to parameters.yml.

Luckily composer-parameter-handler is configurable to automatically replace some parameters with environment variables.

In our case, it would look like:

composer.json

"extra": {
    "incenteev-parameters": {
        "file": "app/config/parameters.yml",
        "env-map": {
            "database_host": "DATABASE_HOST",
            "database_port": "DATABASE_PORT",
            "database_name": "DATABASE_NAME",
            "database_user": "DATABASE_USER",
            "database_password": "DATABASE_PASSWORD"
        }
    }
}

Initialize the database

Almost done, now we just need to initialize the database. Using the heroku command, run the schema creation command:

$ heroku run php app/console doctrine:schema:create

For automation of the deployment button:

app.json

"scripts": {
    "postdeploy": "php app/console doctrine:schema:create"
}

Fin

And you’re done! On the first run of your application it may take a while to load (cache is warming up), however every subsequent page load should be fast :)

Counting Github commits

Wed, Oct 15, 2014

The Github API enables you to do some pretty awesome things with their data. You can query for information and use it to build statistics or you can interface into their system and make changes to your repositories.

I spent some time creating a Github project comparison tool as a way to help compare two projects hosted on Github. While building it I ran into an interesting problem that I will discuss here: how to count things.

The Github API

The API Github provides is fairly straight-forward and well documented. For example you can get a set of commits for a repository by simply doing a GET request to:

https://api.github.com/repos/bayne/github-compare/commits

Which returns something like:

Pagination

Typically when you make a request to a URI that does not end in an ID, you are going to receive a collection. So in the above example a collection is returned.

Another thing, the Github API automatically paginates long collections. The pagination can be accessed by two parameters: per_page and page. Traversing the collection requires diving into the response headers:

Link:<https://api.github.com/repositories/458058/issues?page=2>; rel="next", <https://api.github.com/repositories/458058/issues?page=26>; rel="last"

By using the Link header, you are able to traverse the collection.

Getting the count

Getting the count of the collection requires getting the entire collection, which can be an expensive operation. However by using the per_page parameter we can utilize a little hack to make it cheap.

Set the per_page parameter to 1

https://api.github.com/repos/symfony/symfony?per_page=1

Which gives this in your response header

Link:<https://api.github.com/repositories/458058/issues?per_page=1&page=2>; rel="next", <https://api.github.com/repositories/458058/issues?per_page=1&page=760>; rel="last"

Now you know exactly how many issues are in that repository. By looking at the last link and the value of the page parameter, you are able to count the number in that collection without traversing.

Getting the count without “last”

What if you don’t have the link to the last page? Unfortunately the commits endpoint doesn’t provide you with the last page.

Counting the number of commits (efficiently) is a bit more involved. By thinking back to my Computer Science backround the best way to do this is to divide-and-conquer or a binary search. The binary search has to be two step though since a binary search requires a ceiling (this does not have one).

The first part of the algorithm requires finding the ceiling, which from my intuition is best to do exponentially (optimally would probably be to fit it to some heuristic based on standard distribution). Then from there do a binary search for the page that is only partially filled.

github-services.js

function binarySearch(lower, upper, callback) {
  var deferred = $q.defer();
  function recurse(lower, upper) {
    var midpoint = Math.ceil((lower+upper)/2);
    callback(midpoint).then(function (result) {
      if (result < 0) {
        recurse(midpoint + 1, upper);
      } else if (result > 0) {
        recurse(lower, midpoint - 1);
      } else {
        deferred.resolve(midpoint);
      }
    });
    return deferred.promise;
  }
  return recurse(lower, upper);
}

var compare = function (page) {
  var deferred = $q.defer();
  $http.get(
    urlParser(url)
      .addToSearch('page', page)
      .url+'&per_page=100',
    {stopPropagate: true}
  ).then(function (response) {
      var next = parse_link_header(response.headers('Link')).next;
      var result;
      if (response.data.length === 0) {
        result = 1;
      } else if (response.data.length == 100 && next !== undefined) {
        result = -1;
      } else {
        result = 0;
      }
      deferred.resolve(result);
  });
  return deferred.promise;
};

getUpperbound(compare).then(function (upperBound) {
  binarySearch(1, upperBound, compare).then(function (lastPage) {
    $http.get(
      urlParser(url)
        .addToSearch('page', lastPage)
        .url+'&per_page=100',
      {stopPropagate: true}
    ).then(function (response) {
        deferred.resolve(response.data.length+(lastPage-1)*100);
    });
  });
});

Bayne

Feature extraction

Histogram of Oriented Gradients (HOG)

Parameter selection

Training the classifier

Sub-sampling the full frame

Sliding window search

region

window_y_step_ratio

window_x_step_ratio

scale_rate_y

min_window_size

Optimization

Development optimization

Processing the video

Bounding box

Possible problems/improvements

Camera Calibration

Calibration Images

Saving the calibration

Pipeline

Region of Interest

Distortion Correction

Perspective Transform

Lane pixel detection

Color Threshold

Edge Detection

Lane Detection

Position & Curvature

Problems & Improvements

Model Architecture and Training Strategy

Model

Pre-processing

Overfitting prevention

Appropriate training data

Shuffling

Camera Angles

Less than ideal training

Data augmentation

Dataset

Summary

Visualization

The Classifier

Preprocessing

Architecture

Solution

Training

Outside the Dataset

New Images

Performance

Certainty

Pipeline

Applying the color mask

White lane

Yellow lane

Performing edge detection

Selecting regions to search for lane lines

Using the Hough transform to find line segments

Extrapolating the lane from the line segments provided by the Hough transform

Potential shortcomings with my current pipeline

Possible improvements to my pipeline

Develop a model of the image

Make the image grayscale

Blur the image

Get the gradient

Generate the edge pixels

The Hough transform

Functional tests

Too slow

Lowest hanging fruit

Database pooling

Create the pool filler

Create a command to fill your database pool

Update your tests to use the database pool

Fin

What is Plex?

How the library is populated

The remote file problem

Lets just use sshfs

Done

`region`

`window_y_step_ratio`

`window_x_step_ratio`

`scale_rate_y`

`min_window_size`