Skip to content Skip to navigation Skip to collection information

OpenStax-CNX

You are here: Home » Content » Red Cup Replacement » Low-Level Overview

Navigation

Recently Viewed

This feature requires Javascript to be enabled.
 

Low-Level Overview

Module by: Blake Miller, John Slack, Thomas Ladd. E-mail the authors

Algorithmically our red cup replacement algorithm breaks down into three main sections: cup identification, finding a suitable replacement image, and the merger of the found image into the original.  Each part presents its own technical challenges and solutions.

Identification

Our test identification algorithm is based on simple template matching.  Basically, the template image of a desired object is convolved with the original image and the correlation between the two is found at every point.  The correlation is then normalized with respect to the intensity of the original image, giving a correlation value in the range between -1 and 1.  This process is encapsulated in the matlab function normxcorr2, whith takes two grayscale image matrices and returns one correlation matrix whose width and height are the sum of the widths and heights of the original matrices.  

The program sets a threshold value (around .7 by experimentation) to determine if our template has matched a cup in the original image.  Each color channel runs and is compared with the threshold separately.  The program then ands the resulting filtered correlation matrices together so a match is only found if it matches in terms of red, green, and blue.  This prevents a red (100% red, 0% green, 0% blue) from matching with white (100% red, 100% green, 100% blue).  At this stage, all points that exceed the threshold are considered matches.  Inorder to find the actual location of the cup the algorithm finds the maximum correlation overall, records a cup at that location, and then masks out the area of the found cup.  This neutralizes the other over threshold points around corresponding to the same cup, preventing overlapping cup hits.  The algorithm then finds the next greatest maximum value and repeats until all points over threshold have been accounted.

Unfortunately, this approach only works for one size of cup in the source image (the size of the template).  To detect all cup sizes the scale of the template relative to the source image must change and the correlation must be run for each respective size.  Our algorithm scales down the original image using imresize and leaves the template small (to save on runtime by reducing the correlation size instead of increasing it).  After each small change in size the correlation function runs and saves matched regions to an accumulation array.  The function also keeps track of the masks of previous match regions so smaller cups aren’t found erroneously inside of larger cups.  The match regions are recorded at the scale of the original image, so the algorithm keeps track of the scale factor at each step and sizes the recorded region accordingly.

Search

The search algorithm builds on the idea of template matching and expands it to a wider scope.  Ideally the program would exactly match the regions around each cup and ignore the cup itself.  Since our correlation function can not exclude the middle area, we had to use a different approach.  The replacement algorithm generates blocks around the found cup with a width proportional to the size of the cup to be replaced.  Each individual block is then correlated through the image bank (similarly to as explained above).  The main difference is that the search algorithm must consider all blocks simultaneously-- a match is only a match if it works all the way around the suspect region.  To achieve this, the correlation matrices for each block are shifted and merged by the displacement of the block from the origin of the replacement image.  This generates a correlation matrix that takes all blocks into account.  The algorithm then finds the region with the highest correlation from all the images, and passes that region to the merge algorithm.

We built our test image bank from a relatively small number of images and just used Matlab’s imread function to load each one serially.  The program runs the above block based correlation on each image, keeping track of the highest correlation value and its assosciated region.

Because of the block nature of the search algorithm, one simple improvement we made was to give the blocks different weights based on their importance to the continuity of the image.  The human eye sees lines and edges more than muted textures, so we gave more weight (by multipying their correlation matrices by a factor before the final correlation sum) to blocks that contained more edges.  This modification helped ensure that arms stayed continuous and helped with the hand problem (the frequent presence of hands over the cup).  

Combination

After finding a region to suitably replace the excised region from the original image, the new image is blended with the original.  We used a conditional blend to completely replace the red cup, and then gradually blended the surrounding buffer regions together with the original image.   Our blend algorithm used a linear intensity blend (scaled sum of the two images), but could be quickly improved with bicubic blur (taking blur information from above and below as well) and a more consistent merger (angled corners).  

Collection Navigation

Content actions

Download:

Collection as:

EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Module as:

PDF | More downloads ...

Add:

Collection to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks

Module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks