Connexions

You are here: Home » Content » Determining the ROI

Recently Viewed

This feature requires Javascript to be enabled.

Determining the ROI

Incorporating Motion, Edge, and Focus Detection

In order to determine the ROI for each frame of a movie, we need to be able to incorporate the results of motion, edge, and focus detection into a single system. The way we accomplish this is by analyzing each frame in five separate sections.

Frame Region Analysis

For motion and edge detection, the entire frame is processed at once, and then the resulting matrix is broken into 5 regions as below:

For motion detection, recall that the processing results in a “difference” matrix after subtracting two frames. The mean of the difference values is taken for each region and divided by the mean of the difference values for the whole frame. These region means are then normalized by the one with the largest magnitude. The result is a number between 0 and 1 for each region, with a value of 1 indicating the region of maximum relative change and 0 indicating no change.

For edge detection, recall that the processing results in an edge matrix, where a value of 1 means that that pixel is part of an edge and a value of zero indicates that the pixel is not part of an edge. Thus the sum of the pixels in each region is found and normalized to the region with the highest sum. The result is a value between 0 and 1 for each region, with 1 indicating the region with the most edges and 0 indicating a region with no edges.

For focus detection, recall that the processing results in a value for the slope of the linear regression of the loglog plot of the power spectrum. Due to the requirement of a square matrix for the 2D Fourier transform, the frame is divided into 5 semi-overlapped square regions:

The focus detection processing is performed on each of these regions, and then the most in-focus region is identified (by its falloff rate), and the remaining regions are assigned a normalized value corresponding to how close they come to having the best focus value. The result is a value between 0 and 1 for each region, with 1 indicating the region of best focus and 0 indicating the region of worst focus.

Assigning the ROI

After converting the results of motion, edge, and focus detection into values between 0 and 1 for each region, the values are averaged for each region. This gives one value for each region, with higher values indicating that there are more elements of interest in that region.

To translate this into an ROI, a horizontal midpoint is defined within the widescreen frame, and each region’s interest value is mapped to a weighted deviation from this midpoint. The net deviation from midpoint is then found by summing these deviations, and the fullscreen ROI midpoint is defined to be at this deviation from widescreen center.

Thus, interest in regions 1 and 2 act to pull the fullscreen ROI to the left, while interest in regions 4 and 5 acts to pull the fullscreen ROI to the right, and activity in region 3 acts to maintain the fullscreen ROI at the center.

The final midpoint value is filtered with a moving average (half-width = 30, or approximately 1 second of video) to eliminate jerky ROI movements.

Content actions

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags?

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks