DSP Tools

Throughout this project, we experimented with and used many different DSP tools to amplify tiny motions. Here's a discussion of the two tools we used from in class and the one tool that we used from outside of class!

Bandpass Filtering: In-Class Tool 1

The first DSP skill that we used, and used heavily throughout our project, was bandpass filtering. For motion amplification of any kind, filtering is the method through which you determine what motion is amplified and what motion is ignored. This is an important concept because when performing any kind of analysis of motion in a video, there is going to be some amount of noise from pixel movement in the background that has nothing to do with the actual motion being tracked. In our algorithm, we used a Butterworth bandpass filter as the main filter algorithm to determine what motion we wanted. This was a decision we made based on its level of complexity and its level of accuracy. For our algorithm, we needed a type of bandpass filter that was relatively simple to implement, because we would be passing in many different types of motion with different bandpass ranges, and these needed to be easily adaptable to each video. The Butterworth bandpass filter function implemented in our code uses the framework of the Python SciPy toolbox, which has tools for the Butterworth bandpass filter. Using this type of filter thus allowed us to keep our inputs both simple and low, consisting of the actual data we wanted processed, the lowpass and highpass limits, the sampling frequency, and the order of the filter (which we hardcoded to 5, a decision we made based on guidance from the MIT paper we were following for this project).

In our algorithm, we iterate through a tensor list (a list of frames) created by the Laplacian pyramid function, and these frames are then processed based on the lowpass and highpass boundaries set. Each frame is compared to data in the tensor list to look for pixel movement of similar value between frames, and motion that meets the required frequency is passed out of the bandpass filter to be amplified.

Fast Fourier Transform: In-Class Tool 2

The second DSP tool that we use in our algorithm is the fast Fourier transform. This is used both in the form of the FFT to change image information from time to frequency domain, and the IFFT to change the information back to time domain for reassembly into the .mp4 output format. We specifically use the FFT in our color amplification, which is what we use to show the amplification of motion more clearly. Using the SciPy toolbox again, we are able to simplify and optimize our code by using a preexisting implementation of the fast Fourier transform, that only requires us to pass in the information we need transformed. This was an important decision for us as processing speed and algorithm memory usage were two big concerns we had. When processing videos of large size, we found that the program errored out because of a lack of memory, so optimizing with an existing FFT from a well-maintained library was a more efficient method for our purpose.

In our algorithm, we use another type of filter not discussed in class called the temporal ideal filter. This is another example of a bandpass filter but is implemented with the FFT to extract frequency component information from a time signal. In the case of a video, this is helpful for the extraction of frequency information for easier processing. Within this filter, we call the FFT to break down the frame information and provide a list of frequencies for the pixel motion between the frames. After some processing through the filter (which acts as a simple bandpass filter rejecting all frequencies outside of an input range), the IFFT is performed on the data and this information is then passed out to be further processed and color amplified in a later part of the algorithm.

Image Pyramids: Out-of-Class Tool 1

The out of class DSP concept used in our algorithm is image pyramiding. Image pyramids are series of images (or frames of a video) that are repeatedly down sampled to create a “pyramid” of the same frame with decreasing resolution. These images are then compared to the next frame of the video, and a matrix subtraction function is called to subtract the two images from each other. By subtracting the two created image arrays, we can easily find the differences between them. These differences can then be processed through filters as described above to determine if it is motion or not, and whether the motion meets the characteristics that we are looking for.

In our algorithm, we use two different types of image pyramids. The first is a Gaussian pyramid, which is created by applying a Gaussian filter to the image, with increasingly large standard deviations. This thus creates a series of images that each become more smoothed at a reduced resolution. The function of this is to set up the creation of the Laplacian pyramid and to begin object/motion identification. Laplacian pyramids are the second type of image pyramids used in our algorithm. Laplacian pyramids are created by subtracting each level of a given Gaussian pyramid from the level above it, after up sampling is performed on the level. This has the effect of creating a series of images that show different levels of frequency information, from which the images can be separated into those that meet the frequency requirements of the bandpass filter and those that do not. After this separation, the frequencies that fall within the bandpass are amplified and then reconstructed with the rest of the video, to create a motion-amplified video.