Skip to content Skip to navigation

Connexions

You are here: Home » Content » Group testing and data stream algorithms

Navigation

Recently Viewed

This feature requires Javascript to be enabled.
 

Group testing and data stream algorithms

Module by: Mark A. Davenport. E-mail the author

Summary: This module provides an overview of the relationship between compressive sensing and problems in theoretical computer science including combinatorial group testing and computation on data streams.

Another scenario where compressive sensing and sparse recovery algorithms can be potentially useful is the context of group testing and the related problem of computation on data streams.

Group testing

Among the historically oldest of all sparse recovery algorithms were developed in the context of combinatorial group testing [2], [3], [5]. In this problem we suppose that there are NN total items and KK anomalous elements that we wish to find. For example, we might wish to identify defective products in an industrial setting, or identify a subset of diseased tissue samples in a medical context. In both of these cases the vector xx indicates which elements are anomalous, i.e., xi0xi0 for the KK anomalous elements and xi=0xi=0 otherwise. Our goal is to design a collection of tests that allow us to identify the support (and possibly the values of the nonzeros) of xx while also minimizing the number of tests performed. In the simplest practical setting these tests are represented by a binary matrix ΦΦ whose entries φijφij are equal to 1 if and only if the j th j th item is used in the i th i th test. If the output of the test is linear with respect to the inputs, then the problem of recovering the vector xx is essentially the same as the standard sparse recovery problem in compressive sensing.

Computation on data streams

Another application area in which ideas related to compressive sensing have proven useful is computation on data streams [1], [4]. As an example of a typical data streaming problem, suppose that xixi represents the number of packets passing through a network router with destination ii. Simply storing the vector xx is typically infeasible since the total number of possible destinations (represented by a 32-bit IP address) is N=232N=232. Thus, instead of attempting to store xx directly, one can store y=Φxy=Φx where ΦΦ is an M×NM×N matrix with MNMN. In this context the vector yy is often called a sketch. Note that in this problem yy is computed in a different manner than in the compressive sensing context. Specifically, in the network traffic example we do not ever observe xixi directly, rather we observe increments to xixi (when a packet with destination ii passes through the router). Thus we construct yy iteratively by adding the i th i th column to yy each time we observe an increment to xixi, which we can do since y=Φxy=Φx is linear. When the network traffic is dominated by traffic to a small number of destinations, the vector xx is compressible, and thus the problem of recovering xx from the sketch ΦxΦx is again essentially the same as the sparse recovery problem in compressive sensing.

References

  1. Cormode, G. and Hadjieleftheriou, M. (2009). Finding the frequent items in streams of data. Comm. ACM, 52(10), 97–105.
  2. Erlich, Y. and Shental, N. and Amir, A. and Zuk, O. (2009, Sept.). Compressed sensing approach for high throughput carrier screen. In Proc. Allerton Conf. Communication, Control, and Computing. Monticello, IL
  3. Kainkaryam, R. and Breux, A. and Gilbert, A. and Woolf, P. and Schiefelbein, J. (2010). poolMC: Smart pooling of mRNA samples in microarray experiments. BMC Bioinformatics, 11(1), 299.
  4. Muthukrishnan, S. (2005). Found. Trends in Theoretical Comput. Science: Vol. 1. Data Streams: Algorithms and Applications. (). Boston, MA: Now Publishers.
  5. Shental, N. and Amir, A. and Zuk, O. (2009). Identification of rare alleles and their carriers using compressed se(que)nsing. Nucleic Acids Research, 38(19), e179.

Content actions

Download module as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Add module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks