Skip to content Skip to navigation

Connexions

You are here: Home » Content » Data Analysis

Navigation

Recently Viewed

This feature requires Javascript to be enabled.
 

Data Analysis

Module by: Ewa Paszek. E-mail the author

Summary: This course is a short series of lectures on Statistical Bioinformatics. Topics covered are listed in the Table of Contents. The notes were prepared by Ewa Paszek, Lukasz Wita and Marek Kimmel. The development of this course has been supported by NSF 0203396 grant.

Data Analysis.

After scanning, a grid must be placed on the image and the spots representing the arrayed genes must be identified. The background fluorescence is calculated locally for each spot and is subtracted from the hybridization intensities. Comparing the fluorescence intensity of control identifies differentially expressed genes and experimental probes hybridized to each spot, (Freeman et al.,2000; Bowtell, 1999; Knudsen, 2002) Typically, the experimental target sequences are labeled with Cy5, which fluoresces red light (667 nm), and control targets are labeled with Cy3, which fluoresces green light (568 nm). The ratio of red to green signal can then be used as a measure of the effect of the experimental treatment on the expression of each gene. A ratio of 1 (yellow spot) indicates no change in the expression level between experimental and control samples, while a ratio greater than 1 (red spot) indicates increased transcription in the experimental sample, and a ratio less than 1 (green spot) indicates decreased transcription in the experimental sample. A scatter plot is a very useful representation of the expression data; the signal intensities of the experimental and control samples are plotted along the x- and y-axes, and the ratio values are plotted as a distance from the diagonal, (Schena, 2003). The diagonal separates spots with higher activity than the control sample from spots with lower activity than the control. The scatter plot provides a visualization of the fluorescence ratios obtained from the experimental and control samples. One can then easily choose points that represent a several fold increase or decrease in gene expression and focus additional analyses on these genes.

Figure 1: A hybridized microarray printed by the AECOM robot (Cheung et al.,1999). A 5550-gene mouse cDNA microarray was printed and hybridized to Cye3-dUTP and Cye5-dUTP probes from wild-type and mutant mouse cell lines and imaged using the AECOM laser scanner. Shown is one out off our of the pen tip printing areas region of the array.
The hybridized microarray.
The hybridized microarray. (chip.gif)

With just one experimental condition and a control, the data analysis is limited to a list of regulated genes ranked by the fold-change or by the significance of the change determined in a t test. Normalization of data must be performed to compare separate arrays. With multiple experimental conditions (e.g. time-points or drug doses), the genes are often grouped into clusters that behave similarly under the different conditions. Complex computational methods such as hierarchical clustering or k-means are used to analyze the massive amounts of data generated by these experiments. Gene clusters are visualized with trees or color-coded matrices by placing genes with similar patterns of expression into a clustered group Figure11. Image processing and analysis software is commercially available, and several packages are available as freeware: http://www.bio.davidson.edu/projects/GCAT/GCATprotocols.html, http://www.tigr.org/software/, http://research.nhgri.nih.gov/microarray/main.html, http://www.bio.davidson.edu/projects/magic/magic.html.

Figure 2: Clustering of gene expression patterns. a, the ratio of gene expression in control relative to experimental for individual genes is displayed using a color scale. Black indicates no change in expression, while an increase in the experimental relative to the control is shown as red, and a decrease in the experimental relative to the control is shown as green. Genes displaying similar patterns of induction or repression are clustered together. b, clustering of thousands of genes by patterns of gene induction or repression following a treatment, (Campbell and Heyer, 2003).
Clustering of gene expression patterns.
Clustering of gene expression patterns. (cluster_1.gif)

Microarray analysis of gene expression does have limitations that researchers must consider. In gene expression, the correlation between induced mRNA and induced levels of protein are not always well aligned. Translational and post-translational regulatory mechanisms that impact the activity of various cellular proteins are not examined by DNA microarrays, though the emerging field of proteomics is beginning to address this issue. Other limitations of microarray analysis include the impact of alternative splicing during transcript processing and the limited detectability of unstable mRNAs. Differential gene expression results must be confirmed through direct examination of selected genes. These analyses are typically at the level of RNA blot or quantitative RT-PCR to examine transcripts of a specific gene, and/or detection of protein concentration using immunoblots. Additional studies often include alteration of gene function with targeted mutations, antisense technology, or protein inhibition.

see also:

see also:

References

  1. Bowtell, D.L. (1999). Options available—From start to finish—for obtaining expression data by microarray. Nat. Genet. Suppl., 21, 25–32.
  2. Campbell, A.M., Heye L.J. (2003). Discovering Genomics, Proteomics and Bioinformatics. CSHL Press and Benjamin Cummings, San Francisco, CA..
  3. Cheung, V.G., Morley, M., Aguilar, F., Massimi, A., Kucherlapati, R., Childs, G. (1999). Making and reading microarrays. nature genetics supplement, 21, 15-19.
  4. Freeman, W.M., Robertson, D.J., Vrana, K.E. (2000). Fundamentals of DNA hybridization arrays for gene expression analysis. BioTechniques, 29, 1042–1055.
  5. Knudsen, S. (2002). A Biologist’s Guide to Analysis of DNA Microarray Data. Wiley-Liss, New York.
  6. Schena, M. (2003). Microarray Analysis. Wiley-Liss, Hoboken, NJ..

Content actions

Download module as:

PDF | EPUB (?)

What is an EPUB file?

EPUB is an electronic book format that can be read on a variety of mobile devices.

Downloading to a reading device

For detailed instructions on how to download this content's EPUB to your specific device, click the "(?)" link.

| More downloads ...

Add module to:

My Favorites (?)

'My Favorites' is a special kind of lens which you can use to bookmark modules and collections. 'My Favorites' can only be seen by you, and collections saved in 'My Favorites' can remember the last module you were on. You need an account to use 'My Favorites'.

| A lens I own (?)

Definition of a lens

Lenses

A lens is a custom view of the content in the repository. You can think of it as a fancy kind of list that will let you see content through the eyes of organizations and people you trust.

What is in a lens?

Lens makers point to materials (modules and collections), creating a guide that includes their own comments and descriptive tags about the content.

Who can create a lens?

Any individual member, a community, or a respected organization.

What are tags? tag icon

Tags are descriptors added by lens makers to help label content, attaching a vocabulary that is meaningful in the context of the lens.

| External bookmarks