<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE document PUBLIC "-//CNX//DTD CNXML 0.5 plus MathML//EN" "http://cnx.rice.edu/cnxml/0.5/DTD/cnxml_mathml.dtd">
<document xmlns="http://cnx.rice.edu/cnxml" xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:bib="http://bibtexml.sf.net/" xmlns:m="http://www.w3.org/1998/Math/MathML" id="m10995">
  <name xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">Introduction to NCBI </name>
  <metadata xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">
  <md:version xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">2.0</md:version>
  <md:created xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">2003/01/12</md:created>
  <md:revised xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">2003/01/12</md:revised>
  <md:authorlist xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">
      <md:author xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="mscates">
      <md:firstname xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">Susan</md:firstname>
      
      <md:surname xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">Cates</md:surname>
      <md:email xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">mscates@bioc.rice.edu</md:email>
    </md:author>
  </md:authorlist>

  <md:maintainerlist xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">
    <md:maintainer xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="mscates">
      <md:firstname xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">Susan</md:firstname>
      
      <md:surname xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">Cates</md:surname>
      <md:email xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">mscates@bioc.rice.edu</md:email>
    </md:maintainer>
  </md:maintainerlist>
  
  <md:keywordlist xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">
    <md:keyword xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">bioinformatics</md:keyword>
    <md:keyword xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">biological databases</md:keyword>
    <md:keyword xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">database mining</md:keyword>
    <md:keyword xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">NCBI</md:keyword>
    <md:keyword xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">sequence alignment</md:keyword>
  </md:keywordlist>

  <md:abstract xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">This is an introduction to the bioinformatics website provided by the National Center for Biotechnology Information (NCBI).  It includes an overview of the basic mission of NCBI and an introduction to the most commonly used biological databases available on the website and the tools for viewing and analyzing the data.</md:abstract>
</metadata>



 <content xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/">
    <para xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="intro">
      The National Center for Biotechnology Information (NCBI)
      provides a comprehensive website for biologists that includes
      biology-related databases, and tools for viewing and analyzing
      the data inherent in the databases.  A division of the National
      Library of Medicine at the National Institutes of Health, NCBI
      is the agency responsible for creating automated systems for
      storing and analyzing the rapidly growing profusion of genetic
      and molecular data.  One of the most difficult challenges faced
      in the field of bioinformatics is how to store, in an easily
      accessible manner, the overwhelming abundance of new
      information, including the sequences of entire genomes, the
      ongoing discoveries of new genes and gene products, and the
      determinations of their functions and structures.  NCBI was
      established as the government's response to the need for more
      and better information processing methods to deal with this
      challenge.
</para>
    <para xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="para2">
      View the <link xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" src="http://www.ncbi.nlm.nih.gov/">NCBI home
	page</link>.  A relatively good overview of the tools and
	databases that can be accessed through NCBI is provided in the
	list along the left border of the home page.  Clicking on the
	link entitled "About NCBI" produces a second menu containing
	the topics "A Science Primer", and "Databases and Tools",
	among others.  Selecting "A Science Primer" yields access to
	general definitions and introductory information regarding the
	branches of science included in bioinformatics.  Many
	bioinformatics terms are defined in this section in a
	clear-cut and basic manner, making this Primer an excellent
	first resource. Selecting "Databases and Tools" from the
	"About NCBI" webpage menu yields a complete and well-ordered
	listing of accessible information.  This web page containing
	the databases and tools menu is a good choice for those who
	are inclined toward bookmarking.
    </para>
    <para xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="para3">
      The first item under the "Databases and Tools" menu is
      "Literature Databases".  PubMed is the most heavily used of the
      literature databases and can be used to access MEDLINE
      biological and medical scientific journal citations dating back
      to articles written in the mid-1960's.  The second item under
      the "Databases and Tools"menu is "Entrez Databases".  <cite xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" src="#entrez">Entrez</cite> is a search and retrieval system
      developed by NCBI that is capable of accessing integrated
      information by searching many of the NCBI databases with just
      one query (instead of searching only one database per query,
      then having to repeat the query to find information on the same
      topic from another NCBI database).  The NCBI databases that are
      included in the search when you launch an Entrez query are shown
      when you click on this link.  The "Nucleotide Databases" link
      under the "Databases and Tools" menu lists all the sequence
      databases available through NCBI.  These sequence databases
      contain annotated collections of publicly available DNA, RNA and
      protein sequences.  The evolution of bioinformatics data mining
      methods has been largely driven by the prodigious amount of
      sequence information collected by scientists in recent years.
      New sequences of unknown function can be compared with sequences
      of well-characterized genes and proteins.  Similarities can be
      identified between the new, unknown sequences and the
      well-characterized sequences, and used to postulate theories
      regarding function or structure.</para>

    <para xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="para4">
        Among the tools listed under the NCBI "Databases and Tools"
      menu, are "Tools for Data Mining".  Selecting the "Tools for
      Data Mining" topic will show a list of data retrieval tools,
      including Entrez, mentioned above, and BLAST, the <cite xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" src="#altetc">Basic Local Alignment Search Tool</cite>.  Blast
      is the predominant sequence alignment tool for performing rapid
      searches of nucleotide and protein sequence databases and
      detecting local, as well as global, sequence alignments between
      the query sequence and the database sequences.
    </para>
    <para xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="conclusion">
      This is a brief glimpse at some of the more widely used tools
      and databases presented by NCBI, presented with the intention of
      helping the novice get some feel for the number and types of
      bioinformatics tools that are available on the internet today.
      Several of these tools are covered in more detail in subsequent
      modules included in this bioinformatics course.  Before
      proceeding to the next module, take a moment to return to the
      "About NCBI" webpage menu and glance through some of the
      interesting webpa ges linked under the topics "A Science
      Primer", "Outreach and Education", and "News".
    </para>

  </content>

  <bib:file>
    <bib:entry id="entrez">
      <bib:article>
	<bib:author>Benson D.A., Boguski M.S., Lipman D.J., Ostell J.</bib:author>
  
	<bib:title>GenBank</bib:title>
	<bib:journal>Nucleic Acids Res.</bib:journal>
	<bib:year>1994</bib:year>
	<bib:volume>22</bib:volume>
	<bib:pages>3441-3444</bib:pages>
      </bib:article>
    </bib:entry>

    <bib:entry id="altetc">
      <bib:article>
	<bib:author>Altschul S.F., Gish W., Miller W., Myers E.W., Lipman D.J.</bib:author>
	<bib:title>Basic local alignment search tool</bib:title>
	<bib:journal>J. Mol. Biol.</bib:journal>
	<bib:year>1990</bib:year>
	<bib:volume>215</bib:volume>
	<bib:pages>403-410</bib:pages>
      </bib:article>
    </bib:entry>
  </bib:file>
</document>
