<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE document PUBLIC "-//CNX//DTD CNXML 0.5 plus MathML//EN" "http://cnx.rice.edu/cnxml/0.5/DTD/cnxml_mathml.dtd">
<document xmlns="http://cnx.rice.edu/cnxml" xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:bib="http://bibtexml.sf.net/" id="m10142">
  
  <name>Graphing Methods Introduction</name>
  
  <metadata>
  <md:version>2.12</md:version>
  <md:created>2001/06/27</md:created>
  <md:revised>2003/07/18 14:49:49.149 GMT-5</md:revised>
  <md:authorlist>
    <md:author id="dmlane">
      <md:firstname>David</md:firstname>
      
      <md:surname>Lane</md:surname>
      <md:email>lane@rice.edu</md:email>
    </md:author>
  </md:authorlist>

  <md:maintainerlist>
    <md:maintainer id="dmlane">
      <md:firstname>David</md:firstname>
      
      <md:surname>Lane</md:surname>
      <md:email>lane@rice.edu</md:email>
    </md:maintainer>
    <md:maintainer id="jago">
      <md:firstname>Adan</md:firstname>
      
      <md:surname>Galvan</md:surname>
      <md:email>jago@rice.edu</md:email>
    </md:maintainer>
    <md:maintainer id="meyer">
      <md:firstname>Eileen</md:firstname>
      
      <md:surname>Meyer</md:surname>
      <md:email>meyer@rice.edu</md:email>
    </md:maintainer>
  </md:maintainerlist>
  
  <md:keywordlist>
    <md:keyword>graphs</md:keyword>
    <md:keyword>statistics</md:keyword>
  </md:keywordlist>

  <md:abstract>Introduction to graphs.</md:abstract>
</metadata>
  
  <content>
    
    <section id="into">
      <name>Clearing up the Draft with Graphs</name>
      
      <para id="intro1">
	In 1969 the war in Vietnam was at its height.  An agency
	called the <term>Selective Service</term> was charged with
	finding a fair procedure to determine which young men would be
	conscripted ("drafted") into the U.S. military.  The procedure
	was supposed to be fair in the sense of not favoring any
	culturally or economically defined subgroup of American men.
	It was decided that choosing "draftees" solely on the basis of
	a person's birth date would be fair.  A birthday lottery was
	thus devised. Pieces of paper representing the 366 days of the
	year (including February 29) were placed in plastic capsules,
	poured into a rotating drum, and then selected one at a time.
	The lower the draft number, the sooner the person would be
	drafted. Men with high enough numbers were not drafted at all.
      </para>
      
      <list id="links">
	<item>
	  <link src="http://www.dartmouth.edu/%7Echance/chance_news/recent_news/chance_news_6.10.html#draftlottery">
	    New York Times article about bias in the lottery</link>
	</item>
	<item>
	  <link src="http://www.landscaper.net/draft.htm">Article
	    about the draft itself</link>
	</item>
	<item><link src="http://psych.rice.edu/online_stat/chapter2/data/draft70yr.txt">Raw Data</link></item>
      
      </list>
      
      <para id="second">
	The draft table below shows the order in which birth dates
	were drawn from the drum (from left to right).  The first
	number selected was 258, which meant that someone born on the
	258th day of the year (September 14th) got a draft number of
	"1" and was among the first to be drafted.  The second number
	was 115, so someone born on the 115th day (April 24th) got a
	draft number of "2".  All 366 birth dates were assigned draft
	numbers in this way.  Someone born on the 160th day of the
	year (the last draft number drawn) got a draft number of 366.
      </para>
      
      <section id="dat1">
	<name>Draft Table</name>
	
	<para id="table1">
	  The birth dates emerging from the 1970 draft lottery in order
	  of appearance.
	</para>

  <code type="block">
    258 115 365 45 292 250 300 251 327 341 244 342 190 102 
    194 364 15 270 306 156 223 178 206 279 50 349 203 157 62 
    91 145 92 77 307 128 237 132 304 346 124 345 195 344 229 
    215 316 332 221 247 189 312 25 357 218 137 340 54 19 24 
    173 242 112 264 179 131 317 207 43 165 356 254 286 169 
    118 140 311 28 362 305 314 95 249 94 360 159 32 280 210 
    46 109 38 26 183 302 359 351 313 199 334 366 5 228 151 
    171 343 222 321 61 175 158 214 138 259 219 185 236 296 
    23 267 198 16 67 363 104 276 318 319 353 336 136 320 330 
    133 163 355 71 177 287 66 18 231 225 322 33 217 323 98 
    107 269 42 273 44 204 230 127 326 338 255 2 266 246 358 
    348 30 339 76 241 220 75 86 289 205 361 335 257 299 263 
    135 56 167 39 328 141 252 325 21 202 187 48 200 120 294 
    213 9 268 298 130 227 8 79 297 278 324 265 58 162 260 
    121 182 35 31 47 68 36 4 41 90 101 100 284 12 180 88 6 
    245 150 201 154 303 329 105 248 271 281 17 55 285 14 80 
    354 293 256 295 277 239 262 174 193 153 142 3 114 97 290 
    261 83 272 84 73 108 216 119 253 301 82 309 63 87 96 211 
    93 164 106 168 64 125 191 139 186 20 333 315 282 192 60 
    238 212 291 209 53 234 49 65 288 134 148 34 123 59 72
    155 51 208 352 1 7 226 149 331 310 232 99 152 347 274 
    113 69 13 144 350 129 197 70 224 10 143 188 337 11 122 
    196 78 243 81 161 110 22 40 235 117 170 283 85 233 111 
    103 37 308 29 184 116 240 181 74 27 166 147 176 275 172 
    146 89 52 126 57 160
  </code>
      
      </section>
    </section>
    
    <para id="third">
      The intention was for every birth date to have the same chance
      of coming up first as coming up second, or third, etc.  Was this
      reasonable expectation met, or were some times of year more
      likely to get lower numbers than others?  Look at the data above
      and see if you can discern the answer to this question.  You'll
      see that staring at the numbers in the table provides little
      idea of the overall pattern, and thus does not help to decide
      whether the birth dates were drawn randomly.
    </para>
    
    <para id="fourth">
      Things are much clearer if we graph the relation between birth
      dates and draft number.  There are many ways of creating such a
      graph. Let's proceed as follows.  First, we'll divide the 366
      birth dates into thirds (122 days each).  The first third goes
      from January 1 to May 1, the second from May 2 to August 31, and
      the last from September 1 to December 31.  The three groups of
      birth dates yield three groups of draft numbers.  The draft
      number for each birthday is the order it was picked in the
      drawing.
    </para>
    
    <para id="p4b">
      <link src="http://psych.rice.edu/online_stat/chapter2/data/draft3.html">View
	the three sets of 122 draft numbers</link>.
      </para>
    
    <para id="five">
      Next, from each group of draft numbers we'll pick six numbers to
      summarize all 122 of them.  Specifically, in each group, we
      determine:
      
      <list id="list2" type="enumerated">
	<item>The minimum draft number of the group</item>
	<item>The draft number at the 25th percentile of the group</item>
	<item>The draft number at the 50th percentile of the group</item>
	<item>The draft number at the 75th percentile of the group</item>
	<item>The maximum draft number of the group</item>
	<item>The mean of the 122 draft numbers</item>
      </list>
      
      Each set of 6 numbers (one such set for each group of birthdays)
      is then used to draw a box along a vertical scale running from 1
      to 400.  (We go beyond 366 just to stop at a nice, round
      number.)  The bottom of the box is drawn at the draft number
      corresponding to the 25th percentile.  The top is drawn at the
      draft number corresponding to the 75th percentile.  The draft
      number corresponding to the 50th percentile is drawn as a line
      inside the box.  Lines outside of the box mark the minimum and
      maximum draft numbers.  Finally, a plus sign is used to mark the
      mean.  The three boxes are then set side-by-side starting with
      the earlier birth dates and finishing with the latest.  This
      procedure gives us the three boxes shown in <cnxn strength="9" target="draft1"/>.  For example, we see from the first box that
      the 25th percentile of the first group is the draft number 122
      whereas the 75th percentile is 298.  The 50th percentile of the
      first group is 217, the mean is 210, and the minimum and maximum
      draft numbers are 2 and 365.
    </para>
    
    <figure id="draft1">
      <name/>
      <media type="image/png" src="draft.png"/>
      <caption>
	Draft numbers as a function of the part of the year the person
	was born.
      </caption>
    </figure>
    
    <para id="para7">
      If the draft numbers had been chosen randomly, then the three
      boxes should have been about the same.  However, they differ
      systematically.  The later in the year someone was born, the
      lower their draft number was likely to have been.  In other
      words, the box representing those born in the first third of the
      year is higher than the box representing those born in the
      second third which is, in turn, higher than the box for those
      born in the last third.  Had there been no relationship between
      birth date and draft number, the three boxes in <cnxn strength="9" target="draft1"/> would be lined up horizontally.
      Apparently the plastic capsules holding the birth dates were not
      shuffled sufficiently by the rotating drum.  The last ones put
      in tended to be the first ones pulled out.  (Which boys went to
      war was thus partly determined by a premature decision to stop
      turning the drum.)
    </para>
    
    <para id="para8">
      The important point is that <cnxn strength="9" target="draft1"/>
      brings order to a confusing array of data.  Specifically, it
      makes clear the relationship between birth date and draft
      number.  Although not everyone born late in the year was
      assigned a low draft number, draft numbers did decrease
      systematically with birth date.  This relation is not easy to
      detect from the numbers in <cnxn target="dat1" strength="9">Draft Table</cnxn> above but the visual
      representation in <cnxn strength="9" target="draft1"/> makes the
      relationship easy to see.
    </para>
    
    <para id="para9">
      Choosing what to graph and how to graph it is often the most
      important part of a statistical analysis.  Even sophisticated
      statistical analyses are often less revealing than a
      well-constructed graph.
    </para>
  
    
  </content>
  
</document>
