<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE document PUBLIC "-//CNX//DTD CNXML 0.5//EN" "http://cnx.rice.edu/technology/cnxml/schema/dtd/0.5/cnxml_plain.dtd">
<document xmlns="http://cnx.rice.edu/cnxml" xmlns:md="http://cnx.rice.edu/mdml/0.4" xmlns:bib="http://bibtexml.sf.net/" id="new">
  <name>Import from Elsewhere</name>
  <metadata>
  <md:version>1.1</md:version>
  <md:created>2007/08/30 00:23:26.286 GMT-5</md:created>
  <md:revised>2007/08/30 01:08:40.326 GMT-5</md:revised>
  <md:authorlist>
      <md:author id="jgrimes">
      <md:firstname>Joseph</md:firstname>
      <md:othername>E.</md:othername>
      <md:surname>Grimes</md:surname>
      <md:email>joe_grimes@sil.org</md:email>
    </md:author>
  </md:authorlist>

  <md:maintainerlist>
    <md:maintainer id="jgrimes">
      <md:firstname>Joseph</md:firstname>
      <md:othername>E.</md:othername>
      <md:surname>Grimes</md:surname>
      <md:email>joe_grimes@sil.org</md:email>
    </md:maintainer>
  </md:maintainerlist>
  
  <md:keywordlist>
    <md:keyword>collection</md:keyword>
    <md:keyword>converter</md:keyword>
    <md:keyword>import</md:keyword>
    <md:keyword>spreadsheet</md:keyword>
    <md:keyword>Wordcorr</md:keyword>
    <md:keyword>WordSurv</md:keyword>
    <md:keyword>XML</md:keyword>
  </md:keywordlist>

  <md:abstract>Data can be morphed into Wordcorr's XML structure by using converters.</md:abstract>
</metadata>
  <content>
    <para id="delete_me">Many linguists have already put data into a computer. They may have had no access at the time to any standardized scheme for showing the structure of their data set. But at present there are two paths for getting data into Wordcorr from another computer readable form: spreadsheets and WordSurv. People will probably invent more paths as they are needed.</para><para id="element-418">If you have comparative word lists in a spreadsheet, chances are good that you can copy them en masse into a special spreadsheet constructed by Maria Faehndrich, which a Python script then transforms into a Wordcorr XML file. The special spreadsheet, the Python script, and complete instructions are in the "utilities" package on Wordcorr's <link src="http://sourceforge.net/project/showfiles.php?group_id=63303">download site</link>.</para><para id="element-833">The idea is this: word lists are usually displayed together with the names or codes of speech varieties going with across the top of a table, and the meanings of the forms down the left side of the table. The forms themselves fill in the boxes.</para><para id="element-187">You copy your word lists in a block from your spreadsheet to the place indicated on the special spreadsheet. You also fill in all the metadata for your collection on the special spreadsheet. Then you save that spreadsheet as a tab-delimited file (you may need to get a computer guru to show you how) and aim the Python script at the result (the same guru can probably handle it). You import the resulting file into Wordcorr.</para><para id="element-614">A program called WordSurv has been in use for some time. Its original aim was to provide lexical similarity counts for possibly related speech varieties. It continues to be used as a way of storing word lists in a computer, even though few linguists give credence to its conclusions about relationships because they have been shown to be unreliable in the statistical sense.</para><para id="element-981">Wordcorr makes use of WordSurv's very sensible plan for data storage, but applies widely tested and agreed upon best practices to produce its results. Wordcorr contains a converter for the first versions of Wordsurv, with instructions in Wordcorr Help. More recent versions can be converted by the <link src="http://www.palmsurv.com">Palmsurv</link> converter.</para><para id="element-574">WordSurv and older spreadsheets were unable to handle phonetic symbols, so linguists used their imagination: N for angma, E for epsilon, 7 for glottal stop, on and on and usually different for each collection. One of the spreadsheet utilities includes a table in which you can put the Unicode characters of the International Phonetic Alphabet that you want to convert to. For imports from WordSurv, however, hand editing on the Data panel seems to be the only feasible solution so far.</para>   
  </content>
  
</document>
