Skip to main content.

This page’s menu:

Import and Export

Page 4 of 5

Import / Export a pedigree from / as a LINKAGE file

The LINKAGE file format

Basically, the LINKAGE file format is the same as the CSV file format shown on the previous pages. But here the items are separated by spaces, in contast to CSV (Comma Separated Values).

The first 6 columns are

  1. Family ID
  2. Individual ID
  3. Father ID
  4. Mother ID
  5. Gender
  6. Phenotype or affection status:
    Integer 0 .. 2
    1 = unaffected, 2 = affected, 0 = unknown
    Phenotypes outside 0..2 will be treated as '0' (unknown).
  7. Following are

  8. Marker genotypes:
    Each marker is represented by two columns (one for each allele, separated by a space)

All items are integers - individual IDs must be unique integers

With PED, only columns 2 - 6 must be integers >= 0

The family ID may be a string (or even omitted, if there is only one family in the file). The marker genotypes must be integers, if you use the data for linkage analysis. PED does not test whether these alleles are integers.

Use CSV format whenever possible

The original linkage file format uses space as an item separator. So does PED, if a pedigree is exported in linkage file format. CSV is more flexible when the pedigree information has to be imported in a spreadsheet, or in a database. Missing values will be recognized, and will pose no problem if a pedigree has to be re-imported in PED.

Examples from the web

As soon as you import CSV or linkage files in PED that have previously been exported from PED, there is nothing special to consider. In real life, you probaly have data that will not adhere strictly to linkage file format. Two examples taken from the web should show you how to avoid possible stumbling blocks.

1. Rearrange columns in data files

If you concentrate on linkage analysis you certainly have noticed the following examples:

The first one is taken from http://linkage.rockefeller.edu/soft/linkage/. Select chapter 2.7 Pedigree Information (PEDFILE), and scroll down to the very last example at this page (just beneath the line The following data refer to a larger pedigree, taken from a coronary heart disease study, in PEDFILE form:).

PEDFILE

Here we are interested only in coumns 1 (pedigree number), 2 (id number), 3 (father id), 4 (mother id), 8 (sex), 10 (disease status).

Now copy data from screen (from the original web page - here we have only a PNG file) to a text file, replace spaces by commas, replace double and triple commas by a single one, save as CSV file, import in a spreadsheet program, delete all columns except those mentioned above, and save as CSV file again.

This is the resulting CSV file:

CSV file

We launch PED, press the Import button, use first five columns, and, as 'phenotype', column 6.

This is the resulting pedigree:

pedigree, imported from CSV

To adjust the lay-out, select Input Window - Options - (or [Ctrl]+O), More, a symbol and a font size of 99 %, press ID Position and choose top right, and close all dialogs.

If you take a closer look at, say, members number 19 - 21, you will see that an unknown phenotype results in a "?" inside the symbol, and affected members are displayed as black circles or squares, depending on the current phenotype file.

2. Missing links in data files

Sometimes you will get a pedigree much smaller than expected. For an example, please take a look at http://hg.wustl.edu/info/linkage/dataprep.html.

Scroll down to the bottom of the page until you see the line .pre File (in 'pre-MAKEPED', or 'pre-LINKAGE' format). On the right side a small file is displayed:

Linkage file

Importing this file in PED will result in the following pedigree:

imported pedigree - members missing

Only 5 members imported. The IDs are at the top right position, so it is easy to discover the missing members:

missing members

There are three more "founders" (no parents of their own) in this small pedigree, and two more members (103, 105) that are children of these founders (30,31,32). We do not know how they are linked to the members of the previously shown pedigree. We add a pair of grandparents (9 and 10), and mark members 31 and 104 as their children:

a new pair of grandparents

On the right side we see the pedigree from above, on the left we have the newly added members. On top reside the grandparents:

a modified pedigree

Finally, two rules of thumb

  1. If the imported pedigree is smaller than expected, there is probably a missing link between the members of a pedigree
  2. If the imported pedigree is higgledy-piggledy (did you ever see The Thin Man Goes Home ?), a chaos of lines, circles, and squares, there is probably a loop or a consanguinity in your data file. Break that loop, and your pedigree will be properly imported

Go to: previous page | top of this page | next page of Import and Export