Polymorphic region predictions (PRPs) generated with the mPPR method (Zeller G, Clark RM, Schneeberger K, Bohlen A, Weigel D, Raetsch G. Detecting polymorphic regions in Arabidopsis thaliana with resequencing microarrays. Genome Research 2008, in press, PMID: 18323538). For continuity with earlier studies, we have annotated PRPs relative to both the TAIR6 and TAIR7 genome annotations. ---------------------------------- Directory: 'whole_genome_predictions_txt'. Archives included: _PRPs_txt.tar.bz2 The bz2 compressed archives correspond to flat text files with each harboring PRPs per accession per chromosome. Column definitions: 1. Accession 2. Chromosome 3. Start position of the PRP 4. End (i.e. last) position of the PRP 5. Number of repetitive positions inclusive to the PRP ---------------------------------- Directory: 'whole_genome_predictions_gff'. Archives included: _PRPs_gff.tar.bz2 The bz2 compressed archives contains one GFF file per accession per chromosome that adheres to the conventions of the GFF3 file format (see http://song.sourceforge.net/gff3.shtml). Column definitions: 1. Chromomsome 2. Prediction source (always "mPPR") 3. Feature type (always "variation") 4. Start position of the PRP 5. End (i.e. last) position of the PRP 6. Fraction of repetitve probes inclusive to the PRP 7. Strand (always ".") 8. Phase (always ".") 9. Attributes (always ".") ----------------------------------------------- Directories: 'prps_in_genes_tair6' and 'prps_in_genes_tair7'. Note: The same file sets are included in both directories only the annotation employed differs as indicated throughout the file names (TAIR6 vs TAIR7; see also below). Files: _PRPs_in_genes_tair.txt.bz2 bz2 compressed tab-delimited flat files, one per accession that contain one line for each PRP that overlaps a gene present in the TAIR annotation version (6 or 7). Column definitions: 1. Gene ID (ATG code) 2. Chromosome 3. Gene start position 4. Gene end position 5. PRP start position 6. PRP end position 7. Number of repetitive probes inclusive to the PRP Files: _PRPs_in_locus_sequence_tair.fasta.bz2 bz2 compressed multi-fasta files (one per accession) that contain one sequence per locus (annotated in TAIR version ) in which nucleotides inclusive to PRPs were masked with 'x' or 'X'. Upper case letters correspond to exons and lower case letters to introns, respectively. For genes on the Crick (-) strand, sequences are the reverse complement of the genome sequence. Files: _PRPs_in_transcript_tair.fasta.bz2 bz2 compressed multi-fasta files (one per accession) that contain one sequence entry per transcript (annotated in TAIR version ) that only contains exonic portions. Nucleotides inclusive to PRPs were masked with 'x' or 'X'. Here, upper case letters correspond to coding sequence and lower case letters to untranslated regions, respectively. For genes on the Crick (-) strand, sequences are the reverse complement of the genome sequence.