Drosophila melanogaster genome annotation release 4.1 date 20050207 DATA CONTENTS Feature counts in release 4, 3 compared annotation features (r40 nov04, r322 oct 04, r320 March 04) Table of D. mel. genome feature counts per release. Feature 410 400 322 320 -------------------------------------------------------------- BAC 674 -- 949 949 << reduced BACS (removed obsoletes?) CDS 18941 18715 18747 18746 DNA_motif 5 5 5 5 EST -- -- -- 304257 RNA_motif 0 0 1 0 aberration_junction 86 86 86 87 cDNA_clone -- -- -- 10204 chromosome_arm 6 6 6 -- chromosome_band 5770 -- 5715 -- enhancer 27 27 27 27 five_prime_UTR 14641 14360 15769 13608 gene 13449 13472 13472 13473 << reduced due to merges insertion_site 457 457 457 424 intron 16362 16135 16153 16199 mRNA 19572 19301 19302 18810 == all transcripts (see below) mRNA_genscan -- -- 19189 -- mRNA_piecegenie -- -- 13794 -- match_HDP 139 139 2448 -- match_RNAiHDP 110 110 40 -- match_assembly_path 434 434 -- -- match_blastx_aa_SP.hyp.dros -- -- 354 -- match_blastx_aa_SP.real.dros -- -- 22163 -- match_blastx_aa_SPTR.dmel 207911 207911 -- -- match_blastx_aa_SPTR.dros -- -- 68846 -- match_blastx_aa_SPTR.insect 16610 16610 7492 -- match_blastx_aa_SPTR.othinv 21451 21451 12471 -- match_blastx_aa_SPTR.othvert 18036 18036 11774 -- match_blastx_aa_SPTR.plant 11997 11997 9609 -- match_blastx_aa_SPTR.primate 20850 20850 16345 -- match_blastx_aa_SPTR.rodent 21644 21644 16081 -- match_blastx_aa_SPTR.worm 13765 13765 12679 -- match_blastx_aa_SPTR.yeast 5593 5593 5211 -- match_blastx_aa_TR.real.dros -- -- 43823 -- match_blastx_aa_users_i.dros -- -- 4633 -- match_fgenesh -- -- 14837 -- match_genie 11063 11063 -- -- match_genscan 17811 17811 -- -- match_repeat_runner_seg -- 9198 -- -- match_repeatmasker 11758 11758 -- -- match_sim4_na_ARGs.dros 1062 1062 -- -- match_sim4_na_ARGsCDS.dros 984 984 -- -- match_sim4_na_DGC.dros -- -- 15270 -- match_sim4_na_DGC_dros 6458 5159 -- -- match_sim4_na_EST.all_nr.dros -- -- 267828 -- match_sim4_na_adh.cDNAs.dros -- -- 51 -- match_sim4_na_cDNA.dros -- -- 10319 -- match_sim4_na_dbEST.diff.dmel 85910 82910 -- -- match_sim4_na_dbEST.same.dmel 169078 159793 -- -- match_sim4_na_gadfly.dros.RE.. -- -- 14389 -- match_sim4_na_gadfly_dmel_r2 14249 14249 -- -- match_sim4_na_gb.dmel 26531 26531 -- -- match_sim4_na_gb.dros -- -- 14977 -- match_sim4_na_gb.tpa.dmel 2214 2214 -- -- match_sim4_na_pe.dros -- -- 3201 -- match_sim4_na_smallRNA.dros 98 98 -- -- match_sim4_na_transcript_dme.. 19001 19001 -- -- match_sim4_na_transcript_dme.. 18799 18799 -- -- match_sim4tandem_na_gb.dmel 23748 23748 -- -- match_tRNAscan-SE 295 295 -- -- match_tblastx_na_agambiae 101190 101190 -- -- match_tblastx_na_dbEST.insect 34107 34107 16818 -- match_tblastx_na_dpse 263465 263465 -- -- match_tblastx_na_unigene.rod.. -- -- 11707 -- mature_peptide 7 7 7 8 ncRNA 130 70 70 65 << new data oligo 197525 -- 197726 193813 << added back old data orthologous_region -- -- 12101 -- point_mutation 485 485 485 476 polyA_site 107 107 107 101 processed_transcript -- -- -- 16748 protein -- -- -- 233812 protein_binding_site 90 90 90 85 pseudogene 39 40 40 39 rRNA 96 96 96 85 region 30 30 30 28 regulatory_region 137 137 137 136 repeat_region 9199 1 4652 3390 rescue_fragment 136 136 136 135 scaffold 437 437 437 437 sequence_variant 232 232 232 225 signal_peptide 0 0 0 1 snRNA 29 28 28 28 snoRNA 28 28 28 28 syntenic_region -- -- 1230 -- tRNA 295 288 288 288 tRNA_trnascan -- -- 297 -- three_prime_UTR 15019 14683 16777 15493 transcription_start_site 36921 36921 35737 16997 transposable_element 1571 1571 1572 1567 transposable_element_inserti.. 16404 4680 3257 4566 << new data transposable_element_pred -- -- 1572 -- -------------------------------------------------------------- -- == data not available for this feature Table of D. mel. cytological feature counts per release. Feature 410 400 322 320 -------------------------------------------------------------- cyto_Inversion-derived_defic.. 24 -- -- -- cyto_assortment-derived_aneu.. 11 -- -- -- cyto_assortment-derived_defi.. 1 -- -- -- cyto_assortment-derived_defi.. 11 -- -- -- cyto_assortment-derived_dupl.. 16 -- -- -- cyto_bipartite_duplication 9 -- -- -- cyto_bipartite_inversion 116 -- -- -- cyto_chromosomal_deletion 11476 -- -- -- cyto_chromosomal_duplication 5 -- -- -- cyto_chromosomal_inversion 4442 -- -- -- cyto_chromosomal_translocation 4070 -- -- -- cyto_complex_chromosomal_mut.. 22 -- -- -- cyto_compound_chromosome_arm 4 -- -- -- cyto_cyclic_translocation 140 -- -- -- cyto_deficient_interchromoso.. 1 -- -- -- cyto_deficient_intrachromoso.. 1 -- -- -- cyto_deficient_inversion 73 -- -- -- cyto_deficient_translocation 86 -- -- -- cyto_dexstrosynaptic_chromos.. 127 -- -- -- cyto_free_chromosome_arm 2 -- -- -- cyto_free_duplication 44 -- -- -- cyto_free_ring_duplication 4 -- -- -- cyto_hetero-compound_chromos.. 1 -- -- -- cyto_interchromosomal_duplic.. 250 -- -- -- cyto_interchromosomal_transp.. 1539 -- -- -- cyto_intrachromosomal_duplic.. 88 -- -- -- cyto_intrachromosomal_transp.. 338 -- -- -- cyto_inversion-cum-transloca.. 465 -- -- -- cyto_inversion-derived_bipar.. 7 -- -- -- cyto_inversion-derived_bipar.. 9 -- -- -- cyto_inversion-derived_defic.. 15 -- -- -- cyto_inversion-derived_dupli.. 15 -- -- -- cyto_inverted_intrachromosom.. 59 -- -- -- cyto_inverted_ring_chromosome 2 -- -- -- cyto_laevosynaptic_chromosome 130 -- -- -- cyto_ring_chromosome 8 -- -- -- cyto_tandem_duplication 364 -- -- -- cyto_transposition 16 -- -- -- OLD cytology feature types cyto_insertion -- 16363 16363 21379 cytobreakpoint_inv -- 4565 4565 4565 cytobreakpoint_other -- 791 791 791 cytobreakpoint_ttp -- 6243 6243 6243 cytodeleted_segment -- 11073 11073 11073 cytoduplicated_segment -- 880 880 880 cytogene -- 5671 5671 6683 ------------------------------------------------------------ Category clarification: gene = protein coding gene, other features with gene-models (and transcripts) are pseudogene, rRNA, snRNA, snoRNA, tRNA, ncRNA mRNA = all transcript types including from pseudogene, rRNA, snRNA, snoRNA, tRNA, ncRNA Cytological feature types have been reclassified. ------- Data are from Postgres Chado database, release 4.1, 7 feb 2005 Copy at ftp://flybase.net/genomes/Drosophila_melanogaster/ dmel_r4.1_20050207/pgsql/chado*.gz BULK FILE SET See ftp://flybase.net/genomes/Drosophila_melanogaster/dmel_r4.1_20050207/ blast/ - NCBI blast database set for selected fasta/ feature sets. dna/ - contains dna raw format files per chromosome-arm fasta/ - dna and protein data per chromosome and feature type; and -all- files which catenate each chromosome set. chromosome dna in fasta format gff/ - GFF v3 standard feature files per chromosome gnomap/ - Gnomap standard feature files per chromosome (drive genome map views) These two contain chromosome locations of above listed features