Tag: getfasta

[SOLVED] Special .bed to .fa conversion (GenomicCoordinates/DNAsequence) ~ Linux Fixes

My aim is to create a custom protein sequence reference file (protein.fa) from genomic coordinates (origin.bed). (origin.bed; with Chromosome, start, end, TranscriptID, strand, GeneID) chr1 109202569 109202584 ENST00000370031.1_uORF_0 – ENSG00000162639.11 chr1 109203584 109203617 ENST00000370031.1_uORF_0 – ENSG00000162639.11 chr11 102188276 102188302 ENST00000263464.3_uORF_0 + ENSG00000023445.9 chr11 10830291 10830306 ENST00000530211.1_uORF_1 – ENSG00000110321.11 chr11 10830400…

Continue Reading [SOLVED] Special .bed to .fa conversion (GenomicCoordinates/DNAsequence) ~ Linux Fixes

bam – Detect mutation context in a read of a sam file

That kind of custom fiddling with reads and variants is very cumbersome, non-standard and also error-prone. Do a standard variant callign pipeline and then filter for the mutations that you want. Then extract the variant position (so the coordinates) and get the variant context from the reference genome. Using individual…

Continue Reading bam – Detect mutation context in a read of a sam file

Trouble with bedtools getfasta

Trouble with bedtools getfasta 0 I am trying to extract sequences from a .fasta file based on a bed file using bedtools getfasta and I am getting the following error. The command run was the following: bedtools getfasta -fi genomic.fasta -bed bedfile.bed -fo output.fasta WARNING. chromosome (chr1) was not found…

Continue Reading Trouble with bedtools getfasta

organizing a Bed file for bedtools getfasta

organizing a Bed file for bedtools getfasta 0 I am trying to use bedtools getfasta on some bed files, but the issue is that the peaks bed file columns are mixed up such that the first column with the chromosome names contains the peak location as well for some of…

Continue Reading organizing a Bed file for bedtools getfasta

bedtools getfasta concatenating sequences

bedtools getfasta concatenating sequences 0 Hi, I have a bed file containing exons of the genes. the name field is specified with name of the gene like (ENSG***). when I run bedtools getfasta I get the sequences of each exon separately. is there a standard way in order to concatenate…

Continue Reading bedtools getfasta concatenating sequences

Exon coordinates and sequence

I did it like that: 1- Download refGene.txt.gz and hg19.fasta from the UCSC goldenpath. ( note: convert hg19.2bit to hg19.fa using twoBitToFa ) 2- Create a bed file with exon coordiniate using my awk script // to_transcript.awk BEGIN { OFS =”t” } { name=$2 name2=$13 sens = $4 ==”+” ?…

Continue Reading Exon coordinates and sequence

extract entire header from BED file to FASTA

extract entire header from BED file to FASTA 1 Hi, Is there any way one can extract the entire header from a BED file while using bedtools getfasta command and write it in the FASTA output ? Have tried using bedtools getfasta -fi hg19.fa -bed file.bed -fo test.fasta -fullHeader but…

Continue Reading extract entire header from BED file to FASTA