Categories
Tag: SAMtools
Accepted samtools 1.14-1~exp1 (source) into experimental
—–BEGIN PGP SIGNED MESSAGE—– Hash: SHA512 Format: 1.8 Date: Tue, 07 Dec 2021 09:13:32 +0100 Source: samtools Architecture: source Version: 1.14-1~exp1 Distribution: experimental Urgency: medium Maintainer: Debian Med Packaging Team <debian-med-packag…@lists.alioth.debian.org> Changed-By: Andreas Tille <ti…@debian.org> Changes: samtools (1.14-1~exp1) experimental; urgency=medium . * New upstream version * Cleanup Breaks/Replaces Checksums-Sha1: f497f3ec80eec25fedfdf3595b126d5ef89aeea0…
Parallel genomic responses to historical climate change and high elevation in East Asian songbirds
Extreme environments present profound physiological stress. The adaptation of closely related species to these environments is likely to invoke congruent genetic responses resulting in similar physiological and/or morphological adaptations, a process termed “parallel evolution” (1). Existing evidence shows that parallel evolution is more common at the phenotypic level than at…
Doubt samtools flagstat
Doubt samtools flagstat 0 I’d like to see the percentage of how many sequences align with my decrementing sequence and I’ve come to this sample table. But wanted to know what use? The mapped (80.94% : N/A)or properly paired (0.06% : N/A) percentage? 1036193 + 0 in total (QC-passed reads…
VCF samtools
VCF samtools 0 Hello, I am having trouble when doing variant calling with samtools. I am getting only the header an no variants. If I would instead use Freebayes, I do get a lot of variables, and with Gatk, I get just a few. What can the problem be? Do…
Making consensus sequence for each haplotype
Making consensus sequence for each haplotype 0 I’m dealing with paired end amplicon sequencing data. I’ve produced a GVCF file with haplotype calls using: gatk HaplotypeCaller -R $REF -I “$BAM” -O “$OUT”/results/variants/${SN}_HaplotypeCallerPGT.vcf -ERC GVCF The vcf file it produces contains the PGT flag, and variants are called in the format…
Extract human, mouse and pig aligned reads from .bam file alignment to conactenated genomes
Extract human, mouse and pig aligned reads from .bam file alignment to conactenated genomes 0 I aligned an NGS read dataset to a conactenated reference consisting of the human, mouse and pig genomes. Now I would like to extract the human (GRCh38) aligned reads only into a seperate .bam file…
Help with picard
Help with picard 1 Can anyone help me? I have no idea why my Picard not working. I used to make it work but now when I run a pipeline, this picard step can’t work. Thank you so much! (R-4) $ java -jar picard.jar CollectRawWgsMetrics -R /ref/hs37d5.fa -I CPM00002066-PL-D-20191015_20211104-dragen-somatic_tumor.bam -O…
Alignment fastq files
Alignment fastq files 0 I have a question. I need to align and convert fastq files (unpaired) into bam file. If to beo sure I need to ast first if this command below are enough to do this or I forgot about something. bowtie2 -x input.index.hg19 -U input.fastq -S {output}…
Rapid evolutionary dynamics of an expanding family of meiotic drive factors and their hpRNA suppressors
1. Zanders, S. E. & Unckless, R. L. Fertility costs of meiotic drivers. Curr. Biol. 29, R512–R520 (2019). CAS PubMed PubMed Central Google Scholar 2. Agren, J. A. & Clark, A. G. Selfish genetic elements. PLoS Genet. 14, e1007700 (2018). PubMed PubMed Central Google Scholar 3. Lindholm, A. K. et…
Getting errors trying to run rmats
Getting errors trying to run rmats 1 Hi, I am trying to use rmats for splice variation analysis through ssh using slurm after loading rmats module, these are commands that I tried and errors they produced rmats –s1 $PWD/control.txt –s2 $PWD/pdac.txt –gtf mm10/mm10.refGene.gtf Python programming language version 3.6.8 loaded. GNU…
Run multiple times samtools and sed for a big number of bam files in folder
Run multiple times samtools and sed for a big number of bam files in folder 1 How can we execute the following commands with bash for a big amount of bam files in a folder samtools view -H in.bam > header.sam sed -i s/SN:/SN:chr/ header.sam sed -i s/SN:chrMT/SN:chrM/ header.sam samtools…
Add or reveal read groups on .sam file aligned by BWA
Add or reveal read groups on .sam file aligned by BWA 0 Hi, I’m trying to use GATK HaplotypeCaller but everytime I run its says A USER ERROR has occurred: Argument emit-ref-confidence has a bad value: Can only be used in single sample mode currently. Use the –sample-name argument to…
How To Open Bam Files Without Software?
The BAM files can be opened remotely (ftp, http) or locally (local). The index file must be found in the same directory as the BAM file in order to view it. The index should be named by appending “. The file name is changed from “bai” to “bam”. How Do…
Insert size historgram from Picard for Illumina paried end 150 bp: FR, TANDEM, and both
I’m got some low coverage skim-seq bam files (1x) and was doing qc on them and got some strange results. I ran Picard CollectInsertSizeMetrics. The sequencing was done by Illumina paired end and the orientation was be F-R as usual. But I got insert size histograms showing FR, TANDEM, and…
Problem with bowtie2 alignment – libtbb.so.2
In a nutshell, I have 44 folders of different samples/species that each have paired reads for those samples/species. I’m doing bowtie alignment with the same referent genome, and then outputing it to BAM and sorting it using samtools. Since alignment takes a while, I’ve written a script and passed it…
Bioinformatics Specialist for projects analyzing bacterial & human genomic & transcriptomic data
HHMI is currently seeking a talented, teamwork-oriented Bioinformatics Specialist for a full-time opportunity to join the lab of Professor and Nobel Laureate Jennifer Doudna at the University of California, Berkeley. The California Institute for Quantitative Biosciences (QB3) is one of four Governor Gray Davis Institutes for Science and Innovation established in…
Variant Calling with Bcftools Call instead of Bcftools View
Variant Calling with Bcftools Call instead of Bcftools View 0 I’m revisiting variant calling from when I used an older version of bcftools, but found the newer sam/bcftools have deprecated some commands and parameters. This led me to retaining what I feel are too many loci (100’s of thousands vs….
Genome Size Estimation with Jellyfish and Genome Scope is Unexpectedly Small
I used jellyfish to count kmers and the obtained histograms were analyzed using Genome scope (qb.cshl.edu/genomescope/) in order to estimate effective genome size. However the resulting bp estimation of 61,351 bps is unexpectedly small for human data. The paired-end data was obtained from a Chip-seq experiment on Geo (GSE72141). Additionally,…
Extracting matching reads by read ID
Extracting matching reads by read ID 1 What tool would you recommend to compare two BAM files and extract matching reads by read ID? BAM • 79 views samtools view file1.bam | awk -F “t” ‘{print $1}’ | sort | uniq > names_in_file1 filterbyname.sh -Xmx4g in=file2.bam names=names_in_file1 out=file.fq.gz include=t file.fq.gz…
samtools vs libdna – compare differences and reviews?
The number of mentions indicates the total number of mentions that we’ve tracked plus the number of user suggested alternatives. Stars – the number of stars that a project has on GitHub. Growth – month over month growth in stars. Activity is a relative number indicating how actively a project…
Read counts for peak file
Read counts for peak file 1 Hi, Need some help. I need a peak call file with read counts for some downstream analysis. I have my BAM file, I used MACS2 to get the peak calls, next I used Samtools bedcov but I am not familiar with these tools therefore…
Bioinformatics Engineer Job Opening in St. Louis, MO at Benson Hill
About Benson HillBenson Hill empowers innovators to develop more healthy, tasty and sustainable food by unlocking the natural genetic diversity of plants. Benson Hill’s CropOS™ platform combines machine learning and big data with advanced breeding techniques and plant biology to drastically accelerate and simplify the product development process. The CropOS…
samtools mpileup error
samtools mpileup error 0 I am using samtools function mpileup to get reads in sites. The command I used is: samtools mpileup -l hg19.position -f hg19.fa -Q 20 -q 20 -I file.bam in the command, hg19.position is position-based site file. When running these is an error printed on screen: ……
How to assess structural variation in your genome, and identify jumping transposons
Prerequisites Data An annotated genome Long reads Repeat annotation Software minimap2 samtools bedtools – for comparisons only tabix – for visualization only Installation 1 2 3 /work/gif/remkv6/USDA/04_TEJumper conda create -n svim_env –channel bioconda svim source activate svim_env Map your long reads to your genome with minimap My directory locale 1…
LeftAlignIndels error
LeftAlignIndels error 0 Hello! I input sorted and indexing bam file to LeftAlignIndels: ~/Soft/gatk-4.1.9.0/gatk LeftAlignIndels -I bam_fin/Exome_dups.bam -R /mnt/lapd/Index_hum/dna2/GRCh_2021.fa -O bam_fin/Exome.bam And have this error: ‘java.lang.IllegalArgumentException: Alignments added out of order in SAMFileWriterImpl.addAlignment for file:///mnt/lapd/Vika_data/RNF_raw/exome/bam_fin/Exome.bam. Sort order is coordinate. Offending records are at [1:152985370] and [1:152985347] at htsjdk.samtools.SAMFileWriterImpl.assertPresorted(SAMFileWriterImpl.java:197) at htsjdk.samtools.SAMFileWriterImpl.addAlignment(SAMFileWriterImpl.java:184)…
Bioinformatics Scientist at Infectious Disease Institute
IDI seeks to hire a Bioinformatics Scientist (BS) for the centre. The BS will be a fulltime staff who is familiar with the application of computational and biotechnology capabilities to biomedical and public health problems like genetics, clinical and medical research, as well as other data intensive analyses. By coordinating…
Index of /~psgendb/doc/local/script/samtools/birchpylib/pydocs
Name Last modified Size Description Parent Directory – birchlib.html 2011-08-29 15:06 17K csh2sh.html 2010-07-27 13:27 6.5K customdoc.html 2010-07-27 13:27 6.4K dbsout.html 2010-07-27 13:27 8.7K flat2list.html 2010-07-27 13:27 5.3K flat2tree.html 2010-07-27 13:27 4.9K flatcnv.html 2010-07-27 13:27 7.4K getbirch.html 2010-07-27 13:27 3.7K …
Consensus sequence calling with normalisation of indels
Consensus sequence calling with normalisation of indels 0 I’m following the workflow suggested by samtools here to produce a fasta with the consensus sequence. samtools.github.io/bcftools/howtos/consensus-sequence.html The workflow goes like this: # call variants bcftools mpileup -Ou -f reference.fa alignments.bam | bcftools call -mv -Oz -o calls.vcf.gz bcftools index calls.vcf.gz #…
how to remove host reads from other microbe reads
Hi wonderful people, I have been analyzing data from a paper, where I have used GEM3Mapper to get SAM files. Now, I have to remove host reads. There are 2 advices I got from biostars : $ samtools view -buSh -f 4 x.sam | samtools fastq – | cat ->…
bcftools consensus -m file.bed error
bcftools consensus -m file.bed error 0 Hi everyone, with bcftools consensus I’m trying to substitute the regions with low coverage with ‘N’. I made a BED file from the bam file with bedtools: bedtools genomecov -bga -ibam [file.bam] | awk ‘$4<5’ > low_coverage.bed output: 1 0 10001 0 1 10001…
Header error while converting sam file to bam
Header error while converting sam file to bam 0 Hi, I am trying to convert sam files to bam file after bowtie aligmnet. I checked previous posts but still have the same error. I have miRNA-Seq data. the codes from the script I used are given below: For alignment: bowtie…
Creating SNP index
Creating SNP index 0 Hello, I’m having a problem in creating SNP index of Brassica allopolyploids using gmap. Where can I find the SNP data of Brassica? In order to create a pseudo genome I need to have the SNP index. I have tried the snpindex command of gmap with…
Senior Bioinformatics Scientist in Cambridge, Cambridgeshire | The Tec Recruitment Group Limited
Senior Bioinformatics Scientist – Cambridge Remote/hybrid working option Role overview: You will be part of an industry leading Genomics company, who are working in the development and accessibility of sequencing products to push the boundaries of drug discovery and therapy development. You will be part of a global team of…
sam to bam then delete sam file
sam to bam then delete sam file 1 Hi, how can I make a loop for sam to bam then delete sam file in the same loop samtobam loop • 117 views As others have suggested, do you have a process that generates SAM output that can be piped directly…
featureCounts for paired-end reads
featureCounts for paired-end reads 0 Below is the samtools flagstat output for my BAM file, which only contains mapped reads (as seen from 100.00% mapped) I know featureCounts by default does not consider/assign multimappers (i.e., secondary). Does it also skip over singletons/unproperly paired reads? Is it correct to say they…
Error after STAR mapping
Error after STAR mapping 0 Hi, I’m doing the STAR mapping, but I get the bam files with some problems.When I use the command samtools flagstat SRR7195620_2.fastq.gz_Aligned.sortedByCoord.out.bam to see the details of the bam file,it shows this: 3266075 + 0 in total (QC-passed reads + QC-failed reads) 1044500 + 0…
is it same to use .bam file or .sam file?
.sam file was generated by following code samtools sort -n Untreated-3/accepted_hits.bam > Untreated-3_sn.bam samtools view -o Untreated-3_sn.sam Untreated-3_sn.bam samtools sort Untreated-3/accepted_hits.bam > Untreated-3_s.bam samtools index Untreated-3_s.bam .gtf file was downloaded by: wget ftp.ensembl.org/pub/release-70/gtf/drosophila_melanogaster/Drosophila_melanogaster.BDGP5.70.gtf.gz gunzip Drosophila_melanogaster.BDGP5.70.gtf.gz when I use htseq-count: htseq-count -s no -a 10 Untreated-3_sn.sam Drosophila_melanogaster.BDGP5.70.gtf > Untreated-3.count an error…
Name of a Specific read
Name of a Specific read 1 Hello! I have a very simple question, I think, as I am only now learning and studying NGS. Supposing we have a bam file, how do I know the name of a specific read? For example, which is the name of read 313? Thank…
Exception type: ValueError, raised in libcalignmentfile.pyx:990
HTSeq-count error: Exception type: ValueError, raised in libcalignmentfile.pyx:990 0 .sam file was generated by following code samtools sort -n Untreated-3/accepted_hits.bam > Untreated-3_sn.bam samtools view -o Untreated-3_sn.sam Untreated-3_sn.bam samtools sort Untreated-3/accepted_hits.bam > Untreated-3_s.bam samtools index Untreated-3_s.bam .gtf file was downloaded by: wget ftp.ensembl.org/pub/release-70/gtf/drosophila_melanogaster/Drosophila_melanogaster.BDGP5.70.gtf.gz gunzip Drosophila_melanogaster.BDGP5.70.gtf.gz when I use htseq-count: htseq-count -s…
featureCounts fragment count
featureCounts fragment count 0 I can’t figure out where the 17999315 fragments from featureCounts is from? Below is the output for the BAM file inputted to featureCounts. samtools flagstat output for BAM input to featureCounts: 35483648 + 0 in total (QC-passed reads + QC-failed reads) 24895287 + 0 primary 10588361…
GATK4 stripping header from .bam???? What the heck? : bioinformatics
Hi all. I have a problem. Code posted below for those who want to take a look. I have a series of 167 .bam files I need to variant call for my project. Aside from them being an absolute nightmare to work with on other grounds, a new problem has…
CBIIT Giga Galaxy A Galaxybased Platform for Largescale
CBIIT Giga. Galaxy – A Galaxy-based Platform for Large-scale Genomics Analysis Tin-Lap, LEE School of Biomedical Sciences, CUHK-BGI Innovation Institute of Trans-omics, The Chinese University of Hong Kong, Hong Kong SAR, China. CBIIT • Jointly established between The Chinese University of Hong Kong (CUHK) and BGI. • “We aim to…
BAM file reads mapping to multiple genes
BAM file reads mapping to multiple genes 0 I am unfamiliar with BAM files and pretty new to the linux command line. I have what I suspect is a fairly simple problem to solve. I have a dataset where I believe there are a large number of reads mapping to…
Empty BAM file in DANPOS3
Error: Empty BAM file in DANPOS3 0 Hi everyone, I am currently trying to run the command in DANPOS3 – which is a software to analyse nucleosome positions and call peaks. This is the command $python3 danpos.py dpos <filename.bam> [optional parameters] The bam file I am using was created using…
samtools – How to analyze IGV alignment
I’m working on a project where I am analyzing the performance of an alignment workflow. My goal is to find regions in the resulting BAM file where there are outstanding discrepancies or anything that indicates my assembly/alignment has “mistakes”. My workflow so far: input paired end FASTQ files (human) into…
Interploidy gene flow involving the sexual-asexual cycle facilitates the diversification of gynogenetic triploid Carassius fish
1. Muller, H. J. The relation of recombination to mutational advance. Mutat. Res. Mol. Mech. Mutagen. 1, 2–9 (1964). Google Scholar 2. Maynard Smith, J. The Evolution of Sex (Cambridge University Press, 1978). Google Scholar 3. Avise, J. C. Clonality (Oxford University Press, 2008). Google Scholar 4. Hamilton, W. D.,…
Add Cigar string and Template Length to Read Name
Add Cigar string and Template Length to Read Name 1 Hi all, I need to convert a BAM file to Fastq format, but I don’t want to loose the Cigar and TLen information. My idea is to edit each read name in the BAM file, by appending both Cigar and…
terminal – Issues installing samtools on MacOs with M1 with Conda
I’m trying to install samtools in a conda environment, but I’m having some issues. This is the output when I run conda install -c bioconda samtools: Collecting package metadata (current_repodata.json): done Solving environment: failed with initial frozen solve. Retrying with flexible solve. Collecting package metadata (repodata.json): done Solving environment: failed…
Personalis Senior Bioinformatics Pipeline Development Engineer
Senior Bioinformatics Pipeline Development Engineer (Remote option available) at Personalis, Inc (View all jobs) Menlo Park Personalis is a rapidly growing cancer genomics company transforming the development of next-generation therapies by providing more comprehensive molecular data about each patient’s cancer and immune response. Our ImmunoID NeXT Platform® is enabling the…
Quantitative assessment reveals the dominance of duplicated sequences in germline-derived extrachromosomal circular DNA
Significance Extrachromosomal circular DNA (eccDNA) plays a role in human diseases such as cancer, but little is known about the impact of eccDNA in healthy human biology. Since eccDNA is a tiny fraction of nuclear DNA, artificial amplification has been employed to increase eccDNA amounts, resulting in the loss of…
How do I get bcftools to rename some samples in a .vcf file?
How do I get bcftools to rename some samples in a .vcf file? 1 I am trying to use the reheader tool from bcftools in order to rename some samples in a .vcf.gz file, but it does not work, and is giving some pretty strange errors. Before trying to rename…
GATK HaplotypeCaller works without GVCF option, but errors with GVCF
I’ve extracted chromosome 4 from a whole genome bam file as follows: samtools view -h “$BAM” chr4 > “$EXT/temp/”$PREFIX”_chr4.sam” samtools view -bS “$EXT”/temp/$PREFIX”_chr4.sam” > “$EXT”/temp/$PREFIX”_chr4.bam” Then added read groups, as required by GATK picard AddOrReplaceReadGroups I=”$BAM” O=”$EXT”/temp/$PREFIX”_chr4_rg.bam” RGID=4 RGLB=lib1 RGPL=ILLUMINA RGPU=unit1 RGSM=20 Index the bam: samtools index “$BAM” Download the…
Senior Bioinformatics Pipeline Development Engineer at Personalis
Senior Bioinformatics Pipeline Development Engineer (Remote option available) at Personalis, Inc (View all jobs) Menlo Park Personalis is a rapidly growing cancer genomics company transforming the development of next-generation therapies by providing more comprehensive molecular data about each patient’s cancer and immune response. Our ImmunoID NeXT Platform® is enabling the…
samtools pid process 7845 is still there… invoking kill -9 on 7845 … Closing samtools process : 7845
Breakdancer : samtools pid process 7845 is still there… invoking kill -9 on 7845 … Closing samtools process : 7845 0 My command : ../../script/breakdancer-test-compile/lib/breakdancer-max1.4.5-unstable-66-4e44b43/bam2cfg.pl libextract-TTN_1aTTN_mapped_read_aln.sorted.bam This error appeared : > Processing bam: libextract-TTN_1aTTN_mapped_read_aln.sorted.bam > Closing BAM file > Send TERM signal for 7845 > samtools pid process 7845 is…
GATK Mutect2 errors during basic variant calling
GATK Mutect2 errors during basic variant calling 0 I’ve just installed GATK and am trying to do some basic variant calling. However when I try and run this line gatk Mutect2 -R $REF -I “$BAM” -O “$DIR”/gatk/$PREFIX”_bwa_gatk_unfiltered.vcf” I get the error below. Reading the output, it looks like this is…
Understanding MarkDuplicates and featureCounts results
I am trying to understand the effects of sorting on MarkDuplicates and how to interpret the numbers in the featureCounts log. According to the MarkDuplicates documentation, it handles coordinated-sorted and query-sorted BAM inputs slightly differently: The program can take either coordinate-sorted or query-sorted inputs, however the behavior is slightly different….
Linked supergenes underlie split sex ratio and social organization in an ant
Significance Some social insects exhibit split sex ratios, wherein a subset of colonies produce future queens and others produce males. This phenomenon spawned many influential theoretical studies and empirical tests, both of which have advanced our understanding of parent–offspring conflicts and the maintenance of cooperative breeding. However, previous studies assumed…
Genome of the estuarine oyster provides insights into climate impact and adaptive plasticity
1. Hoegh-Guldberg, O. & Bruno, J. F. The impact of climate change on the world’s marine ecosystems. Science 328, 1523–1528 (2010). CAS PubMed Google Scholar 2. Chou, C. et al. Increase in the range between wet and dry season precipitation. Nat. Geosci. 6, 263–267 (2013). CAS Google Scholar 3. Li,…
convert multiple files simultaneously using samtools
convert multiple files simultaneously using samtools 1 Hello everyone, I am new to bioinformatics. I have several files in BAM format and I want to convert them to fastq using samtools. Is there any way to convert them all at once? I tried : samtools fastq * .bam > *…
VCF file generation from multiple samples fro PCA
VCF file generation from multiple samples fro PCA 0 I am trying to generate vcf file for 80 samples(human) and use it for pca. But when trying to get eigen vectors using plink it says genotyping rate is 0.12 and when i remove snps with missing data threshold all data…
PASA pipeline updating annotation results in no changes
PASA pipeline updating annotation results in no changes 0 I’m running these commands leaving the numbers as default in config and cleaned transcripts already. module load Miniconda3/4.9.2 # create database $PASAHOME/scripts/create_sqlite_cdnaassembly_db.dbi -c alignAssembly.config -S /home/data/pest_genomics/pasa/opt/pasa-2.4.1/schema/cdna_alignment_sqliteschema # Upload annotations $PASAHOME/scripts/Load_Current_Gene_Annotations.dbi -c alignAssembly.config -g Chilo_suppressalis_v2_genome_220620_correct2.fasta -P chilo_transfer2_corrected2.gff3 module load SAMtools/1.12-GCC-10.2.0 $PASAHOME/Launch_PASA_pipeline.pl -c…
Frontiers | Accelerating Complete Phytoplasma Genome Assembly by Immunoprecipitation-Based Enrichment and MinION-Based DNA Sequencing for Comparative Analyses
Introduction Phytoplasmas are wall-less bacterial pathogens that are known to infect numerous plant species and lead to significant agricultural losses (Gurr et al., 2016; Kumari et al., 2019; Pierro et al., 2019). They are parasitic bacteria multiplying exclusively in phloem sieve elements and are transmitted between plants by phloem-feeding insects…
Piping bowtie2 output directly into BAM
Piping bowtie2 output directly into BAM 1 I am using bowtie2, and due to the size of the SAM output, I would like to entirely avoid the SAM file, and pipe the results into samtools to convert the output to a BAM file. I am getting a little bit hung…
How To Split Multiple Samples In Vcf File Generated By Gatk?
There now also is a plugin in bcftools which does the split in a single pass over the multi-sample VCF/BCF file. It does not seem to be very fast, but looks correct and there are options to do the split in custom ways. You do need to install bcftools with…
Picard vs Samtools converting CRAM to FASTQ
Picard vs Samtools converting CRAM to FASTQ 0 I need to convert my CRAM files to FASTQ to complete an analysis. I have been trying to do this via GATK and Picard, but I have repeatedly been getting an “out of memory” error even as I have increased allocated memory…
Diverse alterations associated with resistance to KRAS(G12C) inhibition
1. Ostrem, J. M., Peters, U., Sos, M. L., Wells, J. A. & Shokat, K. M. K-Ras(G12C) inhibitors allosterically control GTP affinity and effector interactions. Nature 503, 548–551 (2013). ADS CAS Article Google Scholar 2. Lito, P., Solomon, M., Li, L. S., Hansen, R. & Rosen, N. Allele-specific inhibitors inactivate…
Fast way to extract specific sequences from large fasta
Fast way to extract specific sequences from large fasta 2 Hi all! I have ~2k text files, each with ~1k protein names (one protein name per line) and I need to extract the sequences of these proteins from a large master fasta file which contains ~5.5 million sequences. I wrote…
Contig assembly and consensus sequence
Contig assembly and consensus sequence 0 Hello Everyone I’m new to bioinformatics and my objective is to extract 2 allele sequences of the target gene from RNA-sequence data. What I did: I mapped the reads to a reference transcriptome from ncbi using bowtie2, I then extracted the target gene using…
running trinity align_and_estimate_abundance.pl on multiple files
running trinity align_and_estimate_abundance.pl on multiple files 0 Hello, I am fairly new to comp bio. This is a novice question, and I’d appreciate any advice (or points in the right direction to get the info I need). I am attempting to run Trinity’s align_and_estimate_abundance on multiple libraries. I have paired-end…
HISAT2 no properly paired alignments
HISAT2 no properly paired alignments 1 Hi All! I’m a wetlab guy quite new to data analysis and would appreciate some help if possible! Slowly i’m getting into commandline and understanding some of the workflow behind analysis but i’ve hit a bit of a wall. Following hisat2-build on the human…
How to convert GEN or .gen format from impute.me to vcf on windows 10?
How to convert GEN or .gen format from impute.me to vcf on windows 10? 1 I tried for days to convert a gen file to vcf but it did not work. I am a beginner so i don’t know what are in vcf files and gen files or how they…
Bioinformatics Scientist – Job at DAWSON in Bethesda, MD
Bioinformatics Scientist Full Time Prof-Entry Bethesda, MD, US DAWSON is a Native Hawaiian Organization 8(a) small business that brings the Spirit of Aloha to our employees. As part of the DAWSON “Ohana”, you will be provided a best-in-class benefits program that strives to ensure our great people have peace of…
Add option “-g” to samtools view to keep reads with *any* of the specified bitflags
Yeah agreed, though you could choose to look at it as: -f keep if all -g keep if any -F exclude if any (opposite of -f) -G exclude if all (opposite of -g) The manual says that -G is the opposite of -f, but I don’t really agree, I think…
Comparing SNP coordinates from alignments to different reference genomes
Comparing SNP coordinates from alignments to different reference genomes 0 I have a list of SNP coordinates of interest based on a study that used Drosophila reference genome 5.18. I would like to compare these to some data that we have based on a different reference genome (5.57). What would…
Help for extraction of fasta sequences
Hello everyone, I hope you are well. I am writing this post because I have a question or rather I have a problem with my workflow. Perform a workflow for RNA-seq processing as follows: quality control – Hisat2 – Stringtie – Deseq2 A simple, normal workflow that threw me important…
samtools view not removing unwanted headers
samtools view not removing unwanted headers 0 Hi, I assume I am doing something really thick but cannot for the life of me restrict my bam files to chromosome only fields and remove scaffolds. I have run samtools as suggested in other posts but every time I check the file,…
Samtools Samtools Statistics & Issues
Issue TitleStateCommentsCreated DateUpdated DateClosed Datesamtools 1.10 csi vs 1.13 generation/reading closed 22021-11-012021-10-272021-11-03mpileup: –output-BP-5 switches output column 7 and 8 open 22021-11-012021-10-27-ampliconclip bug when hardclipping odd-length sequences from the left closed 12021-10-232021-10-272021-11-03Feature Request: Flag in ampliconclip to clip out everything except target sequence open 22021-10-222021-10-27-samtools markdup slow in docker open…
bcftools merge GP format issues
Hello, I am trying to merge VCF files from several samples from different sequencing runs. I ran bcftools merge on the VCF files and after ten hours I got the error message “Incorrect number of FORMAT/GP values at chr_Y:216795, cannot merge. The tag is defined as Number=G, but found 2…
Issues with CollectAlignmentSummaryMetrics Picard tool
Issues with CollectAlignmentSummaryMetrics Picard tool 0 Hi everyone, I was running the following comand line: java -jar /home/Picard/picard.jar CollectAlignmentSummaryMetrics -R GRCh38.fna -I 2H.bam -O output2H.txt but I got the following error: Exception in thread “main” htsjdk.samtools.util.SequenceUtil$SequenceListsDifferException: Sequence dictionaries are not the same size (194, 639) at htsjdk.samtools.util.SequenceUtil.assertSequenceListsEqual(SequenceUtil.java:259) at htsjdk.samtools.util.SequenceUtil.assertSequenceDictionariesEqual(SequenceUtil.java:342) at…
creating a consensus fasta that is same length as the reference
creating a consensus fasta that is same length as the reference 1 Hi all, I have been trying to create a same-length consensus fasta by mapping reads to a reference genome for input into angsd as the ancestral sequence. However, the scaffolds of the resulting consensus always seem to be…
Exclude reads with mate mapped to a different chr
Exclude reads with mate mapped to a different chr 0 Hi, using samtools flagstat I have 14228966 + 0 in total (QC-passed reads + QC-failed reads) 0 + 0 secondary 0 + 0 supplementary 12102440 + 0 duplicates 14228966 + 0 mapped (100.00% : N/A) 14228966 + 0 paired in…
Assigning variables programmatically for bwa-mem
Assigning variables programmatically for bwa-mem 1 I have the following script: bwa mem -t 10 -R “@RGtID:xxxtSM:xxxxtLB:LB-1tPU:xxxtPL:ILLUMINA” ref_genome.fa sample_1_1.fastq sample_1_2.fastq | samtools view -@ 10 -b – | samtools s sort -@ 10 -o sample_1.bam I also have a spreadsheet with a column for the forward reads (sample 1, sample…
Samtools depth error in bash script
Samtools depth error in bash script 0 Hi! I wrote a bash script to run Samtools depth, but it gives the following error: /storage1/kaman/onceta/cover/brca12_uniq.bed /storage1/kaman/onceta/cover/ENIGMA_uniq.bed /storage1/kaman/onceta/cover/bam/26m_s67_l001_r1_001_alignment [bed_read] Parse error reading “26m_s67_l001_r1_001_alignment.bam” at line 1 samtools depth: Could not read file “26m_s67_l001_r1_001_alignment.bam” The script looks like this brca12_bed= realpath brca12_uniq.bed enigma_bed=…
Providence hiring Bioinformatics Scientist 1 in Portland, Oregon, United States
DescriptionProvidence is calling a Bioinformatics Scientist 1 to the Molecular Genomics Lab at Providence Office Park i n Portland, OR. This is a full-time (1.0 FTE), day shift position. This position is a hybrid role between working in the lab and working from home.Apply today! Applicants that meet qualifications will…
Parasite genome assembly
Parasite genome assembly 0 Hi All, (1) I have been working with a parasite genome assembly using the BWA tool. l used the following command to execute assembly (paired-end Illumina short reads). module load bwa/0.7.15 bwa mem -t 1 -M -R “@RGtID:readstSM: AA_genome” reference_genome.fasta AA_genome1.fasta.gz AA_genome2.fasta.gz > AA_genome_aln-pe.sam (2) l…
bwa mem -T (alignment score) not doing anything
bwa mem -T (alignment score) not doing anything 0 I’m trying to control which reads are mapped to a reference genome by adjusting the the -T parameter in bwa mem, which is the minimum acceptable alignment score. The only problem is that reads with an alignment score below the threshold…
Command line .sam file processing
Hi, I have done some alignments using bowtie2 and I am having trouble with the processing of the .sam file. As explained in the bowtie2 Manual “When Bowtie 2 prints a SAM alignment for a pair, it prints two records (i.e. two lines of output), one for each mate. The…
How to correctly calculate Alternative Allele Frequency (AAF) from Samtools mpileup
I have a hard time to figure it out how correctly calculate the Alternative Allele Frequency (AAF) from Samtools mpileup. I received the code which using samtools 1.8 mpileup (following code) get the following format: …. samtools mpileup -l … capture_targets.bed -t DP,AD,ADF,ADR,SP,INFO/AD,INFO/ADF,INFO/ADR -d100000000 –output-BP –output-MQ –output-QNAME … Later in…
Extracting specific regions from bam file as FASTA format
Extracting specific regions from bam file as FASTA format 0 Hello, first time asking, so sorry in advance for beginner’s mistakes. I have an indexed bam file, and I want to extract the FASTA format of all the reads that appear within the defined region (or partially appear) for example:…
Need explain about number of mapped reads and total number reads
Need explain about number of mapped reads and total number reads 0 Hi, I’m studying the article about NIFTY test as literature review (www.ncbi.nlm.nih.gov/pmc/articles/PMC3544640/). And I have some problems that I cannot understood. Its about the fomular of kmer coverage, the k-mer coverage for each chromosome is calculate as (number…
Heterozygous Variants On Male X/Y Chromosome (Exome Data)
Heterozygous Variants On Male X/Y Chromosome (Exome Data) 0 Hello, I am analyzing the Whole Exome Sequencing (WES) data of a male patient. When looking at the variants on X and Y chromosome, I find out many heterozygous variants. I think they should all be hemizygous variants. Shouldn’t they? What…
Running Hisat2 but it seems not working at all
Running Hisat2 but it seems not working at all 0 I’m running Hisat2 with the code below: hisat2 -p 9 -f ../0.Reference/CHO-PICRH -1 ../2.Trimmomatic/ERR2593198_1_trimmed.fastq.gz -2 ../2.Trimmomatic/ERR2593198_2_trimmed.fastq.gz 2> ../3.HISAT/ERR25593198.log| samtools view -Sbo ../3.HISAT/ERR2593198.bam It’s been running over 24 hours but it doesn’t seem work well. Can you please check the code…
Identify and annotate mutations from genome editing assays
Here we propose our CRISPR-detector to facilitate the CRISPR-edited amplicon and whole genome sequencing data analysis, with functions that existing tools are not able to provide. CRISPR-detector brings the following four key innovations : optimized processing time allowing for hundreds of amplicons or whole genome sequencing data; integrated structural variation…
Bam File Average Coverage Depth Changes After Fixmate
Bam File Average Coverage Depth Changes After Fixmate 0 I am sure this is a silly question. I have a BAM file and when I compute average coverage using samtools depth -a *BAM file* | awk ‘{sum+=$3} END { print “Average = “,sum/NR}’ I get 13.4499 Then I sort by…
dyld: Library not loaded: @rpath/libcrypto.1.0.0.dylib
Samfile error: dyld: Library not loaded: @rpath/libcrypto.1.0.0.dylib 1 I’ve downloaded samfiles using conda For converting sam to bam I used the command samtools view -S -b /Users/dhee/Desktop/SRR12304924.s > /Users/dhee/Desktop/SRR12304924.bam In getting an error as dyld: Library not loaded: @rpath/libcrypto.1.0.0.dylib Referenced from: /Users/dhee/miniconda3/bin/samtools Reason: image not found zsh: abort samtools view…
Samfiles
Samfiles 1 I’ve downloaded samfiles using conda For converting sam to bam I used the command samtools view -S -b /Users/dhee/Desktop/SRR12304924.s > /Users/dhee/Desktop/SRR12304924.bam In getting an error as dyld: Library not loaded: @rpath/libcrypto.1.0.0.dylib Referenced from: /Users/dhee/miniconda3/bin/samtools Reason: image not found zsh: abort samtools view -S -b /Users/dhee/Desktop/SRR12304924.s Samfiles • 36…
DuplicateSetIterator
DuplicateSetIterator JavaScript is disabled on your browser. java.lang.Object htsjdk.samtools.DuplicateSetIterator Method Summary Methods inherited from class java.lang.Object clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait Methods inherited from interface java.util.Iterator forEachRemaining Constructor Detail DuplicateSetIterator public DuplicateSetIterator(CloseableIterator<SAMRecord> iterator, SAMFileHeader header, boolean preSorted) Allows the user of this iterator to skip the sorting of the input…
samtools view extract from .bed list to .bam file
samtools view extract from .bed list to .bam file 0 Hi – I am attempting to extract certain regions from a large .bam file into a smaller subsetted .bam file using samtools view. I’m doing so in order to have a smaller file to down/upload for viewing in IGV. I…
Check if BAM is derived from pair-end or single-end reads?
Check if BAM is derived from pair-end or single-end reads? 3 I’m automating a pipeline and currently a user inputs as a parameter if the input BAM is from pair-end or single-end reads. I want to automatically check this. How can do I this? BAM Pair-End Single-End • 16k views…
output as input new function
output as input new function 0 Heys, I’m working with samtools view to check a specific region from my bam file and I would like to convert that region into fasta piping it to angsd. However, I’m not being able to do it. How should I indicate I want the…