Tag: BWA

Read counts an order of magnitude higher on one chromosome

Read counts an order of magnitude higher on one chromosome 3 Hi, I am having an issue with a sequencing run that when demultiplexed, aligned, and filtered each individual has 1-2 million reads, but these reads are predominantly on one chromosome. For background these are oncorhynchus mykiss and o. clarki…

Continue Reading Read counts an order of magnitude higher on one chromosome

Detailed differences between sambamba and samtools

3 month , My first post in the new student group , The false-positive mutation appears because duplicates mark Not enough ?, Tells the story of supplementary read It won’t be GATK MarkDuplicates Marked as duplicates The problem of . after , In response to this question , I began…

Continue Reading Detailed differences between sambamba and samtools

Genetic and chemotherapeutic influences on germline hypermutation

DNM filtering in 100,000 Genomes Project We analysed DNMs called in 13,949 parent–offspring trios from 12,609 families from the rare disease programme of the 100,000 Genomes Project. The rare disease cohort includes individuals with a wide array of diseases, including neurodevelopmental disorders, cardiovascular disorders, renal and urinary tract disorders, ophthalmological…

Continue Reading Genetic and chemotherapeutic influences on germline hypermutation

On a reference pan-genome model (Part II)

12 July 2019 I wrote a blog post on a potential reference pan-genome model. I had more thoughts in my mind. I didn’t write about them because they are immature. Nonetheless, a few readers raised questions related to my immature thoughts, so I decide to add this “Part II” as…

Continue Reading On a reference pan-genome model (Part II)

Postdoctoral Research Fellow in Bioinformatics/Computational Biology

Details Posted: 27-Apr-22 Location: Boston, Massachusetts Salary: Open Categories: Staff/Administrative Internal Number: 2022-27118 Located in Boston and the surrounding communities, Dana-Farber Cancer Institute brings together world renowned clinicians, innovative researchers and dedicated professionals, allies in the common mission of conquering cancer, HIV/AIDS and related diseases. Combining extremely talented people with…

Continue Reading Postdoctoral Research Fellow in Bioinformatics/Computational Biology

Bioinformatics Analyst II – Remote in Danville, PA for Geisinger

Details Posted: 22-Apr-22 Location: Danville, Pennsylvania Type: Full Time Salary: Open Categories: Operations Job Summary Primary accountability is to leverage the organization’s data assets exome sequencing data (>180,000 individuals) from MyCode Community Health Initiative to improve quality, efficiency and generate knowledge specifically in the field of bioinformatics within health research….

Continue Reading Bioinformatics Analyst II – Remote in Danville, PA for Geisinger

long run-time and low CPU usage

Pindel: long run-time and low CPU usage 0 I’m trying to run Pindel on some 30x Illumina WGS data. I aligned reads with BWA-MEM, then sorted by co-ordinates and indexed them with Samtools. I also tried filtering the bam files with samtools -F 0x800 as suggested by another post. I…

Continue Reading long run-time and low CPU usage

FastQC per base sequence content

FastQC per base sequence content 1 I’m running FastQC on some paired-end fastq files. I have a warning on per-base sequence content, as the first 5 to 6 bases show significant bias towards T and G, as shown below. I was wondering what the sequence in the first 5 or…

Continue Reading FastQC per base sequence content

Bioinformatics Pipeline Development Engineer II at Personalis, Inc

Personalis, Inc. is a leader in advanced cancer genomics for enabling the next generation of precision cancer therapies and diagnostics. The Personalis NeXT Platform® is designed to adapt to the complex and evolving understanding of cancer, providing its biopharmaceutical customers and clinicians with information on all of the approximately 20,000 human genes,…

Continue Reading Bioinformatics Pipeline Development Engineer II at Personalis, Inc

Sam file is not written

Dear all, It writes the following in the log file: [08-02 01:26:25] Running Step 2: BWA … bwa_wrap /work/pathology/s206442/dbet_project/hg19/hg19.fa Output3/out_1.valid.fastq 6 Output3/out_1.valid.sam 0 Running BWA on trimmed reads … bwa mem -t 6 /work/pathology/s206442/dbet_project/hg19/hg19.fa Output3/out_1.valid.fastq | samtools view -h -F 2048 – > Output3/out_1.valid.sam However, the sam file size is…

Continue Reading Sam file is not written

Building custom hg38 – alt contigs

I am exploring modifications of hg38 like these: github.com/mebbert/Dark_and_Camouflaged_genes Starting from the regular bcbio hg38 data installation Masking hg38.fa using bedtools maskfasta Generating indexes using bcbio_setup_genome.py for seq and bwa as described in the manual The bwa directory then contains ├── bwa │   ├── hg38_masked.fa.amb │   ├── hg38_masked.fa.ann │   ├──…

Continue Reading Building custom hg38 – alt contigs

Color hiring Software Engineer, Bioinformatics in Remote

About Color Color’s mission is to help people lead the healthiest lives that science and medicine can offer. We launched in April 2015 with a simple, affordable genetic test to help people understand their risk for hereditary cancer. In 2017, we added coverage for hereditary heart conditions. Between them, cancer…

Continue Reading Color hiring Software Engineer, Bioinformatics in Remote

Frontiers | Machine Learning and Deep Learning Applications in Metagenomic Taxonomy and Functional Annotation

Introduction The study of the microbial environments has benefited from the sequencing revolution, where technology improvement decreased the DNA sequencing cost and increased the number of sequenced nucleic bases. For approximately 20 years (depending on how we define the term metagenomics), it has allowed the decryption of the microbial composition…

Continue Reading Frontiers | Machine Learning and Deep Learning Applications in Metagenomic Taxonomy and Functional Annotation

BTG2 gene predicts poor outcome in PT-DLBCL

Introduction Primary testicular diffuse large B-cell lymphoma (PT-DLBCL) is a rare and aggressive form of mature B-cell lymphoma.1–3 PT-DLBCL was the most common type of testicular tumor in men aged over 60 and characterized by painless uni- or bilateral testicular masses with infrequent constitutional symptoms.4–6 PT-DLBCL shows significant extranodal tropism,…

Continue Reading BTG2 gene predicts poor outcome in PT-DLBCL

sorting – indexing sorted alignment file with samtools index gives “Exec format error”

I am struggling with samtools index. I already did the alignment using “bwa mem reference.fa seq.fastq > alg.sam”. The resulting sam file was converted to bam format using “samtools view -S -h -b alg.sam > alg.bam”. Next, the files were sorted by using “sort -h alg.bam >sorted.bam”. And now we…

Continue Reading sorting – indexing sorted alignment file with samtools index gives “Exec format error”

bwa-mem2/mm2-fast: Accelerated version of minimap2; up to 1.8x faster

GitHub – bwa-mem2/mm2-fast: Accelerated version of minimap2; up to 1.8x faster This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository. You can’t perform that action at this time. You signed in with another tab or window. Reload to…

Continue Reading bwa-mem2/mm2-fast: Accelerated version of minimap2; up to 1.8x faster

HRJOB7442 Bioinformatics Scientist 2 (Various Locations) in Nether Alderley, Macclesfield (SK10) | Almac Group (Uk) Ltd

Bioinformatics Scientist 2 Hours: 37.5 hours per week Salary: Competitive Ref No: HRJOB7442 Business Unit: Diagnostic Services Location: Craigavon or Manchester Open To: Internal and External Applicants The Company Almac Diagnostic Services is a leading stratified medicine business, specialising in biomarker-driven clinical trials. We are incredibly proud to be involved…

Continue Reading HRJOB7442 Bioinformatics Scientist 2 (Various Locations) in Nether Alderley, Macclesfield (SK10) | Almac Group (Uk) Ltd

samtools markdup

samtools markdup 1 I’m doing deduplicate reads on a merged bam file, and I get this error What is going on? What is the solution? (base) javier@iMac-de-JAVIER BWA % samtools markdup -r -S 1merged.bam 2merged.bam [tmp_file] Error: tmp file write data failed. [markdup] error: writing temp output failed. [E::bgzf_close] File…

Continue Reading samtools markdup

nf-core/circrna

circRNA quantification, differential expression analysis and miRNA target prediction of RNA-Seq data Introduction nf-core/circrna is a best-practice analysis pipeline for the quantification, miRNA target prediction and differential expression analysis of circular RNAs in paired-end RNA sequencing data. The pipeline is built using Nextflow, a workflow tool to run tasks across…

Continue Reading nf-core/circrna

Cell Strain-Derived Induced Pluripotent Stem Cells as an Isogenic Approach To Investigate Age-Related Host Response to Flaviviral Infection

INTRODUCTION Dengue is the most common mosquito-borne viral disease globally (1). This acute disease, which can be life-threatening, is caused by four different dengue viruses (DENVs) (DENV-1, DENV-2, DENV-3, and DENV-4). An estimated 390 million people are infected with these DENVs annually (2), and populations throughout the tropics face frequent…

Continue Reading Cell Strain-Derived Induced Pluripotent Stem Cells as an Isogenic Approach To Investigate Age-Related Host Response to Flaviviral Infection

[MonashBioinformaticsPlatform/RSeQC] junction_saturation not suit for bam/sam file generated by minimap or pbmm2

because the CIGAR in bam/sam file generated by minimap2 contain “=” , represent right match with reference, and “X”, represent wrong match with reference. while the bam_cigar.py in ./lib/qcmodule/bam_cigar.py only suit for bam/sam generated such as BWA/bowtie, which CIGAR contain only “M” ,represent mis/match. So i modified the bam_cigar.py 77…

Continue Reading [MonashBioinformaticsPlatform/RSeQC] junction_saturation not suit for bam/sam file generated by minimap or pbmm2

bwa , 2 files fastq to 1 sam

bwa , 2 files fastq to 1 sam 1 i have this problem, please, help me, I’m trying it too from Mac OS Catalina I am creating a sam file, with 2 fastq files, using bwa I apply the following command bwa mem -t 2 GRCh38.primary_assembly.genome.fa.gz V350019555_L03_B5GHUMqcnrRAABA-556_1.fq.gz V350019555_L03_B5GHUMqcnrRAABA-556_2.fq.gz > V350019555_L03_B5GHUMqcnrRAABA-556.sam…

Continue Reading bwa , 2 files fastq to 1 sam

Senior Bioinformatics Software Developer – Bethesda

Medical Science & Computing, (MSC), a Dovel company, is seeking skilled Senior Bioinformatics Software Developers to join our team supporting our client, NCBI at the National Institutes of Health, (NIH) in Bethesda, MD. The National Center for Biotechnology Information (NCBI) is part of the National Library of Medicine (NLM) at…

Continue Reading Senior Bioinformatics Software Developer – Bethesda

Samtools flagstat confusing result of a merged bam file

Hi, I am a bioinformatics student and I am struggling with an issue, I had paired-end fastq files for one sample with some low-quality bases at the end and adapter contamination, so I went and I trimmed my reads with trimmomatic, it gave me 4 files that I used for…

Continue Reading Samtools flagstat confusing result of a merged bam file

Failure to detect mutations in U2AF1 due to changes in the GRCh38 reference sequence

Materials and Methods Genomic data was collected as part of the MDS National History Study or The Cancer Genome Atlas project and consented appropriately under those protocols 8 Sekeres M.A. Gore S.D. Stablein D.M. DiFronzo N. Abel G.A. DeZern A.E. Troy J.D. Rollison D.E. Thomas J.W. Waclawiw M.A. Liu J.J….

Continue Reading Failure to detect mutations in U2AF1 due to changes in the GRCh38 reference sequence

samtools sort

samtools sort 1 I am transforming sam files to bam, to facilitate their ordering I use this command, % cd /Volumes/GENOMA/BWA % samtools sort -n -O V350019555_L03_B5GHUMqcnrRAABA-551.sam | samtools fixmate -m -O bam V350019555_L03_B5GHUMqcnrRAABA-551.bam but it gives me the following error, As elsewhere in samtools, use ‘-‘ as the filename…

Continue Reading samtools sort

Bwa on multiple processor

Hi Guys, When I am trying to run bwa mem on multiple processor, I am getting error as : > mpirun -np 16 bwa mem hg19-agilent.fasta R1.fastq R2.fastq | samtools sort -o aln.bam [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::bwa_idx_load_from_disk] read…

Continue Reading Bwa on multiple processor

Alignment report

Alignment report 0 Hi Guys, I did alignment of R1 and R2 fastq files with reference genome using bwa mem and got bam file. Now, I want to check whether the alignment is done correctly and alignment percentage,coverage etc. I run following command: bwa mem hg19.fasta R1.fastq R2.fastq | samtools…

Continue Reading Alignment report

sequence alignment – MarkDuplicatesSpark failing with cryptic error message. MarkDuplicates succeeds

[*] I have been trying to follow the GATK Best Practice Workflow for ‘Data pre-processing for variant discovery’ (gatk.broadinstitute.org/hc/en-us/articles/360035535912). This has all been run on Windows Subsystem for Linux 2 on the Bash shell. I started off with FASTQ files from IGSR (www.internationalgenome.org/data-portal) and performed alignment with Bowtie2 (instead of…

Continue Reading sequence alignment – MarkDuplicatesSpark failing with cryptic error message. MarkDuplicates succeeds

Systems biology analysis of human genomes points to key pathways conferring spina bifida risk

Significance Genetic investigations of most structural birth defects, including spina bifida (SB), congenital heart disease, and craniofacial anomalies, have been underpowered for genome-wide association studies because of their rarity, genetic heterogeneity, incomplete penetrance, and environmental influences. Our systems biology strategy to investigate SB predisposition controls for population stratification and avoids…

Continue Reading Systems biology analysis of human genomes points to key pathways conferring spina bifida risk

Benchmarking the NVIDIA Clara Parabricks germline pipeline on AWS

This blog post was contributed by Ankit Sethia, PhD, and Timothy Harkins, PhD, at NVIDIA Parabricks, and Olivia Choudhury, PhD,  Sujaya Srinivasan, and Aniket Deshpande at AWS. This blog provides an overview of NVIDIA’s Clara Parabricks along with a guide on how to use Parabricks within the AWS Marketplace. It…

Continue Reading Benchmarking the NVIDIA Clara Parabricks germline pipeline on AWS

Towards the biogeography of prokaryotic genes

1. Sunagawa, S. et al. Structure and function of the global ocean microbiome. Science 348, 1261359 (2015). PubMed  Google Scholar  2. Zou, Y. et al. 1,520 reference genomes from cultivated human gut bacteria enable functional microbiome analyses. Nat. Biotechnol. 37, 179–185 (2019). CAS  PubMed  PubMed Central  Google Scholar  3. Mohammad,…

Continue Reading Towards the biogeography of prokaryotic genes

Attempting to generate a bam.bai file but the output is not readable

Attempting to generate a bam.bai file but the output is not readable 1 Hi, I am new a exome sequencing, and have tried to follow tutorials on the subject. I am stuck at the samtools index stage because the output files are in a non-human readable format and I believe…

Continue Reading Attempting to generate a bam.bai file but the output is not readable

Padding out a GVCF file with 1000G exomes to get gatk VariantRecalibrator working with a small sample

I’ve got sequencing data for a small 500 bp amplicon from a few samples. GATK best principles suggest running VariantRecalibrator on the GVCF files I generate. I’m trying to get this working, but I get an error about “Found annotations with zero variances”. Reading the gatk manual and other posts…

Continue Reading Padding out a GVCF file with 1000G exomes to get gatk VariantRecalibrator working with a small sample

Mapping multiples

Mapping multiples 1 Hi, I am coming to you for help. I am doing a mapping on short and long read files with BWA and MINIMAP2 My problem is that, I want to make an if loop that would allow me to choose either BWA if I work with short…

Continue Reading Mapping multiples

Strange speed up in GATK LeftAlignIndels

Strange speed up in GATK LeftAlignIndels 1 Hi! I noticed a strange thing, I have been running a DNA-seq pipeline like this: reads -> bwa-mem2 -> picard SortSam -> picard MergeSamFiles -> picard MarkDuplicates -> gatk LeftAlignIndels … gatk LeftAlignIndels has always taken around 4 hours to complete with the…

Continue Reading Strange speed up in GATK LeftAlignIndels

Single-cell DNA and RNA sequencing reveals the dynamics of intra-tumor heterogeneity in a colorectal cancer model | BMC Biology

Organoid culture of small intestinal cells and lentiviral transduction C57BL/6J mice and BALB/cAnu/nu immune-deficient nude mice were purchased from CLEA Japan (Tokyo, Japan). The small intestine was harvested from wild-type male C57BL/6J mice at 3–5 weeks of age (Additional file 1: Figure S9A). Crypts were purified and dissociated into single cells,…

Continue Reading Single-cell DNA and RNA sequencing reveals the dynamics of intra-tumor heterogeneity in a colorectal cancer model | BMC Biology

Why are my Nextflow processes not executing in parallel?

I have written a Nextflow script with three process: The first process takes a pair of fastq files and aligns with reference genome. The process writes the resulting SAM file into sam channel. Second process takes input from the sam channel and creates a BAM file from it, and writes…

Continue Reading Why are my Nextflow processes not executing in parallel?

Weird error from BWA and BOWTIE2

Weird error from BWA and BOWTIE2 1 Hi community, Recently I have used BWA and Bowtie2 to align simulated DNA sequencing data to test our sequencing simulator. I got some errors from both aligners: BWA: submit.sh: line 48: 6881 Segmentation fault (core dumped) BOWTIE2: terminate called after throwing an instance…

Continue Reading Weird error from BWA and BOWTIE2

Transposition and duplication of MADS-domain transcription factor genes in annual and perennial Arabis species modulates flowering

Annual and perennial species occur in many plant families. Annual plants and some perennials are monocarpic (flowering once in their life cycle), characterized by a massive flowering and typically produce many seeds before the whole plant senesces. By contrast, most perennials live for many years, show delayed reproduction, and are…

Continue Reading Transposition and duplication of MADS-domain transcription factor genes in annual and perennial Arabis species modulates flowering

iCOMIC: a graphical interface-driven bioinformatics pipeline for analyzing cancer omics data

Abstract Despite the tremendous increase in omics data generated by modern sequencing technologies, their analysis can be tricky and often requires substantial expertise in bioinformatics. To address this concern, we have developed a user-friendly pipeline to analyze (cancer) genomic data that takes in raw sequencing data (FASTQ format) as input…

Continue Reading iCOMIC: a graphical interface-driven bioinformatics pipeline for analyzing cancer omics data

Dissemination of Mycobacterium abscessus via global transmission networks

Dataset construction, cluster identification and definition of DCCs Whole genome sequencing of two collections of isolates from Manchester, UK, and the Netherlands was carried out as previously described2. Briefly, DNA was extracted from colony sweeps of subcultured samples before to paired-end sequencing using the Illumina HiSeq platform. These samples were…

Continue Reading Dissemination of Mycobacterium abscessus via global transmission networks

Genome-wide analysis reveals associations between climate and regional patterns of adaptive divergence and dispersal in American pikas

Alexander DH, Novembre J, Lange K (2009) Fast model-based estimation of ancestry in unrelated individuals. Genome Res 19:1655–1664 CAS  PubMed  PubMed Central  Article  Google Scholar  Alexander DH, Shringarpure SS, Novembre J, Lange K (2015) Admixture 1.3 software manual. UCLA Hum Genet Softw Distrib, Los Angeles Google Scholar  Angert AL, Bontrager…

Continue Reading Genome-wide analysis reveals associations between climate and regional patterns of adaptive divergence and dispersal in American pikas

converting Bam to fastq while removing clipping(hard/soft clip bases)

converting Bam to fastq while removing clipping(hard/soft clip bases) 0 Hello, I want to do some analysis and my raw data is paired-end reads fastq files. So far: I used BWA mem to convert them to Sam file then used samtools to convert to BAM file. My next step is…

Continue Reading converting Bam to fastq while removing clipping(hard/soft clip bases)

Haplotype divergence supports long-term asexuality in the oribatid mite Oppiella nova

Significance Putatively ancient asexual species pose a challenge to theory because they appear to escape the predicted negative long-term consequences of asexuality. Although long-term asexuality is difficult to demonstrate, specific signatures of haplotype divergence, called the “Meselson effect,” are regarded as strong support for long-term asexuality. Here, we provide evidence…

Continue Reading Haplotype divergence supports long-term asexuality in the oribatid mite Oppiella nova

The sardine run in southeastern Africa is a mass migration into an ecological trap

INTRODUCTION Large-scale annual migrations occur in an extraordinary range of animals, from insects to the great whales. While the driving mechanisms of these migrations are varied and sometimes poorly understood, they often represent a way of optimizing conditions for breeding and adult fitness when these are in conflict. Often, populations…

Continue Reading The sardine run in southeastern Africa is a mass migration into an ecological trap

Align fastq SOLiD data

Align fastq SOLiD data 1 Hello everyone, I have downloaded some data from the short read archive using the sratoolkit. The data is SOLiD data. I have seen people using the Lifescope (Life Technologies) to align the reads, as I presume it works for this type of data. But unfortunately,…

Continue Reading Align fastq SOLiD data

bamdst gives error “EOF marker is absent. The input is probably truncated.”

bamdst gives error “EOF marker is absent. The input is probably truncated.” 0 I created a set of bam files from Poolseq data using bwa -aln, and all of the output files gave the following error when I ran bamdst to get summary statistics on read depth: “EOF marker is…

Continue Reading bamdst gives error “EOF marker is absent. The input is probably truncated.”

High tumor mutation burden and DNA repair gene mutations

Introduction Anaplastic lymphoma kinase (ALK)‑fusion genes represent a small but important part of oncogenic driver mutations in NSCLC, accounting for approximately 3%‑7% of all cases worldwide.1,2 Small molecule tyrosine kinase inhibitors (TKIs) are the standard therapy for ALK-rearranged NSCLC. Crizotinib, a first-generation TKI, is the most widely used targeted drug…

Continue Reading High tumor mutation burden and DNA repair gene mutations

Command-line alternative to Geneious assemble for Sanger sequencing data

I am doing Sanger sequencing of a construct ~2Kb using 4 primer pairs. I get back 4 .ab1 files, each with generally around 1Kb of high quality sequence and given the relatively small size of the construct these overlap significantly. The goal is to assemble these 4 sequences into a…

Continue Reading Command-line alternative to Geneious assemble for Sanger sequencing data

Bioinformatics Support Specialist (Remote) at Agilent Technologies, Inc.

Agilent inspires and supports discoveries that advance the quality of life. We provide life science, diagnostic, and applied market laboratories worldwide with instruments, services, consumables, applications, and expertise. Agilent enables customers to gain the answers and insights they seek so they can do what they do best: improve the world…

Continue Reading Bioinformatics Support Specialist (Remote) at Agilent Technologies, Inc.

High frequency of an otherwise rare phenotype in a small and isolated tiger population

Significance Small and isolated populations have low genetic variation due to founding bottlenecks and genetic drift. Few empirical studies demonstrate visible phenotypic change associated with drift using genetic data in endangered species. We used genomic analyses of a captive tiger pedigree to identify the genetic basis for a rare trait,…

Continue Reading High frequency of an otherwise rare phenotype in a small and isolated tiger population

MAPQ (Mapping quality) of 0 for most reads from BWA-MEM2 (with no secondary alignment or other apparent reason)

Hello, I got a very weird output from BWA-mem2 – most of the reads have mapping quality of 0, even though there is no secondary alignment or anything else suspicious. I got sequencing data that was aligned with Novoalign to hg18, the data was bam files. I needed to realign…

Continue Reading MAPQ (Mapping quality) of 0 for most reads from BWA-MEM2 (with no secondary alignment or other apparent reason)

Biocept, Inc. hiring Bioinformatics Scientist in San Diego, California, United States

Tasks and Responsibilities Develop and maintain analysis pipelines for next generation sequencing data. Deep dive analysis of targeted and UMI… Required Skills & Experience MS/PhD in bioinformatics, mathematics, computer science, biology, chemistry, or similar, with 3+ years experience in an industrial environment. Experience in clinical diagnostics desired but not required…

Continue Reading Biocept, Inc. hiring Bioinformatics Scientist in San Diego, California, United States

Cancer Mutation Detection Depends on Choices at Each Step of Sequencing, Analysis Pipeline

NEW YORK — An international team of researchers has examined how variations in sequencing approaches can influence the ability to accurately detect cancer mutations, providing guidance for the wider community. The team additionally developed a set of reference samples for benchmarking efforts. Next-generation sequencing approaches are increasingly being adopted to…

Continue Reading Cancer Mutation Detection Depends on Choices at Each Step of Sequencing, Analysis Pipeline

Assistant Research Professor – Genomics and Bioinformatics job with City of Hope

About City of Hope City of Hope, an innovative biomedical research, treatment and educational institution with over 6000 employees, is dedicated to the prevention and cure of cancer and other life-threatening diseases and guided by a compassionate, patient-centered philosophy. Founded in 1913 and headquartered in Duarte, California, City of Hope…

Continue Reading Assistant Research Professor – Genomics and Bioinformatics job with City of Hope

ENHANCED GRAVITROPISM 2 encodes a STERILE ALPHA MOTIF–containing protein that controls root growth angle in barley and wheat

    Significance To date, the potential of utilizing root traits in plant breeding remains largely untapped. In this study, we cloned and characterized the ENHANCED GRAVITROPISM2 (EGT2) gene of barley that encodes a STERILE ALPHA MOTIF domain–containing protein. We demonstrated that EGT2 is a key gene of root growth…

Continue Reading ENHANCED GRAVITROPISM 2 encodes a STERILE ALPHA MOTIF–containing protein that controls root growth angle in barley and wheat

Mapping digested synthetic oligos back to original sequences.

Mapping digested synthetic oligos back to original sequences. 0 Hi, I have several synthetic dsDNA of 70bp and I digest them with some enzyme. I am interested to see the exact cut site of the enzyme so I had the products sequenced using MiSeq. They are single-end read. What is…

Continue Reading Mapping digested synthetic oligos back to original sequences.

Aligning Multiple paired end files together

Aligning Multiple paired end files together 1 Hi All, I have 72 paired end .fastq file for which i need to do Alignment using BWA. Since its a paired end data and my files are named as sam_001_1.fastq sam_001_2.fastq sam_002_1.fastq sam_002_2.fastq & so on Since its a paired end data…

Continue Reading Aligning Multiple paired end files together

Gene mutation analysis in papillary thyroid carcinoma

Introduction Thyroid tumors are the most common malignant tumors of the endocrine system, and their incidence has been increasing in the recent decades. Currently, there are some target drugs that can effectively treat PTC, and next-generation sequencing (NGS) can be used for targeted therapy. In order to make better informed…

Continue Reading Gene mutation analysis in papillary thyroid carcinoma

pseudogenes and their parent gene common regions

pseudogenes and their parent gene common regions 1 Hi,I have a list of gene names and their corresponding pseudogenes. I want to figure out which regions of a pseudogene and its parent gene are common. I think one way would be first extracting their sequence then align them to each…

Continue Reading pseudogenes and their parent gene common regions

How to analyze the generated VCF file, what to do if you have multiple VCF file for the same gene?

How to analyze the generated VCF file, what to do if you have multiple VCF file for the same gene? 0 I have given 40 tumor samples to NGS for the analysis and I gave them a list of specific genes only do the sequencing for lets call that gene…

Continue Reading How to analyze the generated VCF file, what to do if you have multiple VCF file for the same gene?

Twist Bioscience hiring Bioinformatics Scientist, Production Bioinformatics in South San Francisco, California, United States

Twist is looking for a Bioinformatics Scientist to join our Production Bioinformatics Team. You will work alongside research scientists, software engineers and data scientists to further deliver on our mission to expand access to best-in-class synthetic biology and next-generation sequencing applications. You will be developing and engineering tools to better…

Continue Reading Twist Bioscience hiring Bioinformatics Scientist, Production Bioinformatics in South San Francisco, California, United States

Error in pipe output to samblaster from bwa-mem2

Error in pipe output to samblaster from bwa-mem2 0 Hi, I am trying to upgrade my command from bwa to bwa-mem2. This command usually works. bwa mem -M -R “@RGtID:idtSM:sampletLB:lib” human_g1k_v37.fasta sample.1.fq sample.2.fq | samblaster -M –excludeDups –addMateTags –maxSplitCount 2 –minNonOverlap 20 | samtools view -S -b – > sample.bam…

Continue Reading Error in pipe output to samblaster from bwa-mem2

the Genomic Rearrangement IDentification Software Suite

Tool:GRIDSS: the Genomic Rearrangement IDentification Software Suite 0 GRIDSS is typically used for detecting structural variation breakpoints from short read sequencing data but is a modular software suite containing a number of tools useful for the detection of genomic rearrangements including: A structural variant caller. The GRIDSS caller uses break-end…

Continue Reading the Genomic Rearrangement IDentification Software Suite

Bioinformatics Analyst II in Danville, PA for Geisinger

Job Summary Primary accountability is to leverage the organization’s data assets exome sequencing data (>180,000 individuals) from MyCode Community Health Initiative to improve quality, efficiency and generate knowledge specifically in the field of bioinformatics within health research. Performs and supervises complex data extraction, transformation, visualization, and summarization to support Research…

Continue Reading Bioinformatics Analyst II in Danville, PA for Geisinger

How to align and visualize data with .fasta and .gff3 files in IGV?

How to align and visualize data with .fasta and .gff3 files in IGV? 1 Hi everyone, I have an issue in aligning and visualizing my data in IGV. As I read in manual of IGV, to align and visualize data, I need to to prepare .BAM/.SAM or other input format…

Continue Reading How to align and visualize data with .fasta and .gff3 files in IGV?

Bowtie2 hg19 reference for gatk MuTect

Bowtie2 hg19 reference for gatk MuTect 3 Hello, I understand that the suggested aligner to use with GATK is bwa. If I want to use Bowtie2 as the aligner, which reference file should I be using? The reference in GATK bundle (Homo_sapiens_assembly19.fasta) does not seem to work with Bowtie2 and…

Continue Reading Bowtie2 hg19 reference for gatk MuTect

Bwa sampe error 999

Bwa sampe error 999 25-08-2021 I’m getting the following error message when I try to import into 1aa.vremenagoda54.ru file (using samtools import). [samopen] SAM header I’m using bwa aln to find coordinates and bwa sampe to…

Continue Reading Bwa sampe error 999

Exctracting amino acid substitutions

Exctracting amino acid substitutions 0 Good day, I’m trying to develop a pipeline to determine mutations which are responsible for amino acid changes in genes associated with antibiotic resistance. I have roughly 300 bacrtial isolates. My approach so far has not been fruitful, in short this is what i tried:…

Continue Reading Exctracting amino acid substitutions

Snakemake-Aligment using BWA-MEM2

Hello I have started using snakemake 6.5.2 to align fastq files with reference file. I have pasted the error below in this question. How to allocate memory in the snakefile and read the header from samfile, ‘-‘. This is the snakefile (wrapper for running alignment): rule bwa_mem2_mem: input: reads=[“/scicore/home/cichon/GROUP/test_workflow/samples/{sample}.1.fq”, “/scicore/home/cichon/GROUP/test_workflow/samples/{sample}.2.fq”]…

Continue Reading Snakemake-Aligment using BWA-MEM2

Unable to locate package hisat2″

how to solve this error when I want to install HISAT2? “E: Unable to locate package hisat2” 1 Dear all, I need to install HISAT2 aligner in my study. My Linux version is 16.04 (Xenial Xerus). So I used the below command : sudo apt-get install -y hisat2 but I…

Continue Reading Unable to locate package hisat2″

How to calculate the Average Insert Size after mapping the reads to the reference genome using BWA

How to calculate the Average Insert Size after mapping the reads to the reference genome using BWA 3 Hi, Having mapped the reads to the reference genome using BWA, I am trying to calculate their Average Insert Size. Thereafter, I converted the BAM file to SAM file in order to…

Continue Reading How to calculate the Average Insert Size after mapping the reads to the reference genome using BWA

Missing read group in BAM files

Missing read group in BAM files 1 Hello everyone, I have processed PE reads through the pipeline HybPiper to align them to a reference genome with GATK. But inspecting the output BAM files with the GATK tool ValidateSamFile, I found out a very common error in the error report: WARNING::RECORD_MISSING_READ_GROUP…

Continue Reading Missing read group in BAM files

MarkduplicatesSpark How to speed-up ?

MarkduplicatesSpark How to speed-up ? 0 Hello all, I would like to know if there is any good option to speed up MarkduplicatesSpark ? I work with human genome with arround 900 millions reads (151 bp). I work on a cluster (with slurm). The command that i used is (with…

Continue Reading MarkduplicatesSpark How to speed-up ?

What is the difference between GRCh37 and hs37? And hg19?

This is what I have found so far. Please correct me if I am wrong. GRCh37 w/o patches includes the primary assembly (22 autosomal, X. Y, and non-chromosomal supecontigs) and alternate scaffolds, but not a reference mitogenome. Non-chromosomal supercontigs are the unlocalized and unplaced scaffolds. The rCRS reference mitogenome in…

Continue Reading What is the difference between GRCh37 and hs37? And hg19?

Base recalibration -Java run time error and no sequence dictionary

Base recalibration -Java run time error and no sequence dictionary 0 Hello I am stuck with base recalibration step in NGS analysis. Used this command for the base calibration step: gatk BaseRecalibrator -I sample1.bam -R gch38.fa –known-sites GCF_000001405.39 -O recal_data.table I got the following warning: WARN IndexUtils – Feature file…

Continue Reading Base recalibration -Java run time error and no sequence dictionary

Vacancy for Bioinformatics Analyst in the USA – OYA Opportunities

Apply for Vacancy for Bioinformatics Analyst at Weill Cornell Medicine in the USA. The deadline for this job is 30th September 2021. About: Weill Cornell Medicine, officially the Joan & Sanford I. Weill Medical College of Cornell University, is the biomedical research unit and medical school of Cornell University, a…

Continue Reading Vacancy for Bioinformatics Analyst in the USA – OYA Opportunities

So many variants detected.

So many variants detected. 0 Dear All, I have done variant calling in Germline data that has single sample of each individual and two genes. I did following steps, but after checking results I found too many variants. After Haplotypecaller (the step 6) I found 140900 known variants, and the…

Continue Reading So many variants detected.

CROP-seq data analysis

CROP-seq data analysis 1 Hi, I am a new bie to single cell sequencing analysis. I have to analyze CROP-seq data, I am going through the following paper, www.nature.com/articles/nmeth.4177. I have to use cell ranger ( instead of DROP-seq software) as the first step to process single cell data.I wanted…

Continue Reading CROP-seq data analysis

Alignment using bwa-mem2

Alignment using bwa-mem2 0 Hello I need help in aligning the sequence with reference using bwa-mem2. I used the following code: bwa-mem2 mem -t 8 gch38.fa DE98NGSUKBD117612_1_1.fq DE98NGSUKBD117612_1_2.fq > d3_align.sam I got the following error: ERROR! Unable to open the file: gch38.fa.bwt.2bit.64 There is no gch38.fa.bwt.2bit.64 file. I have the…

Continue Reading Alignment using bwa-mem2

align using file.ht2

align using file.ht2 1 now i downloaded in my terminal indexed file of UCSC hg19 and when i uncompress it , i found two files genome.5.ht2 genome.8.ht2 and every time i want to align my samples at indexed file this error show up [e::bwa_idx_load_from_disk] fail to locate the index files…

Continue Reading align using file.ht2

I am converting the fq.gz. files (which are the results of the mgi study) to bam files to view on igv.

I am converting the fq.gz. files (which are the results of the mgi study) to bam files to view on igv. 0 Hey everyone, before i start apologies for the inconvenience cause of my wrong or inappropriate use of terms. I take some fails of bwa mem lately. As i…

Continue Reading I am converting the fq.gz. files (which are the results of the mgi study) to bam files to view on igv.

Map Entire Directory of Paired-End Reads at Once

Map Entire Directory of Paired-End Reads at Once 0 Is there a way to map an entire directory of reads at once? Would I just have to write a script for this specific to my directory structure and data? I’m using BWA MEM to map 49 paired-end reads and have…

Continue Reading Map Entire Directory of Paired-End Reads at Once

Read group info

Read group info 0 Hello I need help in getting read group info for performing alignment using BWA-MEM2. I read previous post (bwa mem: Passing a variable to read group) on read-group info, where a shell script is used to get the read group info from fastq file. Can someone…

Continue Reading Read group info

VCF Filter On Small Genomes

VCF Filter On Small Genomes 0 Hi guys, I am working on a yeast species (Candida glabrata) NGS data to find any mutations related to drug resistance. I am new in bioinformatics so I am using Galaxy.eu to get use to algorithms. There is literature about some genes that mutations…

Continue Reading VCF Filter On Small Genomes