Tag: BAM
How do I specify the Recalibration_data.table for “PrintReads” in GATK 4.3, if the tool does not admit -BQSR as an option anymore?
How do I specify the Recalibration_data.table for “PrintReads” in GATK 4.3, if the tool does not admit -BQSR as an option anymore? 1 I executed BaseRecalibrator on my .bam and obtained a Recalibration_data.table file. Now, When I apply PrintReads, the option -BQSR is not acknowlegded by PrintReads. Even online or…
What’s the correct way to store bam record in a vector and free them?
What’s the correct way to store bam record in a vector and free them? 1 Here’s my demo code #include <stdio.h> #include “htslib/sam.h” #include <vector> #include <fstream> #include <string> #include <iostream> int main(int argc, char *argv[]) { samFile *fp = sam_open(argv[1], “r”); hts_idx_t *idx = sam_index_load(fp, argv[1]); bam_hdr_t *h =…
No @hd header returned in sam file when running bwa mem
No @hd header returned in sam file when running bwa mem 1 Hello, I produced sam files with the below command: bwa mem -M -t 10\ IndexedReference\ ${sample}_R1.fastq.gz ${sample}_R2.fastq.gz\ 2> ${sample}_bwa.err > ${sample}.sam` The resulting sam file doesn’t have an @hd header. Example output of samtools view: samtools view -H…
If I execute “AddOrReplaceReadGroups” on a sarted and duplicate-marked .bam file, do I have to re-sort and re-mark duplicates?
If I execute “AddOrReplaceReadGroups” on a sarted and duplicate-marked .bam file, do I have to re-sort and re-mark duplicates? 0 I’m brand new in Bioinformatics, and building my first variant calling pipeline for human genomes. Now, I got stuck using BaseRecalibrator as it needs read groups in my .bam file,…
Bioconductor Unevensamplesizes
Answer: multiple filters in biomaRt by RuBBiT0 ▴ 10 First, yes, one gene could be related to several GO ID, and the result is based on the annotation package you use. Second, could you pro… Comment: Diffbind “No genome detected” by kyliecode • 0 Hi Dr. Stark, Thanks very…
Integrated multi-omics for rapid rare disease diagnosis on a national scale
Ethics The Australian Genomics Acute Care study has Human Research Ethics Committee approval (HREC/16/MH/251). Parents provided informed consent for participation in the study, following genetic counseling. Study design and participants The Acute Care Genomics program is a national multi-site study delivering ultra-rapid genomic testing to critically ill pediatric patients with…
Next-Generation Sequencing (NGS)- Definition, Types, Applications, Limitations
What is Next-Generation Sequencing (NGS)? Next-Generation Sequencing (NGS), also known as high-throughput sequencing, has revolutionized the field of genomics and molecular biology by allowing the sequencing of thousands to millions of DNA molecules simultaneously. It encompasses a range of different sequencing technologies, all aimed at producing large amounts of sequence…
Epigenetic dysregulation from chromosomal transit in micronuclei
Cell culture Cell lines (MDA-MB-231, 4T1 and RPE-1) were purchased from the American Type Culture Collection (ATCC). TP53-knockout MCF10A, TP53-knockout RPE-1 and Trex1 knockout 4T1 cells were gifts from the Maciejowski laboratory at the Memorial Sloan Kettering Cancer Center (MSKCC). OVCAR-3 cells were a gift from J. D. Gonzales. All…
Heritable transcriptional defects from aberrations of nuclear architecture
Cell culture and cell line construction Cells were cultured at 37 °C in 5% CO2 atmosphere with 100% humidity. Telomerase-immortalized RPE-1 retinal pigment epithelium cells (CRL-4000, American Type Culture Collection), U2OS osteosarcoma cells (HTB-96, American Type Culture Collection) and derivative cell lines were grown in DMEM/F12 (1:1) medium without phenol red…
Masking before RNA-Seq Alignment and Gene Prediction in Plant Genomes
Masking before RNA-Seq Alignment and Gene Prediction in Plant Genomes 0 Hello Experts, I’m interested in conducting gene prediction for a plant genome using RNA sequencing data. To achieve this, I intend to perform RNA sequencing alignment and employ BRAKER2 for gene prediction. Before running BRAKER2, I’m considering applying soft-masking…
Reading same BAM file twice with htslib
I would like to read iteratively the same BAM file twice using htslib. My option is to use hts_open, sam_hdr_read, then sam_read1 as much as needed to read the file until EOF. Is there a possibility to “rewind”, and use sam_read1 again from the beginning of the file, without closing…
Gviz Coverage Plots
I have two bam files from single-cell RNA sequencing mapped to the reference genome using CellRanger, I can view them in IGV and I have a particular region where the pattern of reads mapped to the reference genome are different between the two bam files but when I try the…
Bioinformatics Analyst III job with US Tech Solutions
Job Description Job Description and title: Bioinformatics analyst 3 month contract with possibility of extension Typically between 9AM-5PM Central Time. We have an exciting contract opportunity for a Bioinformatics analyst/programmer to support the Emerging Technology Group in GRC. Primary responsibility include providing analysis support for running standard pooled CRISPR screening…
How to split a scRNA reads BAM or FASTQ file to a separate file for each cell by cell barcode?
How to split a scRNA reads BAM or FASTQ file to a separate file for each cell by cell barcode? 1 Hello everyone, I have a sample of scRNA seq data (A.Thaliana) generated by 10X Genomics. The data is composed of R1 (cell barcodes and UMIs) and R2 (actual reads)…
Convert Bam file to Fastq
Convert Bam file to Fastq 1 I used samtools to convert Bam paired files to Fastq, and I’d like to see if the conversion was successful. I attempted to check the read names with the following commands: comm -1 -2 bam_read_names.txt (paste fastq1_read_ids.txt fastq2_read_ids.txt | sort) > common_read_names.txt However, the…
How to calculate TPM from featureCounts output
How to calculate TPM from featureCounts output 0 I would like to find the TPM counts for the GSE102073 study. When i downloaded the raw data from GEO, the raw data are featureCounts output. First part of the file: # Program:featureCounts v1.4.3-p1; Command:”/data/NYGC/Software/Subread/subread-1.4.3-p1-Linux-x86_64/bin/featureCounts” “-s” “2” “-a” “/data/NYGC/Resources/ENCODE/Gencode/gencode.v18.annotation.gtf” “-o” “/data/analysis/LevineD/Project_LEV_01204_RNA_2014-01-30/Sample_JB4853/featureCounts/Sample_JB4853_counts.txt” “/data/analysis/LevineD/Project_LEV_01204_RNA_2014-01-30/Sample_JB4853/STAR_alignment/Sample_JB4853_Aligned.out.WithReadGroup.sorted.bam”…
Filtering mitochondrial reads from ATAC-Seq aligned reads- what to do with reads that have MT in RNEXT field
Hi all, I am trying to filter mitochondrial reads from my ATAC-seq data after trimming with Trimmomatic and then aligning with Bowtie2. After searching through many pipelines, I have found 2 ways that people often do this (both using inputs that are sorted and indexed BAM files): 1) with samtools…
“index/Trinity.fa” does not exist or is not a Bowtie 2 index Exiting now …
(ERR): “index/Trinity.fa” does not exist or is not a Bowtie 2 index Exiting now … 0 Hi, I am new to using trinity tool, and I went through a tutorial to run and assess the assembly. I already generated Trinity.fasta file, then # Build Index using this: bowtie2-build ~/workdir/trinity/trinity_out_dir/Trinity.fasta index/Trinity.fa…
Why is coordinate sort required before findng read depths?
Why is coordinate sort required before findng read depths? 1 I have a wgs dataset and when I attempt use it with sambamba depth command, it gives sambamba-depth: All files must be coordinate-sorted error. What is the reason for this and why coordinate sorting is required? wgs sambamba • 28…
hardfilter error
hardfilter error 0 I got the vcf file without Picard addgroup before making it. but I realized that it was wrong while making hard filter. I did some research and they told me to do a recall bam but I searched and couldn’t find any information about it. Can you…
Reads with highest MAPQ values from SAM files are showing mismatches to reference sequence and IGV classified them as supplementary reads
Hi all, I am expressing a GFP synonymous variant library in human cells and sequencing its RNA on the nanopore and I am having some trouble analysing the data. Initially, I basecalled all the fast5 files using the super accuracy model in the guppy basecaller, then I discarded the reads…
Not annotated metagenome-assembled genomes recovered from rumen samples from cows
Protozoa comprise a major fraction of the microbial biomass in the rumen microbiome, of which the genera Entodinium has been consistently observed to be dominant across a diverse genetic and geographical range of ruminant hosts. Despite the apparent core role that species such as Entodinium caudatum exert, their greater…
RNASeq gene labeling and mRNA filter from bulkRNA data.
RNASeq gene labeling and mRNA filter from bulkRNA data. 0 Hello, Currently, I have BAM files sent to me (I have acces to fastq files as well if that is a required data) from a sequencing company, and generated a count matrix using RSubreads package function, featureCounts(). I have also…
BAM creation – vg surject vs vg mpmap output
BAM creation – vg surject vs vg mpmap output 1 I have a graph that I am mapping RNA seq reads to and I want to create a BAM for a comparison study. Is there any difference between using vg surject, and selecting BAM as the output format when using…
Bioinformatics Analyst II (Remote) Position In North Chicago , IL
Job Description To discuss more about this job opportunity, please reach out to Chitrank Rastogi (LinkedIn URL – www.linkedin.com/in/chitrank-rastogi-55119a102/), email your updated resume at Email – chitrank.rastogi@collabera.com or give me a call at (425) 523-1648. Thank you! Job Description:Job Roles & Responsibilities: We have an exciting contract opportunity for a…
What Are The Most Common Stupid Mistakes In Bioinformatics?
Forum:What Are The Most Common Stupid Mistakes In Bioinformatics? 78 While I of course never have stupid mistakes…ahem…I have many “friends” who: forget to check both strands generate random genomic sites without avoiding masked (NNN) gaps confuse genome freezes and even species but I’m sure there are some other very…
Yersinia pestis genomes reveal plague in Britain 4000 years ago
All radiocarbon dates were calibrated in OxCal 4.4 using the IntCal20 calibration curve18,19. There is no stable carbon and nitrogen isotopic evidence for any detectable input of marine or freshwater foods that would require a correction for reservoir effects. Charterhouse Warren: Archaeological context Charterhouse Warren is a natural shaft in…
PacBio Pipeline and Tools for Variant Call
PacBio Pipeline and Tools for Variant Call 0 Hi, I am new to long read seq, I am trying to call Variants on GIAB Trio samples from PacBio data Initially i Aligned reads with Pbmm2 tool, then variant call by DeepVariant 1.5, Phasing through Whatshap. My queries are as follows…
How to get a comperative result of 2 bed files?
How to get a comperative result of 2 bed files? 0 I had a bed file, output of a CNV prediction software containing the columns name, chromosome, starting_index, ending_index and prediction(duplicate or deletion). I need to give a numerical result for this output. I have the bam, bam.bai and bed…
Tom York on Business: InventHelp Pitches Inventions by San Diego-Area Residents
An InventHelp invention licensing expo. Courtesy of the company The San Diego office of InventHelp, an invention service company that offers patent services to inventors, has announced a number of innovations by local residents that have commercial potential. The company reported in a news release that a Santee inventor has…
sorting BAM file
sorting BAM file 0 I tried sorting the BAM file i created using gatk-package-4.2.5.0-local.jar SortSam the code is : java -jar gatk-package-4.2.5.0-local.jar SortSam -I {path.bam} -O {path.sorted.bam} -SO coordinate but i encountered this error: INFO: Failed to detect whether we are running on Google Compute Engine. [Tue May 30 16:08:32…
weird behaviour on bedtools
weird behaviour on bedtools 0 Hi All, I want to extract the counts on defined regions (my.bed). But i’m not able to understand the results. The command says extract the counts form the bam files using bed file also the region of the bed region has to have >=80%. multiBamCov…
htslib-1.15.1-2.fc38 – Fedora Packages
htslib-1.15.1-2.fc38 – Fedora Packages ↵ Return to the main page of htslibView buildSearch for updates Package Info (Data from x86_64 build) Changelog Dependencies Provides Files Changelog Date Author Change 2023-01-19 Fedora Release Engineering <releng at fedoraproject dot org> – 1.15.1-2 – Rebuilt for fedoraproject.org/wiki/Fedora_38_Mass_Rebuild 2022-08-15 John Marshall <jmarshall at hey…
failed to find the gene identifier attribute
featureCounts: ERROR: failed to find the gene identifier attribute 1 Hello I made my own gtf file from hmmer results and I used it to calculate abundance of genes from the annotated feature of my gtf file using featureCounts program. The error message that I got is the following: featureCounts…
Demultiplexing bam file
Demultiplexing bam file 0 Hi to everyone, I have NIPT data from multiple samples which are sequenced by Iontorrent. All of them are aligned together using bowtie. I want to separate every sample from the aligned bam file which consists of all samples together. Part of the header of the…
DownsampleSam
I am trying to run DownsampleSam with Picard version: 2.26.5 on the following script and i get an error about Provider GCS. code: import os,sys from multiprocessing import Pool # the original depth of NA12878 and YH-1 nadepth=812.40 yhdepth=407.25 work_dph=int(sys.argv[1]) napercent = [0.99,0.95,0.90,0.80,0.70,0.60] yhpercent = [0.01,0.05,0.10,0.20,0.30,0.40] #yhpercent = [0.05] NAtotal=”/media/marina/marina2TB/BIOFILES/bams/na.addRG.mdup.bam”…
Phylogenomic analysis supports Mycobacterium tuberculosis transmission between humans and elephants
1. Introduction Tuberculosis (TB) is a significant global burden and is widely reported to be a major public health and economic problem, costing the world $617 billion between 2000 and 2015 and projected to cost $1 trillion between 2015 and 2030 (1). It is the second leading cause of death…
Deferentially expressed gene with high log2foldchange by DESeq2; but not meaningful at the individual level
Hi all, I am working with the RNA-Seq data on human (24Cases-20 controls) to find differentially expressed genes. my RNA-Seq data is unstranded. Here is the comments that I used to align the fastq files: ls *_1P.fastq.gz | parallel –bar -j8 ‘R2=$(echo {} | sed s/_1/_2/) && out=$(echo {} |…
ExomeDepth error in getBamCounts when adding a fasta reference
ExomeDepth error in getBamCounts when adding a fasta reference 1 Whenever I try to add a reference fasta file for computing the GC content in the GetBamCounts function: my.countsV6 <- getBamCounts(bed.frame =AgilentV6, bam.files = BAMFiles, include.chr = TRUE, referenceFasta = “data/hg19.fa” ) I get an error like this: Reference fasta…
Ubuntu Manpage: removeDup – Remove duplicated reads
Provided by: subread_2.0.0+dfsg-1_amd64 NAME removeDup – Remove duplicated reads USAGE removeDup [options] -i <input_file> -o <output_file> Required arguments: -i <string> Name of input file in SAM/BAM format. -o <string> Name of output SAM file including filtered reads. The format is BAM unless ‘-S’ is specified. Optional arguments: -S Generate the…
Mindlance hiring Bioinformatics Analyst in United States
Title- Bioinformatics Analyst Duration-3months with extension Location-Remote Description: We have an exciting contract opportunity for a Bioinformatics analyst/programmer to support the Emerging Technology Group in GRC. Primary responsibility include providing analysis support for running standard pooled CRISPR screening data and differential expression analysis of bulk RNAseq data, and performing…
Chipseq data peak calling issue
Hi , I’m trying to do analysis of chipseq data . I have 3 samples Sample1 , sample2 and input I have done QC and then alignment using Bowtie . After that I used samtool to get bam files . Then I have used Picard for duplicate removal. Now I…
samtools installed but error message: Library not loaded: @rpath/libcrypto.1.0.0.dylib
samtools installed but error message: Library not loaded: @rpath/libcrypto.1.0.0.dylib 1 I would like to convert input.sam to output.bam but I get following error message: kk$ samtools view -bS input.sam > output.bam dyld[88970]: Library not loaded: @rpath/libcrypto.1.0.0.dylib Referenced from: /Users/kk/opt/anaconda3/bin/samtools Reason: tried: ‘/Users/kk/opt/anaconda3/bin/../lib/libcrypto.1.0.0.dylib’ (no such file), ‘/Users/kk/opt/anaconda3/bin/../lib/libcrypto.1.0.0.dylib’ (no such file), ‘/usr/local/lib/libcrypto.1.0.0.dylib’…
VS-Bioinformatics Analyst (Remote) – Rangam Infotech Private Limited
“Applicants must be authorized to work for ANY employer in the U.S. We are unable to sponsor or take over sponsorship of an employment Visa at this time.” Remote 3 month contract with possibility of extension We have an exciting contract opportunity for a Bioinformatics analyst/programmer to support the Emerging…
snakemake restricting resources for specific rule
I am trying to generate several snakemake rules. However, I would like to include some restrictions due to storage limitations I first would like to copy the the bamfiles, 1 for each sample, to a staging area. However, while I have over 50 samples, I can only process three samples…
Genes’ fpkm values through cufflink
Hi, I am a newbie to RNA-seq data analysis. I have to identify differentially expressed genes (DEGs) between human and chimpanzee in a tissue type. I have comparable RNA-seq experiment data (reads/fastq) for the two species. Each species has 2 biological replicates(each with three technical replicates) so six runs per…
Samtools index and sambamba-depth error?
Samtools index and sambamba-depth error? 0 I have the following script samtools index dataset01.bam basename dataset01.bam f=”$(basename — dataset01.bam)” sambamba depth base -L genomic.bed dataset01.bam > ./read_depths/”$f.txt” I run this script with a whole genome dataset and after that is gives the following error. samtools index: failed to create or…
Sr. Bioinformatics Engineer II Job Opening in Cambridge, MA at ModernaTX, Inc.
Job Description: The Role In this role, you will develop bioinformatics pipelines and implement the latest algorithms. You will work closely with computational biologists, statisticians, bioinformaticians, and scientists within Oncology and software engineers in our Digital organization to develop our next-generation bioinformatics pipelines. Here’s What You’ll Do Develop, test, and…
how to run FacetsSuite wrapper scripts on command line
how to run FacetsSuite wrapper scripts on command line 0 Hi, I am very new to using FACETS. I was wondering if there is a way to run the Rscript snp-pile-wrapper.R directly on the command line without installing facets using conda or from GitHub. Is this possible? I’ve tried running…
Inquiry Regarding Somatic Analysis and Normal Sample Requirement
Inquiry Regarding Somatic Analysis and Normal Sample Requirement 1 Dear All, I have two questions: I plan to perform somatic analysis on 10 different tumor BAM files to determine the SNPs, indels, and CNVs. Is it necessary to have 10 distinct normal files corresponding to each tumor BAM file, or…
Off-target % for whole-exome sequencing panel
Off-target % for whole-exome sequencing panel 0 Hi all, My samples have been sequenced with the Twist Exome 2.0 sequencing panel, and I want to assess how efficient the sequencing has been. By efficient, I mean examining metrics such as the uniformity (fold-80 score), coverage, on/off-target % etc. as documented…
The wheat stem rust resistance gene Sr43 encodes an unusual protein kinase
Mutant collection development We mutagenized 2,700 seeds of the wheat–Th. elongatum introgression line RWG34 containing Sr43 (ref. 29). Dry seeds were incubated for 16 h with 200 ml of a 0.8% (w/v) EMS solution with constant shaking on a Roller Mixer (Model SRT1, Stuart Scientific) to ensure maximum homogenous exposure of the…
BED file showing an error while performing the FPKM count in Galaxy Europe
BED file showing an error while performing the FPKM count in Galaxy Europe 0 When I’m running FPKM Count program in the Galaxy Europe website i’m getting an error which is a s follows: [W::hts_idx_load3] The index file is older than the data file: input.bam.bai Extract exon regions from /data/dnb08/galaxy_db/files/f/b/4/dataset_fb483fc1-0b18-4aa6-aa5c-c2b9fd8047da.dat……
An unusual tandem kinase fusion protein confers leaf rust resistance in wheat
Plant material Bread wheat accessions Transfer (TA5524), WL711, TA5605, Ae. umbellulata accession TA1851 and Ae. triuncialis accession TA10438 were obtained from the Wheat Genetics Resource Center (WGRC). TcLr9 (Transfer/6*Thatcher) is a near-isogenic line carrying Lr9 from Transfer in the genetic background of the susceptible wheat line Thatcher. TcLr9 and TA5605…
Concatenating fastq for the same sample or doing it separately and merge at the BAM stage ?
Concatenating fastq for the same sample or doing it separately and merge at the BAM stage ? 1 Dear All, I am confused about one item I encounter. I have samples that were sequenced 3-times on 3-lanes to attain the required depth. I am running a pipeline to check fastq…
ChIP-Seq
ChIP-Seq Input Data (Reference Feature) LiftOver LiftOver option] body=[We provide on-the fly lift-over of reference data sets between different genome assemblies for broader comparison among annotations.]”> : Upload custom Data File Format] body=[All ChIP-seq tools use SGA (Simplified Genome Annotation) files as an internal working format. SGA intput…
Generating variant read count matrix, total read count matrix and binary/ternary mutaion matrix for SNV from scDNAseq FASTQ files
Generating variant read count matrix, total read count matrix and binary/ternary mutaion matrix for SNV from scDNAseq FASTQ files 0 Leung et al., 2017 paper mentioned in Fig 1 data processing for CRC patients was sequenced as single cell for both SNV (with MDA WGA) and CNA (with DOP-PCR) parallelly….
Index of /Atumefaciens/20230426-pgen-HISAT2-stringtie-gffcompare-RNAseq/heart
Name Last modified Size Description Parent Directory – e2t.ctab 2023-04-28 14:22 2.7M e_data.ctab 2023-04-28 14:22 14M heart-hisat2_stats.txt 2023-04-28 13:58 647 heart.cov_refs.gtf 2023-04-28 14:22 5.4M heart.gtf 2023-04-28 14:22 38M heart.sorted.bam 2023-04-28 14:07 12G heart.sorted.bam.bai 2023-04-28 14:11 2.0M heart_checksums.md5 2023-04-28 14:23 484 …
Principal component analysis using plotPCA of deepTools
Principal component analysis using plotPCA of deepTools 0 Hello everyone, I want to find PCA plot for multiple bam files. I used deep tool’s multiBamSummary to find signal coverage over genomic bins. Then, I supplied the output compressed numpy array(.npz) file to the plotPCA of deeptools. I got the required…
BAMboozle
BAMboozle 1 Hi, I am running BAMboozle to anonymize variant sequences using the GRCh37 human reference genome on my bam files. My bam files originally are 2-3 GB but when I get the output bam file from BAMboozle it is 500-600 Kb. Does BAMboozle decrease the size of the bam…
Extract paired-end reads with one end falls within a given region from a bam file
Extract paired-end reads with one end falls within a given region from a bam file 2 I have paired-end sequencing reads and two ends of a read may not come from the same region. I want to extract those that have at least one end mapping on a certain region…
An integrated tumor, immune and microbiome atlas of colon cancer
Samples used in this observational cohort study (tumor tissue and matched healthy colon tissue, AC-ICAM cohort) are from patients with colon cancer diagnosed at Leiden University Medical Center, the Netherlands, from 2001 to 2015 that did not object for future use of human tissues for scientific research and that were…
Paired-end reads were detected in single-end read library
About ERROR: Paired-end reads were detected in single-end read library 1 Hi all, I run a pretty similar command for two RNA-seq data sets: featureCounts -a Homo_sapiens.GRCh38.107.gtf -o count.out -T8 /mapped/*.bam I only change the bam files and both RNA-seq data are paired-end reads. However, one command got the error…
Babraham Bioinformatics – FastQC A Quality Control tool for High Throughput Sequence Data
FastQC Function AMPERE quality control tool for elevated throughput sequence data. Your Java What A match Java Runtime Ecology This Picard BAM/SAM Libraries (included in download) Code Maturation Robust. Mature code, but feedback exists comprehended. Code Released No, under GPL v3 or later. Initial Contact Simon Andrews Download Now Views…
filter reads in BAM having a tag
filter reads in BAM having a tag 3 Anyone has a simple solution for filtering reads in a BAM/SAM file having a certain TAG? This came up trying to filter out reads from 10x without a proper CB tag defined (which is causing troubles in downstream analysis tools). I’m surprised…
Fast BAM header editing
Fast BAM header editing 2 I would like to change some information in the BAM header (specifically, chromosome size information). Is there a smarter/faster way to do this without extracting to an intermediate SAM file and remaking the BAM file from the edited SAM file? bam • 94 views You…
CABANA workshop: Advanced RNAseq and network analysis in genomics
This course will provide training on RNAseq data production and interpretation. The course starts with a brief introduction to RNA-seq and discusses quality control issues. Next, we will present the alignment step, quantification of expression and differential expression analysis. We will dedicate some time to analysing and constructing networks from…
Answer: Estimate sizes of repeats in a especific Gene
Tell me if I’m in the way. I have the CRAM file and the respective CRAI (index). So I just ran the SAM like this, clipping my area of interest: > $ samtools view -b NG1PSZ7BE9.mm2.sortdup.bqsr.cram “chrX:147912050-147912110” > result.bam Then I indexed the .bam file: > $ samtools index result.bam…
The Glue of Genomics: Will Science’s Unsung Data Heroes Abandon Academia?
Advances in the last twenty years of genomics have turbocharged our ability to decode DNA and other nucleic acids. At the turn of the century, the Human Genome Project completed a 13-year journey to produce the first complete human genome at a great financial cost. In 2023, genome sequencing is a routine…
Estimate sizes of repeats in a especific Gene
Estimate sizes of repeats in a especific Gene 0 Amateur problem here: We know that it is possible to use the ExpansionHunter tool to estimate sizes of such repeats by performing a targeted search through a BAM/CRAM file for reads that span, flank, and are fully contained in each repeat….
Help with Diffbind result
Help with Diffbind result 0 @3fdb6f97 Last seen 8 hours ago United States Hi all, I am trying to find which genes are different in chromatin accessibility so I used nf-core/ATAC pipeline then feed the bam files and broadpeak files output into Diffbind. I got around 100k peaks, 21k genes…
multi-mapping reads settings in Rsubread or Rsubjunc
multi-mapping reads settings in Rsubread or Rsubjunc 0 Hi All, I am using Rsubjunc to process my RNA seq data for DEseq2 and differential splicing analysis. I have a question about how to set multi-mapping reads alignment in Rsubjunc R package. The command I used is attached to the end…
Kallisto Pseudoalignment vs. STAR Alignment
Accuracy of Methods for Gene Variant Detection and Quantification: Kallisto Pseudoalignment vs. STAR Alignment 1 I am analyzing the RNAseq data to quantify transcript variants for a specific gene. Ensembl database shows that it has 11 transcripts (splice variants). First I aligned the RNAseq data using Kallisto to quantify transcripts….
Methanol fixation is the method of choice for droplet-based single-cell transcriptomics of neural cells
hiPSC cell culture and differentiation hiPSCs were maintained on 1:40 matrigel (Corning, #354277) coated dishes in supplemented mTeSR-1 medium (StemCell Technologies, #85850) with 500 U ml−1 penicillin and 500 mg ml−1 streptomycin (Gibco, #15140122). For the differentiation of cortical neurons the protocol described previously21 was followed with slight modifications. Briefly, hiPSC colonies were seeded…
Paternity Testing from WGS Trio
It is definitely possible to assess paternity from whole genome sequence (WGS) data. Paternity can probably be established with as little as a few dozen or maybe hundreds of well-chosen single nucleotide polymorphisms (SNPs). If you have decent WGS data you can expect to genotype millions of SNPs. So, paternity…
Edit and re-head BAM file
Edit and re-head BAM file 0 Hi there I have a BAM file which needs to be edited and re-headed. Now, I’m aware of how to do so the problem is that for some reason the sed command I’m using does not catch the sequence I have to remove… Below,…
Intersecting transcriptome bam file with GTF file
Hello, I aligned artificial rna-seq reads against the genome and transcriptome using STAR. Star also generated a transcriptome.bam file. I want to to have have information of which ENST intersect against a specific part of the GTF, hence I have subset the GTF file. I have tried bedtools intersect, and…
Predictive network analysis identifies JMJD6 and other potential key drivers in Alzheimer’s disease
Cerejeira, J., Lagarto, L. & Mukaetova-Ladinska, E. B. Behavioral and psychological symptoms of dementia. Front. Neurol. 3, 73 (2012). Article CAS PubMed PubMed Central Google Scholar Murphy, M. P. & LeVine, H. III Alzheimer’s disease and the amyloid-beta peptide. J. Alzheimers Dis. 19, 311–323 (2010). Article PubMed PubMed Central Google…
Maximum Read Depth
Maximum Read Depth 3 Hi, I wish to get the variants of my sequencing data using Samtools (vcfutils). For this I need to specify the maximum read depth in VarFilter function. I have no information of this from my Data. Can anybody tell me, how should I approach to its…
Cost-effective Whole Exome Sequencing discovers pathogenic variant causing Neurofibromatosis type 1 in a family from Jammu and Kashmir, India
Cawthon, R. M. et al. A major segment of the neurofibromatosis type 1 gene: cDNA sequence, genomic structure, and point mutations. Cell 62, 193–201 (1990). Article CAS PubMed Google Scholar Buske, A. et al. Recurrent NF1 gene mutation in a patient with oligosymptomatic neurofibromatosis type 1 (NF1). Am. J. Med….
Element Biosciences Accelerates Data Discoveries with Amazon Omics
Element Biosciences leverages Amazon Omics to democratize genomic data by offering an innovative federated compute model SAN DIEGO, May 15, 2023 /PRNewswire/ — Element Biosciences, Inc.— developer of the Element AVITI™ System, an innovative DNA sequencing platform that is disrupting the genomics industry — announced it is now leveraging a…
Haplotype specific alleles
Haplotype specific alleles 0 Dear all, Do you know of any tool that provides allelic frequency information within a haplotype? > Input: Bam files > Output: allele1 – pos A/T AF=0.5, pos A/C AF=0.3, pos C/G AF=0.98 allele2 – pos A/C AF=0.3, pos………. I tried Shorah tool, but if I…
How to extract regions from BAM file on remote server and visualize long-read alignments in IGV?
How to extract regions from BAM file on remote server and visualize long-read alignments in IGV? 1 Hello, I have code that indicates there are a number of SNP mutations at specific locations for different cell types and none for others. I need to align all the respective longreads at…
Prevalence of BRCA homopolymeric indels in an ION Torrent-based tumour-to-germline testing workflow in high-grade ovarian carcinoma
Patients cohort Among consecutive patients who underwent BRCA tumour testing through ION Torrent-based sequencing between August 2017 and February 2022, we retrospectively selected 222 high-grade ovarian cancer (HGOC) patients with the following histological subtypes: 203 serous (HGSOC), seven endometrioid, five clear-cell and seven with mixed histotypes. Since NGS BRCA1/2 tumour…
Integrated microbiome-metabolome-genome axis data of Laiwu and Lulai pigs
Animal rearing and samples collection Our experiment was designed to compare eight female Laiwu pigs (LW) with eight female Lulai pigs (LU) which crossbred between LW and Yorkshire breeds. All pigs were born and raised for approximately two years (715 ± 33 days, Table 1) under uniform housing and feeding conditions at Jing-Qi-Shen…
BBmap params for accurate MAG relative abundance estimation?
hi folks, Recently, I’ve been working on developing molecular and computational methods to improve the accuracy of relative abundance estimation of metagenome assembled genomes (MAGs) from NGS datasets. Most of this work is focused on host-associated viral metagenomics, where quite a bit of nucleic acid manipulation and amplification is needed…
Whole-genome sequencing in B-cell lymphomas for circulating tumor DNA analysis by multiplex digital PCR for disease monitoring
WGS data of paired tumor and normal samples in 9 patients (6 diffuse large B-cell lymphomas, 1 transformed follicular lymphoma and 2 follicular lymphoma) with a median depth of 37X (range: 27-50X). The study was approved by the Stockholm Regional Ethical committee (2017/2538-31) and conducted in accordance with the Declaration…
samtools and Rsamtools in r
samtools and Rsamtools in r 2 Hello, How can I use Rsamtools, so I can import the results from a bam file. E.g. if I use samtools view sample.bam > sample.sam I can import this sam file easily into R. However, if I run Rsamtools, and do something like input_bam…
bash – CNV Kit ` from . import commands ImportError: cannot import name ‘commands’ from ‘__main__’`
I am trying to run some code for my colleague in bash /path/to/cnvkit.py batch /path/to/my/folder/with/bams/*.bam \ –normal –targets ${bed_file} \ –fasta path/to/my/resources_broad_hg38_v0_Homo_sapiens_assembly38.fasta \ –output-reference /path/to/my/CD_BATCH1_reference.cnn \ –output-dir /path/to/my/Group_1 –scatter however, I keep getting this truly peculiar error Traceback (most recent call last): File “/path/to/cnvkit.py”, line 4, in <module> from ….
GATK4 ASEReadCounter Function of VCF
GATK4 ASEReadCounter Function of VCF 1 Hi all, I am using GATK4 ASEReadCounter to create the inputs for an ASE analysis. I understand that the inputs are pre-processed + mapped .bams, reference genome and a VCF for specific sites to be processed (As per GATK documentation). I was wondering what…
Removing multi-variant records from vcf file
Removing multi-variant records from vcf file 3 I am using gatk ASEReadCounter to get the read counts per allele. To do so, I used the following command: gatk ASEReadCounter -R /path_to_genome/hg38_genome/GRCh38.p13.genome.fa -I sample.sorted.bam -V sample.vcf.gz -O output.table I used GATK4. but I realized In my VCF at position chr1:1574033, there…
Could not get first alignment from target
Can you share some of the image as text for easier understanding? It seems like there might be an issue with your BAM file or the region you are trying to call variants on. To help diagnose the issue, please follow these steps: 1. Check if your BAM file is…
bcftools get allele abundance
I’m using bcftools to extract variants from a bam file, but I have reference data that tells me whether the patient is homozygous or heterozygous. For a particular sample, I see a high proportion of the alternate allele (87%) and a lower proportion of the reference allele (13%), yet according…
Multimodal perturbation analyses of cyclin-dependent kinases reveal a network of synthetic lethalities associated with cell-cycle regulation and transcriptional regulation
Phylogenetic tree construction Tree diagram showing relationships between CDK proteins was constructed from a multi-sequence alignment (MSA) using Geneious95. The “Geneious Aligner”, was used to generate the MSA, and the neighbor joining method was used to construct the tree. All default parameters were used except where otherwise indicated. Combinatorial CRISPR…
A draft human pangenome reference
Sample selection We identified parent–child trios from the 1KG in which the child cell line banked within the NHGRI Sample Repository for Human Genetic Research at the Coriell Institute for Medical Research was listed as having zero expansions and two or fewer passages, and rank-ordered representative individuals as follows. Loci…
Sub-sampling a BAM to a fixed number of reads
Sub-sampling a BAM to a fixed number of reads 0 Hi all, I have recently ben trying to write a simple python script that can produce contaminated synthetic paired end reads for a tool my group are creating. However, in writing this script i’ve been using the following: samtools view…
No differentially expressed genes after multiple testing correction in mice
No differentially expressed genes after multiple testing correction in mice 0 Hi all, I am working with the RNA-seq data on mice (group A N=3 vs group B N=3). Mice are littermates, of which group A overexpresses a human transgene which I verified. I have had .cram files from mouse…
Help with error in GATK variant calling
Help with error in GATK variant calling 1 Hi all, I try some reference genome such as Homo_sapiens_assembly38.fasta and Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa but I still got the error below. Would you please have a suggestion? Thank you so much. The link in the error message doesn’t work. gatk BaseRecalibrator -I Library_1Aligned.out.sorted.bam -R…
Average bigwig files (not sum)
Average bigwig files (not sum) 1 Hello, I have bigwig (RPKM) files of a chip-seq experiment for treatment and control conditions which I am trying to compare. I have 3 replicates for control and 5 replicates for treatment condition. To show the average difference in signal, I merged the replicates…
In-depth Temporal Transcriptome Profiling of Monkeypox and Host Cells using Nanopore Sequencing
Figure 1 shows the detailed workflow of the study. Fig. 1 General overview of the study. Briefly, MPXV was isolated from a skin lesion and then was used to infect CV-1 cells. After the designated infection times, total RNA was isolated and sequenced using direct cDNA sequencing protocol on ONT’s MinION platform….