Tag: cram
Giants’ Brian Daboll treating OTAs like a ‘teaching camp’
This week, New York Giants head coach Brian Daboll began his second round of organized team activities (OTAs) with the team. There’s a big difference in the air from this time last year when everything and almost everyone was new to one another and their surroundings. The Giants entered Phase…
How to Split 3000 WGS CRAM files into 1Mbp length chunks
How to Split 3000 WGS CRAM files into 1Mbp length chunks 1 Hello, I have 3000 WGS CRAM files and I want to split them into 1Mbp chunks. I want to split with exact genomic coordinate locations, e.g. starting from 1 to 1000000bp, 1000001bp to 2000000bp, 2000001bp to 3000000 etc….
Answer: Estimate sizes of repeats in a especific Gene
Tell me if I’m in the way. I have the CRAM file and the respective CRAI (index). So I just ran the SAM like this, clipping my area of interest: > $ samtools view -b NG1PSZ7BE9.mm2.sortdup.bqsr.cram “chrX:147912050-147912110” > result.bam Then I indexed the .bam file: > $ samtools index result.bam…
Estimate sizes of repeats in a especific Gene
Estimate sizes of repeats in a especific Gene 0 Amateur problem here: We know that it is possible to use the ExpansionHunter tool to estimate sizes of such repeats by performing a targeted search through a BAM/CRAM file for reads that span, flank, and are fully contained in each repeat….
Predictive network analysis identifies JMJD6 and other potential key drivers in Alzheimer’s disease
Cerejeira, J., Lagarto, L. & Mukaetova-Ladinska, E. B. Behavioral and psychological symptoms of dementia. Front. Neurol. 3, 73 (2012). Article CAS PubMed PubMed Central Google Scholar Murphy, M. P. & LeVine, H. III Alzheimer’s disease and the amyloid-beta peptide. J. Alzheimers Dis. 19, 311–323 (2010). Article PubMed PubMed Central Google…
A draft human pangenome reference
Sample selection We identified parent–child trios from the 1KG in which the child cell line banked within the NHGRI Sample Repository for Human Genetic Research at the Coriell Institute for Medical Research was listed as having zero expansions and two or fewer passages, and rank-ordered representative individuals as follows. Loci…
No differentially expressed genes after multiple testing correction in mice
No differentially expressed genes after multiple testing correction in mice 0 Hi all, I am working with the RNA-seq data on mice (group A N=3 vs group B N=3). Mice are littermates, of which group A overexpresses a human transgene which I verified. I have had .cram files from mouse…
Missing columns in meta table from SRA Selector
Unfortunately there is not enforced standard of what metadata must make into the SRA, it is very frustrating actually and makes reproducing any analysis needlessly complicated. You can look at what EBI fields are there, and sometimes they produce more fields than SRA: pip install bio then look at the…
Best Practices for CRAM BAM
Forum:Best Practices for CRAM <-> BAM 0 Hi, I am looking for advice about transitioning from bam/bai to cram for archival purposes. General advice is appreciated, but I’m specifically looking for answers to these two questions – Does samtools offer the best performance for converting to and from CRAMs? Do…
Comparing Alignment Files (CRAM)
Comparing Alignment Files (CRAM) 0 Hello all, Just checked different forums and generally, I see that it would be useful to use samtools or picard-tools for comparing alignment files. Here I want to compare the aligned output files using two different alignment algorithms. In this case, I had some general…
Issue With CRAM -> BAM -> FASTQ Conversion
Issue With CRAM -> BAM -> FASTQ Conversion 2 Please help! I am trying to obtain fastq files from the GDSC, all we have in the lab is CRAM files. Unfortunately, the reference genome seems to not exist when pulled from an online source. I have attempted to use the…
Supported Tools – MultiQC
Tool Tool Name Description Removes adapter sequences and trims low quality bases from the 3′ end of reads. Overlapping paired-ended reads can be merged into consensus sequences and adapter sequence can be found for paired-ended data if not known. Automatic Filtering, Trimming, Error Removing and Quality Control for fastq data….
Newest ‘cram?tab=Active’ Questions – Bioinformatics Stack Exchange
Newest ‘cram?tab=Active’ Questions – Bioinformatics Stack Exchange …
storage – Good / recommended way to archive fastq and bam files?
The only free and open source tool I know that can help is zstd. Their github repository’s README describes it as: Zstandard, or zstd as short version, is a fast lossless compression algorithm, targeting real-time compression scenarios at zlib-level and better compression ratios. It’s backed by a very fast entropy…
The Biostar Herald for Monday, April 03, 2023
The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here. This edition of the Herald was brought to you by contribution from Istvan Albert, and was edited by Istvan…
Manta and alignment name collision
Manta and alignment name collision 0 Dear community members, I received hundreds of CRAM files which I have to run through Manta SV calling and they fail due to “Unexpected alignment name collision” – this file contains tens (out of millions) of reads which were multi-mapped, so they have 2…
Running accurate, comprehensive, and efficient genomics workflows on AWS using Illumina DRAGEN v4.0
Introduction The reduced cost of DNA sequencing technology has led to an exponential growth of raw sequencing data. To keep pace with this development, secondary analysis tools that can provide fast and accurate results in a cost-effective manner are needed to extract actionable genomic insights. Illumina’s DRAGENTM (Dynamic Read Analysis for GENomics) addresses…
bwa-mem2 vs htslib – compare differences and reviews?
What are some alternatives? When comparing bwa-mem2 and htslib you can also consider the following projects: minimap2 – A versatile pairwise aligner for genomic and spliced nucleotide sequences bowtie2 – A fast and sensitive gapped read aligner genozip – A modern compressor for genomic files (FASTQ, SAM/BAM/CRAM, VCF, FASTA, GFF/GTF/GVF,…
converting cram to ubam
converting cram to ubam 1 How can I convert a cram to an unmapped bam file with samtools? samtools view -b -T ref.fasta input.cram > output.bam is this correct? cram • 21 views samtools collate -O -u input.cram | \ samtools reset -O BAM -o out.bam Login before adding your…
Unravelling microalgal-bacterial interactions in aquatic ecosystems through 16S rRNA gene-based co-occurrence networks
Croft, M. T., Lawrence, A. D., Raux-Deery, E., Warren, M. J. & Smith, A. G. Algae acquire vitamin B12 through a symbiotic relationship with bacteria. Nature doi.org/10.1038/nature04056 (2005). Article PubMed Google Scholar Kazamia, E. et al. Mutualistic interactions between vitamin B12-dependent algae and heterotrophic bacteria exhibit regulation. Environ. Microbiol. doi.org/10.1111/j.1462-2920.2012.02733.x…
Everything You need to know about the CRAM Format
This tutorial teaches everything you need to know about the CRAM format, bam to cram compression ratio, cramtools, etc 1. What is a BAM, SAM, and CRAM format BAM, SAM, and CRAM are file formats used to store and exchange alignment data in bioinformatics. BAM (Binary Alignment/Map) format is a…
SAMtools – PACE Cluster Documentation
Updated 2023-01-06 Overview SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM, and CRAM formats. This guide will cover how to run SAMtools on the Cluster. This is the link to the SAMtools Homepage. Summary SAMtools has a set…
CYP2D6: Phase I Oxidative Metabolism Enzyme – 474 Words
CYP2D6: Phase I Oxidative Metabolism Enzyme – 474 Words | Cram Home Page CYP2D6: Phase I Oxidative Metabolism Enzyme CYP2D6 is a Phase I oxidative metabolism enzyme that is clinically important because about 20-25% of clinically used drug are metabolized by the CYP2D6 enzyme. CYP2D6 substrates are typically lipophilic and…
Ubuntu Manpage: samtools index – indexes SAM/BAM/CRAM files
Provided by: samtools_1.10-3_amd64 NAME samtools index – indexes SAM/BAM/CRAM files SYNOPSIS samtools index [-bc] [-m INT] aln.bam|aln.cram [out.index] DESCRIPTION Index a coordinate-sorted BGZIP-compressed SAM, BAM or CRAM file for fast random access. (Note that this does not work with uncompressed SAM files.) This index is needed when region arguments are…
Cost-effective and accurate genomics analysis with Sentieon on AWS
This blog post was contributed by Don Freed, Senior Bioinformatics Scientist, and Brendan Gallagher, Head of Business Development at Sentieon; and Olivia Choudhury, PhD, Senior Partner Solutions Architect, Sujaya Srinivasan, Genomics Solutions Architect, and Aniket Deshpande, Senior Specialist, HPC HCLS at AWS. The year 2022 was an exciting one for genomics…
Plink duplicate ID
Plink duplicate ID 1 Hi, I’ve converted the reich dataset to plunk format along with my vcf file provided from my full genome, I merged the both together which led to getting an error and output two files. The two files it output was .fam and .missnp, now it tried…
find tandem repeats in DNA
find tandem repeats in DNA 1 @07a6aebe Last seen 8 hours ago United Kingdom I want to find tandem repeats in DNA. I have access to CRAM file and the VCF file. I initially tried to get the insertions from the VCF file, but I am not sure if the…
Remote Visualization of Local Genome Alignments Aids Pathogenic Variant Evaluation for Rare Disease
CHICAGO – A group at Spain’s National Center for Genomic Analysis-Center for Genomic Regulation (CNAG-CRG) in Barcelona has harnessed a protocol for accessing sequencing and variant data to help assess potentially pathogenic genetic variants within the context of a European Union-funded program to improve diagnosis of rare diseases. The CNAG-CRG…
find tandem repeats in DNA from CRAM/VCF file
find tandem repeats in DNA from CRAM/VCF file 0 I want to find tandem repeats in DNA. I have access to CRAM file and the VCF file. I initially tried to get the insertions from the VCF file, but I am not sure if the variant caller has included all…
Standards, Regulation, Funding Move Bioinformatics in 2022, But Hurdles to Precision Medicine Remain
CHICAGO – Although the US Food and Drug Administration (FDA) provided some long-sought clarity in 2022 on how it would regulate clinical decision support and in vitro diagnostic software, technology developers and healthcare organizations still struggled with how to integrate genomics data into clinical practice. It will likely take more…
Compressing BAM, SAM, CRAM | Genozip
How good is Genozip at compressing BAM files? See Benchmarks. Compressing a BAM, SAM or CRAM file In the rest of this page we will give examples of BAM files. Genozip is also capable of compressing SAM files, and with some limitations, CRAM files as well. …
Getting information on CRAM files from headers inside the files
Getting information on CRAM files from headers inside the files 1 Hello. I wish to know if one can find the following information in CRAM files’ headers: 1) Whether or not sequencing data in CRAM files is from WGS or WES, and if so, where? and 2) In case one…
Samtools Convert Sam To Bam With Code Examples
Samtools Convert Sam To Bam With Code Examples In this session, we’ll try our hand at solving the Samtools Convert Sam To Bam puzzle by using the computer language. The code that follows serves to illustrate this point. # Basic syntax: samtools view -S -b sam_file.sam > bam_file.bam # Where:…
Index of /~psgendb/doc/pkg/samtools-1.7/htslib-1.7/cram
Name Last modified Size Description Parent Directory – cram.h 2015-06-24 11:00 2.4K cram_codecs.c 2017-09-26 09:28 50K cram_codecs.h 2016-03-17 07:48 6.0K cram_codecs.o 2018-03-04 16:57 175K cram_decode.c 2018-01-26 05:33 84K cram_decode.h 2013-10-16 06:15 3.4K cram_decode.o 2018-03-04 16:57 236K cram_encode.c 2017-07-03 16:45 87K …
CNV Pipeline Options
The following are the top-level options that are shared with the DRAGEN Host Software to control the CNV pipeline. You can input a BAM or CRAM file into the CNV pipeline. If you are using the DRAGEN mapper and aligner, you can use FASTQ files. …
How to trim the length of reads in a CRAM file?
How to trim the length of reads in a CRAM file? 0 I have a CRAM file with paired reads which looks like this: im13@node-13-21:~/scratch_im13_projects/im13_basespace_runs$ samtools view ./walkup_194_repeat/CRAM/A01_FR_KAPA_25x_1ug_SR_1ngx4rxns_S1.cram | head D00586:937:HVCWGBCX3:1:1101:1485:1803 77 * 0 0 * * 0 0 NCAGAGGAAGCGGAACGCATGTTTC #<GGGIIGIGGGIIGIGIIGGG.<< D00586:937:HVCWGBCX3:1:1101:1485:1803 141 * 0 0 * * 0 0…
Index of /~psgendb/birchhomedir/public_html/doc/pkg/samtools-1.7/htslib-1.7/htslib
Name Last modified Size Description Parent Directory – bgzf.h 2018-01-10 07:45 14K cram.h 2015-09-25 05:36 15K faidx.h 2017-02-07 11:06 5.6K hfile.h 2018-01-26 05:33 9.6K hts.h 2017-11-24 09:46 29K hts_defs.h 2017-08-10 11:07 3.3K hts_endian.h 2017-09-27 10:40 11K hts_log.h 2017-06-03 15:45 3.8K …
How To Install libhts-dev on Kali Linux
In this tutorial we learn how to install libhts-dev on Kali Linux. libhts-dev is development files for the HTSlib Introduction In this tutorial we learn how to install libhts-dev on Kali Linux. What is libhts-dev HTSlib is an implementation of a unified C library for accessing common file formats, such…
Samtools Htslib Issues
Issue Title State Comments Created Date Updated Date How to get a specific chromosome open 1 2022-07-14 2022-07-18 tabix returns row from VCF file multiple times open 4 2022-07-11 2022-07-18 Modified base parsing failure failure closed 0 2022-07-01 2022-07-18 extract genotype information open 1 2022-06-24 2022-07-18 sam_hdr_remove_lines is inefficient if…
Ubuntu Manpage: alleleCounts.pl – Generate tab seperated file with allelic counts and depth for each
Provided by: liballelecount-perl_4.2.1-1_all NAME alleleCounts.pl – Generate tab seperated file with allelic counts and depth for each specified locus. SYNOPSIS Where possible use the C version for large data (it’s also more configurable). alleleCounts.pl Required: -bam -b BAM/CRAM file (expects co-located index) – if CRAM see ‘-ref’ -output -o Output…
Ubuntu Manpage: bamfillquery – fill query sequences into BAM files
Provided by: biobambam2_2.0.179+ds-1_amd64 NAME bamfillquery – fill query sequences into BAM files SYNOPSIS bamfillquery [options] <in.bam queries.fasta >out.bam DESCRIPTION bamfillquery reads a SAM/BAM/CRAM file and a FastA file, copies the sequences found in the FastA file into the query sequence field of the SAM/BAM/CRAM file and writes the resulting data…
[SpotBugs] htsjdk.samtools.cram.structure.CramHeader defines clone() but doesn’t implement Cloneable
Cloneable is not used very much so maybe deprecate and remove the clone() method? /cc @jmthibault79, @cmnbroad See spotbugs.readthedocs.io/en/stable/bugDescriptions.html#cn-class-defines-clone-but-doesn-t-implement-cloneable-cn-implements-clone-but-not-cloneable Part of #1267 Report: In class htsjdk.samtools.cram.structure.CramHeader In method htsjdk.samtools.cram.structure.CramHeader.clone() At CramHeader.java:[lines 80-85] Read more here: Source link
bioconductor – Trouble installing Rhtslib in R/R studio
I’m using RStudio on Ubuntu 18 and I’m trying to install the htslib package from the Bioconductor repo, but I’m stuck now. This is what I get: * installing *source* package ‘Rhtslib’ … ** using non-staged installation via StagedInstall field ** libs cd “htslib-1.7” && make -f “/usr/lib/R/etc/Makeconf” -f “Makefile.Rhtslib”…
Read bam/cram file with IGV from aws s3
Hi all, We store our alignment files on aws s3. I would like to be able to open them with IGV without needing to download them completely, but I can’t find an optimal solution. If I get a pre-signed url it works but it’s not convenient. I try to follow…
Ubuntu Manpage: samtools reheader – replaces the header in the input file
Provided by: samtools_1.13-2_amd64 NAME samtools reheader – replaces the header in the input file SYNOPSIS samtools reheader [-iP] [-c CMD | in.header.sam ] in.bam DESCRIPTION Replace the header in in.bam with the header in in.header.sam. This command is much faster than replacing the header with a BAM→SAM→BAM conversion. By default…
The Biostar Herald for Tuesday, September 21, 2021
The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here. This edition of the Herald was brought to you by contribution from Istvan Albert, and was edited by Istvan…
Bedtools: Merging Many Bed Files
Bedtools: Merging Many Bed Files 2 I am using the algorithm CookHLA for my research. As part of its preparation, I need to feed it a bed file representing at least 100 of my samples. I have made the bed files for 500 samples using samtools and bedtools in a…
Best Omic file compressor?
Best Omic file compressor? 1 Our team has been having storage space issues; we predicted that we will not have enough available memory to store the files generated by our pipelines. Standard file compressors (gzip, bzip2, 7zip) weren’t cutting it and I started experimenting with file-specific compressors. This is where…
[main_samview] fail to read the header from “-“.
[main_samview] fail to read the header from “-“. 1 I am attempting to run a file through an algorithm I have been using, HLA*LA. On running the samtools command within the algorithm, I have unfortunately been getting this error. After trying to debug this following other guides, I am seeking…
How to extract all sequences mapped to a transcript from Kallisto output
How to extract all sequences mapped to a transcript from Kallisto output 0 I ran Kallisto with the –pseudobam option. How do I extract all the short reads that are mapped to a single transcript (e.g. ENST00000367969.8)? As a person without any previous SAM/BAM experience, I tried the following things…
install GenomicFeatures fail
install GenomicFeatures fail 1 @5b9023e7 Last seen 19 hours ago China BiocManager::install(‘GenomicFeatures’) results show ‘getOption(“repos”)’ replaces Bioconductor standard repositories, see ‘?repositories’ for details replacement repositories: CRAN: mirrors.tuna.tsinghua.edu.cn/CRAN/ Bioconductor version 3.14 (BiocManager 1.30.16), R 4.1.0 (2021-05-18) Installing package(s) ‘GenomicFeatures’ also installing the dependencies ‘Rhtslib’, ‘Rsamtools’, ‘GenomicAlignments’, ‘rtracklayer’ Packages which are only…
.tar.gz = same size as before?
BAM compression: .tar.gz = same size as before? 2 I tried to compress 5 bam files using: tar -czvf original_bams.tar.gz *.bam The resulting file sizes (“ll –block-size=M”) are: 8067M file1.bam 6962M file2.bam 10662M file3.bam 7794M file4.bam 7346M file5.bam 40828M original_bams.tar.gz There’s a difference of 3MB between the archive and the…