Tag: cram

Giants’ Brian Daboll treating OTAs like a ‘teaching camp’

This week, New York Giants head coach Brian Daboll began his second round of organized team activities (OTAs) with the team. There’s a big difference in the air from this time last year when everything and almost everyone was new to one another and their surroundings. The Giants entered Phase…

Continue Reading Giants’ Brian Daboll treating OTAs like a ‘teaching camp’

How to Split 3000 WGS CRAM files into 1Mbp length chunks

How to Split 3000 WGS CRAM files into 1Mbp length chunks 1 Hello, I have 3000 WGS CRAM files and I want to split them into 1Mbp chunks. I want to split with exact genomic coordinate locations, e.g. starting from 1 to 1000000bp, 1000001bp to 2000000bp, 2000001bp to 3000000 etc….

Continue Reading How to Split 3000 WGS CRAM files into 1Mbp length chunks

Answer: Estimate sizes of repeats in a especific Gene

Tell me if I’m in the way. I have the CRAM file and the respective CRAI (index). So I just ran the SAM like this, clipping my area of interest: > $ samtools view -b NG1PSZ7BE9.mm2.sortdup.bqsr.cram “chrX:147912050-147912110” > result.bam Then I indexed the .bam file: > $ samtools index result.bam…

Continue Reading Answer: Estimate sizes of repeats in a especific Gene

Estimate sizes of repeats in a especific Gene

Estimate sizes of repeats in a especific Gene 0 Amateur problem here: We know that it is possible to use the ExpansionHunter tool to estimate sizes of such repeats by performing a targeted search through a BAM/CRAM file for reads that span, flank, and are fully contained in each repeat….

Continue Reading Estimate sizes of repeats in a especific Gene

Predictive network analysis identifies JMJD6 and other potential key drivers in Alzheimer’s disease

Cerejeira, J., Lagarto, L. & Mukaetova-Ladinska, E. B. Behavioral and psychological symptoms of dementia. Front. Neurol. 3, 73 (2012). Article  CAS  PubMed  PubMed Central  Google Scholar  Murphy, M. P. & LeVine, H. III Alzheimer’s disease and the amyloid-beta peptide. J. Alzheimers Dis. 19, 311–323 (2010). Article  PubMed  PubMed Central  Google…

Continue Reading Predictive network analysis identifies JMJD6 and other potential key drivers in Alzheimer’s disease

A draft human pangenome reference

Sample selection We identified parent–child trios from the 1KG in which the child cell line banked within the NHGRI Sample Repository for Human Genetic Research at the Coriell Institute for Medical Research was listed as having zero expansions and two or fewer passages, and rank-ordered representative individuals as follows. Loci…

Continue Reading A draft human pangenome reference

No differentially expressed genes after multiple testing correction in mice

No differentially expressed genes after multiple testing correction in mice 0 Hi all, I am working with the RNA-seq data on mice (group A N=3 vs group B N=3). Mice are littermates, of which group A overexpresses a human transgene which I verified. I have had .cram files from mouse…

Continue Reading No differentially expressed genes after multiple testing correction in mice

Missing columns in meta table from SRA Selector

Unfortunately there is not enforced standard of what metadata must make into the SRA, it is very frustrating actually and makes reproducing any analysis needlessly complicated. You can look at what EBI fields are there, and sometimes they produce more fields than SRA: pip install bio then look at the…

Continue Reading Missing columns in meta table from SRA Selector

Best Practices for CRAM BAM

Forum:Best Practices for CRAM <-> BAM 0 Hi, I am looking for advice about transitioning from bam/bai to cram for archival purposes. General advice is appreciated, but I’m specifically looking for answers to these two questions – Does samtools offer the best performance for converting to and from CRAMs? Do…

Continue Reading Best Practices for CRAM BAM

Comparing Alignment Files (CRAM)

Comparing Alignment Files (CRAM) 0 Hello all, Just checked different forums and generally, I see that it would be useful to use samtools or picard-tools for comparing alignment files. Here I want to compare the aligned output files using two different alignment algorithms. In this case, I had some general…

Continue Reading Comparing Alignment Files (CRAM)

Issue With CRAM -> BAM -> FASTQ Conversion

Issue With CRAM -> BAM -> FASTQ Conversion 2 Please help! I am trying to obtain fastq files from the GDSC, all we have in the lab is CRAM files. Unfortunately, the reference genome seems to not exist when pulled from an online source. I have attempted to use the…

Continue Reading Issue With CRAM -> BAM -> FASTQ Conversion

Supported Tools – MultiQC

Tool Tool Name Description Removes adapter sequences and trims low quality bases from the 3′ end of reads. Overlapping paired-ended reads can be merged into consensus sequences and adapter sequence can be found for paired-ended data if not known. Automatic Filtering, Trimming, Error Removing and Quality Control for fastq data….

Continue Reading Supported Tools – MultiQC

storage – Good / recommended way to archive fastq and bam files?

The only free and open source tool I know that can help is zstd. Their github repository’s README describes it as: Zstandard, or zstd as short version, is a fast lossless compression algorithm, targeting real-time compression scenarios at zlib-level and better compression ratios. It’s backed by a very fast entropy…

Continue Reading storage – Good / recommended way to archive fastq and bam files?

The Biostar Herald for Monday, April 03, 2023

The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here. This edition of the Herald was brought to you by contribution from Istvan Albert, and was edited by Istvan…

Continue Reading The Biostar Herald for Monday, April 03, 2023

Manta and alignment name collision

Manta and alignment name collision 0 Dear community members, I received hundreds of CRAM files which I have to run through Manta SV calling and they fail due to “Unexpected alignment name collision” – this file contains tens (out of millions) of reads which were multi-mapped, so they have 2…

Continue Reading Manta and alignment name collision

Running accurate, comprehensive, and efficient genomics workflows on AWS using Illumina DRAGEN v4.0

Introduction The reduced cost of DNA sequencing technology has led to an exponential growth of raw sequencing data. To keep pace with this development, secondary analysis tools that can provide fast and accurate results in a cost-effective manner are needed to extract actionable genomic insights. Illumina’s DRAGENTM (Dynamic Read Analysis for GENomics) addresses…

Continue Reading Running accurate, comprehensive, and efficient genomics workflows on AWS using Illumina DRAGEN v4.0

bwa-mem2 vs htslib – compare differences and reviews?

What are some alternatives? When comparing bwa-mem2 and htslib you can also consider the following projects: minimap2 – A versatile pairwise aligner for genomic and spliced nucleotide sequences bowtie2 – A fast and sensitive gapped read aligner genozip – A modern compressor for genomic files (FASTQ, SAM/BAM/CRAM, VCF, FASTA, GFF/GTF/GVF,…

Continue Reading bwa-mem2 vs htslib – compare differences and reviews?

converting cram to ubam

converting cram to ubam 1 How can I convert a cram to an unmapped bam file with samtools? samtools view -b -T ref.fasta input.cram > output.bam is this correct? cram • 21 views samtools collate -O -u input.cram | \ samtools reset -O BAM -o out.bam Login before adding your…

Continue Reading converting cram to ubam

Unravelling microalgal-bacterial interactions in aquatic ecosystems through 16S rRNA gene-based co-occurrence networks

Croft, M. T., Lawrence, A. D., Raux-Deery, E., Warren, M. J. & Smith, A. G. Algae acquire vitamin B12 through a symbiotic relationship with bacteria. Nature doi.org/10.1038/nature04056 (2005). Article  PubMed  Google Scholar  Kazamia, E. et al. Mutualistic interactions between vitamin B12-dependent algae and heterotrophic bacteria exhibit regulation. Environ. Microbiol. doi.org/10.1111/j.1462-2920.2012.02733.x…

Continue Reading Unravelling microalgal-bacterial interactions in aquatic ecosystems through 16S rRNA gene-based co-occurrence networks

Everything You need to know about the CRAM Format

This tutorial teaches everything you need to know about the CRAM format, bam to cram compression ratio, cramtools, etc 1. What is a BAM, SAM, and CRAM format BAM, SAM, and CRAM are file formats used to store and exchange alignment data in bioinformatics. BAM (Binary Alignment/Map) format is a…

Continue Reading Everything You need to know about the CRAM Format

SAMtools – PACE Cluster Documentation

Updated 2023-01-06 Overview SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM, and CRAM formats. This guide will cover how to run SAMtools on the Cluster. This is the link to the SAMtools Homepage. Summary SAMtools has a set…

Continue Reading SAMtools – PACE Cluster Documentation

CYP2D6: Phase I Oxidative Metabolism Enzyme – 474 Words

CYP2D6: Phase I Oxidative Metabolism Enzyme – 474 Words | Cram Home Page CYP2D6: Phase I Oxidative Metabolism Enzyme CYP2D6 is a Phase I oxidative metabolism enzyme that is clinically important because about 20-25% of clinically used drug are metabolized by the CYP2D6 enzyme. CYP2D6 substrates are typically lipophilic and…

Continue Reading CYP2D6: Phase I Oxidative Metabolism Enzyme – 474 Words

Ubuntu Manpage: samtools index – indexes SAM/BAM/CRAM files

Provided by: samtools_1.10-3_amd64 NAME samtools index – indexes SAM/BAM/CRAM files SYNOPSIS samtools index [-bc] [-m INT] aln.bam|aln.cram [out.index] DESCRIPTION Index a coordinate-sorted BGZIP-compressed SAM, BAM or CRAM file for fast random access. (Note that this does not work with uncompressed SAM files.) This index is needed when region arguments are…

Continue Reading Ubuntu Manpage: samtools index – indexes SAM/BAM/CRAM files

Cost-effective and accurate genomics analysis with Sentieon on AWS

This blog post was contributed by Don Freed, Senior Bioinformatics Scientist, and Brendan Gallagher, Head of Business Development at Sentieon; and Olivia Choudhury, PhD, Senior Partner Solutions Architect, Sujaya Srinivasan, Genomics Solutions Architect, and Aniket Deshpande, Senior Specialist, HPC HCLS at AWS. The year 2022 was an exciting one for genomics…

Continue Reading Cost-effective and accurate genomics analysis with Sentieon on AWS

Plink duplicate ID

Plink duplicate ID 1 Hi, I’ve converted the reich dataset to plunk format along with my vcf file provided from my full genome, I merged the both together which led to getting an error and output two files. The two files it output was .fam and .missnp, now it tried…

Continue Reading Plink duplicate ID

find tandem repeats in DNA

find tandem repeats in DNA 1 @07a6aebe Last seen 8 hours ago United Kingdom I want to find tandem repeats in DNA. I have access to CRAM file and the VCF file. I initially tried to get the insertions from the VCF file, but I am not sure if the…

Continue Reading find tandem repeats in DNA

Remote Visualization of Local Genome Alignments Aids Pathogenic Variant Evaluation for Rare Disease

CHICAGO – A group at Spain’s National Center for Genomic Analysis-Center for Genomic Regulation (CNAG-CRG) in Barcelona has harnessed a protocol for accessing sequencing and variant data to help assess potentially pathogenic genetic variants within the context of a European Union-funded program to improve diagnosis of rare diseases. The CNAG-CRG…

Continue Reading Remote Visualization of Local Genome Alignments Aids Pathogenic Variant Evaluation for Rare Disease

find tandem repeats in DNA from CRAM/VCF file

find tandem repeats in DNA from CRAM/VCF file 0 I want to find tandem repeats in DNA. I have access to CRAM file and the VCF file. I initially tried to get the insertions from the VCF file, but I am not sure if the variant caller has included all…

Continue Reading find tandem repeats in DNA from CRAM/VCF file

Standards, Regulation, Funding Move Bioinformatics in 2022, But Hurdles to Precision Medicine Remain

CHICAGO – Although the US Food and Drug Administration (FDA) provided some long-sought clarity in 2022 on how it would regulate clinical decision support and in vitro diagnostic software, technology developers and healthcare organizations still struggled with how to integrate genomics data into clinical practice. It will likely take more…

Continue Reading Standards, Regulation, Funding Move Bioinformatics in 2022, But Hurdles to Precision Medicine Remain

Compressing BAM, SAM, CRAM | Genozip

How good is Genozip at compressing BAM files? ​ See Benchmarks. ​ Compressing a BAM, SAM or CRAM file  ​ In the rest of this page we will give examples of BAM files. Genozip is also capable of compressing SAM files, and with some limitations, CRAM files as well. ​…

Continue Reading Compressing BAM, SAM, CRAM | Genozip

Getting information on CRAM files from headers inside the files

Getting information on CRAM files from headers inside the files 1 Hello. I wish to know if one can find the following information in CRAM files’ headers: 1) Whether or not sequencing data in CRAM files is from WGS or WES, and if so, where? and 2) In case one…

Continue Reading Getting information on CRAM files from headers inside the files

Samtools Convert Sam To Bam With Code Examples

Samtools Convert Sam To Bam With Code Examples In this session, we’ll try our hand at solving the Samtools Convert Sam To Bam puzzle by using the computer language. The code that follows serves to illustrate this point. # Basic syntax: samtools view -S -b sam_file.sam > bam_file.bam # Where:…

Continue Reading Samtools Convert Sam To Bam With Code Examples

Index of /~psgendb/doc/pkg/samtools-1.7/htslib-1.7/cram

Name Last modified Size Description Parent Directory   –   cram.h 2015-06-24 11:00 2.4K   cram_codecs.c 2017-09-26 09:28 50K   cram_codecs.h 2016-03-17 07:48 6.0K   cram_codecs.o 2018-03-04 16:57 175K   cram_decode.c 2018-01-26 05:33 84K   cram_decode.h 2013-10-16 06:15 3.4K   cram_decode.o 2018-03-04 16:57 236K   cram_encode.c 2017-07-03 16:45 87K  …

Continue Reading Index of /~psgendb/doc/pkg/samtools-1.7/htslib-1.7/cram

CNV Pipeline Options

The following are the top-level options that are shared with the DRAGEN Host Software to control the CNV pipeline. You can input a BAM or CRAM file into the CNV pipeline. If you are using the DRAGEN mapper and aligner, you can use FASTQ files. …

Continue Reading CNV Pipeline Options

How to trim the length of reads in a CRAM file?

How to trim the length of reads in a CRAM file? 0 I have a CRAM file with paired reads which looks like this: im13@node-13-21:~/scratch_im13_projects/im13_basespace_runs$ samtools view ./walkup_194_repeat/CRAM/A01_FR_KAPA_25x_1ug_SR_1ngx4rxns_S1.cram | head D00586:937:HVCWGBCX3:1:1101:1485:1803 77 * 0 0 * * 0 0 NCAGAGGAAGCGGAACGCATGTTTC #<GGGIIGIGGGIIGIGIIGGG.<< D00586:937:HVCWGBCX3:1:1101:1485:1803 141 * 0 0 * * 0 0…

Continue Reading How to trim the length of reads in a CRAM file?

Index of /~psgendb/birchhomedir/public_html/doc/pkg/samtools-1.7/htslib-1.7/htslib

Name Last modified Size Description Parent Directory   –   bgzf.h 2018-01-10 07:45 14K   cram.h 2015-09-25 05:36 15K   faidx.h 2017-02-07 11:06 5.6K   hfile.h 2018-01-26 05:33 9.6K   hts.h 2017-11-24 09:46 29K   hts_defs.h 2017-08-10 11:07 3.3K   hts_endian.h 2017-09-27 10:40 11K   hts_log.h 2017-06-03 15:45 3.8K  …

Continue Reading Index of /~psgendb/birchhomedir/public_html/doc/pkg/samtools-1.7/htslib-1.7/htslib

How To Install libhts-dev on Kali Linux

In this tutorial we learn how to install libhts-dev on Kali Linux. libhts-dev is development files for the HTSlib Introduction In this tutorial we learn how to install libhts-dev on Kali Linux. What is libhts-dev HTSlib is an implementation of a unified C library for accessing common file formats, such…

Continue Reading How To Install libhts-dev on Kali Linux

Samtools Htslib Issues

Issue Title State Comments Created Date Updated Date How to get a specific chromosome open 1 2022-07-14 2022-07-18 tabix returns row from VCF file multiple times open 4 2022-07-11 2022-07-18 Modified base parsing failure failure closed 0 2022-07-01 2022-07-18 extract genotype information open 1 2022-06-24 2022-07-18 sam_hdr_remove_lines is inefficient if…

Continue Reading Samtools Htslib Issues

Ubuntu Manpage: alleleCounts.pl – Generate tab seperated file with allelic counts and depth for each

Provided by: liballelecount-perl_4.2.1-1_all NAME alleleCounts.pl – Generate tab seperated file with allelic counts and depth for each specified locus. SYNOPSIS Where possible use the C version for large data (it’s also more configurable). alleleCounts.pl Required: -bam -b BAM/CRAM file (expects co-located index) – if CRAM see ‘-ref’ -output -o Output…

Continue Reading Ubuntu Manpage: alleleCounts.pl – Generate tab seperated file with allelic counts and depth for each

Ubuntu Manpage: bamfillquery – fill query sequences into BAM files

Provided by: biobambam2_2.0.179+ds-1_amd64 NAME bamfillquery – fill query sequences into BAM files SYNOPSIS bamfillquery [options] <in.bam queries.fasta >out.bam DESCRIPTION bamfillquery reads a SAM/BAM/CRAM file and a FastA file, copies the sequences found in the FastA file into the query sequence field of the SAM/BAM/CRAM file and writes the resulting data…

Continue Reading Ubuntu Manpage: bamfillquery – fill query sequences into BAM files

[SpotBugs] htsjdk.samtools.cram.structure.CramHeader defines clone() but doesn’t implement Cloneable

Cloneable is not used very much so maybe deprecate and remove the clone() method? /cc @jmthibault79, @cmnbroad See spotbugs.readthedocs.io/en/stable/bugDescriptions.html#cn-class-defines-clone-but-doesn-t-implement-cloneable-cn-implements-clone-but-not-cloneable Part of #1267 Report: In class htsjdk.samtools.cram.structure.CramHeader In method htsjdk.samtools.cram.structure.CramHeader.clone() At CramHeader.java:[lines 80-85] Read more here: Source link

Continue Reading [SpotBugs] htsjdk.samtools.cram.structure.CramHeader defines clone() but doesn’t implement Cloneable

bioconductor – Trouble installing Rhtslib in R/R studio

I’m using RStudio on Ubuntu 18 and I’m trying to install the htslib package from the Bioconductor repo, but I’m stuck now. This is what I get: * installing *source* package ‘Rhtslib’ … ** using non-staged installation via StagedInstall field ** libs cd “htslib-1.7” && make -f “/usr/lib/R/etc/Makeconf” -f “Makefile.Rhtslib”…

Continue Reading bioconductor – Trouble installing Rhtslib in R/R studio

Read bam/cram file with IGV from aws s3

Hi all, We store our alignment files on aws s3. I would like to be able to open them with IGV without needing to download them completely, but I can’t find an optimal solution. If I get a pre-signed url it works but it’s not convenient. I try to follow…

Continue Reading Read bam/cram file with IGV from aws s3

Ubuntu Manpage: samtools reheader – replaces the header in the input file

Provided by: samtools_1.13-2_amd64 NAME samtools reheader – replaces the header in the input file SYNOPSIS samtools reheader [-iP] [-c CMD | in.header.sam ] in.bam DESCRIPTION Replace the header in in.bam with the header in in.header.sam. This command is much faster than replacing the header with a BAM→SAM→BAM conversion. By default…

Continue Reading Ubuntu Manpage: samtools reheader – replaces the header in the input file

The Biostar Herald for Tuesday, September 21, 2021

The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here. This edition of the Herald was brought to you by contribution from Istvan Albert, and was edited by Istvan…

Continue Reading The Biostar Herald for Tuesday, September 21, 2021

Bedtools: Merging Many Bed Files

Bedtools: Merging Many Bed Files 2 I am using the algorithm CookHLA for my research. As part of its preparation, I need to feed it a bed file representing at least 100 of my samples. I have made the bed files for 500 samples using samtools and bedtools in a…

Continue Reading Bedtools: Merging Many Bed Files

Best Omic file compressor?

Best Omic file compressor? 1 Our team has been having storage space issues; we predicted that we will not have enough available memory to store the files generated by our pipelines. Standard file compressors (gzip, bzip2, 7zip) weren’t cutting it and I started experimenting with file-specific compressors. This is where…

Continue Reading Best Omic file compressor?

[main_samview] fail to read the header from “-“.

[main_samview] fail to read the header from “-“. 1 I am attempting to run a file through an algorithm I have been using, HLA*LA. On running the samtools command within the algorithm, I have unfortunately been getting this error. After trying to debug this following other guides, I am seeking…

Continue Reading [main_samview] fail to read the header from “-“.

How to extract all sequences mapped to a transcript from Kallisto output

How to extract all sequences mapped to a transcript from Kallisto output 0 I ran Kallisto with the –pseudobam option. How do I extract all the short reads that are mapped to a single transcript (e.g. ENST00000367969.8)? As a person without any previous SAM/BAM experience, I tried the following things…

Continue Reading How to extract all sequences mapped to a transcript from Kallisto output

install GenomicFeatures fail

install GenomicFeatures fail 1 @5b9023e7 Last seen 19 hours ago China BiocManager::install(‘GenomicFeatures’) results show ‘getOption(“repos”)’ replaces Bioconductor standard repositories, see ‘?repositories’ for details replacement repositories: CRAN: mirrors.tuna.tsinghua.edu.cn/CRAN/ Bioconductor version 3.14 (BiocManager 1.30.16), R 4.1.0 (2021-05-18) Installing package(s) ‘GenomicFeatures’ also installing the dependencies ‘Rhtslib’, ‘Rsamtools’, ‘GenomicAlignments’, ‘rtracklayer’ Packages which are only…

Continue Reading install GenomicFeatures fail

.tar.gz = same size as before?

BAM compression: .tar.gz = same size as before? 2 I tried to compress 5 bam files using: tar -czvf original_bams.tar.gz *.bam The resulting file sizes (“ll –block-size=M”) are: 8067M file1.bam 6962M file2.bam 10662M file3.bam 7794M file4.bam 7346M file5.bam 40828M original_bams.tar.gz There’s a difference of 3MB between the archive and the…

Continue Reading .tar.gz = same size as before?