Creating a variation graph for Giraffe alignment from assemblies
Creating a variation graph for Giraffe alignment from assemblies 0 I have a collection of ~100 4.5 megabase haploid assemblies that I would like to map to using giraffe. However, I am not completely clear on what the best practices are to construct the graph starting from the assemblies. I…
integrative multiomic data
integrative multiomic data 0 I have to integrate cancer patients cohort data – protein, mRNA, Methylation, mutation profile to identify genes associated with survival. What may be the best strategy. I have looked at iNMF, icluster plus etc, that are based on different methods but could not find out what…
Inquiry related to vcf file and formatting
Hello everyone, I am trying to run predixcan software. But its showing error as segmentation fault implying that there is something wrong with my vcf files. I am sharing the header of vcf file. ##fileformat=VCFv4.1 ##INFO=<ID=LDAF,Number=1,Type=Float,Description=”MLE Allele Frequency Accounting for LD”> ##INFO=<ID=AVGPOST,Number=1,Type=Float,Description=”Average posterior probability from MaCH/Thunder”> ##INFO=<ID=RSQ,Number=1,Type=Float,Description=”Genotype imputation quality from…
Difference in alignment length between FASTA and HitTable
Difference in alignment length between FASTA and HitTable 0 Hello all, I’ve a horrible feeling this is going to be a stupidly obvious answer but I’ve had no luck finding a similar question amongst the forum or in the BLAST manual. I’ve used BLAST on some sequences. I’ve then downloaded…
Data Stewardship Community Manager
Job:Data Stewardship Community Manager 0 We are looking for a new Community Manager to join ELIXIR-UK and lead our new data steward training Fellowship programme and manage the community that develops. Do you have great communication skills, want to help others and enjoy working in an interdisciplinary team? Come and…
Is subtelomeric region and pericentromeric region defined in human genome?
Is subtelomeric region and pericentromeric region defined in human genome? 2 I’ve been trying to see if there’s any coordinates for these but doesn’t have much luck. Saw a bunch of people defining it by +-2MB around the centromere gap and 30kb away from the telomere. I was wondering if…
Can someone explain the differences between various 1000 genome project and gnomad call sets? Also any straightforward PCA implementation on them?
I’ve been trying to delve into the data from whole genome sequencing, specifically by looking at the already existing data in the 1000 genome project and gnomad, and I have a lot of questions. Does gnomAD contain the 1000gp samples? I’ve found many vcf including these: ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/supporting/hd_genotype_chip/ ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/integrated_sv_map/ ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase1/analysis_results/integrated_call_sets/ gnomad.broadinstitute.org/downloads…
Ten simple rules for biologists initiating a collaboration with computer scientists
I think it depends. I think it is probably quite a good guide for Bioinformaticians that want to collaborate with “proper” computer scientists when they need new computer science to solve their problems. I think its less good for biologists who need to collaborate with someone to give more computational…
How to create a BED12 file defining UTR sequences
Hello, I am doing an experiment and I need to build a BED12 file for some UTR sequences that I have. I have done a blast for those sequences and with that I was able to build a successful BED6 file, like this: 19 20752377 20758767 ENSDARG00000062634_Kat2b_Tscan 0 + 15…
What does samtools mean by ‘orientation’ when marking duplicates?
What does samtools mean by ‘orientation’ when marking duplicates? 0 Hello Biostars, It is my understanding that samtools marks duplicates on the basis of the 5′ position of reads and also the orientation of reads. This is based on my reading of the following: www.htslib.org/algorithms/duplicate.html However, I am not sure…
Tools for Alternative Splicing Events in RNA-Seq analysis
Tools for Alternative Splicing Events in RNA-Seq analysis 0 Hello community, I’ve been doing a bit of a literature search on alternative splicing analysis using RNA-Seq data, there are quite a few tools that I have come across which I will list below. As well as some of these tools…
heatmap column(sample) names disordered
heatmap column(sample) names disordered 0 Hi dear biostar community, I have a problem with my heatmap output. I want to draw a heatmap of 45 topvar genes for 52 samples. but my column (sample) names are shown disordered. (e.g. f1d#num is femal-1dpi-disease, and f1c#num is female-1dpi-control) as you can see,…
Filtering MSA by similiarity to a consensus sequence/motif
Filtering MSA by similiarity to a consensus sequence/motif 0 Dear all, anyone knows a good way of filtering or sorting a large multiple sequence alignment (~8000 sequences) by similarity to a given consensus sequence? A solution using python/biopython would be optimal. Any help is appreciated! Best Jonathan biopython motif multiple…
blastpgp -b parameter?
blastpgp -b parameter? 0 hey do you know what is for -b parameter in blastpgp? blastpgp -d db -a 6 -e 0.005 -h 0.005 -j 5 -v 50 -b 750 cheers blast • 11 views Login before adding your answer. Source link
Upcoming online training courses
News:Upcoming online training courses 0 Physalia-courses will be hosting numerous online training courses and workshops throughout the rest of 2021! Please visit our website for the specific topics: www.physalia-courses.org/courses-workshops/ Should you have any questions, please do not hesitate to get in touch with us. Best regards, Carlo Python R •…
Sequencing file conversion
Sequencing file conversion 0 Hi, friends, I downloaded a set of scATAC-seq BAM files from an article database, and the author said that a BAM file is information about a cell. However, after a few days’ analysis of the script given by the author, I found that a CSV file…
GSVA R packages
GSVA R packages 1 Hello everyone, I’m trying to do a gene set varian analysis using R to detect a specific gene set signature of a specific pathway from 20 samples of RNA-seq. I have this files in BAM format but I don’t know what to do in order to…
PLINK ASSOC understanding the results
PLINK ASSOC understanding the results 1 Hello to all, I have 10 vcf files – 5 female fish and 5 male fish I have merged all 10 fish to one vcf file.(all_fish.vcf) I performed the plink association analysis on all 10 fish with the command: -noweb –const-fid –allow-no-sex –allow-extra-chr –pheno…
BMC issues passes to fully vaccinated people ahead of Mumbai local trains' resumption
Ahead of resumption of Mumbai local trains, Brihanmumbai Municipal Corporation (BMC) began issuance of local train passes to fully vaccinated … Source link
how to demultiplex paired end reads when R1 and R2 are identified by two different substrings?
I am struggling with finding a solution to a problem which seems easy but it’s not. I found many many questions that seems to be related (and I believe they are) but they are confusing and you never know which one fits your case. So there we go. I’ll try…
Where do I get a WES dataset of size
Where do I get a WES dataset of size <1GB 1 Can someone please tell me from where can I get the WES or WGS dataset of size <1GB WGS WES genomics • 164 views Just browse sra-explorer.info for datasets. I doubt you can meaningfully query for file size as…
Replace multiple text with corresponding text
Replace multiple text with corresponding text 1 Hi, I run an analysis and the software replaced the bacteria name with codes, and I have txt file as below: Order Original Name Code 1 Allostreptomyces_psammosilenae_DSM_42178 S1_f1 2 Embleya_hyalina_NBRC_13850 S2_f2 3 Embleya_scabrispora_DSM_41855 S3_f3 Because the analysis involved few hundreds bacteria, it would…
Antibody Matching Transcripts
Antibody Matching Transcripts 1 Hello everyone, I am interested to find the matching isoforms of a gene that my HPA antibody is able to hit. When I check the Human Atlas Protein website it gives me a list of 4 transcripts my antibody can recognize. However, when I open ensembl…
How to set variant FILTER in a VCF file based on overlap with regions in a BED file
I figured out how to do the annotation using BCFTools. 2 steps are needed. Input BED file requires 1 for each region where the annotation should be set Chr_01 1000 2000 1 Chr_05 5000 6000 1 Input header file: ##INFO=<ID=BAD_REGION,Number=0,Type=Flag,Description=”My bad region for some reason”> bgzip and tabix the bed…
print only columns with data from every line
print only columns with data from every line 0 Hi, I have a vcf file where is about 60 000 columns. Here is example of the first three lines: #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 10022-20416-17 10024-34469-18A 10025-34469-18B 10034-31625-18A 10035-31625-18B 10036-31625-18C 10042-29083-18 10044-34485-18A 10045-34485-18B 10046-34485-18C 10069-33802-18 10070-20895-17…
Sortmerna error
Sortmerna error 0 Hello, I am facing this error in one of my files during sortmerna [split:646] ERROR: Failed deflating readstring: @A00489:986:HGYHKDRXY:1:2123:9824:27242 1:N:0:CAGTGCTT+ACCTGGAA CGGCTGCCTCTCAGGGGCGGTGGGGGGCGCGGCCGGCAGCGGCCCGCGGGGCGCGGGGGGCACCGAGTCGCTGCTGAAGTCCAGCAGCGGTGCGGCGGCGGGGGGCACCGGAGCCGCGGACAGCCCGGCTGCGGGCTTCCTCTCCAGCAC + FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF::FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFF:F:FFFFFFFFF I have used the tool before many times and in this run all the other samples are also working without errors. The md5sum doesn’t…
Postdoc position in phylogenomics and evolution of beetles
The research group led by Dr Dagmara Żyła at the Museum and Institute of Zoology, Polish Academy of Sciences (MIZ, PAS) is looking for candidates for a postdoc position to work within the project entitled: “The Impact of the Paleocene-Eocene Thermal Maximum on diversification dynamics in Paederinae rove beetles” funded…
How to select dataset from 1000 genome
How to select dataset from 1000 genome 0 I am new to genomics. I want to start practicing. For that, I would like to select dataset (.vcf file) from 1000 genome. On what basis I should choose? 1000 file vcf genome • 20 views Login before adding your answer. Source…
How To Perform Peptide-Protein Docking
How To Perform Peptide-Protein Docking 4 Hi, could anybody give me an example about how to perform peptide-protein docking: the software and steps?Here is my problem:(1) a peptide about 10 amino acids(2) target protein (an enzyme with known PDB structure file, and known active sites)(3) how to find the most…
Mapping transcripts to mitogenome
Mapping transcripts to mitogenome 0 I have performed de novo assembly of a plant mitogenome (non-model plant). In addition, I also perform de novo transcriptome assembly of this particular plant species. How to verify if the mitogenome assembly is correct making use of the RNA-Seq data? Should I align my…
%% error in Rstudio
%% error in Rstudio 1 dc.markers %>% group_by(cluster) %>% top_n(2, wt = avg_logFC) the above code is giving error even after using dplyr and matrix libraries in seurat analysis in rstudio error : Error: Problem with filter() input ..1. i Input ..1 is top_n_rank(2, avg_logFC). x object ‘avg_logFC’ not found…
scRNAseq STAR create index how to set –sjdbOverhang
scRNAseq STAR create index how to set –sjdbOverhang 0 hello everyone, I want to create index before align read using STAR, the sjdbOverhang is described as ReadLength-1, so does it mean I should set 149 if the data is from Illumina 2×150bp? thanks scRNAseq STAR sjdbOverhang • 12 views Source…
Seurat to Trajectory Analysis
Seurat to Trajectory Analysis 0 Login before adding your answer. Traffic: 2582 users visited in the last hour Source link
Analysing RNA seq data
Analysing RNA seq data 1 Hello all! I am new analysing RNA-seq data. They give me a directory containing: 1 normal sample and 15 tumor samples, each one has its cel file, exp file, and dcl file. They ask me to do a reanalysis, but I think I need more…
Signac CallPeaks from multiple fragment files
Signac CallPeaks from multiple fragment files 0 I am attempting to run Macs2 CallPeaks on some multiome data and running into a problem when attempting to run CallPeaks command on multiple fragment file paths in Seurat object. peaks<-CallPeaks(DataCombined, macs2.path = “/anaconda3/bin/macs2”) FileNotFoundError: [Errno 2] No such file or directory: ‘/Users/Desktop/multiome/sc291/atac_fragments.tsv.gz…
How can I get the exact 3D structure of the protein to use the PDB file for PPI docking
How can I get the exact 3D structure of the protein to use the PDB file for PPI docking 1 I am working on an uncharacterized protein, and I need to know its PPI with rna polymrase in humans. Please, how can I get the exact 3D structure of the…
How to write a function (Python 3.8) to find FASTA entry for DNA sequence
Forum:How to write a function (Python 3.8) to find FASTA entry for DNA sequence 0 Prompt: FASTA files can contain an unlimited number of sequences, but are too large for most text editors to open and manipulate. One that can load a large FASTA file and extract a sequence of…
CWL capture multiple output files with prefix
CWL capture multiple output files with prefix 0 I have a CommandLineTool demo, and run as: demo –AA file1.txt –BB file2.txt –CC file3.txt –nthreads 4 input file: –AA output file: –BB –CC The fllowing CWL doesn’t work. error: (“Error collecting output for parameter ‘ofile1’:ndemo.cwl:32:7: Did not find output file with…
What is the best way to publish data from LIMS/ELN to DataLake
What is the best way to publish data from LIMS/ELN to DataLake 0 What is the best Strategy/Technique (like DB sync, push using APIs etc…) to publish Experiment/Study data from LIMS/ELN to Data lake/Data Warehouse? ELN DataLake Warehouse LIMS • 18 views Source link
Sequence alignment in Subio on personal PC
2 hours ago rehankkhan0123 • 0 Is Subio platform a good and trusted platform to perform sequence alignment? I have fastq files that I downloaded from NCBI, now I need to get raw count to perform further analysis like differential gene expression analysis. Source link
Codon usage with unknown/unspecified Nucleotides
Codon usage with unknown/unspecified Nucleotides – coRdon 0 Hi, I am perfoming Codon usage analysis with coRdon package, and I wondering about how this package or other bioinformatics tool handle sequences with a significant amount of “Ns”, I failed to find any tip about it or even a deep discussion….
Alignment using bwa-mem2
Alignment using bwa-mem2 0 Hello I need help in aligning the sequence with reference using bwa-mem2. I used the following code: bwa-mem2 mem -t 8 gch38.fa DE98NGSUKBD117612_1_1.fq DE98NGSUKBD117612_1_2.fq > d3_align.sam I got the following error: ERROR! Unable to open the file: gch38.fa.bwt.2bit.64 There is no gch38.fa.bwt.2bit.64 file. I have the…
Multiple data scientist positions in computational biomedicine
Job:Multiple data scientist positions in computational biomedicine 0 Job Description Several NIH-funded data scientist positions are available in Prof. Gaurav Pandey’s (research.mssm.edu/gpandey/) lab at the Icahn School of Medicine at Mount Sinai in New York City. The overall project for these positions is the design and implementation of novel machine/deep…
Is it normal for RCorrector to remove millions of reads?
Is it normal for RCorrector to remove millions of reads? 0 I’m trying to build De Novo transcriptomes for unsequenced plants to do sequence analysis. I’m trying to choose a tool for my first pass of quality filtering after running FastQC on my raw reads. I’ve tried AfterQC and RCorrector….
How to find the co ordinates of long reads (simulated by Badreads) with respect to the reference genome
How to find the co ordinates of long reads (simulated by Badreads) with respect to the reference genome 0 Hi, I have simulated a set of ONT long reads (10x) of E coli using the Badreads simulator tool. I was wondering is there any way I can know the co…
Downloading WGBS sequencing reads for cancer sample and healthy sample
Downloading WGBS sequencing reads for cancer sample and healthy sample 0 Hello, I am completely new in WGBS data analysis. I want to learn analysing differentially methylated regions between cancer and normal sample. How can I find these data easily? I’m looking for WGBS reads data. Your help is highly…
Comment: Direct – indirect binding of a transcription factor in chip-seq analysis
Try with different tools HOMER, MEME etc. May be try +/- 10bp from summit of peak. If same motif comes out than the probability increase for direct binding. Although this approach is just to get an idea that there is a direct interaction. Source link
Annotating cell types via integrating a query dataset with a reference dataset and then cluster
Mostly because it’s typically unnecessary given that reference-based classification should yield a similar result without being subjected to potential biases introduced during the integration process. SingleR (and presumably Seurat, I don’t know as I don’t use it) uses a reference dataset and asks “Which reference sample’s expression profile is…
Help speeding up HMMER’s HMMSearch algorithm for large fasta file with GNU Parallel
I’ve seen that HMMER can be sped up with GNU Parallel: Speed of hmmsearch I have around 100,000 sequences and a HMMER database of around 300 HMM profiles. I’m running everything at once but I’m wondering if it’ll be faster to split up the sequences and/or split up the jobs….
Survival Analysis Cut-off
Survival Analysis Cut-off 0 Hello guys, I am doing a survival analysis using TCGA-BRCA project data. I am trying different cut-offs to separate my samples into high and low risk groups, but since it is my first time I would like to ask a question just to be fully sure…
antiSMASH output
antiSMASH output 0 Hello. Can someone help me to interpret the antiSMASH output and the count number of BGCs by using command line. It would be great to receive your help. Thank you RESULTS antiSMASH • 65 views Login before adding your answer. Source link
linkage disequilibrium and haplotype analysis of GWAS .
linkage disequilibrium and haplotype analysis of GWAS . 0 Hi all, I have GWAS data. I have my data in 22 chromosome files in plink format. I have imputed genotype with Sanger imputation server. I use plink for my analysis but because plink 1.9 no more supports –hap…
How to visualise a phylogenetic tree with amino acids (double letter repeat) multiple-sequence alignment?
How to visualise a phylogenetic tree with amino acids (double letter repeat) multiple-sequence alignment? 0 I have a fasta file as shown below, rvd.fasta >t1 NI-NG-NR-NN-NG-HD-HD >t_temp5 NG-NG-NI-N*-NR-NI-NN-NG-NG-HD >tal8 NG-NG-NI-N*-ND-NI-NN-NG-NG-H*-NH-NI I have a newick file as follows, tree.newick (tal8:0.49999997,t_temp5:0.47298786,t1:28.37858179); I need to visualise both the tree and rvd.fasta file (multiple-sequence…
How to get the assembly larger than 3.5Gb in NextDenovo?
How to get the assembly larger than 3.5Gb in NextDenovo? 0 Hey there! NextDenovo assembles only 3.5Gb. Is it possible to somehow make the NextDenovo go to 17Gb? Thanks, Ural assemble NextDenovo genome large • 42 views Source link
Understanding bcftools command
Understanding bcftools command 1 I need to perform the following action to combine multiple vcf files into one BCF=/path_to_bcftools export BCFTOOLS_PLUGINS=$BCF/plugins DIR=/path_to_normal_vcf_file $BCF/bcftools merge -m all -f PASS,. –force-samples $DIR/*.vcf.gz | $BCF/bcftools plugin fill-AN-AC | $BCF/bcftools filter -i ‘SUM(AC)>1′ > panel_of_normal.vcf I don’t have access to command-line bcftools, and since…
Positive preliminary data on CRISPR treatment for blood diseases
Credit: CC0 public domain Stephan Grupp, MD, Ph.D., Chief and Medical Director of the Cell Therapy and Transplantation Section of the Institute for Cell and Gene Therapy at the Philadelphia Children’s Hospital (CHOP) and the first pioneer in cell immunotherapy in childhood. A collaborative team of researchers, including Cancer, recently…
Allele frequency calculation
Allele frequency calculation 0 Hello everyone, I use vcf tools to find AF values by using this command: vcftools –gzvcf $SUBSET_VCF –freq2 –out $OUT –max-alleles 2 The output I got from this is: chr pos nalleles nchr a1 a2 <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 1 22 16050408 2 846…
Fact Check-mRNA vaccines are distinct from gene therapy, which alters recipient’s genes
Vaccines that use mRNA technology are not gene therapy because they do not alter your genes, experts have told Reuters after contrary claims were posted online. Thousands of social media users have shared such posts since the rollout of COVID-19 vaccines began (here) – and have continued to do so…
Here’s Why Fulcrum Therapeutics Stock Is Shooting Higher Today
What happened Shares of Fulcrum Therapeutics (NASDAQ: FULC), an early clinical-stage biopharmaceutical company, are rocketing higher after an encouraging clinical trial readout. Investors excited about the company’s sickle-cell disease candidate pushed the stock 76% higher as of 10:35 a.m. EDT on Monday. So what This morning Fulcrum Therapeutics shared data…
From EC terms to GO terms to GO enrichment
From EC terms to GO terms to GO enrichment 0 Hi I did a KEGG annotation of RNAseq data. I want do do GO enrichment next. I do have the EC numbers of the genes of interest, so I converted the EC numbers to GO terms using EC2GO. But I…
Error in DESeqDataSetFromMatrix Function in DESeq Library
Error in DESeqDataSetFromMatrix Function in DESeq Library 1 I am trying to create a DESeq dataset object from a dataset in GEO as follows: deseq2_142731 <- DESeqDataSetFromMatrix(countData = GSE142731[,2:ncol(GSE142731)],colData = labels_gse142731,design = ~V1) However, I get an error: Error in DESeqDataSet(se, design = design, ignoreRank) : some values in assay…
Environmental DNA: Seeing the Unseen
We produce antibodies after vaccination, but how many do we need to prevent infection and the spread of COVID? PlayFull TranscriptDownload … Source link
What’s the difference between enrichKEGG and gseKEGG
What’s the difference between enrichKEGG and gseKEGG 3 Hi, I was wondering what is the difference between enrichKEGG and gseKEGG in R package ClusterProfile. Thanks! clusterprofiler KEGG • 2.3k views Source link
Manual annotation of cell types in single cell RNA-seq
Manual annotation of cell types in single cell RNA-seq 1 I have recently started working with scRNA-seq data. I am following the tutorials by the creators of Seurat. In the final section titled “Assigning cell type identity to clusters”, the authors mention that Fortunately in the case of this dataset,…
Am I Sabotaging Myself By Getting A Masters Instead Of A Phd?
Am I Sabotaging Myself By Getting A Masters Instead Of A Phd? 5 Hello everyone! I realize that this question has been asked before and I have read through some of the other threads but I figured I’d see if there are any more perspectives out there. I am currently…
The usage of sed
The usage of sed 1 sed -e ‘s/_scATAC_hg19_noDup_noMT.bam//g’ -e ‘s//directory/to/singleCell///g’ bamlist.txt | sed -e ‘s///t/g’ | awk ‘OFS=”t”{print $2}’ | tr ‘n’ ‘t’ > header.txt This replacement command is too complex. Can someone explain what this means? linux sed shell • 51 views • link updated 1 hour ago by…
SRA splitting for each metagenome-assembled genome
Job:SRA splitting for each metagenome-assembled genome 0 Hi everybody, we obtained viruses from water and sequenced them with Illumina. we formed different metagenomic-assembled genomes and get a Bioproject number and Biosample numbers (for each of them). Now, i should do SRA submission. But i cannot submit for my all genomes…
Samtools difference between Mapped and Unmapped read
Samtools difference between Mapped and Unmapped read 0 Hello, I am wondering what the difference is between a mapped and unmapped read in samtools. I am extremely new to this whole process, just trying to learn my way through so if you could “dumb” it down or try and explain…
align using file.ht2
align using file.ht2 1 now i downloaded in my terminal indexed file of UCSC hg19 and when i uncompress it , i found two files genome.5.ht2 genome.8.ht2 and every time i want to align my samples at indexed file this error show up [e::bwa_idx_load_from_disk] fail to locate the index files…
question about running CIRI-full
question about running CIRI-full 1 I’m using ciri-full to calculate the full length sequence of circRNAs ,and I can run the test data set successfully, but I can’t run my own data running test data set: java -jar ../CIRI-full.jar Pipeline -1 test_1.fq.gz -2 test_2.fq.gz -a test_anno.gtf -r test_ref.fa -d test_output/…
DESeq2 design question
I have a count matrix from an RNA-seq experiment that I’d like to normalize using DESeq2 and perform DE analysis on. My code is below: dds <- DESeqDataSetFromMatrix(countData = cts, colData = coldata, design= ~ condition) My experiment is performed over two time periods, week1 (with treated vs untreated) and…
R Programming – how to make a simple heat map
R Programming – how to make a simple heat map 5 Hi can anyone guide me how to make a simple heat map in R? Heatmap R • 264 views There is github.com/XiaoLuo-boy/ggheatmap which is fully ggplot in case you feel more comfortable with it rather than the suggested pheatmap/ComplexHeatmap…
Caribou Biosciences Raises $304M in Potentially the Largest Gene Editing IPO | Rothwell, Figg, Ernst & Manbeck, P.C.
Caribou Biosciences, Inc., a Berkeley, California-based CRISPR genome-editing biopharmaceutical company, raised $304M in an initial public offering, one of the most lucrative IPOs in gene-editing. In June 2021, Gene editing biotech Verve raised $267M in IPO proceeds and later added another $40 million after its financial underwriters opted to buy…
Trimming of adapters and indexes
Trimming of adapters and indexes 0 I investigate a protein which binds small DNA (<30 nt) and have a library of these small DNA. I know that adapters and indexes are from this site (5′ adapter has T instead of U). [To reach the page I want to show click…
EOF marker absent in VCF
EOF marker absent in VCF – can this be safely ignored? 0 Hi, I generated a VCF file using a bcftools mpileup | bcftools call pipeline. I have done this before, and the file produced then looks fine. However, the log for this one had [W::bgzf_read_block] EOF marker is absent….
Error while subsetting VCF – error doesn’t check out with (z)grep
Error while subsetting VCF – error doesn’t check out with (z)grep 0 I’m using bcftools view -s to subset a VCF.gz file. I ran into an error: [E::vcf_parse_format] Number of columns at chr9:44897051 does not match the number of samples (90 vs 99) To look at this site, I ran…
Low assignment rate with featureCounts
Low assignment rate with featureCounts 0 I used STAR to align my reads (brain samples) to human reference genome. Getting good unique mapping rates (~70-90%). However, when I use featureCounts I get really low assignment rates. Here is an example command featureCounts -p -t exon -g gene_id -s 2 -T…
seurat `@assays$RNA@counts` vs `@assays$RNA@data`?
seurat `<obj>@assays$RNA@counts` vs `<obj>@assays$RNA@data`? 1 I have two matrices called <object>@assays$RNA@counts and <object>@assays$RNA@data that are both real non-negative. What is the difference between these? seurat • 46 views • link 11 minutes ago by mk ▴ 230 Source link
k-mer counters – presence/absence matrix
k-mer counters – presence/absence matrix 2 Hi lizabe, You’re right that this tutorial is out of date. The –matrix option is no longer valid as an option to jellyfish count. However, I don’t think it’s original intent was to do what you wanted anyway. It doesn’t write out a binary…
remove effect of latent variables from log fold change
The findMarkers function of seurat allows users to specify latent variables to be adjusted for when finding differentially expressed genes. I am testing for differences in gene expression between 2 groups – disease vs normal. For the statistical test, I am using LR, described below: LR: disease state is modelled…
Python fast way to get ONLY MAIN metadata for GSE ? (not walking through thousands underlying GSM-samples : slow or even endless)
Not Python but using EntrezDirect you can get: $ esearch -db bioproject -query “GSE118723” | esummary | xtract -pattern DocumentSummary -element Project_Description Quantification of gene expression levels at the single cell level has revealed that gene expression can vary substantially even across a population of homogeneous cells. However, it…
STAR rna-seq for bacterial genomes
Hi, I’m willing to use STAR for bacterial genomes. I wanted to ask if this is strongly unadvised or if there is a way to manage the main challenges of mapping reads to prokaryotes. (I know there are specific tools for this purpose, i.e. EdgePro, but I’m a beginner in…
Help with finding p value
Help with finding p value 1 Hello, I’m doing a study to compare 3 groups with different n. I was wondering if anyone could guide me to find the best way to get P-value between these groups using excel p-value • 68 views Source link
I am converting the fq.gz. files (which are the results of the mgi study) to bam files to view on igv.
I am converting the fq.gz. files (which are the results of the mgi study) to bam files to view on igv. 0 Hey everyone, before i start apologies for the inconvenience cause of my wrong or inappropriate use of terms. I take some fails of bwa mem lately. As i…
Map Entire Directory of Paired-End Reads at Once
Map Entire Directory of Paired-End Reads at Once 0 Is there a way to map an entire directory of reads at once? Would I just have to write a script for this specific to my directory structure and data? I’m using BWA MEM to map 49 paired-end reads and have…
some values in assay are negative
some values in assay are negative 0 Hi, We are trying to analyze information and we have this problem. Our data does not contain any negative values at all. deseq2 • 35 views Source link
What Galaxy tools add Ns to variable length FASTQ sequences to get uniform length? (FASTA if needed)
What Galaxy tools add Ns to variable length FASTQ sequences to get uniform length? (FASTA if needed) 1 Hello! I am attempting to perform alignments for a variety of FASTQ files. I need the sequences to be the same length, 250 bp. That being said, I do not want to…
DESeq2 analysis result differences
Hello, I performed patch-seq for 2 sets of neurons and then used DESeq2 to look for transcriptomic differences between the groups. One group consists of 7 neurons and the second group consists of 9 neurons. Genes that meet a threshold criteria of L2FC of more than 1.5 and adjusted p-value…
Converting mouse Gene IDs to Human while keeping genes that don’t convert
Hi there, I am using bioMart to convert some gene IDs from mouse to human for some data I generated through RNA-seq. I am currently mapping using the following function: convertMouseGeneList <- function(x){ require(“biomaRt”) human = useMart(“ensembl”, dataset = “hsapiens_gene_ensembl”) mouse = useMart(“ensembl”, dataset = “mmusculus_gene_ensembl”) genesV2 = getLDS(attributes =…
Problems with fragment length distribution output with Salmon
Problems with fragment length distribution output with Salmon 1 Hi all, New to RNA-Seq and I’m struggling with my Salmon alignment output. I tried to find an answer to this question on older posts but I couldn’t locate any other discussions, so apologies in advance if this has been covered…
Histone marks enrichment analysis
Histone marks enrichment analysis 0 Hello everyone, here’s my question: I have a bed file of human genomic coordinates (hg19), and I would like to know whether ChIP-seq peaks for specific histone marks (such as those from ENCODE) are significantly more represented within my test regions compared to a background…
PyMol – molecule export problem
PyMol – molecule export problem 0 Dear all, I have a molecule in PyMol which I want to open in ChimeraX. But after exporting the molecule as .pdb, some structural information seems to be lost – regions that in the PyMol session file are embedded into helices are now loops…
MinION Data Examples (FAST5) Database
MinION Data Examples (FAST5) Database 0 Hello everyone, I am constructing a pipeline to analyze Oxford Nanopore MinION data. I have start from FAST5 files and for some optimizations I will try multiple tools for each step. So I will need several datasets to try. As I see most of…
Relative abundance of differentially abundant ASVs after DESeq2
Relative abundance of differentially abundant ASVs after DESeq2 0 Hello, I used DESeq2 to see which ASVs were differentially abundant between different treatments on 16S metabarcoding data. I now want to plot the relative abundance (in %) of those ASVs. However, I am unsure which data would make to most…
R Programming
R Programming 4 Hi can anyone guide me how to make a simple heat map in R? in Heatmap R • 201 views There is github.com/XiaoLuo-boy/ggheatmap which is fully ggplot in case you feel more comfortable with it rather than the suggested pheatmap/ComplexHeatmap packages and want to have a consistent…
GEO submission when I have raw data in SRA
GEO submission when I have raw data in SRA 0 I am trying to submit my scRNA-seq data to GEO. GEO submission guidelines state that I should upload metadata, raw and processed data. And they submit the raw data to SRA on my behalf. But I already submitted my raw…
phasing with shapeit for non human spicies
phasing with shapeit for non human spicies 0 I am trying to phase genotyping data in non-human organism. I have a reference for my plant species only in fasta format, but required input for shapeit are phased.gz legend.gz and .sample how can I do phasing with my reference panel? Any…
PDRA in Computational Biophysics and Cancer Research, University of Manchester, UK
Research Associate in Computational Biophysics, University of Manchester Job reference: BMH-017047 Location: Oxford Road, Manchester, UK Closing date: 19/08/2021 Salary: £32,816 per annum Employment type: Fixed Term Faculty/Organisation: Biology, Medicine & Health School/ Directorate: Molecular & Cellular Function Hours per week: Full Time Contract Duration: 01 August 2021 until 31…
Analyzing TCRseq Data
Analyzing TCRseq Data 1 Hi everyone, I am new to TCRseq and I have some data that I would like to start analyzing. I was hoping I can get people’s input on what the best package to analyze the 10x V(D)J output would be. I am currently debating between immunarch…
Corelate TCR data to clusters/GEX/CITEseq data
Corelate TCR data to clusters/GEX/CITEseq data 1 Hello everyone, I just added my TCR VDJ data as metadata to my Seurat object (as described in the tutorial here). So, I basically ended up with two different collumns of metadata where my barcodes are assigned to the clonotypes and the cdr3…