PDRA in Computational Biophysics and Cancer Research, University of Manchester, UK

Research Associate in Computational Biophysics, University of Manchester Job reference: BMH-017047 Location: Oxford Road, Manchester, UK Closing date: 19/08/2021 Salary: £32,816 per annum Employment type: Fixed Term Faculty/Organisation: Biology, Medicine & Health School/ Directorate: Molecular & Cellular Function Hours per week: Full Time Contract Duration: 01 August 2021 until 31…

Continue Reading PDRA in Computational Biophysics and Cancer Research, University of Manchester, UK

Analyzing TCRseq Data

Analyzing TCRseq Data 1 Hi everyone, I am new to TCRseq and I have some data that I would like to start analyzing. I was hoping I can get people’s input on what the best package to analyze the 10x V(D)J output would be. I am currently debating between immunarch…

Continue Reading Analyzing TCRseq Data

Corelate TCR data to clusters/GEX/CITEseq data

Corelate TCR data to clusters/GEX/CITEseq data 1 Hello everyone, I just added my TCR VDJ data as metadata to my Seurat object (as described in the tutorial here). So, I basically ended up with two different collumns of metadata where my barcodes are assigned to the clonotypes and the cdr3…

Continue Reading Corelate TCR data to clusters/GEX/CITEseq data

blastx versus tblastx

hello everyone I have a question about the blast. I admit that I do not understand everything. I have been asked to blastx an fsa file of arabidopsis thaliana sequences against an oak gene model. In order to see if there were any matching sequences between the two species: My…

Continue Reading blastx versus tblastx

Integrated Dimension Reduction Plot for CD4/CD8 sorted Feedback

Integrated Dimension Reduction Plot for CD4/CD8 sorted Feedback 1 Hello, I have recently followed adopted the Harvard Chan Bioinformatics Core guidelines for SC QC/Normalization/Clustering (hbctraining.github.io/scRNA-seq_online/schedule/links-to-lessons.html). I have integrated CD4+/CD8+ T cells from two time points. I recently received feedback that my integrated dimension reduction plot clustering looked problematic. Specifically, the…

Continue Reading Integrated Dimension Reduction Plot for CD4/CD8 sorted Feedback

computematrix how to set certain TSS

computematrix how to set certain TSS 1 how to set certain TSS deeptools computematrix, I want to visualize the certain gene’s TSS up/downstream instead of whole gene,but how can I do it ? thanks. deeptools • 34 views You probably want the reference-point mode as described here. You will also…

Continue Reading computematrix how to set certain TSS

Advise for a computational neophyte

Forum:Advise for a computational neophyte 0 Hello, biostars! Once I have earned B.Sc. in Biotechnology (in Russia it’s a pretty weird combination of biology, engineering and almost all branches of chemistry), turned to a bigger city to gain a M.Sc. degree. In our research lab we have 2 professors for…

Continue Reading Advise for a computational neophyte

Pooling annotations from different databases in InterProScan

Pooling annotations from different databases in InterProScan 0 Is it acceptable to pool the annotations from the various sources InterProScan offers, and annotate a sequence with a subset of these? For example, if I have something like so: id annot src start stop seq1 dom1 Pfam 100 120 seq1 dom1a…

Continue Reading Pooling annotations from different databases in InterProScan

Error in haploview data format

Error in haploview data format 0 Hi everyone , I’m trying to use haploview software to plot LD for arch chromosomes but I have tassel format data(HapMap) I converted the data to plink format via Tassel (write plink) and I have two files ,Ped file and map file , after…

Continue Reading Error in haploview data format

Read group info

Read group info 0 Hello I need help in getting read group info for performing alignment using BWA-MEM2. I read previous post (bwa mem: Passing a variable to read group) on read-group info, where a shell script is used to get the read group info from fastq file. Can someone…

Continue Reading Read group info

How to search dbSNP using a list of SNPs and retrieve Gene name (hgnc symbol if existing, otherwise just whatever is in there)

How to search dbSNP using a list of SNPs and retrieve Gene name (hgnc symbol if existing, otherwise just whatever is in there) 2 I have a list of 500.000 SNPs from which I want to obtain the gene name. I try to search with biomaRt library(data.table) library(biomaRt) rs <-…

Continue Reading How to search dbSNP using a list of SNPs and retrieve Gene name (hgnc symbol if existing, otherwise just whatever is in there)

Functional enrichment analysis of bacteria

Functional enrichment analysis of bacteria 2 Hello, I have gene lists from an RNA-seq experiment from E.coli bacteria. So far, I have only worked with model organisms, which are supported by biomaRt, so conversion of gene IDs and functional enrichment analysis within R was easy. Now that I am working…

Continue Reading Functional enrichment analysis of bacteria

Making a living with bioinformatics

Please do not find this offensive in any way. Please do not get discouraged by this post. I mean nothing bad by this post. I just need some advice as lately (for a year now) I am finding it very difficult to “see the light at the end of the…

Continue Reading Making a living with bioinformatics

Variant annotation using Illumina Basespace Variant Interpreter

Variant annotation using Illumina Basespace Variant Interpreter 0 Hello, I am working on whole exome data (Illumina data) and now I want to do variant annotation. I used several methods for annotation but not getting enough information. I heard about Illumina Basespace variant Interpreter, but I dont know how to…

Continue Reading Variant annotation using Illumina Basespace Variant Interpreter

microarray miRNA expression data analysis

I wrote a script on how to analyze the microarray-based miRNA expression data. Here is my code: # general config baseDir <- ‘.’ annotfile <- ‘mirbase_genelist.tsv’ setwd(baseDir) options(scipen = 99) require(limma) # read in the data targets <- read.csv(“/media/mdrcubuntu/46B85615B8560439/microarray_text_files/targets.txt”, sep=””) # retain information about background via gIsWellAboveBG project <- read.maimages(targets,source=”agilent.median”,green.only…

Continue Reading microarray miRNA expression data analysis

Using joiningdata/lollipops to make lollipop chart

Using joiningdata/lollipops to make lollipop chart 0 I’m having trouble downloading and using pbnjay’s lollipop I went and downloaded the mac version and opened up “lollipops” in the command line. I closed the terminal and then opened it again, and tried: lollipops TP53 R273C R175H T125 R248Q but go the…

Continue Reading Using joiningdata/lollipops to make lollipop chart

Aligning 23andme to reference genome

Aligning 23andme to reference genome 0 I’ve got some 23andMe data that I’m playing around with and was wondering if the SNPs could be aligned to a reference genome to subsequently be turned into a VCF. The txt file has the genomic positions so I guess it’s possible? I just…

Continue Reading Aligning 23andme to reference genome

Beginner level projects for bioinformatics.

Beginner level projects for bioinformatics. 0 Hi, I am a 1 year masters bioinformatics student and I have skills such as python and I am looking for some begginer level projects to add in my CV. Can anyone suggest me some good projects? science Python data • 66 views Source…

Continue Reading Beginner level projects for bioinformatics.

Bioperl SeqIO.pm cannot be found

Hello, I am trying to run this command line in the terminal from Mac OS, which is copied from a protocol: FEELnc_codpot.pl –i <unannotated_transcript.fa> -a <mRNA_sequence.fa> -g <reference_genome.fa> -m ’shuffle’ –o <unannotated_transcript> However, I get the following error: command not found, probably because I am not using perl, so I…

Continue Reading Bioperl SeqIO.pm cannot be found

Allele-Specific analysis for human WGBS data

Allele-Specific analysis for human WGBS data 0 Hi everyone, I need to perform allele-specific methylation analysis for human whole genome bisulfite sequencing data. As Allele-specific analysis is dependent on SNPs/Polymorphic sites, I stuck with two queries: Is there a way to get these polymorphic sites from WGBS data ? (I…

Continue Reading Allele-Specific analysis for human WGBS data

OP-ED: 6 vaccination myths to put to rest

There’s a lot of scare-mongering out there, but don’t believe  everything you hear Myth 1: The Covid-19 vaccine will affect a woman’s fertility This myth was sparked when a social media post was shared in December 2020 by Dr Wolfgang Wodarg, a physician and former chief scientist for allergy and…

Continue Reading OP-ED: 6 vaccination myths to put to rest

How to check which genes affect a continuous phenotype

How to check which genes affect a continuous phenotype 1 I want to test which genes affect a change in a condition. The condition is measured on a continuous scale. The data comes from micro-arrays and there are two batch effects to be accounted for. I thought of performing a…

Continue Reading How to check which genes affect a continuous phenotype

BUSCO installation and run

BUSCO installation and run 2 That unil.ch link looks a bit too complicated, these are the commands I ran to get BUSCO via Singularity: Download the image: singularity pull Run the image with example genome.fasta, assuming it’s a genome and a plant: singularity run -B $(pwd):/busco_wd/ busco_v5.2.2_cv1.sif busco -l…

Continue Reading BUSCO installation and run

Maryland woman dies while hiking in Maine’s western mountains

Aug. 7—A Maryland woman died Friday while hiking in Maine’s western mountains. Barbara Goldberg, 78, of Potomac, Maryland, was dropped off about 9 a.m. in Stow, where she intended to hike Blueberry Mountain, according to Mark Latti, a spokesperson for the Maine Department of Inland Fisheries and Wildlife. Her partner,…

Continue Reading Maryland woman dies while hiking in Maine’s western mountains

Strange difference in the order of probe ID between my matrix from cel and the series matrix the author uploaded

Hi all,I met a very strange error when reading and doing RMA of the raw cel files. When i use the following codes to do the background correction and normalization of GSE18997 (platform GPL570), I found the order of some probe IDs of the final results seems to be different…

Continue Reading Strange difference in the order of probe ID between my matrix from cel and the series matrix the author uploaded

A chord plot for miRNA IDs by GOplot package

A chord plot for miRNA IDs by GOplot package 0 Dear all, I hope all of you be fine and healthy. I want to create a chord plot by GOplot package. My input data includes miRNA IDs. After performing the subsequent command, I encounter the following error: chord <- chord_dat(circ,…

Continue Reading A chord plot for miRNA IDs by GOplot package

Gene read count-level batch correction in scRNA-seq?

Gene read count-level batch correction in scRNA-seq? 0 Hi, I’m working on the integration of several scRNA-seq datasets. After trying Seurat v3 and Harmony, I realized they outputs dimension reduction matrix rather than correct read counts, therefore not suitable for some downstream analysis on gene-expression level. I wonder if there…

Continue Reading Gene read count-level batch correction in scRNA-seq?

how to plot genetic struture of bacterial genes?

how to plot genetic struture of bacterial genes? 0 Login before adding your answer. Traffic: 1092 users visited in the last hour Source link

Continue Reading how to plot genetic struture of bacterial genes?

How to interpret bimodal distribution of GC-content for RNAseq and can it be remedied ?

How to interpret bimodal distribution of GC-content for RNAseq and can it be remedied ? 0 A colleague of mine have got the following distribution of GC-content for RNAseq. How to interpret bimodal distribution of GC-content for RNAseq ? Does it mean some contamination ? Is there any method to…

Continue Reading How to interpret bimodal distribution of GC-content for RNAseq and can it be remedied ?

VCF to 23 and Me format and changing ensamble reference help needed for underestanding VCF

Hello i am trying to change my nebula Genomics report to 23 and me Format i have to problems nebula uses 38 human ensemble and 23 and me 37, I was thinking to do a python script but i have some doubts: My plan was to change the genotype according…

Continue Reading VCF to 23 and Me format and changing ensamble reference help needed for underestanding VCF

Linux command to delete empty fastq.gz files

Linux command to delete empty fastq.gz files 1 Hi all, I am wondering how to use Linux command to remove all the fastq.gz files within a folder. As the compressed fastq.gz has a size of 20 bytes even if it’s empty after decompressing, I am not sure how to delete…

Continue Reading Linux command to delete empty fastq.gz files

Karyotyping variation by ethnicity

Karyotyping variation by ethnicity 0 Does the human karyotype vary based on ethnicity? If yes, then do clinical investigations of genetic abnormalities via karyotying need to be adjusted for such variations? If yes, how is this adjustment performed – some software I assume? If not, then why not? I was…

Continue Reading Karyotyping variation by ethnicity

Calculate population allele frequencies from a vcf file including multiple populations

Calculate population allele frequencies from a vcf file including multiple populations 2 I have a vcf file with about 800 individuals (diploids) and millions of SNPs. The individuals can be divided in 15 to 25 populations. I would like to calculate the allele frequencies for each SNP on each population….

Continue Reading Calculate population allele frequencies from a vcf file including multiple populations

Download COG Database

Download COG Database 0 How to extract COG database for GAMMAPROTEOBACTERIA? database COG • 27 views Login before adding your answer. Source link

Continue Reading Download COG Database

Answer: AnnotationHub::mapIds() cannot find existing ENSG (GEO supplemental data cross-r

Hi, a quick check on NCBI Gene reveals that the official symbol for this is *PRXL2C*, not *AAED1*. In this way, I would not have expected `org.Hs.eg.db` (using ‘recent’ annotation) to have it. However, I can see that `EnsDb.Hsapiens.v86` (older version) does [have it]. So, there must have been an…

Continue Reading Answer: AnnotationHub::mapIds() cannot find existing ENSG (GEO supplemental data cross-r

hisat2 compatibility for long read

hisat2 compatibility for long read 0 Hi, I am trying to align PacBio transcriptome reads against the genome to count the gene number. For pair end read i used the following workflow: # convert gff to gtf /home/software/cufflinks-2.2.1/gffread xxx.gff -T -o xxx.gtf # build index /home/software/hisat2-2.2.1/hisat2_extract_exons.py xxx.gtf > xxx.exon /home/software/hisat2-2.2.1/hisat2_extract_splice_sites.py…

Continue Reading hisat2 compatibility for long read

AnnotationHub::mapIds() cannot find existing ENSG (GEO supplemental data cross-referenced with ensembl.org)

Anyone know why I’m not getting ENSG ids for some of these symbols? The example below retrieves `NA` for multiple symbols, including AAED1 [whose ENSG is ENSG00000158122][1]. “` > library(AnnotationHub) > library(org.Hs.eg.db) > library(GEOquery) > temp download.file(getGEO(“GSM4430459″)@header$supplementary_file_1,temp) > genes unlink(temp) > ensids = mapIds(org.Hs.eg.db, keys=genes, column=”ENSEMBL”, keytype=”SYMBOL”, multiVals=”first”) > ensids[“AAED1”]…

Continue Reading AnnotationHub::mapIds() cannot find existing ENSG (GEO supplemental data cross-referenced with ensembl.org)

How to identify mutations from FASTA sequences?

How to identify mutations from FASTA sequences? 0 I have two full genome sequences (in Fasta format) plus annotation file (in gff format) from same organism. One sequence is the reference genome and another is my test sequence. Would you please suggest me some pipeline or tools ( preferably, R…

Continue Reading How to identify mutations from FASTA sequences?

Calculate LD matrix from bgen file

formatting error: Calculate LD matrix from bgen file 1 Hello, I am new to plink and am learning as I go. I am trying to calculate an LD matrix for a list of variants while using a bgen file as my reference population. See the command below: ./plink2/plink2 –r2 bin…

Continue Reading Calculate LD matrix from bgen file

extract list of SNPs from multiple chr{1:22}.bgen files using plink2

extract list of SNPs from multiple chr{1:22}.bgen files using plink2 1 hello, I have extracted out list of snps based on the maf cutoff 0,,0.0001, 0.001,0.01,0.1,.55,1.0. I am running plink2 to extract this list from .bgen files for individual chromosomes using the following code plink2 –chr{1:22}.bgen –extract maf1_snps for imputed…

Continue Reading extract list of SNPs from multiple chr{1:22}.bgen files using plink2

illumina adapter specifying and removing using fastp

Dear all, Recently, I have been asked to do preprocessing of some fastq files produced by Illumina (I don’t know which machine produced data). This is information of a fastq file (forward); @A00957:111:H5MTHDSX2:3:1101:2718:1063 1:N:0:TCCGCGAA+AGGCTATA CTGACCTCAAGTGATCTACCCACCTCGGTCTCCCAAAGTGCTGGGATTACAGGCAGGAGCCACTGCCCCTGGCCCTAATCATAGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCCGCGAAATCTCGTATGCCGGCGTCTGCTTGAAA when I asked adapter sequences from the company, they provided me them as D710-501 TCCGCGAATATAGCCT…

Continue Reading illumina adapter specifying and removing using fastp

plotting AF in vcf files

plotting AF in vcf files 0 Hello everyone, I want to plot AF distribution for all my vcf files. Is there any software to do that. I tried using vcfstats but its showing error in my vcf files. AF vcf plotting • 38 views Login before adding your answer. Source…

Continue Reading plotting AF in vcf files

Default CNV call thresholds for haplotype chromosome s

Default CNV call thresholds for haplotype chromosome s 0 Hi, I confuse a topic that about the CNV call. The default thresholds are -1.1 => 0, -0.25 => 1, 0.2 => 2, 0.7 => 3 for discrete copy number. But these thresholds doesn’t work for chrY and chrX. What is…

Continue Reading Default CNV call thresholds for haplotype chromosome s

Why does Txdb transcript length not always match to transcript end-start position?

Why does Txdb transcript length not always match to transcript end-start position? 2 I have just found an example that biomart’s transcript_length is not identical with transcript_end – transcript_start. ensembl_gene_id mgi_symbol chromosome_name strand start_position end_position gene_biotype transcript_start transcript_end strand.1 transcript_length 128537 ENSMUSG00000037860 Aim2 1 1 173178445 173293606 protein_coding 173178445 173293606…

Continue Reading Why does Txdb transcript length not always match to transcript end-start position?

Mixed model analysis using lme4

Mixed model analysis using lme4 0 Hi all, I am a novice in using mixed models. I am not sure how to set up the fixed and random effects in mixed model. Your expertise would be much appreciated. My experimental set up is as follows: Three years of experiment (Experiment),…

Continue Reading Mixed model analysis using lme4

De novo genome assembly

Forum:De novo genome assembly 0 Howdy, I have recently been tasked as the ‘bioinformatics guy” in my lab and am having trouble with a de novo genome assembly of Mother of a Thousand. I am working with Nanopore reads and have ran my reads through CANU. I have all of…

Continue Reading De novo genome assembly

bcftools consensus still returns “Could not parse the header” error

bcftools consensus still returns “Could not parse the header” error 0 I attempted to create a consensus fasta file using bcftools, i.e. bgzip -c All_SRR_SNP_Clean.vcf > All_SRR_SNP_Clean.vcf.gz tabix All_SRR_SNP_Clean.vcf.gz cat $ref| bcftools consensus $vcf_dir/All_SRR_SNP_Clean.vcf.gz > consensus.fasta where $ref is the path to a Drosophila reference genome fa and the vcf…

Continue Reading bcftools consensus still returns “Could not parse the header” error

command not found, what is wrong?

fastq-dump: command not found, what is wrong? 0 I have downloaded a tar of SRA toolkit, unzipped and installed it. I have also done the binary installation where you specify the path and I think I’ve done it correctly. Now, I try to use fastq-dump and it runs when I…

Continue Reading command not found, what is wrong?

Extremely low number of variants in VCF file after filtering MIN(FORMAT/DP)>10

Extremely low number of variants in VCF file after filtering MIN(FORMAT/DP)>10 0 I’m doing microbiome analysis where I’m looking for SNPs in a large number of microbe species’ genomes. I ran my bcftools pipeline on around 15 bacterial and viral species from which the end result produced a number of…

Continue Reading Extremely low number of variants in VCF file after filtering MIN(FORMAT/DP)>10

variant filtration with gene names or position

variant filtration with gene names or position 0 Hey, I have a question about filtration of variants? which one of variant filtration with gene names or position would be better or right? thank you so much variant-filtration • 34 views Source link

Continue Reading variant filtration with gene names or position

Variant calling from 5 MB regions coming from contrasting cultivars

Variant calling from 5 MB regions coming from contrasting cultivars 0 Hi, I would like to compare ~5 MB genomic (QTL) regions across two groups (resistant and susceptible) and identify variants that might majorly influence resistance. I was thinking of the following pipeline; use susceptible cultivar as the reference (since…

Continue Reading Variant calling from 5 MB regions coming from contrasting cultivars

install ensembl-vep

install ensembl-vep 0 Hello, I want to install ensembl-vep in my Ubuntu 18.04.2. I have already installed LWP::Simple. What can I do in the next step? Thanks in advance for great help! Best, Yue Inspiron-3670:~$ perl -MCPAN -e’install “LWP::Simple”‘ Reading ‘/home/jing/.cpan/Metadata’ Database was generated on Sat, 07 Aug 2021 06:55:53…

Continue Reading install ensembl-vep

News and Insights | Nasdaq

Now Playing China Will Have Difficult Second Half: UBS’s Zuercher 1 day ago Now Playing In the Money: Can’t Stop, Won’t Stop 2 days ago Now Playing Biden Aims for 50% Clean Car Goal for U.S. by 2030 2 days ago Now Playing Delta Cases Threaten China Recovery 3 days…

Continue Reading News and Insights | Nasdaq

How to separate sub-families from transposons sequence based fasta files?

How to separate sub-families from transposons sequence based fasta files? 1 I’m working on the classification of transposable elements. I want to retrieve sequences of their sub-classes in separate files. Is there any code or tool present to separate their sub-families because dataset contains thousands of sequence entries for different…

Continue Reading How to separate sub-families from transposons sequence based fasta files?

calculating fold change from dataframe

calculating fold change from dataframe 0 Hey, is there a code or a way to calculate the fold change? all i could found is a calculations that includes only two variables, for example: log2FC=Log2(B)-Log2(A) but i have a data frame and i want to found the value , my data…

Continue Reading calculating fold change from dataframe

Highly used R packages with no Python equivalent

The biggies are obviously DESeq2, limma and edgeR, but they are massive packages doing some very complex statistics, and also have dependency trees that would need to be considered. Depending on your background, you might want to look into the rtracklayer/GenomicRanges eco-system. While I personally am not a fan, I…

Continue Reading Highly used R packages with no Python equivalent

Converting an S4 object into a dataframe in R

I have an S4 object named ‘res’ which I got while using an R package called RDAVIDWebService. I can’t seem to find a way to convert this object into a dataframe in R. I tried using the function ‘as.data.frame(res)’ but it throws this error: > as.data.frame(res) Error in as.data.frame.default(res) :…

Continue Reading Converting an S4 object into a dataframe in R

Where To Find Annotation File For Agilent Microarray?

An easier way that has [probably] only come about since this question was posted is via biomaRt in R. You can build annotation tables for Agilent 4×44 arrays for mouse and human as follows: require(biomaRt) Homo sapiens # agilent_wholegenome_4x44k_v1 mart <- useMart(‘ENSEMBL_MART_ENSEMBL’) mart <- useDataset(‘hsapiens_gene_ensembl’, mart) annotLookup <- getBM( mart…

Continue Reading Where To Find Annotation File For Agilent Microarray?

VCF Filter On Small Genomes

VCF Filter On Small Genomes 0 Hi guys, I am working on a yeast species (Candida glabrata) NGS data to find any mutations related to drug resistance. I am new in bioinformatics so I am using Galaxy.eu to get use to algorithms. There is literature about some genes that mutations…

Continue Reading VCF Filter On Small Genomes