Tag: query

[lammps-users] moving graphene as rigid – LAMMPS Mailing List Mirror

Hello LAMMPS users, I am using windows 30 july 2021. units are real. I have a query regarding rigid command. I want to move graphene as a rigid body. For this I have created two groups. One group is fixedatoms (purple colored) and the second group is rigidcarbonatoms (grey color)….

Continue Reading [lammps-users] moving graphene as rigid – LAMMPS Mailing List Mirror

How to convert transcript-relative coordinates to genomic coordinates?

How to convert transcript-relative coordinates to genomic coordinates? 0 I have queried using Entrez Utilities (efetch: www.ncbi.nlm.nih.gov/books/NBK25499/) and obtained annotations for transcripts like the following: >Feature ref|NM_152486.3| 1 2557 gene gene SAMD11 gene_syn MRS gene_desc sterile alpha motif domain containing 11 db_xref GeneID:148398 db_xref HGNC:HGNC:28706 db_xref MIM:616765 How/what database should…

Continue Reading How to convert transcript-relative coordinates to genomic coordinates?

taxonomy – Assign multiple taxids to a sequence when constructing a local BLAST database

I recently had a script fail due to poor handling of BLAST output. The BLAST -outfmt staxids field usually returns a single taxid, but occasionally it returns two or more taxids separated by a semicolon, such as 556514;701533. Fixing the script to handle this should be fairly straightforward. But the…

Continue Reading taxonomy – Assign multiple taxids to a sequence when constructing a local BLAST database

Install CUDA on NVIDIA Jetson Nano

Hardware Pre-requisite Jetson Nano A 5V 4Ampere Charger 64GB SD card Software Preparing Your Raspberry Pi Flashing Jetson SD Card Image Unzip the SD card image Insert SD card into your system. Bring up Etcher tool and select the target SD card to which you want to flash the image….

Continue Reading Install CUDA on NVIDIA Jetson Nano

Convert list of Accession Numbers to Full Taxonomy

Using NCBI Entrez direct. $ esearch -db assembly -query “GCA_000005845” | elink -target taxonomy | efetch -format native -mode xml | grep ScientificName | awk -F “>|<” ‘BEGIN{ORS=”, “;}{print $3;}’ Escherichia coli str. K-12 substr. MG1655, cellular organisms, Bacteria, Proteobacteria, Gammaproteobacteria, Enterobacterales, Enterobacteriaceae, Escherichia, Escherichia coli, Escherichia coli K-12, If…

Continue Reading Convert list of Accession Numbers to Full Taxonomy

downloading RNA seq data

downloading RNA seq data 0 Hi friends I am using the following code to get the data from TCGA. I want to have only one allocate of each person then I will have unique patients ID. Is there any line of code that I should add to this to get…

Continue Reading downloading RNA seq data

Bioconductor – GSE13015

DOI: 10.18129/B9.bioc.GSE13015     GEO accession data GSE13015_GPL6106 as a SummarizedExperiment Bioconductor version: Release (3.14) Microarray expression matrix platform GPL6106 and clinical data for 67 septicemic patients and made them available as GEO accession [GSE13015](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE13015). GSE13015 data have been parsed into a SummarizedExperiment object available in ExperimentHub. This data data…

Continue Reading Bioconductor – GSE13015

[lh3/minimap2] Memory leak when using Python and threads

The program align.py uses mappy to align reads in Python using multiple worker threads. After loading the index the memory usage jumps up quickly to >20Gb and then continues to climb steadily through 40Gb an beyond. This issue was first discovered in bonito and isolated to mappy. The data flow…

Continue Reading [lh3/minimap2] Memory leak when using Python and threads

Bwa on multiple processor

Hi Guys, When I am trying to run bwa mem on multiple processor, I am getting error as : > mpirun -np 16 bwa mem hg19-agilent.fasta R1.fastq R2.fastq | samtools sort -o aln.bam [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::bwa_idx_load_from_disk] read…

Continue Reading Bwa on multiple processor

Introducing CreateAPI | kean.blog

If you’ve tried OpenAPI spec generators, you know how it goes. They get you about 60-80% there, but you end up having to modify the code by hand. For one of the specs (GitHub REST API spec), a popular code generator I tried produced more than 300 compile-time errors. With…

Continue Reading Introducing CreateAPI | kean.blog

VEP issue: ERROR: Cache assembly version (GRCh37) and database or selected assembly version (GRCh38) do not match

Describe the issue VEP give errors even my query and reference has same assembly version Command :$: ./vep -i examples/homo_sapiens_GRCh37.vcf –cache –refseq cache reference details while running install.pl ? 458 NB: Remember to use –refseq when running the VEP with this cache! downloading ftp.ensembl.org/pub/release-104/variation/indexed_vep_cache/homo_sapiens_refseq_vep_104_GRCh37.tar.gz unpacking homo_sapiens_refseq_vep_104_GRCh37.tar.gz converting cache, this may…

Continue Reading VEP issue: ERROR: Cache assembly version (GRCh37) and database or selected assembly version (GRCh38) do not match

BLAST | ICGRC

In bioinformatics, BLAST (Basic Local Alignment Search Tool) is an algorithm for comparing primary biological sequence information, such as the amino-acid sequences of different proteins or the nucleotides of DNA sequences. A BLAST search enables a researcher to compare a query sequence with a library or database of sequences, and…

Continue Reading BLAST | ICGRC

makeblastdb creating multiple files of unexpectedly large sizes

I have a set of 100 amino acid sequences and I want to perform a BLASTP sesrch against the refseq_protein database. Accordingly I had set up the standalone version of BLAST (Version 2.11.0+) and downloaded the refseq_protein database from NCBI using the following code wget ftp.ncbi.nlm.nih.gov/refseq/release/complete/*.faa.gz The database gets downloaded…

Continue Reading makeblastdb creating multiple files of unexpectedly large sizes

sra toolkit

sra toolkit 1 hello I cannot download sra files. I tried prefetch SRR17055838 and gives this error 2021-12-26T13:48:20 prefetch.2.11.2 err: error unexpected while resolving query within virtual file system module – failed to resolve accession ‘SRR17055838’ – The object is not available from your location. ( 406 ) 2021-12-26T13:48:20 prefetch.2.11.2:…

Continue Reading sra toolkit

The Hot Topic In Probabilistic Programming

One of the biggest challenges of this decade is solving uncertainty, ethical and explainable problems in the thousands of machine learning models we interact with daily. Meta, formerly Facebook, announced the release of their supplement to aid this developing sphere. Bean Machine, Meta’s probabilistic programming system, is a PyTorch-based model…

Continue Reading The Hot Topic In Probabilistic Programming

How do I “flush” data to my RSQLite disk database?

You’re not using the pattern suggested by the RSQLite documentation. That documentation uses dbWriteTable to copy a data frame into a SQLite table: dbWriteTable(con, “mtcars”, mtcars) According to this documentation, your full code would look something like this: con <- dbConnect(RSQLite::SQLite(), “./mtcars.db”) data(mtcars) dbWriteTable(con, “mtcars”, mtcars) dbListTables(con) # Fetch all…

Continue Reading How do I “flush” data to my RSQLite disk database?

GSA – Galaxy Community Hub

Galaxy-GSA – Galaxy Community Hub ← Platform Directory comments Gene Set Analysis (GSA) can be defined as the comparison of a query gene set (a list or a rank of differentially expressed genes, for example) to a reference database of annotated gene sets, in order to interpret the initial query…

Continue Reading GSA – Galaxy Community Hub

Delightful code generation for OpenAPI specs for Swift written in Swift

Delightful code generation for OpenAPI specs for Swift written in Swift. Fast: processes specs with 100K lines of YAML in less than a second Smart: generates Swift code that looks like it’s written by hand Reliable: tested on 500K lines of publically available OpenAPI specs producing correct code every time…

Continue Reading Delightful code generation for OpenAPI specs for Swift written in Swift

Automation Hub Open API Power Query Data Parsing

Parsing the data from the Automation Hub API can sometimes prove to be challenging, especially if you are consolidating a very complex report. In this page, we are presenting a couple of tips and tricks that can be used to improve the overall data parsing process. The page contains: The…

Continue Reading Automation Hub Open API Power Query Data Parsing

RStudio AI Weblog: Picture segmentation with U-Internet

Certain, it’s good when I’ve an image of some object, and a neural community can inform me what sort of object that’s. Extra realistically, there is perhaps a number of salient objects in that image, and it tells me what they’re, and the place they’re. The latter process (referred to…

Continue Reading RStudio AI Weblog: Picture segmentation with U-Internet

bioinformatics – Local BLAST NCBI C++ Exception

I’m getting an error trying to to use blast v2.12 against a local nt database. I’ve downloaded nt twice from the ftp server thinking the first time it was corrupt but that didn’t change anything. My command is: blastn -db nt -num_threads 8 -outfmt “6 qseqid sacc stitle ssciname nident…

Continue Reading bioinformatics – Local BLAST NCBI C++ Exception

40231867-SWI-Prolog-as-a-Semantic-Web-Tool-for-semantic-querying-in-Bioclipse-Integration-and-perfor – SWI-Prolog as a Semantic Web Tool for semantic

Unformatted text preview: SWI-Prolog as a Semantic Web Tool for semantic querying in Bioclipse: Integration and performance benchmarking Samuel Lampa June 2, 2010 Abstract The huge amounts of data produced in new high-throughput techniques in the life sciences, and the need for integration of heterogeneous data from disparate sources in…

Continue Reading 40231867-SWI-Prolog-as-a-Semantic-Web-Tool-for-semantic-querying-in-Bioclipse-Integration-and-perfor – SWI-Prolog as a Semantic Web Tool for semantic

Postdoctoral Position in Structural Bioinformatics job with National Institute of Allergy and Infectious Diseases (NIAID)

  Postdoctoral Position in Structural Bioinformatics Department of Health and Human Services National Institutes of Health National Institute of Allergy and Infectious Diseases The Structural Bioinformatics Core Section (SBIS) at the National Institute of Allergy and Infectious Diseases (NIAID), Vaccine Research Center (VRC), located on the main National Institutes of…

Continue Reading Postdoctoral Position in Structural Bioinformatics job with National Institute of Allergy and Infectious Diseases (NIAID)

python – Directly referring to database tables in DataSpell (?)

I downloaded DataSpell, configured Jupyter Notebook (works) and connected to the database which I’m using (works). Is there any way now how I can directly refer to chosen tables in the database (via DataSpell environment) or do I still need to write whole connection code inside Jupyter Notebook? E. g….

Continue Reading python – Directly referring to database tables in DataSpell (?)

Single-cell delineation of lineage and genetic identity in the mouse brain

STICR lentiviral library preparation and validation We synthesized a high-complexity lentivirus barcode library that encodes approximately 60–70 million distinct oligonucleotide RNA sequences (STICR barcodes). STICR barcodes comprised three distinct oligonucleotide fragments cloned sequentially into a multicloning site within the 3′ UTR of an enhanced green fluorescent protein (eGFP) transgene under…

Continue Reading Single-cell delineation of lineage and genetic identity in the mouse brain

How to differenciate between 16s hypervariables regions using QIIME2 ? – User Support

M_F: May i search the sequences on ncbi for example correponding to v4 domain No, NCBI probably would not have such sequences in an easily indexed form but I could be wrong. Rather, grab some reference sequences (can be a random subsample, do not need all of them) and use…

Continue Reading How to differenciate between 16s hypervariables regions using QIIME2 ? – User Support

KINNEY_DNMT1_METHYLATION_TARGETS

Standard name KINNEY_DNMT1_METHYLATION_TARGETS Systematic name M2508 Brief description Hypomethylated genes in prostate tissue from mice carrying hypomorphic alleles of DNMT1 [GeneID=1786]. Full description or abstract Previous studies have shown that tumor progression in the transgenic adenocarcinoma of mouse prostate (TRAMP) model is characterized by global DNA hypomethylation initiated during early-stage…

Continue Reading KINNEY_DNMT1_METHYLATION_TARGETS

laboratory jobs in germany

We wish you a good luck and have a prosperous career. Working at Labcorp | Jobs and Careers at Labcorp 15 GNeuS Postdoc Positions in Neutron Science of 24 Months Each (Full-time Job) FZJ – Forschungszentrum Jülich. What other similar jobs are there to Laboratory jobs in Germany? Clinical Laboratory…

Continue Reading laboratory jobs in germany

A pandemic-scale phylogenetic analysis tool

Phylogenetics is an analytical tool that quickly analyzes genomic data to provide invaluable insights into the evolution and spread of a pathogen, thereby allowing public health officials and governments to respond to it in a timely fashion. During the coronavirus disease 2019 (COVID-19) pandemic, phylogenetics, like many other pre-pandemic tools,…

Continue Reading A pandemic-scale phylogenetic analysis tool

Blast command line pipeline not working

Blast command line pipeline not working 0 Hello, I am running now a local blast pipeline using MacOs. The goal here is to take interval of the 5 best hits and then extract the SNP variants from multiple vcf.gz files. But I am facing an error which I cannot solve….

Continue Reading Blast command line pipeline not working

Easy OpenAPI specs and Swagger UI for your Flask API

Easy Swagger UI for your Flask API Flasgger is a Flask extension to extract OpenAPI-Specification from all Flask views registered in your API. Flasgger also comes with SwaggerUI embedded so you can access localhost:5000/apidocs and visualize and interact with your API resources. Flasgger also provides validation of the incoming data,…

Continue Reading Easy OpenAPI specs and Swagger UI for your Flask API

Piranha Peak-Calling with multiple replicates

Piranha Peak-Calling with multiple replicates 0 I am trying to call RNA-Protein interation peaks by using Piranha software. I have multiple replicates for each experiment and the control data, and I can’t seem to understand how to combine them into one Piranha query. For example, if I was to call…

Continue Reading Piranha Peak-Calling with multiple replicates

Dedupe array of database results

A result set from a PDO query is as follows… Array ( [0] => Array ( [activity_link_type] => Category [data_id] => 1 ) [1] => Array ( [activity_link_type] => Category [data_id] => 38 ) [2] => Array ( [activity_link_type] => PData [data_id] => 108 ) [3] => Array ( [activity_link_type]…

Continue Reading Dedupe array of database results

alphafold2: HHblits failed – githubmemory

I’ve tried using the standard alphafold2 setup via docker (converted to a singularity container) via the setup described at github.com/kalininalab/alphafold_non_docker, and both result in the following error: […] E1210 12:01:01.009660 22603932526400 hhblits.py:141] – 11:49:18.512 INFO: Iteration 1 E1210 12:01:01.009703 22603932526400 hhblits.py:141] – 11:49:19.070 INFO: Prefiltering database E1210 12:01:01.009746 22603932526400 hhblits.py:141]…

Continue Reading alphafold2: HHblits failed – githubmemory

What is the single nucleotide polymorphism database ( dbsnp )?

The Single Nucleotide Polymorphism Database (dbSNP) is a free public archive for genetic variation within and across different species developed and hosted by the National Center for Biotechnology Information (NCBI) in collaboration with the National Human Genome Research Institute (NHGRI). Furthermore, are there any databases for single nucleotide polymorphisms?As there…

Continue Reading What is the single nucleotide polymorphism database ( dbsnp )?

Malaysian Genomics dives to lowest in nine months after hitting limit down

KUALA LUMPUR (Dec 9): Malaysian Genomics Resource Centre Bhd’s (MGRC) share price hit limit down on Thursday (Dec 9, 2021) after the health technology company’s stock price fell as much as 35 sen or 29.91% to 82 sen. At 82 sen, MGRC’s share price is at its lowest in about…

Continue Reading Malaysian Genomics dives to lowest in nine months after hitting limit down

r – Is there a way to do a negative match using regex sub?

Say I have a vector of strings, g<-c(“bunchofstuff>query=true/fun/weird>bunchofstuff”, “bunchofstuff>query=animals/octopus/weird>bunchofstuff”, “bunchofstuff>query=flowers/sunshine/fun>bunchofstuff”, ” bunchofstuff>query=fun/true/sunshine>bunchofstuff” and I want to essentially use sub to erase anything after query=, until the end of the string, IF query= is not followed by true (ideally in any position). As far as I can tell, there isn’t a…

Continue Reading r – Is there a way to do a negative match using regex sub?

Help needed for Ensembl Gene ID conversion for RNA-seq data

Hello All, I am new to the RNA-seq world and especially new to the bioinformatics side. We recently completed a RNA-seq experiment (total RNAs) on human samples and we used illumina’s Dragen RNA pipeline which generated salmon gene count (.sf) output files. In the files, the gene ID is in…

Continue Reading Help needed for Ensembl Gene ID conversion for RNA-seq data

Bash script to help with print Name of reads that only have query subsequence or its verse complement and position of first occurance of this subsequence in read and output of all this in tab separat

Bash script to help with print Name of reads that only have query subsequence or its verse complement and position of first occurance of this subsequence in read and output of all this in tab separat 0 Create bash script that receives name of fastq fastq file and query subsequence…

Continue Reading Bash script to help with print Name of reads that only have query subsequence or its verse complement and position of first occurance of this subsequence in read and output of all this in tab separat

Curio Genomics Joins the International Wheat Genome Sequencing Consortium

Newswise — The International Wheat Genome Sequencing Consortium (IWGSC) is pleased to announce that the bioinformatics company Curio Genomics has joined the organization as a sponsoring partner. The IWGSC is an international, collaborative consortium of wheat growers, plant scientists, and public and private breeders dedicated to the development of genomic…

Continue Reading Curio Genomics Joins the International Wheat Genome Sequencing Consortium

r – RSQlite – Find values with most occurences in group

I’m using RSQlite to import Datasets from an SQlite-Database. There are multiple millions of observations within the Database. Therefor I’d like to do as much as possible of Data selection and aggregation within the Database. At some point I need to aggregate a character variable. I want to get the…

Continue Reading r – RSQlite – Find values with most occurences in group

How to call variant by –max-depth for RNAseq

Hi everyone! I have a query regarding variant calling from a high coverage site on the basis of the maximum likelihood variant. I have RNA-seq data mapped bam file. I called variant using the below command. “bcftools mpileup –max-depth 10000 -Oz -f ref.fa sample.bam | bcftools call -mv -Oz -o…

Continue Reading How to call variant by –max-depth for RNAseq

query sequence is input sequence or its reverse complement

query sequence is input sequence or its reverse complement 0 >sp|O14920.1|IKKB_HUMAN RecName: Full=Inhibitor of nuclear factor kappa-B kinase subunit beta; Short=I-kappa-B-kinase beta; Short=IKK-B; Short=IKK-beta; Short=IkBKB; AltName: Full=I-kappa-B kinase 2; Short=IKK2; AltName: Full=Nuclear factor NF-kappa-B inhibitor kinase beta; Short=NFKBIKB; AltName: Full=Serine/threonine protein kinase IKBKB MSWSPSLTTQTCGAWEMKERLGTGGFGNVIRWHNQETGEQIAIKQCRQELSPRNRERWCLEIQIMRRLTH PNVVAARDVPEGMQNLAPNDLPLLAMEYCQGGDLRKYLNQFENCCGLREGAILTLLSDIASALRYLHENR IIHRDLKPENIVLQQGEQRLIHKIIDLGYAKELDQGSLCTSFVGTLQYLAPELLEQQKYTVTVDYWSFGT LAFECITGFRPFLPNWQPVQWHSKVRQKSEVDIVVSEDLNGTVKFSSSLPYPNNLNSVLAERLEKWLQLM LMWHPRQRGTDPTYGPNGCFKALDDILNLKLVHILNMVTGTIHTYPVTEDESLQSLKARIQQDTGIPEED QELLQEAGLALIPDKPATQCISDGKLNEGHTLDMDLVFLFDNSKITYETQISPRPQPESVSCILQEPKRN LAFFQLRKVWGQVWHSIQTLKEDCNRLQQGQRAAMMNLLRNNSCLSKMKNSMASMSQQLKAKLDFFKTSI…

Continue Reading query sequence is input sequence or its reverse complement

igBLAST query/options error

igBLAST query/options error 2 When I try to run this command: igblastn -germline_db_V $GERMLINE_DB”/human_gl_HV” -germline_db_J $GERMLINE_DB”/human_gl_HJ” -germline_db_D $GERMLINE_DB”/human_gl_HD” -organism human -domain_system imgt -query $WORKDIR”https://www.biostars.org/”$FILE”.fasta” -auxiliary_data $IGBLASTDIR”/optional_file/human_gl.aux” -outfmt 7 -num_threads 4 -num_alignments_V 5 -out $FILE”_tab.igblast” I get this error: BLAST query/options error: Germline annotation database human/human_V could not be found in…

Continue Reading igBLAST query/options error

NCBI’s Efetch not working

Any help would be much appreciated. My goal is to run the following for loop to generate a list of sample_id (which is actually isolation site) for a list of SRAs. However I get an error (see below) for each and every SRA. for sra in `awk ‘NR>1{print $1}’ metadata.txt`…

Continue Reading NCBI’s Efetch not working

increasing word size extremely slows down the search

standalone blastp: increasing word size extremely slows down the search 1 Hello, I need to blastp a genome (15,000 seqs) against genome (12,000 seqs) using Biopython. I decided to use local blast and query genome 1 fasta file against genome 2 database ( made by makeblastdb command with second genome…

Continue Reading increasing word size extremely slows down the search

Proper way(s) to perform enrichment analysis in R

I am not sure what is the proper way to carry out over-representation analysis (and also gene set enrichment analysis) for RNAseq data. Ideally, the analysis can be performed in R, otherwise, if the software/ platform can export the output file (also include all the non-statistical-significant term) will also be…

Continue Reading Proper way(s) to perform enrichment analysis in R

Is it possible to isolate and merge overlapping hits from a BLAST search Aligned Fasta file?

Hello all, Hope you’re well. I started a PhD in Jan this year and frankly I am struggling. I’m coming into the second week of trying to figure out how to attempt this problem and it’s honestly starting to get to me a bit – the current task I’m working…

Continue Reading Is it possible to isolate and merge overlapping hits from a BLAST search Aligned Fasta file?

Ttc30a affects tubulin modifications in a model for ciliary chondrodysplasia with polycystic kidney disease

Significance Cilia are tubulin-based cellular appendages, and their dysfunction has been linked to a variety of genetic diseases. Ciliary chondrodysplasia is one such condition that can co-occur with cystic kidney disease and other organ manifestations. We modeled skeletal ciliopathies by mutating two established disease genes in Xenopus tropicalis frogs. Bioinformatic…

Continue Reading Ttc30a affects tubulin modifications in a model for ciliary chondrodysplasia with polycystic kidney disease

How to extract genomic upstream region of a protein identified by its NCBI accession number?

How to extract genomic upstream region of a protein identified by its NCBI accession number? 1 I have a list of NCBI protein accession numbers. I would like to extract out the upstream genomic region of the corresponding gene’s nucleotide sequence. I will be thankful to you if you can…

Continue Reading How to extract genomic upstream region of a protein identified by its NCBI accession number?

command for isolate unique reads with unique subject id

command for isolate unique reads with unique subject id 0 So as you see below there is 1 file that contains a different column. So as some reads are multiple align to different subject id, so I want to isolate only one read which contains the highest bit score but…

Continue Reading command for isolate unique reads with unique subject id

BLAST Annotating Sub-sequences of Features with Not Normalized Identity Scores

BLAST Annotating Sub-sequences of Features with Not Normalized Identity Scores 1 I’m using BLAST annotation to annotate protein features using the online tool Below is a screenshot of the entered queries. Below is a screenshot of the result. My question is there is a parameter to let BLAST normalizes the…

Continue Reading BLAST Annotating Sub-sequences of Features with Not Normalized Identity Scores

hypothetical protein DAPPUDRAFT_213302, maker-scaffold2255_size18018-snap-gene-0.6 (gene) Tigriopus kingsejongensis

Associated RNAi Experiments Homology BLAST of hypothetical protein DAPPUDRAFT_213302 vs. L. salmonis genes Match: EMLSAG00000000401 (supercontig:LSalAtl2s:LSalAtl2s1063:86108:87342:-1 gene:EMLSAG00000000401 transcript:EMLSAT00000000401 description:”maker-LSalAtl2s1063-snap-gene-0.46″) HSP 1 Score: 149.443 bits (376), Expect = 4.121e-44Identity = 91/196 (46.43%), Postives = 119/196 (60.71%), Query Frame = 0 Query: 14 MDKITDLQVEPLT–NSRFVKPLRLRFKQDGKVKVWDLIQCHASVAVVIFNQTTQKFVFVRQFRPAVYFSALRRAQGDVEPGTQFKGDEIDPKVGITLELCAGIVD-KSKSLIEIAHEEILEETGYDVPMNLIEEIQTFPVGVGVGGENMTLFCAEVTEAMRKGPGGGLAEEGEMIDVIEMGVEETRTLMRAKSVT 206 MDK+ VEPL +SRFV P R+ ++Q+G…

Continue Reading hypothetical protein DAPPUDRAFT_213302, maker-scaffold2255_size18018-snap-gene-0.6 (gene) Tigriopus kingsejongensis

Helix, Medical University of South Carolina Partner for Population Genomics Program

NEW YORK – The Medical University of South Carolina and Helix said Monday that they have partnered to develop a large-scale population genomics initiative in South Carolina called In Our DNA SC. Beginning this fall, the program will enroll a total of 100,000 patients, providing them with no-cost genetic testing,…

Continue Reading Helix, Medical University of South Carolina Partner for Population Genomics Program

Unable to download fastq files in parallel / SOS

Unable to download fastq files in parallel / SOS 0 Hi! Very new to all this so bear with me if I’m using incorrect terminology. Also english is my second language. I’m trying to download my fastq files in parallel but it doesn’t work and I keep receiving this error:…

Continue Reading Unable to download fastq files in parallel / SOS

MUSC and Helix launch In Our DNA SC, first-of-its-kind population genomics program to drive preventive, precision health care for South Carolinians | MUSC

Large-scale initiative will advance innovative research, improved health outcomes CHARLESTON, S.C. and SAN MATEO, Calif., (Sept. 20, 2021) – The Medical University of South Carolina  (MUSC) and Helix have announced a strategic collaboration to develop a first-of-its-kind population genomics initiative in South Carolina called In Our DNA SC. The large-scale program is designed…

Continue Reading MUSC and Helix launch In Our DNA SC, first-of-its-kind population genomics program to drive preventive, precision health care for South Carolinians | MUSC

Submit sequence data to NCBI

Data provision and standards. GEO sequence submission procedures are designed to encourage provision of MINSEQE elements: Thorough descriptions of the biological samples under investigation, and procedures to which they were subjected. Thorough descriptions of the protocols used to generate and process the data. Request updates to accessioned records per the…

Continue Reading Submit sequence data to NCBI

Kaggle Training

Listing Results Kaggle training Learn Python, Data Viz, Pandas & More Tutorials Kaggle Learn Kaggle.com All Courses 8 hours agoPython Learn the most important language for data science. Intro to Machine Learning Learn the core ideas in machine learning, and build your first models. Intermediate Machine Learning Handle missing values,…

Continue Reading Kaggle Training

New Gmod Server Downtime History

“Downtime” refers to our ability to communicate with the server. If our system can’t communicate with the server, we mark it as offline and register the time below. In most cases, the server’s failure to respond to a query is due to the server being offline, but may be due…

Continue Reading New Gmod Server Downtime History

Is the Ensembl GRCh38 genome assembly more up to date than the UniProtKB online database?

Dear all, I am working with a list of Ensembl accession codes for a desired group of proteins. I have downloaded the protein annotations related to the genome assembly GRCH38. I fetched the genomic coordinates from UniProtKB API service using the Ensembl accession codes. The service provide a protein annotation…

Continue Reading Is the Ensembl GRCh38 genome assembly more up to date than the UniProtKB online database?

protti source: R/fetch_alphafold_prediction.R

#’ Fetch AlphaFold prediction #’ #’ Fetches atom level data for AlphaFold predictions either for selected proteins or whole #’ organisms. #’ #’ @param uniprot_ids optional, a character vector of UniProt identifiers for which predictions #’ should be fetched. This argument is mutually exclusive to the code{organism_name} argument. #’ @param…

Continue Reading protti source: R/fetch_alphafold_prediction.R

FASTA Programs and Algorithm – Subject:- Bioinformatics FASTA Programs and Algorithm FASTA Programs

Subject:- Bioinformatics FASTA Programs and Algorithm FASTA Programs FASTA: Compares the protein sequence to another protein sequence in a database or compares nucleotide sequence to another nucleotide sequence in a database. FASTX, FASTY: It performs a search for comparing the nucleotide sequence to a protein sequence database. SSEARCH: It performs…

Continue Reading FASTA Programs and Algorithm – Subject:- Bioinformatics FASTA Programs and Algorithm FASTA Programs

How to pipe awk of bed file into samtools to extract fasta sequences?

How to pipe awk of bed file into samtools to extract fasta sequences? 1 I have a bed file (seq.bed) that contains “queryID queryStart queryEnd”. Following is the example (the content of seq.bed file). SRR5892231.6 28 178 SRR5892231.7 4 307 SRR5892231.7 16 307 SRR5892231.9 216 408 I would like to…

Continue Reading How to pipe awk of bed file into samtools to extract fasta sequences?

org.intermine.sql.query.Query.addHaving java code examples | Tabnine

public void testHavingConstraintSet() throws Exception { q1 = new Query(“select table1.field1 from table1 group by table1.field1 having (table1.field1 = table1.field2 or table1.field1 = table1.field3)”); q2 = new Query(); Table t1 = new Table(“table1”); Field f1 = new Field(“field1”, t1); Field f2 = new Field(“field2”, t1); Field f3 = new Field(“field3”,…

Continue Reading org.intermine.sql.query.Query.addHaving java code examples | Tabnine

GATK HaplotypeCaller – Shutting down engine

00:32:48.224 INFO  HaplotypeCaller – Shutting down engine [September 17, 2021 12:32:48 AM CST] org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller done. Elapsed time: 0.04 minutes. Runtime.totalMemory()=2398617600 java.nio.BufferUnderflowException         at java.nio.ByteBuffer.get(ByteBuffer.java:688)         at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:285)         at java.nio.ByteBuffer.get(ByteBuffer.java:715)         at htsjdk.samtools.MemoryMappedFileBuffer.readBytes(MemoryMappedFileBuffer.java:34)         at…

Continue Reading GATK HaplotypeCaller – Shutting down engine

Genomic features of a carbapenem-resistant K. oxytoca strain

Introduction Antimicrobial resistance is a global issue associated with an increased and often unrestricted antibiotic use in clinical settings, which leads to the dissemination of carbapenem-resistant Enterobacterales (CRE) in healthcare facilities (World Health Organization, 2017).1 CRE constitutes a large group of bacteria with different mechanisms for drug resistance. Among them,…

Continue Reading Genomic features of a carbapenem-resistant K. oxytoca strain

Why some SNP’s are not assigned to any gene?

Why some SNP’s are not assigned to any gene? 0 Hi everyone, I am doing polygenic risk analysis (PRS) ; As you may know PRS is done based on SNP’s. On the other hand, I would like to do some visualizations at gene level. Thus, I used rsnps library (ncbi_snp_query…

Continue Reading Why some SNP’s are not assigned to any gene?

3D-Beacons Network: protein structure data, all in one place

3D-Beacons Network acts as a one-stop shop for protein structures by combining and standardising data from several providers Cryo-EM structure of the BRCA1-UbcH5c/BARD1 E3-E2 module bound to a nucleosome. Image obtained from PDBe A new platform called 3D-Beacons Network brings together experimentally determined and predicted protein structure models and related…

Continue Reading 3D-Beacons Network: protein structure data, all in one place

BLASTP version 4 database error *Help

I am having trouble running the blastp command. When I run this command against the Swiss-prot database installed from the NCBI i get this error $ blastp -db swissprot -query Ecoli_rpoB.fasta -out TEST15.txt BLAST Database error: Error: Not a valid version 4 database. However, when I create my own database…

Continue Reading BLASTP version 4 database error *Help

User friendly (visual&interactive) VCF/BCF mining tools (2021)

What is currently the best user friendly (visual and interactive) VCF/BCF mining tool in 2021? For VCF/BCF similar to size or even larger than the 1000 human genomes VCF? I guess most organization do not have a visual and interactive mining VCF mining tool but use either: A website front-end…

Continue Reading User friendly (visual&interactive) VCF/BCF mining tools (2021)

compare two vcf files

compare two vcf files 1 Hi. I have a problem I want to compare the rs numbers in two vcf files. so I want to check which of the Rs numbers are in the top 10 percent. I don’t know what to do. Can you help me if I have…

Continue Reading compare two vcf files

format specifier associated to “description”

BLAST: format specifier associated to “description” 0 I’m using blastx version 2.2.27+ and the subsequent command blastx -db nr -query fasta -outfmt ‘6 qseqid sgi sacc sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore stitle’ -evalue 1e-10 -num_alignments 1 -num_threads 24 -out blast_farm_50_20.txt that give me this…

Continue Reading format specifier associated to “description”

Regarding Error query

My name is Ruddhida Vidwans, I am a Ph.D. Research Scholar at Jain University, Bengaluru, India. My research area is Forensic Microbiology. Currently, apart from my Ph.D., I am focussing on learning next-generation sequencing analysis. For that, I am practicing with the publicly available data. I am using the “GSE163207”…

Continue Reading Regarding Error query

Bioinformatics Scientist 2 (Internal Only)

Job Summary We have an exciting opportunity in our US office for an experienced bioinformatician who is interested in working at the forefront of the gene editing (CRISPR knock-out, HDR, base editing) and gene modulation (CRISPRa, CRISPRi, RNAi) fields. As a Bioinformatics Scientist 2, you will have the opportunity to…

Continue Reading Bioinformatics Scientist 2 (Internal Only)

Error in merged bam files

Error in merged bam files 0 Hello I am trying to merge unmapped and mapped bam files. I merged the bam files using the picard tool (gatk.broadinstitute.org/hc/en-us/articles/360036883871-MergeBamAlignment-Picard). I checked the merged bam using ValidateSamFile command (gatk.broadinstitute.org/hc/en-us/articles/360036854731-ValidateSamFile-Picard-) and it showed the below errors: Error Type Count ERROR:MATES_ARE_SAME_END 5496 ERROR:MISMATCH_FLAG_MATE_NEG_STRAND 5478 ERROR:MISMATCH_MATE_CIGAR_STRING…

Continue Reading Error in merged bam files

Mark duplicates the bam files sorted by coordinates

Mark duplicates the bam files sorted by coordinates 0 Hello As it is mentioned in the documentation (gatk.broadinstitute.org/hc/en-us/articles/360037224932?page=1#comment_4406762304155), it is ideal to submit the query name based sorted bam files, so will it be computationally intensive process to submit the coordinated based sorted bam files? First, I sorted the unmapped…

Continue Reading Mark duplicates the bam files sorted by coordinates

Sql Server Import From Excel With Sql, Duplicate Column Names

Explore LabKey Server’s specialized tools for assay data management below and read additional documentation on the LabKey support and documentation portal. Flow. Using SQL Search you can search for the column name and find all the stored procedures where it is used. Work faster. Finding anything in the Object Explorer….

Continue Reading Sql Server Import From Excel With Sql, Duplicate Column Names

probable dimethyladenosine transferase-like, maker-scaffold153_size302544-snap-gene-2.18 (gene) Tigriopus kingsejongensis

Associated RNAi Experiments Homology BLAST of probable dimethyladenosine transferase-like vs. L. salmonis genes Match: EMLSAG00000006273 (supercontig:LSalAtl2s:LSalAtl2s341:673186:674124:1 gene:EMLSAG00000006273 transcript:EMLSAT00000006273 description:”augustus_masked-LSalAtl2s341-processed-gene-6.3″) HSP 1 Score: 484.567 bits (1246), Expect = 2.083e-174Identity = 227/310 (73.23%), Postives = 259/310 (83.55%), Query Frame = 0 Query: 9 KVRKTGSGMSTVEAAGSGGGGQQGMVFNTGLGQHILKNPLVVQSIIDKAALRSTDVVLEIGPGTGNLTVRALEKCKKLIACEVDPRMVAELQKRVQGTHFQSKLQIMVGDVIKTDLPFFDACVANVPYQISSPLVFKLLLHRPFFRCAVLMFQREFAQRLVAKPGDKLYCRLSINTQLLARVDHVMKVGKGNFRPPPKVESSVVRIEPRNPPPPINFKEWDGLTRVAFVRKNKTLGAAFNQTTVLMMLEKNYRVHLSLADEPVPEKIDIKSIIETVLAEIAFKEKRARSMDIDDFMKLLHAFNAKGIHFV 318 KV+ T + GG+QG+VFNT LGQHILKNP VV…

Continue Reading probable dimethyladenosine transferase-like, maker-scaffold153_size302544-snap-gene-2.18 (gene) Tigriopus kingsejongensis

Bioconductor – rols

    This package is for version 2.11 of Bioconductor; for the stable, up-to-date release version, see rols. An R interface to the Ontology Lookup Service Bioconductor version: 2.11 This package allows to query EBI’s Ontology Lookup Service (OLS) using Simple Object Access Protocol (SOAP). Author: Laurent Gatto <lg390 at…

Continue Reading Bioconductor – rols

How to retrieved protein Fasta sequence from accession number by Entrez

How to retrieved protein Fasta sequence from accession number by Entrez 1 AF131201AF326487AF326488AF326489AF326490 This are the some the accession number of protein. Firstly i dont know what kind of accession number is this. beacuse usually protein accession number start with XP_/NP_. Problem arise when i try the following command using…

Continue Reading How to retrieved protein Fasta sequence from accession number by Entrez

Exec format error in unmapped bam file

Exec format error in unmapped bam file 0 Hello I created unmapped bam file from fastq file (sample 1). When I tried to search the bam file using query name, I got the ‘Exec format error’ #1_ucheck.bam: unmapped bam file from Sample 1 fastq file code: samtools view 1_ucheck.bam |…

Continue Reading Exec format error in unmapped bam file

DNA Sequence Classification Based on Milvus

Introduction DNA sequencing is a popular concept in both academic research and practical applications, such as gene traceability, species identification, and disease diagnosis. Whereas all industries starve for a more intelligent and efficient research method, artificial intelligence has attracted much attention, especially from the biological and medical domains. More and…

Continue Reading DNA Sequence Classification Based on Milvus

Drug/small molecule databases with SMILES and chemical properties?

Drug/small molecule databases with SMILES and chemical properties? 0 Dear all, My lab has recently screened a library of compounds. The compounds are meant to represent chemical diversity across chemical space. We would like to supplement our experiments with in silico work. Using the pubchemy python API, I have systematically…

Continue Reading Drug/small molecule databases with SMILES and chemical properties?

tabix for ID column

tabix for ID column 4 Hello, I’m looking for something similar to tabix. But instead of looking for informations within a given region, I would like to use the values in the ID column for quickly lookup. So for example I would like to take the compressed dbSNP file, index…

Continue Reading tabix for ID column

Kaggle Mini Courses

Listing Results Kaggle mini courses Learn Python, Data Viz, Pandas & More Tutorials Kaggle Data Kaggle.com All Courses 8 hours agoPython Learn the most important language for data science. Intro to Machine Learning Learn the core ideas in machine learning, and build your first models. Intermediate Machine Learning Handle missing…

Continue Reading Kaggle Mini Courses

Annotate Structural variants with population specific allele frequency values

Annotate Structural variants with population specific allele frequency values 0 Hi, Has anyone tried filtering structural variants based on pupulation specific allele frequency (AF) values (for example gnomAD-SV or phase 3 1000 genome SV)? I have a set of SVs that I detected using a multipronged approach. For prioritising variants,…

Continue Reading Annotate Structural variants with population specific allele frequency values

Senior Machine Learning/Bioinformatics Software Engineer, Research – Invitae

Invitae is dedicated to bringing comprehensive genetic information into mainstream medicine to improve healthcare for billions of people. Our team is driven to make a difference for the patients we serve. We are leading the transformation of the genetics industry, by making genetic testing affordable and accessible for everyone to…

Continue Reading Senior Machine Learning/Bioinformatics Software Engineer, Research – Invitae

Replace fasta header using bash : bioinformatics

Hello people, I got stucked with my new script and perhaps you can help me. Its goal is to take an input table with querys and subjects (originated by a local blast) and replace query names with subject names in the corresponding fasta file. In detail, the table input file…

Continue Reading Replace fasta header using bash : bioinformatics

Still Not Use QIIME2 for Amplicon Analysis? You’re OUT!

I want to share with you a powerful tool named QIIME 2. It’s handy for Amplicon analysis. What is QIIME 2? The Quantitative Insights Into Microbial Ecology (QIIME) microbiome bioinformatics platform has supported many microbiome studies and gained a broad user and developer community. QIIME 2™(qiime2.org) is a new version…

Continue Reading Still Not Use QIIME2 for Amplicon Analysis? You’re OUT!

Error: NCBI C++ Exception – Invalid choice selection: NCBI-Seqalign::Score.value.real

Error: NCBI C++ Exception – Invalid choice selection: NCBI-Seqalign::Score.value.real 1 Hi, I am having issues running blast from command line, accessing remote databases. This is what i am running: blastn -db nt -query {input} -out {output} -evalue 0.001 -max_target_seqs 100 -remote Which returns this: Error: NCBI C++ Exception: T0 “/opt/conda/conda-bld/blast_1559335677723/work/c++/src/objects/seq/../seqalign/Score_.cpp”,…

Continue Reading Error: NCBI C++ Exception – Invalid choice selection: NCBI-Seqalign::Score.value.real

MEME suite compare motifs between species

MEME suite compare motifs between species 1 Hello, I have a set of 5kb sequences upstream of a gene from different primates, and I would like to know what motifs are in enhancers and promoters of primates vs the 5kb upstream sequences of a set of the same gene but…

Continue Reading MEME suite compare motifs between species

kaggle datasets for tableau

The homepage is full of small visualizations telling stories about each data set. This is the default Tableau location (if you’ve not changed) so far. Transformation processes can also be referred to as data wrangling, or data munging, transforming and mapping data from one raw data form into another format…

Continue Reading kaggle datasets for tableau

Remote blast query limit

Remote blast query limit 0 Hello! How many blast queries can be processed by remote blast calls with biopython’s Bio.Blast.NCBIWWW.qblast or BLAST+ with -remote flag? When I go above 1 sequence I get the following message near the top of my XML results file (and no results: internal_error: (Severe Error)…

Continue Reading Remote blast query limit

kegg pathway database

Pathways that include all genes in gene_ids. Here the KEGG API operations are explained in comparison to these web tools. MODULE — modules or functional units of genes, BRITE — hierarchical classifications of biological entities, This page was last edited on 22 October 2020, at 18:43. The list can be…

Continue Reading kegg pathway database

Commandline BLAST – errors?

Commandline BLAST – errors? 0 Hi, I’m running command line blastx and blastp against a number of databases. However, running the exact same script on the exact same input files against the exact same databases occasionally seems to output different filesizes. I can only assume that this is because the…

Continue Reading Commandline BLAST – errors?

Top Trends Shaping the Global Biotechnology Industry in 2021

Global Biotechnology Market Analysis with forecast period 2020 to 2025 provides an in-depth analysis of market growth factors, future assessment, country-level analysis, Biotechnology industry distribution, and competitive landscape analysis of major industry players. The research report of global Biotechnology market report offers the extensive information about the top most makers…

Continue Reading Top Trends Shaping the Global Biotechnology Industry in 2021

Download intergenic spacers between specific genes for all findings in genbank

Download intergenic spacers between specific genes for all findings in genbank 0 Good afternoon, I want to download fasta from genbank for intergenic spacers between certain genes for a taxon (for example, atpB-rbcL from the chloroplast genome). If I were downloading fasta for a single gene (GENE), I would do…

Continue Reading Download intergenic spacers between specific genes for all findings in genbank

Can you make Machine Learning models to Retrieve Landmarks?

Google is organizing the 4th iteration of its Landmark Retrieval Challenge on Kaggle- The Google Landmark Retrieval 2021 Challenge. With the help of this completion, Google aims to leverage the Machine Learning practitioners community on Kaggle to help retrieve images that have the same landmark as that in the queried image….

Continue Reading Can you make Machine Learning models to Retrieve Landmarks?

Output of samtools view, what does the third column actually represent?

The samtools view outputs information from SAM and BAM files in SAM format. You can find a description of the SAM format here: samtools.github.io/hts-specs/SAMv1.pdf Section 1.4 deals with the meaning of each of the manditory coloumns. It includes the following table: Col Field Type Regexp/Range Brief description |—|——|——-|—————————-|—————————————-| 1 QNAME…

Continue Reading Output of samtools view, what does the third column actually represent?

Blast locally on multiple fasta files with multiple database

Blast locally on multiple fasta files with multiple database 2 Hi all, I need to run blast locally on multiple fasta files contain in a directory. Previously, I have used command below: ls *.fasta | parallel -a – blastp -query {} -db my_database -evalue 0.00001 -qcov_hsp_perc 50 -outfmt 6 -max_target_seqs…

Continue Reading Blast locally on multiple fasta files with multiple database