Tag: hg38
Index of /goldenPath/macEug2/vsHg38/reciprocalBest
This directory contains reciprocal-best netted chains for macEug2-hg38. – macEug2.hg38.rbest.net.gz: macEug2-referenced recip.best net to hg38. – macEug2.hg38.rbest.chain.gz: chains extracted from the recip.best net. These can be passed to the liftOver program to translate coords from macEug2 to hg38 through the recip.best net. – hg38.macEug2.rbest.net.gz: hg38-referenced recip.best net. – hg38.macEug2.rbest.chain.gz: recip.best…
YP5260 – YFull YTree Info
Sample ID Country / Language Info Ref File Testing company Statistics Status I7021 Mongolia (Bulgan) C-F15910 C-F15910*, C-Y507 Hg19 .BAM Ancient 3X, 20.2 Mbp, 40 bp NEO249 Russia (Chukotskiy avtonomnyy okrug) C-F15910* —— Hg19 .BAM Ancient 1X, 7.2 Mbp, 81 bp I11696 Mongolia (Bulgan) C-Y507 —— Hg19 .BAM Ancient 2X,…
08 compare visualization results of different annotation software
stay In the first two sections , We compared the differences vcf Use of annotation software , And convert the demerit recorded after the annotation into maf File format , because snpeff The comment result cannot be converted to maf, So we will compare later ANNOVAR、VEP、GATK Funcatator The results of…
BY3 – YFull YTree Info
J-BY3 – YFull YTree Info SNPs currently defining J-BY3 BY3 / FGC15184 Sample ID Country / Language Info Ref File Testing company Statistics Status YF016315 —— J-FGC15174 J-FGC15174*, J-FGC15168*, J-FT258574 Hg38 .BAM FTDNA (Y500) 23X, 12.0 Mbp, 151 bp YF068400 Sudan (Janūb Kurdufān) J-FGC38453* —— Hg38 .BAM FTDNA (Y700)…
Allelic expression imbalance of PIK3CA mutations is frequent in breast cancer and prognostically significant
Subjects Normal breast and tumor samples were obtained with the written informed consent from donors and appropriate approval from local ethical committees, with the detailed information described in the respective original publications: normal tissue9, METABRIC14, TCGA35. Differential allelic expression analysis DNA and total RNA from 64 samples of normal breast…
YP3952 – YFull YTree Info
Q-YP3952 – YFull YTree Info Sample ID Country / Language Info Ref File Testing company Statistics Status YF073154 Russia (Chechenskaya Respublika) / Chechen Q-YP3952* —— Hg38 .BAM FTDNA (Y700) 33X, 18.2 Mbp, 151 bp YF092378 Russia (Chechenskaya Respublika) / Chechen Q-BZ87 —— Hg38 .BAM FTDNA (Y700) 55X, 18.5 Mbp, 151…
GeneActivity without Fragments file in Seurat for Integrating scRNA-seq and scATAC-seq
Hi all, I am new to R and Seurat, and I am following Seurat tutorials to find anchors between RNA-seq and ATAC-seq data according to: Combining the two tutorials is difficult for a cell line data set I am using for SNARE-seq Human here. I managed to run the following…
Variant #0000255165 (NC_000010.10:g.123278248A>G, FGFR2(NM_000141.4):c.939+1245T>C) – Global Variome shared LOVD
Variant #0000255165 (NC_000010.10:g.123278248A>G, FGFR2(NM_000141.4):c.939+1245T>C) Chromosome 10 Allele Unknown Affects function (as reported) Probably does not affect function Affects function (by curator) Not classified Classification method – Clinical classification likely benign DNA change (genomic) (Relative to hg19 / GRCh37) g.123278248A>G DNA change (hg38) g.121518734A>G Published as FGFR2(NM_022970.3):c.1035T>C (p.Y345=) ISCN – DB-ID FGFR2_000119 Variant remarks VKGL data sharing initiative Nederland Reference – ClinVar ID – dbSNP ID – Origin CLASSIFICATION record Segregation –…
Parse a file of strings in python separated by newline into a json array
I don’t see where you’re actually reading from the file in the first place. You have to actually read your path_text.txt before you can format it correctly right? with open(‘path_text.txt’,’r’,encoding=’utf-8′) as myfile: content = myfiel.read().splitlines() Which will give you [‘/gp/oi/eu/gatk/inputs/NA12878_24RG_med.hg38.bam’, ‘/gp/oi/eu/gatk/inputs/NA12878_24RG_small.hg38.bam’] in content. Now if you want to write this…
Z697 – YFull YTree Info
R-Z697 – YFull YTree Info SNPs currently defining R-Z697 Z697 Sample ID Country / Language Info Ref File Testing company Statistics Status YF009397 Sweden (Västra Götalands län) R-Z697* —— Hg19 .BAM FTDNA (Y500) 81X, 14.4 Mbp, 165 bp YF084333 Italy (Chieti) R-FT285492 —— Hg38 .BAM Dante Labs 14X, 23.4…
Y140591 – YFull YTree Info
R-Y140591 – YFull YTree Info Sample ID Country / Language Info Ref File Testing company Statistics Status YF067865 Germany R-Y140591* —— Hg38 .BAM FTDNA (Y700) 52X, 18.7 Mbp, 151 bp YF076495 Germany R-FT167842 —— Hg38 .BAM FTDNA (Y700) 49X, 18.3 Mbp, 151 bp YF067633 Germany R-FT167842 —— Hg38 .BAM FTDNA…
CTS1346 – YFull YTree Info
Sample ID Country / Language Info Ref File Testing company Statistics Status HGDP01351 China, People’s Republic of O-F3607* —— Hg38 .BAM Scientific 16X, 23.6 Mbp, 151 bp YF079316 —— O-Y224790 —— Hg19 .BAM 23mofang 58X, 21.3 Mbp, 150 bp HG00583 China, People’s Republic of O-Y224790 —— Hg19 .BAM Scientific ——…
A114 – YFull YTree Info
R-A114 – YFull YTree Info SNPs currently defining R-A114 FGC78244 A114(H) H Sample ID Country / Language Info Ref File Testing company Statistics Status YF067576 France (Ille-et-Vilaine) R-A114* —— Hg19 .BAM Dante Labs 12X, 23.0 Mbp, 151 bp YF088360 United States (Virginia) R-CTS4466* —— Hg38 .BAM FTDNA (Y700)…
F13864 – YFull YTree Info
Sample ID Country / Language Info Ref File Testing company Statistics Status ERS5240131 Singapore C-F13864* —— Hg19 .BAM Scientific 7X, 22.9 Mbp, 150 bp YF076683 China, People’s Republic of (Shandong) C-F13864* —— Hg19 .BAM 23mofang 57X, 21.2 Mbp, 150 bp YF071813 —— C-F13864* —— Hg19 .BAM 23mofang 21X, 21.8 Mbp,…
‘No genomes installed!’ error from getREF
I was trying to use the getPlotSetArray() function, but I got the error ‘No genomes installed!’ from the getREF function. I digged into the problem and it turns out that in the latest version of the BSgenome package the output of the function BSgenome::installed.genomes(splitNameParts=TRUE) changed from: pkgname organism provider provider_version…
Building custom hg38 – alt contigs
I am exploring modifications of hg38 like these: github.com/mebbert/Dark_and_Camouflaged_genes Starting from the regular bcbio hg38 data installation Masking hg38.fa using bedtools maskfasta Generating indexes using bcbio_setup_genome.py for seq and bwa as described in the manual The bwa directory then contains ├── bwa │ ├── hg38_masked.fa.amb │ ├── hg38_masked.fa.ann │ ├──…
L1193 – YFull YTree Info
I-L1193 – YFull YTree Info SNPs currently defining I-L1193 L1193 FGC87558 Y72031 Sample ID Country / Language Info Ref File Testing company Statistics Status ASH1 Ireland (Tipperary) I-L1193* —— Hg19 .BAM Ancient 1X, 10.5 Mbp, 101 bp PB581 Ireland (Clare) I-L1193* —— Hg19 .BAM Ancient 2X, 15.8…
3 -tag XM” failed! when running rsem-calculate-expression
Dear sir, When I ran “rsem-calculate-expression –paired-end –alignments -p 8input.bam” gencodev22 ./out. I got error message rsem-parse-alignments ../bowtie2/hg38 ./rsem-out.temp/rsem-out ./rsem-out.stat/rsem-out /NGS_Storage/Debbie/RNA-seq/variant_calling_20210602/RNA-leukemia002A-906.para.bam 3 -tag XM Read A00355:209:H3KTLDSX2:2:2606:24677:17425: The adjacent two lines do not represent the two mates of a paired-end read! (RSEM assumes the two mates of a paired-end read should…
At ABRF Meeting, T2T Consortium Describes Improvements of Complete Human Genome
PALM SPRINGS, Calif. — Researchers from the Telomere-to-Telomere (T2T) Consortium have generated an assembly of a complete human reference genome that could lead to better variant calling in the clinic and inform new studies of cell biology. The results of the project were presented by Karen Miga, an investigator at…
Y18411 – YFull YTree Info
J-Y18411 – YFull YTree Info Sample ID Country / Language Info Ref File Testing company Statistics Status YF072520 Albania J-BY111710 —— Hg19 .BAM Dante Labs 10X, 22.8 Mbp, 151 bp YF067307 Palestine (Nablus) J-BY111710 —— Hg38 .BAM FTDNA (Y700) 34X, 18.7 Mbp, 151 bp NA20827 Italy (Firenze) J-CTS3330 —— Hg19…
Difference between knownGene and wgEncodeGencodeCompV39
Hi: I am a bit confuse with the the relationship/difference between knownGene and wgEncodeGencodeCompV39 on UCSC Table Browser. Anyone know the precise difference between them? They both can be downloaded from the goldenPath page. knownGene: The schema is here, which is NOT match the file (knownGene.txt.gz) I downloaded. According to…
Bioconductor Package Installation
When I try to install the gtf for hg38 BiocManager::install(“TxDb.Hsapiens.UCSC.hg38.knownGene”) I get the following error: ‘getOption(“repos”)’ replaces Bioconductor standard repositories, see ‘?repositories’ for details replacement repositories: CRAN: cran.rstudio.com/ Bioconductor version 3.14 (BiocManager 1.30.16), R 4.1.2 (2021-11-01) Installing package(s) ‘TxDb.Hsapiens.UCSC.hg38.knownGene’ Error in readRDS(dest) : error reading from connection Per stackoverflow.com/questions/67455984/getoptionrepos-replaces-bioconductor-standard-repositories-see-reposito I…
M8498 – YFull YTree Info
B-M8498 – YFull YTree Info Sample ID Country / Language Info Ref File Testing company Statistics Status YF004283 Saudi Arabia B-M8498* —— Hg19 .BAM FTDNA (Y500) 43X, 13.7 Mbp, 165 bp HGDP00992 Namibia B-M7650* —— Hg38 .BAM Scientific 18X, 23.5 Mbp, 151 bp YF013963 —— B-Y82361 —— Hg38 .BAM FTDNA…
FGC15109 – YFull YTree Info
I-FGC15109 – YFull YTree Info SNPs currently defining I-FGC15109 FGC15109 Sample ID Country / Language Info Ref File Testing company Statistics Status SZ43 Hungary (Somogy) I-BY138* —— Hg19 .BAM Ancient 8X, 22.8 Mbp, 32 bp YF010533 —— I-BY138* —— Hg19 .BAM FTDNA (Y500) 73X, 14.9 Mbp, 165 bp YF019250…
bedtools -u not giving unique files
bedtools -u not giving unique files 1 The following are the steps Im following: First step to extract sample using bed file is this (here the bedfile is input bedfile converted to Hg38): tabix -h -R Hg19_to_Hg38_sorted.bed.gz gnomad.genomes.v{g_version}.hgdp_tgp.chr{chr}.vcf.bgz | perl {vcftools} -c {sample_name} > {sample_name}_out.vcf’ output({sample_name}_out.vcf’) chr2 113982416 rs56177103 TATAAAATAAAATAAA…
Pathway analysis of RNAseq data using goseq package
Hello, I have finished the RNA seq analysis and I am trying to perform some pathway analysis. I have used the gage package and I was looking online about another package called goseq that takes into account length bias. However, when I run the code I get an error. How…
FGC19851 – YFull YTree Info
R-FGC19851 – YFull YTree Info SNPs currently defining R-FGC19851 FGC19851 Sample ID Country / Language Info Ref File Testing company Statistics Status YF072967 United States (Georgia) R-FGC19851* —— Hg38 .BAM FTDNA (Y700) 34X, 18.7 Mbp, 151 bp YF009427 —— R-FGC65264* —— Hg19 .BAM FTDNA (Y500) 38X, 12.8 Mbp, 165…
Transcriptional kinetics and molecular functions of long noncoding RNAs
Ethical compliance The research carried out in this study has been approved by the Swedish Board of Agriculture, Jordbruksverket: N343/12. Cell culture Mouse primary fibroblasts were derived from adult (>10 weeks old) CAST/EiJ × C57BL/6J or C57BL/6J × CAST/EiJ mice by skinning, mincing and culturing tail explants (for at least 10 d) in DMEM high…
FGC35106 – YFull YTree Info
Sample ID Country / Language Info Ref File Testing company Statistics Status YF016938 Saudi Arabia (Ar Riyāḍ) J-FGC35106 YF081770 | J-FGC35106*, J-FGC58682* Hg38 .BAM FTDNA (Y500) 30X, 11.5 Mbp, 151 bp YF016937 Saudi Arabia (Ar Riyāḍ) J-FGC35106 YF081769 | J-FGC35106*, J-FGC58682* Hg38 .BAM FTDNA (Y500) 37X, 12.5 Mbp, 151 bp…
Bioconductor – TAPseq
DOI: 10.18129/B9.bioc.TAPseq This package is for version 3.12 of Bioconductor; for the stable, up-to-date release version, see TAPseq. Targeted scRNA-seq primer design for TAP-seq Bioconductor version: 3.12 Design primers for targeted single-cell RNA-seq used by TAP-seq. Create sequence templates for target gene panels and design gene-specific primers using…
YP4024 – YFull YTree Info
Sample ID Country / Language Info Ref File Testing company Statistics Status ERS2478532 Turkmenistan Q-YP4024* —— Hg19 .BAM Scientific 17X, 16.7 Mbp, 151 bp YF006625 Russia (Tomskaya oblast’) / Selkup Q-YP4024* —— Hg19 .BAM FTDNA (Y500) 67X, 14.8 Mbp, 165 bp DA162 Russia (Severnaya Osetiya-Alaniya, Respublika) Q-BZ5214* —— Hg19 .BAM…
Bioconductor – branchpointer
DOI: 10.18129/B9.bioc.branchpointer Prediction of intronic splicing branchpoints Bioconductor version: Release (3.14) Predicts branchpoint probability for sites in intronic branchpoint windows. Queries can be supplied as intronic regions; or to evaluate the effects of mutations, SNPs. Author: Beth Signal Maintainer: Beth Signal <b.signal at garvan.org.au> Citation (from within R,…
Y570 – YFull YTree Info
Sample ID Country / Language Info Ref File Testing company Statistics Status AF2 —— Q-Y570 Q-Y570*, Q-F746* Hg19 .BAM Ancient 1X, 1.3 Mbp, 94 bp YF093124 —— Q-M120* —— Hg38 .BAM Nebula Genomics 57X, 23.6 Mbp, 150 bp Kolyma1 Russia (Sakha, Respublika [Yakutiya]) Q-Y222276* —— Hg19 .BAM Ancient 7X, 13.4…
use tcgabiolinks package to download TCGA data
TCGA Data download in terms of ease of use ,RTCGA The bag should be better , And because it’s already downloaded data , The use is relatively stable . But also because of the downloaded data , There is no guarantee that the data is new .TCGAbiolinks The package is…
PF6747 – YFull YTree Info
E-PF6747 – YFull YTree Info Sample ID Country / Language Info Ref File Testing company Statistics Status YF010216 Azerbaijan (Qəbələ) E-PF6747* —— Hg19 .BAM FTDNA (Y500) 50X, 13.7 Mbp, 165 bp YF064736 Egypt (Al Minūfīyah) E-FT97857* —— Hg38 .BAM FTDNA (Y700) 35X, 18.5 Mbp, 151 bp YF093064 Yemen (Tā’izz) E-Y280593…
Comprehensive circRNA Analyses in Human Vertebrae of GIOP and Its Molecular Mechanism
Circular RNAs (circRNAs) are a novel class of noncoding RNAs that play important roles in human diseases. However, the regulation of circRNAs in glucocorticoid-induced osteoporosis (GIOP) has not been reported. In this study, we performed high-throughput sequencing to identify altered circRNAs in the vertebrae from GIOP patients. A total of…
Variant #0000726648 (NC_000017.10:g.7100169G>A, ACADVL(NM_000018.3):c.-23135G>A) – Global Variome shared LOVD
Variant #0000726648 (NC_000017.10:g.7100169G>A, ACADVL(NM_000018.3):c.-23135G>A) Chromosome 17 Allele Unknown Affects function (as reported) Effect unknown Affects function (by curator) Not classified Classification method – Clinical classification VUS DNA change (genomic) (Relative to hg19 / GRCh37) g.7100169G>A DNA change (hg38) – Published as DLG4(NM_001321075.2):c.990C>T (p.G330=) ISCN – DB-ID DLG4_000038 Variant remarks VKGL data sharing initiative Nederland Reference – ClinVar ID – dbSNP ID – Origin CLASSIFICATION record Segregation – Frequency – Re-site –…
Variant #0000803285 (NC_000007.13:g.92730753A>G, SAMD9(NM_017654.3):c.4658T>C) – Global Variome shared LOVD
Variant #0000803285 (NC_000007.13:g.92730753A>G, SAMD9(NM_017654.3):c.4658T>C) Chromosome 7 Allele Unknown Affects function (as reported) Effect unknown Affects function (by curator) Not classified Classification method – Clinical classification VUS DNA change (genomic) (Relative to hg19 / GRCh37) g.92730753A>G DNA change (hg38) – Published as SAMD9(NM_017654.3):c.4658T>C (p.I1553T), SAMD9(NM_017654.4):c.4658T>C (p.I1553T) ISCN – DB-ID SAMD9_000024 See all 3 reported entries Variant remarks VKGL data sharing initiative Nederland Reference – ClinVar ID – dbSNP ID – Origin CLASSIFICATION…
Z2039 – YFull YTree Info
Sample ID Country / Language Info Ref File Testing company Statistics Status YF003382 Finland (Länsi-Suomen lääni) I-Z2040* —— Hg19 .BAM FTDNA (Y500) 47X, 13.3 Mbp, 165 bp YF067917 Ireland I-FGC69701* —— Hg19 .BAM Dante Labs 9X, 22.9 Mbp, 151 bp YF078735 Belarus (Vicebskaja voblasc’) / Polish I-FGC69702 —— Hg38 .VCF…
BY7447 – YFull YTree Info
E-BY7447 – YFull YTree Info SNPs currently defining E-BY7447 BY7447 Sample ID Country / Language Info Ref File Testing company Statistics Status YF075635 Yemen (Al Bayḑā’) E-FT183181 —— Hg38 .BAM FTDNA (Y700) 39X, 18.2 Mbp, 151 bp YF067501 Yemen (Şan’ā’) E-FT183181 —— Hg38 .BAM FTDNA (Y700) 44X, 18.8 Mbp,…
DF109 – YFull YTree Info
Sample ID Country / Language Info Ref File Testing company Statistics Status YF016926 Ireland R-DF109 R-DF109*, R-A18726* Hg38 .BAM FTDNA (Y500) 27X, 12.7 Mbp, 165 bp YF016394 United States (Ohio) R-DF109 R-DF109*, R-A18726* Hg38 .BAM FTDNA (Y500) 34X, 11.9 Mbp, 151 bp YF011566 Ireland (Mayo) R-DF109 R-DF109*, R-A18726*, R-FGC23742* Hg38…
ZP77 – YFull YTree Info
R-ZP77 – YFull YTree Info SNPs currently defining R-ZP77 ZP77 / FGC6562 Sample ID Country / Language Info Ref File Testing company Statistics Status YF008362 —— R-ZP77* —— Hg19 .BAM FTDNA (Y500) 41X, 13.8 Mbp, 165 bp YF067652 Unknown R-BY40744 —— Hg38 .BAM FTDNA (Y700) 36X, 18.7 Mbp, 151…
Download full list of SNPs and their coordinates in hg38
Download full list of SNPs and their coordinates in hg38 3 What is the best / standard place to get a full list of SNPs and their coordinates in hg38? I downloaded the SNPsnap database, but just realized that those coordinates are in hg19. I’m trying to figure out how…
htseq-count -t gene not working
I found a little problem. When I set the “-t gene”, the reads is mark “__no_feature”. But when I set the “-t exon”, the reads is mark “ENSG00000276104”. The gene “ENSG00000276104” is a single exon gene. I don’t know why this happens. reads: “TGTCTGTGGCGGTGGGATCCCGCGGCCGTGTTTTCCTGGTGGCCCGGCCGTGCCTGAGGTTTCTCCCCGAGCCGCCGCCTCTGCGGGCTCCCGGGTGCCCTTGCCCTCGCGGTCCCCGGCCCTCGCCCGTCTGTGCCCTCTTCCCCGCCCGCCGATCCTCTTCTTCCCCCCGAGCGGCTCACCGGCTTCACGTCCGTTGGTGGCCCCGCCTGGGAC”. I had aligned to hg38 by…
Bioconductor – ChIPQC
This package is for version 3.1 of Bioconductor; for the stable, up-to-date release version, see ChIPQC. Quality metrics for ChIPseq data Bioconductor version: 3.1 Quality metrics for ChIPseq data Author: Tom Carroll, Wei Liu, Ines de Santiago, Rory Stark Maintainer: Tom Carroll <tc.infomatics at gmail.com>, Rory Stark <rory.stark…
hg38 Import custom reference upload error
Our version of TS is 5.12.2 When trying to upload new custom reference fasta (downloaded from ncbi ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA_000001405.15_GRCh38/seqs_for_alignment_pipelines.ucsc_ids/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.gz, gunzipped and renamed to hg38.fasta) through “Import custom reference” in interface an error occures: “uploaded file size is incorrect” (to be honest the error was not shown in logs, because of TypeError…
help with CrossMap
help with CrossMap 0 Hello all, I would really appreciate your help as I am new to working with different file builds and having a setback lifting a vcf file from build hg38 to hg19. in essence, using CrossMap the chromosome value gets altered. Like for example, below is the…
Systems biology analysis of human genomes points to key pathways conferring spina bifida risk
Significance Genetic investigations of most structural birth defects, including spina bifida (SB), congenital heart disease, and craniofacial anomalies, have been underpowered for genome-wide association studies because of their rarity, genetic heterogeneity, incomplete penetrance, and environmental influences. Our systems biology strategy to investigate SB predisposition controls for population stratification and avoids…
Padding out a GVCF file with 1000G exomes to get gatk VariantRecalibrator working with a small sample
I’ve got sequencing data for a small 500 bp amplicon from a few samples. GATK best principles suggest running VariantRecalibrator on the GVCF files I generate. I’m trying to get this working, but I get an error about “Found annotations with zero variances”. Reading the gatk manual and other posts…
computeMatrix in deeptool is Running with no result
computeMatrix in deeptool is Running with no result 0 Hi All, I wonder if someone can help me in explaining what to input on the -R <bed file> argument of the code below? computeMatrix scale-regions -S <bigwig file(s)> -R <bed file> -b 1000 what I did for example, I download…
NoClassDefFoundError: htsjdk/samtools/util/IntervalTree
NoClassDefFoundError: htsjdk/samtools/util/IntervalTree 0 When I run circm6A (github.com/canceromics/circm6a) example code: cd ../.. java -Xmx16g -jar circm6a.jar -ip test_data/HeLa_eluate_rep_1.chr22.bam -input test_data/HeLa_input_rep_1.chr22.bam -r test_data/gencode_chr22.gtf -g test_data/hg38_chr22.fa -o test_data/example_Hela The following error occurred: Start at 2021-12-12 16:33:26 Exception in thread “main” java.lang.NoClassDefFoundError: htsjdk/samtools/util/IntervalTree at main.Method.loadGenes(Method.java:200) at main.Method.run(Method.java:66) at main.Main.main(Main.java:9) Caused by: java.lang.ClassNotFoundException: htsjdk.samtools.util.IntervalTree…
transcripts are not true in TxDb.Hsapiens.UCSC.hg38.knownGene
transcripts are not true in TxDb.Hsapiens.UCSC.hg38.knownGene 1 @11b02720 Last seen 2 hours ago United States Hello, I used TxDb.Hsapiens.UCSC.hg38.knownGene/GenomicFeatures to retrieve gene promoters and other genomic features. here is code: library(‘TxDb.Hsapiens.UCSC.hg38.knownGene’) txdb <- TxDb.Hsapiens.UCSC.hg38.knownGene PR <- promoters(txdb, upstream=2000, downstream=0) but when I take a look at the PR results: it…
gatk VariantRecalibrator positional argument error
I’m trying to use recalibrate my vcf using gatk VariantRecalibrator, but keep getting an error “Illegal argument value: Positional arguments were provided”. But I don’t know what this means, or how to correct it! Here’s my call: gatk VariantRecalibrator -R “/Volumes/Seagate Expansion Drive/refs/hg38/gatk download/Homo_sapiens_assembly38.fasta” -V “$OUT”/results/variants/”$SN”.norm.vcf.gz -AS –resource hapmap,known=false,training=true,truth=true,prior=15.0: “/Volumes/Seagate…
What is the single nucleotide polymorphism database ( dbsnp )?
The Single Nucleotide Polymorphism Database (dbSNP) is a free public archive for genetic variation within and across different species developed and hosted by the National Center for Biotechnology Information (NCBI) in collaboration with the National Human Genome Research Institute (NHGRI). Furthermore, are there any databases for single nucleotide polymorphisms?As there…
Removing uncovered transcripts from multi FASTA reference file
Removing uncovered transcripts from multi FASTA reference file 0 Hi everyone 🙂 for RNASeq analyses, I have a reference file, containing multiple transcript sequences (it´s a subset of the NCBI human hg38 transcriptome). I found, that some of the transcripts are not even covered by a single read (especially if…
The Biostar Herald for Tuesday, September 21, 2021
The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here. This edition of the Herald was brought to you by contribution from Istvan Albert, and was edited by Istvan…
I can’t get a dossage file using PLINK
Hi, I have been trying to get a dosage file from vcf, map and fam files. For that, I have written this bash script : plink –fam plink.fam –map plink.map –dosage one.vcf –write-dosage However, I got this error: –dosage: Reading from one.vcf. Error: Line 1 of one.vcf has fewer tokens…
What is the codification in genestrand 1 and 2?
What is the codification in genestrand 1 and 2? 0 Hi there, I’m doing some peak annotation using ChIPseeker library(ChIPseeker) library(TxDb.Hsapiens.UCSC.hg38.knownGene) library(clusterProfiler) library(annotables) library(org.Hs.eg.db) txdb <- TxDb.Hsapiens.UCSC.hg38.knownGene peaks= readPeakFile(“peaks_”, header = F) peakAnno <- annotatePeak(peaks, tssRegion=c(-3000, 3000), TxDb=txdb, annoDb=”org.Hs.eg.db”) peaks_annot <- as.data.frame(peakAnno) In my annotation file “geneStrand” is codified as…
Best tools for calling structural variants from 2 assemblies?
Best tools for calling structural variants from 2 assemblies? 0 Dear community, I have the fasta files of 2 assemblies of the human genome (for example hg19 and hg38). What would be the best tools to call structural variants from these 2 fasta files? Most of the tools I know…
python – snakemake multiple parameters for multiple input and single output in snakemake. ConbineGVCFs gatk problem
I have written a rule for CombineGVCFs in gatk4. The rule is as follow all_gvcf = get_all_gvcf_list() rule cohort: input: all_gvcf_list = all_gvcf, ref=”/data/refgenome/hg38.fa”, interval_list = prefix+”/bedfiles/hg38.interval_list”, params: extra = “–variant”, output: prefix+”/vcf/cohort.g.vcf”, shell: “gatk CombineGVCFs -R {input.ref} {params.extra} {input.all_gvcf_list} -O {output} –tmp-dir=/data/tmp -L {input.interval_list}” all_gvcf is the dataset for…
Alternate nucleotide is more frequent than reference nucleotide. OMG I’m dizzy. How do I stop the twirl?
This is due to the fact that the very reference genomes that we use for re-alignment are themselves based on individuals who carry rare risk alleles. Thus, when we call variants against these genomes, we are, at many loci, comparing against rare disease risk alleles. As the best/worst example (depending…
mixing hg38 and GRCh38 during variant calling
mixing hg38 and GRCh38 during variant calling 0 Hello everyone! I’ve been working on a variant calling pipeline for WES data and used a mix of hg38 and GRCh38 reference files after reading that hg38 is just an abbreviation of GRCh38, and that they refer to the same thing. But…
SNP exon region UCSC
SNP exon region UCSC 2 how i can get SNP in only exons regions genome with UCSC? UCSC get the all SNP of gene region, and there is no filter option to get only exon region. tx ucsc SNP exon • 245 views • link updated 2 hours ago by…
ZhaozzReal/SNV_IPA: Detect SNV-associated intronic polyadenylation events from standard RNAseq data
Description Somatic single nucleotide variants (SNVs) in cancer genome affect gene expression through various mechanisms depending on their genomic location. In this study, we found that somatic SNVs near splice site are associated with abnormal intronic polyadenylation (IPA) . Here we give examples to show how to detect SNV-associated IPA…
Where can I get ?or how can I make a mappability track for hg38 assembly
Where can I get ?or how can I make a mappability track for hg38 assembly 2 Lucky you @manojmumar_bhosale I worked on similar problem recently and therefore have the bash script you can use. Required tools: GEM libary from here UCSC’s wigToBigWig from here (I chose binary for Linux 64…
How to load user-defined genome in IGV-webapp
How to load user-defined genome in IGV-webapp 0 I would like to create a session in IGV-webapp using a HTML file. The following works with pre-defined genomes (g.e. genome: “hg38”), but I would like to load my own genome. Is there a way to achieve this? <!DOCTYPE html> <html lang=”en”>…
UCSC knownCanonical hg19 vs. hg38
Hello, We have an FAQ page that covers this topic (genome.ucsc.edu/FAQ/FAQgenes.html#singledownload). As posted by ATpoint, it boils down to different datasets and different approaches. hg19 knownCanonical was last updated in 2013 and built primarily from RefSeq and GenBank sequences and a few other sources. One isoform was identified from each…
Get rsID for a list of SNPs in an entire GWAS sumstats file
Here is a fairly efficient way to do this; assuming hg38 and BEDOPS and standard Unix tools installed. $ bedmap –echo –echo-map-id –delim ‘t’ <(awk ‘{n=split($0,a,/[:_]/); print “chr”a[1]”t”a[2]”t”a[2]+1″t”a[3]”https://www.biostars.org/”a[4];}’ sumstats.txt | sort-bed -) <(wget -qO- hgdownload.cse.ucsc.edu/goldenPath/hg38/database/snp150.txt.gz | gunzip -c | cut -f2-5 | sort-bed -) > answer.bed This gets around making…
UCSC liftover
UCSC liftover 2 Hi, I’m using UCSC liftover to convert hg19 to hg38. The result came out that I don’t understand. Feb. 2009 (GRCh37/hg19) → Dec. 2013 (GRCh38/hg38) – chr1:120904787 → chr1:143905854 Dec. 2013 (GRCh38/hg38) → Feb. 2009 (GRCh37/hg19) – chr1:143905854 → chr1:149400430 (I didn’t check “Allow multiple output regions”.)…
Paired-end reads reported without mates: how to play matchmaker?
Hi Everyone, I am currently looking at Acute Myeloid Leukemia (AML) paired-end WGS samples from the TARGET data ocg.cancer.gov/programs/target/target-methods#3241. A bioinformatician in our group remapped the samples from hg19 to hg38. Unfortunately, we do not have any copies of the hg19 version anymore. However, when I try to run anything…
Coverage drops in fastq alignment against custom Immunoglobulin reference
Coverage drops in fastq alignment against custom Immunoglobulin reference 0 I am working on Hiseq2000/2500 single end reads on RNASeq leukemia samples. I am interested in aligning all the reads beloging to the Immunoglobulin genes (Ig) for further analysis. The task is difficult for two main reasons: Final Ig genes…
vcf file analysis
vcf file analysis 0 Hello everyone, I have 22 vcf file for each chr. They were in genome build hg19 so I did a liftover and convert them to hg38 genome build. Now I need just chrom and position values from these vcf files and merge them together into a…
Bioconductor – BSgenome.Hsapiens.UCSC.hg38.dbSNP151.major
DOI: 10.18129/B9.bioc.BSgenome.Hsapiens.UCSC.hg38.dbSNP151.major Full genome sequences for Homo sapiens (UCSC version hg38, based on GRCh38.p12) with injected major alleles (dbSNP151) Bioconductor version: Release (3.13) Full genome sequences for Homo sapiens (Human) as provided by UCSC (hg38, based on GRCh38.p12) with major allele injected from dbSNP151, and stored in Biostrings…
tool or database to convert Gene ID to genomic position
tool or database to convert Gene ID to genomic position 1 Hello.I have lots of Pseudogene IDs like LOC100431174 but none of the below methods worked for me to find their genomic position “offline”. I need a table or package to do it offline without querying to a webpage.methods I…
unable to find chromosome in SAM header
featureCounts: unable to find chromosome in SAM header 0 I am using featureCounts to try and create a count table for some RNA-Seq data I collected using an Oxford Nanopore platform. I have .sam files aligned with minimap2, and am running the following command to try to get a count…
miRNAseq analysis not shown adapter sequence and huge N’s content
miRNAseq analysis not shown adapter sequence and huge N’s content 0 Hi there, This is my third time doing miRNA sequencing analysis, so i do not have huge experience on this… So, i have 18 human semen samples, (also no experience in this type samples) i have been reading alot…
Predicting and characterizing a cancer dependency map of tumors with deep learning
INTRODUCTION The development of novel cancer therapies requires knowledge of specific biological pathways to target individual tumors and eradicate cancer cells. Toward this goal, the landscape of genetic vulnerabilities of cancer, or the cancer dependency map, is being systematically profiled. Using RNA interference (RNAi) loss-of-function screens, Marcotte et al. (1),…
liftover using genome browser
liftover using genome browser 0 Hello everyone, I have a file which is hg38 build. I want to do a liftover and change it to hg19. I thought of using liftover tool from UCSC genome browser. I realise that the input file should be bed format. My file has only…
VariantRecalibrator no positional argument is defined for this tool.
Hi, I am trying to run the following command: gatk VariantRecalibrator -R genome.fa -V all.Sample.SNP.vcf.gz –trust-all-polymorphic -tranche 100.0 -tranche 99.95 -tranche 99.9 -tranche 99.8 -tranche 99.6 -tranche 99.5 -tranche 99.4 -tranche 99.3 -tranche 99.0 -tranche 98.0 -tranche 97.0 -tranche 90.0 -an MQRankSum -an ReadPosRankSum -an FS -an MQ -an SOR…
Get chromosome sizes from fasta file
Get chromosome sizes from fasta file 4 Hello, I’m wondering whether there is a program that could calculate chromosome sizes from any fasta file? The idea is to generate a tab file like the one expected in bedtools genomecov for example. I know there’s the fetchChromSize program from UCSC, but…
Contig chr1 given as location, but this contig isn’t present in the Fasta sequence dictionary
Badly formed genome unclippedLoc: Contig chr1 given as location, but this contig isn’t present in the Fasta sequence dictionary 2 Hi everyone, I’m trying to run Mutect2 for WES cancer data. However, since their Resource bundle only supports h19 seems I cannot proceed (I want to compare it with Strelka2…
Using MACS2 parameters
Using MACS2 parameters 0 Trying to reproduce a galaxy training in Linux CLI. I’ve come up with the following commands for the peak calling with MACS2. Am I on the right track? The galaxy parameters are- macs2 command can be- macs2 callpeak -t input_file.bed -n macs_output -g 50818468 –nomodel –shift…
Non-repeat human genome dataset
Non-repeat human genome dataset 1 Could anyone please point me to where I could find a dataset of non-repeat sequences for the human ref genome. I’m not sure if it’s still regarded as true, but I saw that possibly 2/3 of the human genome contains repeats. Is there a place…
VCF file phasing by SHAPEIT
Hi everybody, I would like to phase (just phasing, not imputation) vcf file containing about 1100 individuals (a given human population) derived from whole genome sequencing, the vcf file obtained by GATK. As I searched, SHAPEIT was mostly used; based on its manual, it requires genetic map for phasing, however,…
Finding 16 mer not present in GRCh38
Thanks for the question – it has kept me busy this Sunday morning / afternoon. As implied by others, this poses a computational challenge but is not insurmountable. For motif searching generally, I usually use AWK. My approach here was to: generate all possible k-mers of the chosen size (run…
question about running CIRI-full
question about running CIRI-full 1 I’m using ciri-full to calculate the full length sequence of circRNAs ,and I can run the test data set successfully, but I can’t run my own data running test data set: java -jar ../CIRI-full.jar Pipeline -1 test_1.fq.gz -2 test_2.fq.gz -a test_anno.gtf -r test_ref.fa -d test_output/…
VCF to 23 and Me format and changing ensamble reference help needed for underestanding VCF
Hello i am trying to change my nebula Genomics report to 23 and me Format i have to problems nebula uses 38 human ensemble and 23 and me 37, I was thinking to do a python script but i have some doubts: My plan was to change the genotype according…