How to get protein ID from gene ID (batch entrez)

How to get protein ID from gene ID (batch entrez)

1

Hi

can someone suggest me How to get protein ID from gene ID (batch entrez).

I have hundreds of gene name like  AaeL_AAEL004207  with gene ID 5564359. Manually we can get the protein ID one by one, the problem I have hundreds of that, obviously it seem not a good idea, any one can suggest me..?

thanks

 

 


gene

• 4.5k views

With Entrez Direct:

epost -db gene -id 5564359 | elink -target protein | efetch -format uid
157105044

You can include multiple gene IDs (at least 500) in the -id part, separated by commas. Here’s a script:

#!/bin/bash
exist=$(which epost)
if [ $(echo $? != 0) ]
then
echo "Entrez Direct not in $PATH"
exit
fi

if [ -n "$1" ]
then
split -l 500 $1 input.

for f in input.*
do
ids=$(cat $f | tr "n" ",")
epost -db gene -id $ids | elink -target protein | efetch -format uid > $f.output
paste $f $f.output > $f.result
rm $f $f.output
done

cat *.result > $1.output
rm *.result

else
echo "Usage: sh convertGeneIDs listOfGeneIDsnOutput: geneIDtproteinID"
fi

updated 2.0 years ago by

34k

written 6.9 years ago by

10k


Login
before adding your answer.

Traffic: 1636 users visited in the last hour

Read more here: Source link