Use grep to loop a command in a script

Hello,

I am doing a measurement of the HWE per Population. I have done this already without trouble with 10 populations, but now I’m doing it with 89 populations so I’d like to create a script.

I use this command to create a list with all the populations and their subset of individuals:

grep Barcelona (one of the populations) FILENAME.vcf > population.list

The file created is like this (the population name is separated from the individual name with this symbol ‘ _ ‘ )
Barcelona_GRA1 Barcelona_GRA2 Barcelona_GRA3 and so on.

Then I use this command to create a list of the individuals for each population:

cat population.list | tr 't' 'n' | grep PJL > PJL.list

cat population.list | tr 't' 'n' | grep BEB > BEB.list

That create a separate file for each population.

To do this 89 times is doable but it’s quite long, so I’d like to make a script.

The reason why I need these list files is because I’m doing a script to measure HWE per population.
This is my script:

number_of_populations="number"

for num in `seq 1 $number_of_populations`
do

POP=$(sed -n "${num}p" pops.txt)

vcftools --vcf FILENAME.vcf --keep ${POP}.list --hardy --out ${POP}.hwe_pvalue_vcftools

done
for num in `seq 1 $number_of_populations`
do

POP=$(sed -n "${num}p" pops.txt)

sites=$(awk '$6 <= 0.0001' ${POP}.hwe_pvalue_vcftools.hwe|wc -l|awk '{print $1}')
echo -e "${POP} t ${sites}" >> HWE_perpop.txt

done

Any suggestion? Thank you very much

Read more here: Source link