Commit dbb266d3 authored by Marwan ELKREWI's avatar Marwan ELKREWI
Browse files

Update fst.md

parent ee90e4e8
**Data**
- RNA reads (125b)
- RNA reads (125b), from Huylmans et. al, 2019
- *A.sinica*
- pool of 10 males: head, thorax, testes
- /nfs/scistore03/vicosgrp/amrnjava/fst2/reads/sinica/male
- pool of 10 females: head, thorax, ovaries
- /nfs/scistore03/vicosgrp/amrnjava/fst2/reads/sinica/female
- *A.franciscana*
- pool of 10 males: head, testes
- pool of 10 females: head, ovaries
- reference genome
- *A.sinica* male (ZZ)
- reference genomes
- *A.sinica* male (ZZ), in-house assembly
- *A.franciscana* male (ZZ), KPI
**Pipeline**
- [ ] map male and female RNA reads to the reference genome
- [ ] calculate Fst between males and females for every SNP
- [ ] plot Fst values for each linkage group
- [ ] calculate Fst between males and females
**Detailed pipeline**
* *adapted from https://sourceforge.net/p/popoolation2/wiki/Tutorial/*
0. Trimming reads with Trimmomatic 0.36 (http://www.usadellab.org/cms/index.php?page=trimmomatic)
+ `java -jar PATH/Trimmomatic-0.36/trimmomatic-0.36.jar PE -phred33 fran_female.1.fastq fran_female.2.fastq fran_female.1_paired.fastq fran_female.1_unpaired.fastq fran_female.2_paired.fastq fran_female.2_unpaired.fastq ILLUMINACLIP:PATH/Trimmomatic-0.36/adapters/TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36`
1. Merge reads
+ male and female forward and reverse reads from different tissues were separately merged
+ example:
......@@ -49,7 +57,6 @@ srun STAR --runThreadN 20 --runMode genomeGenerate --genomeDir ~/fst2/ref_new --
srun STAR --runThreadN 20 --genomeDir ~/fst2/ref_new --readFilesIn FILENAME.1.fastq FILENAME.2.fastq --outFileNamePrefix ~/fst2/map_newref/FILENAME.
```
* resulting sam files and mapping summary files in /nfs/scistore03/vicosgrp/amrnjava/fst2/map_newref
- get MAPQ scores: mapq.sh
......@@ -96,8 +103,8 @@ perl ~/popoolation2-code/mpileup2sync.pl --input FILENAME --output FILENAME.sync
-
- pop_sync_generic.sh
6. Calculating Fst for every SNP
* after we obtained synchronized files, we can use them as the input for PoPoolation2 script that calculates Fst for every SNP
6. Calculating Fst for 1000nt windows
* after we obtained synchronized files, we can use them as the input for PoPoolation2 script that calculates Fst
```
perl ~/popoolation2-code/fst-sliding.pl --input FILENAME --output FILENAME.fst --suppress-noninformative --min-count 3 --min-coverage 10 --max-coverage 200 --min-covered-fraction 0.5 --window-size 1000 --step-size 1000 --pool-size 10
```
......@@ -135,7 +142,7 @@ perl ~/popoolation2-code/fst-sliding.pl --input FILENAME --output FILENAME.fst -
- --suppress-noninformative
- Suppress output for windows with no SNPs or insufficient coverage;
* The output has a line for every SNP:
* The output has a line for every window:
+ col1: reference contig (chromosome)
+ col2: mean position of the sliding window
+ col3: number of SNPs found in the window (not considering sites with a deletion)
......@@ -154,7 +161,3 @@ perl ~/popoolation2-code/fst-sliding.pl --input FILENAME --output FILENAME.fst -
populations are averaged and Pi is calculated as shown above
```
-
- fst_generic.sh
7. Plotting Fst values for each linkage group
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment