Commit d384a946 authored by Uladzislava KHAURATOVICH's avatar Uladzislava KHAURATOVICH 💬
Browse files

Update Z_chrom_heterozygosity_raremale_perf_genome.Rmd

parent 26fbb1b4
......@@ -2,7 +2,7 @@
author: "Uladzislava Khauratovich based on Julian's ans Saren's scripts"
date: <h4> 2 August 2021 </h4>
created: "2021-07-08"
updated: "2021-07-08"
updated: "2022-02-17"
output:
pdf_document:
highlight: default
......@@ -20,8 +20,6 @@ invisible(lapply(Packages, library, character.only = TRUE))
## Empty workspace ##
rm(list=ls())
```
### Julian's part
## Import Data
#It s just one file - we do not need the file name
......@@ -57,7 +55,6 @@ Sdata <- Sdata1 %>%
SdataChr13<-subset(Sdata,Chromosome=="Chromosome_13")
```
#### Binning
Adjusted binning
......@@ -81,7 +78,7 @@ SBindataChr13 <- SdataChr13 %>%
droplevels()
```
### just to categorize snps
```{r }
# Custom function by Saren
interpret = function(female, male){
......@@ -107,7 +104,7 @@ data = SBindataChr13 %>% mutate(state = if_else(FemaleState == 1 & MaleState ==
#assigned SNPs for Chr1
data1 = data %>% mutate(state = interpret(FemaleState, MaleState))
```
### to calculate the number of heterosites in female and raremale
```{r }
data = SBindataChr13 %>% mutate(FHetstate = if_else(FemaleState == 1, "1",
if_else(FemaleState == 0, "0",
......@@ -146,7 +143,7 @@ ggplot(data_new2, aes(x = Bins, y = fractionM)) + geom_bar(stat = "identity", co
ggplot(data2, aes(x = Bins, y = binnedFhetstate)) + geom_bar(stat = "identity", color="pink", width = 0.5) + xlab("1000000nt bins on Chr1") + ylab("number of heterosites in the female") + theme(axis.text.x=element_blank())
```
### plotting fraction of snps of each category along a chromosome
```{r, fig.width=9,fig.height=5}
ggplot(data1, aes(x = state, fill = state)) + geom_bar()
......@@ -165,13 +162,11 @@ abline <- mean(data_new2$fraction)
#old_abline=19.39713
#new_abline=13.77778
#write.table(data_new2, file="/Users/ukhaurat/Documents/full_assembly_analysis/heterozygosity_bins_million_frac_nq.txt", append = FALSE, quote = F, sep = " ", eol = "\n", na = "NA", dec = ".", row.names = TRUE, col.names = TRUE)
ggplot(data_new2, aes(x = Bins, y = fraction)) + geom_bar(stat = "identity", color="light green", width = 0.3) + xlab("Chr 13 Bins of 1000000nt") + ylab("% SNPs that lost heterozygosity in the raremale") + geom_hline(yintercept = 13.77778) + theme(axis.text.x=element_blank())
```
### we won't probably use it: just the difference and log2 F/M number of heterosites
```{r }
# Custom function by Saren
interpret = function(female, male){
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment