SNPolisher: axiom to fitpoly format

Make sure you have R package ‘SNPolisher’ installed

  • input file is “AxiomGT1.summary.txt” from the Axiom analysis suite
library(SNPolisher)

fitTetra_Input(summaryFile="AxiomGT1.summary.txt",
               output.file="AxiomGT1.summary.fitTetra.txt")

fitPoly: genotype calling

Depending if you need to call dosages on families or just cultivars you may need too set the ‘pop.parents’ and ‘population’ functions

Since this example does not contain mapping populations, there is no added benefit of adding the population nor pop.parents files.

Load in Data

fitPoly package used for calling dosage

doParallel package used for using multiple cores. (caution: Windows utilizes RAM differently than linux and macOS). doParallel works better for linux and macOS

  • Possible solutions for running large datasets on Windows machines and on machines with lower amounts of RAM is to divide your data into chunks and run them separately. After running all the data separately, you can combine the dosage calls.
library(fitPoly)
library(doParallel)
data1<-read.delim("AxiomGT1.summary.fitTetra.txt",
                 sep="\t",stringsAsFactors=F)

Run ‘saveMarkerModels’

What is important

  • ploidy was set at 4

  • pop.parents and population set at null

  • depending if you need plots you can change that however 130000+ images takes up a lot of space if you are not going to use it

  • ncores = do not set at more cores than you have available

  • the resulting file will be called “plate13_scores.dat”

NOTES:

  • This step takes a very long time. There is no progress bar this is how to estimate how much longer or percentage of progress done.

    • There are 137786 probes that fitpoly has to run.

    • There will be a line in the code after ‘saveMarkerModels’ begins

      saveMarkerModels: batchsize = somenumber
  • Divide the 137786 by the batchsize number to see how many batches it will take to finish the run.

  • Go to your working directory in explorer and you should see which batch has just finished.

saveMarkerModels(4, markers=NA, data=data1, diplo=NULL, select=TRUE,
                 diploselect=TRUE, pop.parents=NULL, population=NULL, parentalPriors=NULL,
                 samplePriors=NULL, startmeans=NULL, maxiter=40, maxn.bin=200, nbin=200,
                 sd.threshold=0.1, p.threshold=0.99, call.threshold=0.6, peak.threshold=0.85,
                 try.HW=TRUE, dip.filter=1, sd.target=NA,
                 filePrefix=paste0(getwd(),"/plate13"), rdaFiles=F, allModelsFile=T,
                 plot ="none", plot.type="png", ncores=6)

Run custom script

This custom script has been packaged into a function available as an R-package
This link takes you to github for the R-package RoseArrayTools
RoseArrayTools vignette

The function compare_probes is now defunct due to an error it is now a function called compare_probes2

  • will result in two files output “compared_calls.csv” and “compared_calls_kind_counts.csv”

    • compared_calls.csv - contains the dosage calls that have been compared to keep the calls that are consistent, have only one probe, and discards the probes that differ

    • compared_calls_kind_counts.csv gives a matrix which shows which markers have

      • D - different calls (discarded)

      • S - “same” calls that have both probes in agreement

      • O - “one” single probe (one probe is called, other is NA)