UAF in CGLM mixture model: DNAmixtures package, checked by direct calculation using nlminb
DNAmixtures package
http://dnamixtures.r-forge.r-project.org/
Portable preamble for using DNAmixtures package
require(DNAmixtures)
data(USCaucasian,SGMplusDyes)
USCaucasian2<-rbind(USCaucasian,data.frame(marker="D18S51",allele=9,frequency=0.001659950))
ff<-c(paste(Sys.getenv('HOME'),'/Research/Perturb/Sep2013',sep=''),
paste(Sys.getenv('HOME'),'/Peter-Julia/mixtures/UAF',sep=''))
for(f in ff) if(file.exists(f)) {setwd(f); break}
source('DNAmixtureUAF.R')
DNAmixtureUAF.R source code is here works for 1 or 2 individuals only
Document setting out Beta-Binomial algebra
New version, not limited to 2 individuals
The '190' example (check190.txt) - where WoE is less under UAF
nlminb approach works on 190 example, if you set lower bound of 0.001 for 2nd parameter
Table of log10(lik) and WoE for Hp versus Hd
UAF | UAF M=100000 | vanilla | |
---|---|---|---|
DNAmixture Hp: suspect | -2.679907 | -2.679907 | -2.679907 |
DNAmixture Hd: U1 | -4.304199 | -4.514895 | -4.441308 |
"Algebra" Hd: U1 | -3.937663 | - 3.933346 | - 3.933341 |
DNAmixture WoE | 1.624292 | 1.834988 | 1.761401 |
The '1400' example (check.txt)
DNAmixtures
data | input | output
log10(lik) UAF: -6.822838 vanilla: -6.818517
Identical answers from Direct numerical minimisation of CGLM likelihood (3 pars) but with all 0 peak heights explicitly included
input | output
log10(lik) UAF: -6.822838 vanilla: -6.818517
Direct numerical minimisation of CGLM likelihood (3 pars)
input | output
log10(lik) UAF: -6.631515 vanilla: -6.626129
The mle's are very close indeed under the two formulations, however.
TWO TRACES with '190' and '1400' examples
data 1400 data 190 | input M=100 | output M=100 | input M=100000 | output M=100000
Two traces Table of log10(lik) and WoE for Hp versus Hd
Equal stutter proportion $\xi$'s in both tracesUAF | UAF M=100000 | vanilla | |
---|---|---|---|
DNAmixture Hp: suspect | -8.421626 | -8.421626 | -8.421626 |
DNAmixture Hd: U1 | -11.18351 | -10.03488 | -10.04862 |
"Algebra" Hd: U1 | |||
DNAmixture WoE | 2.761884 | 1.613256 | 1.626997 |
Very big difference here between UAF and not but direction is more incriminating for UAF! J
EXAMPLE 5 (Table 3) Puch-Solis (10 markers)
EPG of Puch data for unbalanced proportion, with "true" $\phi=0.9$.
UAF
No UAF
Table WoE Example 5 unbalanced case
with "true" $\phi=0.9$ for M=49 - Balding's $\theta=0.02$and Balding's likeLD (which uses allele presence and indirect information from peak heights)
Puch-Solis | UAF | No UAF | likeLD | |
---|---|---|---|---|
Hp donor 1 & U1 vs. Hd U1 & U2 | 9.2304 | 13.2295 | 13.1732 | 6.0792 |
Hp donor 2 & U1 vs. Hd U1 & U2 | 6.6812 | 7.5063 | 7.4704 | 3.7924 |
Hp donor 1 & 2 vs. Hd U1 & U2 | 20.7123 | 20.6436 |
Table WoE Example 5 log LR markerwise analysis
For Hp donor 1 & U1 vs. Hd U1 & U2
Markers | D16S539 | D18S51 | D19S433 | D21S11 | D2S1338 9 | D3S1358 | D8S117 | FGA | TH01 | VWA |
---|---|---|---|---|---|---|---|---|---|---|
UAF | 0.97624 | 2.67441 | 0.73008 | 1.49473 | 1.36186 | 0.96982 | 1.01293 | 1.86647 | 0.79019 | 1.35280 |
NO UAF | 0.98648 | 2.66034 | 0.72807 | 1.49024 | 1.35042 | 0.96242 | 1.03241 | 1.84206 | 0.76857 | 1.35220 |
Ratio | 0.98963 | 1.00529 | 1.00276 | 1.00301 | 1.00847 | 1.00769 | 0.98112 | 1.01325 | 1.02811 | 1.00045 |
Markers | D16S539 | D18S51 | D19S433 | D21S11 | D2S1338 9 | D3S1358 | D8S117 | FGA | TH01 | VWA |
---|---|---|---|---|---|---|---|---|---|---|
UAF | 0.7591942 | 1.2021318 | 0.3174868 | -0.3427511 | 1.0207615 | 0.4505369 | 1.1366912 | 1.5285545 | 0.9519488 | 0.4817313 |
NO UAF | 0.7503234 | 1.1952612 | 0.3267055 | -0.3368666 | 1.0173818 | 0.4526932 | 1.1389388 | 1.4985151 | 0.9350099 | 0.4924191 |
Ratio | 1.0118225 | 1.0057482 | 0.9717830 | 1.0174685 | 1.0033220 | 0.9952368 | 0.9980266 | 1.0200461 | 1.0181163 | 0.9782954 |
Example 5 (Table 3) Puch-Solis (10 markers)
EPG of Puch data for balanced proportion $\phi=0.5$
UAF
No UAF
Table WoE Example 5 balanced case
with $\phi=0.5$ for M=49 - Balding's $\theta=0.02$and Balding's likeLD (which uses allele presence and indirect information from peak heights)
Puch-Solis | UAF | No UAF | likeLD | |
---|---|---|---|---|
Hp donor 1 & U1 vs. Hd U1 & U2 | 8.5052 | 9.1037 | 9.1709 | 5.5441 |
Hp donor 2 & U1 vs. Hd U1 & U2 | 8.4150 | 8.3665 | 8.3797 | 6.1462 |
Hp donor 1 & 2 vs. Hd U1 & U2 | 21.5716 | 21.5520 |
Interesting now in the balanced case $\phi=0.5$ UAF is less incriminating as expected in first two cases but not in the last J
TWO TRACES Puch-Solis Table 3
with "true" $\phi=0.9$ and $\phi=0.9$
input UAF output UAF
input no UAF output no UAF
Note the computations take about 15 min J
Table WoE Example 5 for Two traces
with "true" $\phi=0.9$ and $\phi=0.9$ for M=49 - Balding's $\theta=0.02$
UAF | No UAF | |
---|---|---|
Hp donor 1 & U1 vs. Hd U1 & U2 | 13.10579 | 13.17318 |
Hp donor 2 & U1 vs. Hd U1 & U2 | 12.37039 | 12.38362 |
Hp donor 1 & 2 vs. Hd U1 & U2 | 25.57633 | 25.55674 |
Here UAF is sligtly less incriminating in first two cases J
EVETT EXAMPLE with original allele frequencies (6 markers)
UAF
No UAF
Evett example with USCaucasian allele frequencies (6 markers)
UAF
No UAF
Evett data: WoE for Hp suspect & U1 vs. Hd U1 & U2 under both CGLM and CLM models with M=100
Evett | CGLM UAF | CGLM no UAF | CLM UAF | CLM no UAF | |
---|---|---|---|---|---|
Evett allele frequencies | 7.3 | 8.4471 | 8.4243 | 7.9219 | 8.2331 |
USCaucasian allele frequencies | 8.4842 | 8.4601 |
CLM results based on theta = 0,.1,.2,…,1 w.p. 1/11 each
See source code (mixEvettnew3.R)
WoE: 7.9250 8.2331 if use theta on 0.2 grid instead
MC15 and MC18 EXAMPLE as in CGLM paper
One trace: MC15
Hp: K1 & K3 & U1 Hd: K1 & U1 & U2
NO UAF | M=1 | M=2 | M=3 | M=4 | M=5 | M=10 | M=100 | M=1000 | M=10^4 | M=10^5 | M=10^6 | M=10^7 | M=10^8 |
9,596595 | 12,58927 | 11.6203 | 11.13769 | 10.84472 | 10,64682 | 10,18518 | 9,663195 | 9,603347 | 9,597271 | 9,596662 | 9,596603 | 9,596603 | 9,596595 |
Hp (3): K1& K2 & K3 Hd(3): K1 & K2 & U1
NO UAF | M=1 | M=2 | M=3 | M=4 | M=5 | M=10 | M=100 | M=1000 | M=10^4 | M=10^5 |
12,11822 | 13,2335 | 12.8780 | 12.6901 | 12.5756 | 12,4989 | 12,3248 | 12.1405 | 12,1205 | 12,1184 | 12,11824 |
Hp(4) : K1 & K2 & K3 & U1 Hd(4) : K1 & K2 & U1 & U2.
NO UAF | M=1 | M=2 | M=3 | M=4 | M=5 | M=10 | M=100 | M=1000 | M=10^4 | M=10^5 |
11,2926 | 12,9975 | 12.5247 | 12.2563 | 12.0868 | 11,9686 | 11,6816 | 11,33794 | 11,2972 | 11,2931 | 11,2926 |
+++TWO TRACES MC15 and MC18
3 contributors
Hp(3) : K1 & K2 & K3 Hd(3) : K1 & K2 & U1
NO UAF | M=1 | M=2 | M=3 | M=4 | M=5 | M=10 | M=100 | M=1000 | M=10^4 | M=10^5 |
14,10439 | 16,3918 | 15.4200 | 15.0266 | 14.8140 | 14,6806 | 14,4010 | 14,1348 | 14,1074 | 14,1047 | 14,10442 |
4 contributors
Hp(4) : K1 & K2 & K3 & U1 Hd(4) : K1 & K2 & U1 & U2
NO UAF | M=1 | M=2 | M=3 | M=4 | M=5 | M=10 | M=100 | M=1000 | M=10^4 | M=10^5 |
14,08295 | 16,4556 | 15.47283 | 15.06958 | 14.8488 | 14,7092 | 14,4114 | 14,1175 | 14.08642 | 14,0833 | 14,08298 |