% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/summix.R
\name{summix}
\alias{summix}
\title{summix}
\usage{
summix(
  data,
  reference,
  observed,
  pi.start = NA,
  goodness.of.fit = TRUE,
  override_removeSmallRef = FALSE,
  network = FALSE,
  N_reference = NA,
  reference_colors = NA
)
}
\arguments{
\item{data}{A dataframe of the observed and reference allele frequencies for N genetic variants. See data formatting document at \href{https://github.com/hendriau/Summix}{https://github.com/hendriau/Summix} for more information.}

\item{reference}{A character vector of the column names for the reference groups.}

\item{observed}{A character value that is the column name for the observed group.}

\item{pi.start}{Length K numeric vector of the starting guess for the reference group proportions. If not specified, this defaults to 1/K where K is the number of reference groups.}

\item{goodness.of.fit}{Default value is TRUE. If set as FALSE, the user will override the default goodness of fit measure and return the raw objective loss from slsqp.}

\item{override_removeSmallRef}{Default value is FALSE. If set as TRUE, the user will override the automatic removal of reference groups with <1\% global proportions - this is not recommended.}

\item{network}{Default value is FALSE. If set as TRUE, function will return a network diagram with nodes as estimated substructure proportions and edges as degree of similarity between the given node pair.}

\item{N_reference}{numeric vector of the sample sizes for each of the K reference groups; must be specified if network = "TRUE".}

\item{reference_colors}{A character vector of length K that specifies the color each reference group node in the network plot. If not specified, this defaults to K random colors.}
}
\value{
A data frame with the following columns:

goodness.of.fit: scaled objective loss from slsqp() reflecting the fit of the reference data. Values between 0.5-1.5 are considered moderate fit and should be used with caution. Values greater than 1.5 indicate poor fit, and users should not perform further analyses using Summix.

iterations: number of iterations for SLSQP algorithm

time: time in seconds of SLSQP algorithm

filtered: number of genetic variants not used in the reference group mixture proportion estimation due to missing values.

K columns of mixture proportions of reference groups input into the function
}
\description{
Estimating mixture proportions of reference groups from large (N SNPs>10,000) genetic AF data.
}
\examples{
# load the data
data("ancestryData")

# Estimate 5 reference ancestry proportion values for the gnomAD African/African American group
# using a starting guess of .2 for each ancestry proportion.
summix(data = ancestryData,
    reference=c("reference_AF_afr",
        "reference_AF_eas",
        "reference_AF_eur",
        "reference_AF_iam",
        "reference_AF_sas"),
    observed="gnomad_AF_afr",
    pi.start = c(.2, .2, .2, .2, .2),
    goodness.of.fit=TRUE)

}
\references{
https://github.com/hendriau/Summix
}
\seealso{
\url{https://github.com/hendriau/Summix} for further documentation and \url{https://github.com/hendriau/Summix2_manuscript} for a larger sample data set and description of simulations in Summix2 manuscript. \code{\link[nloptr]{slsqp}} function in the nloptr package for further details on Sequential Quadratic Programming \url{https://www.rdocumentation.org/packages/nloptr/versions/1.2.2.2/topics/slsqp}
}
\author{
Adelle Price, \email{adelle.price@cuanschutz.edu}

Hayley Wolff, \email{hayley.wolff@cuanschutz.edu}

Audrey Hendricks, \email{audrey.hendricks@cuanschutz.edu}
}
\keyword{admixture,}
\keyword{distribution,}
\keyword{genetics,}
\keyword{mixture}
\keyword{population}
\keyword{stratification}
