Package 'breakaway' reference manual

Title:	Species Richness Estimation and Modeling
Description:	Understanding the drivers of microbial diversity is an important frontier of microbial ecology, and investigating the diversity of samples from microbial ecosystems is a common step in any microbiome analysis. 'breakaway' is the premier package for statistical analysis of microbial diversity. 'breakaway' implements the latest and greatest estimates of species richness, described in Willis and Bunge (2015) <doi:10.1111/biom.12332>, Willis et al. (2017) <doi:10.1111/rssc.12206>, and Willis (2016) <arXiv:1604.02598>, as well as the most commonly used estimates, including the objective Bayes approach described in Barger and Bunge (2010) <doi:10.1214/10-BA527>.
Authors:	Amy D Willis [aut, cre], Bryan D Martin [aut], Pauline Trinh [aut], Sarah Teichman [aut], David Clausen [aut], Kathryn Barger [aut], John Bunge [aut]
Maintainer:	Amy D Willis <[email protected]>
License:	GPL-2
Version:	4.8.4
Built:	2025-02-23 04:19:23 UTC
Source:	https://github.com/adw96/breakaway

alpha_estimates

Description

Build objects of class alpha_estimates from their components. alpha_estimates() is a constructor method

Usage

alpha_estimates(...)
alpha_estimates(...)

Arguments

...

Objects of class alpha_estimate, or a list of objects of class alpha_estimate

Value

An object of class alpha_estimates

(Data) Frequency count table of soil microbes in an apples orchard.

Description

(Data) Frequency count table of soil microbes in an apples orchard.

Usage

apples
apples

Format

A data frame with 88 rows and 2 variables:

index: an index variable
frequency: number of taxa that were observed with this frequency

...

References

Willis, A. and Bunge, J. (2015). Estimating diversity via frequency ratios. Biometrics, 71(4), 1042–1049. doi:10.1111/biom.12332

Walsh, F. et al. (2014). (2014). Restricted streptomycin use in apple orchards did not adversely alter the soil bacteria communities. Frontiers in Microbiology 4, 383.

Modelling total diversity with betta

Description

This function tests for heterogeneity of total diversity (observed plus unobserved) across multiple sites. It can account or test for fixed effects that may explain diversity. It returns the significance of the covariates in explaining diversity and a hypothesis test for heterogeneity.

Usage

betta(
  chats = NULL,
  ses,
  X = NULL,
  initial_est = NULL,
  formula = NULL,
  data = NULL,
  p.digits = 3
)
betta(
  chats = NULL,
  ses,
  X = NULL,
  initial_est = NULL,
  formula = NULL,
  data = NULL,
  p.digits = 3
)

Arguments

`chats`	A vector of estimates of total diversity at different sampling locations. ‘⁠breakaway⁠’ estimates are suggested in the high-diversity case but not enforced.
`ses`	The standard errors in `chats`, the diversity estimates. This can either be a vector of standard errors (with the arguments `chats` and `X`), or the name of the variable in the dataframe `data` that contains the standard errors (with the arguments `formula` and `data`).
`X`	A numeric matrix of covariates. If not supplied, an intercept-only model will be fit. This is optional with the `chats` argument.
`initial_est`	(Optional) A vector of length 1 + ncol(X) giving the starting values for the likelihood maximisation search. The first element is the starting estimate for sigma^2_u, and the remaining elements are the starting elements for beta. Defaults to NULL, in which case the starting values outlined in the paper are used.
`formula`	A formula object of the form $y ~ x \| group$ . Required with the `data` argument.
`data`	A dataframe containing the response, response standard errors, covariates, and grouping variable. Required with the `formula` argument.
`p.digits`	(Optional) A number that specifies the number of digits to which p-values will be rounded. The default value is 3 digits.

Value

`table`	A coefficient table for the model parameters. The columns give the parameter estimates, standard errors, and p-values, respectively. This model is only as effective as your diversity estimation procedure; for this reason please confirm that your estimates are appropriate and that your model is not misspecified. betta_pic may be useful for this purpose.
`cov`	Estimated covariance matrix of the parameter estimates.
`ssq_u`	The estimate of the heterogeneity variance.
`homogeneity`	The test statistic and p-value for the test of homogeneity.
`global`	The test statistic and p-value for the test of model explanatory power.
`blups`	The conditional expected values of the diversity estimates (conditional on the random effects). The authors propose that if the practitioner believes that information from one diversity estimator may inform the others, then using the ‘⁠condfits⁠’ as estimators of total diversity rather than ‘⁠Chats⁠’ may reduce variance of diversity estimates by “sharing strength” across the samples.
`blupses`	The estimated standard deviation (standard errors) in the blups.
`loglikelihood`	The log likelihood of the fitted model.
`aic`	The Akaike information criterion for the fitted model.
`aicc`	The finite sample correction of the Akaike information criterion for the fitted model.
`r_squared_wls`	The weighted R^2 statistic, appropriate for heteroskedastic linear models.
`function.args`	A list containing values initially passed to betta_random.

Note

Ecologists who are interested in the way species richness varies with covariate information often run a regression-type analysis on the observed diversity using their covariate information as predictors. However, in many settings (especially microbial), rare and unobserved taxa play a hugely important role in explaining the subtleties of the ecosystem, however, a regression analysis on the observed diversity level fails to account for these unobserved taxa. By predicting the total level of diversity (for example, via breakaway) and estimating the standard error in the estimate, one can take account of these unobserved, but important, taxa. In order to account for the estimated nature of the response, a mixed model approach is taken, whereby the varying levels of confidence in the estimates contributes to a diagonal but heteroscedastic covariance matrix. Given covariates constitute the fixed effects in the mixed model, and significance of the random effect term “sigsq_u” reflects heterogeneity in the sample, that is, variability that cannot be explained by only the covariates. The authors believe this to be the first attempt at modelling total diversity in a way that accounts for its estimated nature.

Author(s)

Amy Willis

References

Willis, A., Bunge, J., and Whitman, T. (2015). Inference for changes in biodiversity. arXiv preprint.

Willis, A. and Bunge, J. (2015). Estimating diversity via frequency ratios. Biometrics.

Examples


















df <- data.frame(chats = c(2000, 3000, 4000, 3000), ses = c(100, 200, 150, 180),
                 Cont_var = c(100, 150, 100, 50))

# formula notation
betta(formula = chats ~ Cont_var, ses = ses, data = df)

# direct input
betta(c(2000, 3000, 4000, 3000), c(100, 200, 150, 180), cbind(1, c(100, 150, 100,
    50)))

## handles missing data
betta(c(2000, 3000, 4000, 3000), c(100, 200, 150, NA))

## A test for heterogeneity of apples diversity estimates vs butterfly estimates
betta(c(1552, 1500, 884), c(305, 675, 205), cbind(1, c(0, 0, 1)))






















df <- data.frame(chats = c(2000, 3000, 4000, 3000), ses = c(100, 200, 150, 180),
                 Cont_var = c(100, 150, 100, 50))

# formula notation
betta(formula = chats ~ Cont_var, ses = ses, data = df)

# direct input
betta(c(2000, 3000, 4000, 3000), c(100, 200, 150, 180), cbind(1, c(100, 150, 100,
    50)))

## handles missing data
betta(c(2000, 3000, 4000, 3000), c(100, 200, 150, NA))

## A test for heterogeneity of apples diversity estimates vs butterfly estimates
betta(c(1552, 1500, 884), c(305, 675, 205), cbind(1, c(0, 0, 1)))

Confidence intervals and testing for linear combinations of fixed effects estimated via betta() or betta_random()

Description

This function provides point estimates, standard errors, and equal-tailed confidence intervals for linear combinations of fixed effects estimated via betta() or betta_random(). A p-value for a Wald test of the null that the linear combination of effects is equal to zero (against a general alternative) is also returned.

Usage

betta_lincom(fitted_betta, linear_com, signif_cutoff = 0.05)
betta_lincom(fitted_betta, linear_com, signif_cutoff = 0.05)

Arguments

`fitted_betta`	A fitted betta object – i.e., the output of either betta() or betta_random() – containing fixed effect estimates of interest.
`linear_com`	The linear combination of fixed effects for which a point estimate, confidence interval, and hypothesis test are to be produced.
`signif_cutoff`	The type-I significance threshold for confidence intervals. Defaults to 0.05.

Value

table

A table containing a point estimate, standard error, lower and upper confidence bounds, and a p-value for the linear combination of fixed effects specified in input. The p-value is generated via a two-sided Wald test of the null that the linear combination of fixed effects is equal to zero.

Author(s)

David Clausen

References

Willis, A., Bunge, J., and Whitman, T. (2015). Inference for changes in biodiversity. arXiv preprint.

Examples


# generate example data
df <- data.frame(chats = c(2000, 3000, 4000, 3000), ses = c(100, 200, 150, 180),
                 Cont_var = c(100, 150, 100, 50))

# fit betta()
example_fit <- betta(formula = chats ~ Cont_var, ses = ses, data = df)

# generate point estimate and 95% CI for mean richness at Cont_var = 125

betta_lincom(fitted_betta = example_fit,
linear_com = c(1, 125)) # this tells betta_lincom to estimate value of beta_0 + 125*beta_1,
# where beta_0 is the intercept, and beta_1 is the (true value of the) coefficient on Cont_var


# generate example data
df <- data.frame(chats = c(2000, 3000, 4000, 3000), ses = c(100, 200, 150, 180),
                 Cont_var = c(100, 150, 100, 50))

# fit betta()
example_fit <- betta(formula = chats ~ Cont_var, ses = ses, data = df)

# generate point estimate and 95% CI for mean richness at Cont_var = 125

betta_lincom(fitted_betta = example_fit,
linear_com = c(1, 125)) # this tells betta_lincom to estimate value of beta_0 + 125*beta_1,
# where beta_0 is the intercept, and beta_1 is the (true value of the) coefficient on Cont_var

function for plotting total diversity

Description

A simple plotting interface for comparing total diversity across samples or a covariate gradient.

Usage

betta_pic(
  y,
  se,
  x = 1:length(y),
  ylimu = NULL,
  myy = NULL,
  mymain = NULL,
  mycol = NULL,
  labs = NULL,
  mypch = NULL,
  myxlim = NULL
)
betta_pic(
  y,
  se,
  x = 1:length(y),
  ylimu = NULL,
  myy = NULL,
  mymain = NULL,
  mycol = NULL,
  labs = NULL,
  mypch = NULL,
  myxlim = NULL
)

Arguments

`y`	A vector of estimates of total diversity. Other parameter estimates are accessible; this method may be used for plotting any parameter estimates..
`se`	The standard errors in ‘⁠y⁠’, the diversity (or other parameter's) estimates.
`x`	A vector of covariates to form the x-coordinates of the intervals. If no argument is given, defaults to the order.
`ylimu`	The upper endpoint of the y-axis.
`myy`	Deprecated, for backwards compatibility
`mymain`	Deprecated, for backwards compatibility
`mycol`	Deprecated, for backwards compatibility
`labs`	Deprecated, for backwards compatibility
`mypch`	Deprecated, for backwards compatibility
`myxlim`	Deprecated, for backwards compatibility

Value

A ggplot object.

Author(s)

Amy Willis

References

Willis, A., Bunge, J., and Whitman, T. (2015). Inference for changes in biodiversity. arXiv preprint.

Examples

betta_pic(c(1552, 1500, 884), c(305, 675, 205), mymain = "Example title")

betta_pic(c(1552, 1500, 884), c(305, 675, 205), mymain = "Example title")

modelling total diversity with random effects

Description

This function extends betta() to permit random effects modelling.

Usage

betta_random(
  chats = NULL,
  ses,
  X = NULL,
  groups = NULL,
  formula = NULL,
  data = NULL,
  p.digits = 3
)
betta_random(
  chats = NULL,
  ses,
  X = NULL,
  groups = NULL,
  formula = NULL,
  data = NULL,
  p.digits = 3
)

Arguments

`chats`	A vector of estimates of total diversity at different sampling locations. Required with the `groups` argument, and optionally with the `X` argument.
`ses`	The standard errors in `chats`, the diversity estimates. This can either be a vector of standard errors (with the arguments `chats` and `X`), or the name of the variable in the dataframe `data` that contains the standard errors (with the arguments `formula` and `data`).
`X`	A numeric matrix of covariates corresponding to fixed effects. If not supplied, an intercept-only model will be fit. Optional with the `chats` and `groups` arguments.
`groups`	A categorical variable representing the random-effects groups that each of the estimates belong to. Required with the `chats` argument and optionally with the `X` argument.
`formula`	A formula object of the form $y ~ x \| group$ . Required with the `data` argument.
`data`	A dataframe containing the response, response standard errors, covariates, and grouping variable. Required with the `formula` argument.
`p.digits`	(Optional) A number that specifies the number of digits to which p-values will be rounded. The default value is 3 digits.

Value

`table`	A coefficient table for the model parameters. The columns give the parameter estimates, standard errors, and p-values, respectively. This model is only as effective as your diversity estimation procedure; for this reason please confirm that your estimates are appropriate and that your model is not misspecified. betta_pic may be useful for this purpose.
`cov`	Estimated covariance matrix of the parameter estimates.
`ssq_u`	The estimate of the heterogeneity variance.
`ssq_g`	Estimates of within-group variance. The estimate will be zero for groups with only one observation.
`homogeneity`	The test statistic and p-value for the test of homogeneity.
`global`	The test statistic and p-value for the test of model explanatory power.
`blups`	The conditional expected values of the diversity estimates (conditional on the random effects). Estimates of variability for the random effects case are unavailable at this time; please contact the maintainer if needed.
`loglikelihood`	The log likelihood of the fitted model.
`aic`	The Akaike information criterion for the fitted model.
`aicc`	The finite sample correction of the Akaike information criterion for the fitted model.
`function.args`	A list containing values initially passed to betta_random.

Author(s)

Amy Willis

References

Willis, A., Bunge, J., and Whitman, T. (2015). Inference for changes in biodiversity. arXiv preprint.

Examples


df <- data.frame(chats = c(2000, 3000, 4000, 3000),
ses = c(100, 200, 150, 180),
Cont_var = c(100, 150, 100, 50),
groups = c("a", "a", "a", "b"))

# formula notation
betta_random(formula = chats ~ Cont_var| groups,
ses = ses,
 data = df)

# direct input
betta_random(c(2000, 3000, 4000, 3000), c(100, 200, 150, 180),
X = cbind(Int = 1, Cont_var = c(100, 150, 100, 50)),
groups = c("a", "a", "a", "b"))

## handles missing data
betta_random(c(2000, 3000, 4000, 3000), c(100, 200, 150, NA),
groups = c("a", NA,
    "b", "b"))

df <- data.frame(chats = c(2000, 3000, 4000, 3000),
ses = c(100, 200, 150, 180),
Cont_var = c(100, 150, 100, 50),
groups = c("a", "a", "a", "b"))

# formula notation
betta_random(formula = chats ~ Cont_var| groups,
ses = ses,
 data = df)

# direct input
betta_random(c(2000, 3000, 4000, 3000), c(100, 200, 150, 180),
X = cbind(Int = 1, Cont_var = c(100, 150, 100, 50)),
groups = c("a", "a", "a", "b"))

## handles missing data
betta_random(c(2000, 3000, 4000, 3000), c(100, 200, 150, NA),
groups = c("a", NA,
    "b", "b"))

Species richness estimation with breakaway

Description

breakaway is a wrapper for modern species richness estimation for modern datasets

Usage

breakaway(
  input_data,
  cutoff = NA,
  output = NULL,
  plot = NULL,
  answers = NULL,
  print = NULL,
  ...
)
breakaway(
  input_data,
  cutoff = NA,
  output = NULL,
  plot = NULL,
  answers = NULL,
  print = NULL,
  ...
)

Arguments

`input_data`	An input type that can be processed by `convert()`
`cutoff`	The maximum frequency count to use for model fitting
`output`	Deprecated; only for backwards compatibility
`plot`	Deprecated; only for backwards compatibility
`answers`	Deprecated; only for backwards compatibility
`print`	Deprecated; only for backwards compatibility
`...`	Additional arguments will be ignored; this is for backward compatibility

Value

An object of class alpha_estimate

Note

‘⁠breakaway⁠’ presents an estimator of species richness that is well-suited to the high-diversity/microbial setting. However, many microbial datasets display more diversity than the Kemp-type models can permit. In this case, the log-transformed WLRM diversity estimator of Rocchetti et. al. (2011) is returned. The authors' experience suggests that some datasets that require the log-transformed WLRM contain “false” diversity, that is, diversity attributable to sequencing errors (via an inflated singleton count). The authors encourage judicious use of diversity estimators when the dataset may contain these errors, and recommend the use of breakaway_nof1 as an exploratory tool in this case.

Author(s)

Amy Willis

References

Willis, A. and Bunge, J. (2015). Estimating diversity via frequency ratios. Biometrics, 71(4), 1042–1049.

Examples


breakaway(apples)
breakaway(apples, plot = FALSE, output = FALSE, answers = TRUE)

breakaway(apples)
breakaway(apples, plot = FALSE, output = FALSE, answers = TRUE)

species richness estimation without singletons

Description

This function permits estimation of total diversity based on a sample frequency count table. Unlike breakaway, it does not require an input for the number of species observed once, making it an excellent exploratory tool for microbial ecologists who believe that their sample may contain spurious singletons. The underlying estimation procedure is similar to that of breakaway and is outlined in Willis & Bunge (2014). The diversity estimate, standard error, estimated model coefficients and plot of the fitted model are returned.

Usage

breakaway_nof1(
  input_data,
  output = NULL,
  plot = NULL,
  answers = NULL,
  print = NULL
)
breakaway_nof1(
  input_data,
  output = NULL,
  plot = NULL,
  answers = NULL,
  print = NULL
)

Arguments

`input_data`	An input type that can be processed by `convert()`
`output`	Deprecated; only for backwards compatibility
`plot`	Deprecated; only for backwards compatibility
`answers`	Deprecated; only for backwards compatibility
`print`	Deprecated; only for backwards compatibility

Value

An object of class alpha_estimate

`code`	A category representing algorithm behaviour. ‘⁠code=1⁠’ indicates no nonlinear models converged and the transformed WLRM diversity estimate of Rocchetti et. al. (2011) is returned. ‘⁠code=2⁠’ indicates that the iteratively reweighted model converged and was returned. ‘⁠code=3⁠’ indicates that iterative reweighting did not converge but a model based on a simplified variance structure was returned (in this case, the variance of the frequency ratios is assumed to be proportional to the denominator frequency index). Please peruse your fitted model before using your diversity estimate.
`name`	The “name” of the selected model. The first integer represents the numerator polynomial degree and the second integer represents the denominator polynomial degree. See Willis & Bunge (2014) for details.
`para`	Estimated model parameters and standard errors.
`est`	The estimate of total (observed plus unobserved) diversity.
`seest`	The standard error in the diversity estimate.
`full`	The chosen nonlinear model for frequency ratios.

Note

It is common for microbial ecologists to believe that their dataset contains false diversity. This often arises because sequencing errors result in classifying already observed organisms as new organisms. ‘⁠breakaway_nof1⁠’ was developed as an exploratory tool in this case. Practitioners can run ‘⁠breakaway⁠’ on their dataset including the singletons, and ‘⁠breakaway_nof1⁠’ on their dataset excluding the singletons, and assess if the estimated levels of diversity are very different. Great disparity may provide evidence of an inflated singleton count, or at the very least, that ‘⁠breakaway⁠’ is especially sensitive to the number of rare species observed. Note that ‘⁠breakaway_nof1⁠’ may be less stable than ‘⁠breakaway⁠’ due to predicting based on a reduced dataset, and have greater standard errors.

Author(s)

Amy Willis

References

Willis, A. (2015). Species richness estimation with high diversity but spurious singletons. arXiv.

Willis, A. and Bunge, J. (2015). Estimating diversity via frequency ratios. Biometrics.

Examples


breakaway_nof1(apples[-1, ])
breakaway_nof1(apples[-1, ], plot = FALSE, output = FALSE, answers = TRUE)

breakaway_nof1(apples[-1, ])
breakaway_nof1(apples[-1, ], plot = FALSE, output = FALSE, answers = TRUE)

Build frequency count tables from an OTU table

Description

Build frequency count tables from an OTU table

Usage

build_frequency_count_tables(the_table)
build_frequency_count_tables(the_table)

Arguments

the_table

An OTU table as a data frame or a matrix. Columns are the samples and rows give the taxa.

Value

A list of frequency count tables corresponding to the columns.

Chao-Bunge species richness estimator

Description

This function implements the species richness estimation procedure outlined in Chao & Bunge (2002).

Usage

chao_bunge(input_data, cutoff = 10, output = NULL, answers = NULL)
chao_bunge(input_data, cutoff = 10, output = NULL, answers = NULL)

Arguments

`input_data`	An input type that can be processed by `convert()` or a `phyloseq` object
`cutoff`	The maximum frequency to use in fitting.
`output`	Deprecated; only for backwards compatibility
`answers`	Deprecated; only for backwards compatibility

Value

An object of class alpha_estimate, or alpha_estimates for phyloseq objects

Author(s)

Amy Willis

Examples


chao_bunge(apples)

chao_bunge(apples)

The Chao-Shen estimate of Shannon diversity

Description

The Chao-Shen estimate of Shannon diversity

Usage

chao_shen(input_data)
chao_shen(input_data)

Arguments

input_data

An input type that can be processed by convert()

Value

An object of class alpha_estimate

Chao1 species richness estimator

Description

This function implements the Chao1 richness estimate, which is often mistakenly referred to as an index.

Usage

chao1(input_data, output = NULL, answers = NULL)
chao1(input_data, output = NULL, answers = NULL)

Arguments

`input_data`	An input type that can be processed by `convert()` or a `phyloseq` object
`output`	Deprecated; only for backwards compatibility
`answers`	Deprecated; only for backwards compatibility

Value

An object of class alpha_estimate, or alpha_estimates for phyloseq objects

Note

The authors of this package strongly discourage the use of this estimator. It is only valid when you wish to assume that every taxa has equal probability of being observed. You don't really think that's possible, do you?

Author(s)

Amy Willis

Examples



chao1(apples)


chao1(apples)

Bias-corrected Chao1 species richness estimator

Description

This function implements the bias-corrected Chao1 richness estimate.

Usage

chao1_bc(input_data, output = NULL, answers = NULL)
chao1_bc(input_data, output = NULL, answers = NULL)

Arguments

`input_data`	An input type that can be processed by `convert()` or a `phyloseq` object
`output`	Deprecated; only for backwards compatibility
`answers`	Deprecated; only for backwards compatibility

Value

An object of class alpha_estimate, or alpha_estimates for phyloseq objects

Note

Author(s)

Amy Willis

Examples



chao1_bc(apples)


chao1_bc(apples)

Run some basic checks on a possible frequency count table

Description

Run some basic checks on a possible frequency count table

Usage

check_format(output_data)
check_format(output_data)

Arguments

output_data

A matrix to test

Value

A checked frequency count table

convert between different inputs for alpha-diversity estimates

Description

Inputs slated for development include phyloseq and otu_table

Usage

convert(input_data)
convert(input_data)

Arguments

input_data

Supported types include filenames, data frames, matrices, vectors...

Value

Frequency count able

Find a cut-off for estimates relying on contiguous counts

Description

Find a cut-off for estimates relying on contiguous counts

Usage

cutoff_wrap(my_data, requested = NA)
cutoff_wrap(my_data, requested = NA)

Arguments

`my_data`	Frequency count table
`requested`	The user-requested cutoff

Value

Cutoff value

Conduct F test of null hypothesis LB = 0 using output from betta() or betta_random()

Description

This function performs an F-test of a null hypothesis LB = 0 where B is a vector of p fixed effects returned by betta() or betta_random() and L is an m x p matrix with linearly independent rows.

Usage

F_test(fitted_betta, L, method = "bootstrap", nboot = 1000)
F_test(fitted_betta, L, method = "bootstrap", nboot = 1000)

Arguments

`fitted_betta`	A fitted betta object – i.e., the output of either betta() or betta_random() – containing fixed effect estimates of interest.
`L`	An m x p matrix defining the null LB = 0. L must have full row rank.
`method`	A character variable indicating which method should be used to estimate the distribution of the test statistic under the null.
`nboot`	Number of bootstrap samples to use if method = "bootstrap". Ignored if method = "asymptotic".

Value

A list containing

`pval`	The p-value
`F_stat`	The calculated F statistic
`boot_F`	A vector of bootstrapped F statistics if bootstrap has been used. Otherwise NULL.

Author(s)

David Clausen

References

Willis, A., Bunge, J., and Whitman, T. (2015). Inference for changes in biodiversity. arXiv preprint.

Examples


# generate example data
df <- data.frame(chats = c(2000, 3000, 4000, 3000,
2000, 3000, 4000, 3000), ses = c(100, 200, 150, 180,
100, 200, 150, 180),
                 Cont_var = c(100, 150, 100, 50,
                 100, 150, 100, 50),
                 Cont_var_2 = c(50,200,25,125,
                 50,200,25,125))

# fit betta()
example_fit <- betta(formula = chats ~ Cont_var + Cont_var_2, ses = ses, data = df)


# construct L for hypothesis that B_cont_var = B_cont_var_2 = 0
L <- rbind(c(0,1,0),
           c(0,0,1))

F_test_results <- F_test(example_fit, L, nboot = 100)

# generate example data
df <- data.frame(chats = c(2000, 3000, 4000, 3000,
2000, 3000, 4000, 3000), ses = c(100, 200, 150, 180,
100, 200, 150, 180),
                 Cont_var = c(100, 150, 100, 50,
                 100, 150, 100, 50),
                 Cont_var_2 = c(50,200,25,125,
                 50,200,25,125))

# fit betta()
example_fit <- betta(formula = chats ~ Cont_var + Cont_var_2, ses = ses, data = df)


# construct L for hypothesis that B_cont_var = B_cont_var_2 = 0
L <- rbind(c(0,1,0),
           c(0,0,1))

F_test_results <- F_test(example_fit, L, nboot = 100)

Calculate F statistic under null hypothesis LB = 0 using output from betta() or betta_random()

Description

This function calculates an F statistic for a test of null hypothesis LB = 0 (against an unrestricted alternative) where B is a vector of p fixed effects returned by betta() or betta_random() and L is an m x p matrix with linearly independent rows.

Usage

get_F_stat(fitted_betta, L)
get_F_stat(fitted_betta, L)

Arguments

`fitted_betta`	A fitted betta object – i.e., the output of either betta() or betta_random() – containing fixed effect estimates of interest.
`L`	An m x p matrix defining the null LB = 0. L must have full row rank.

Value

A list containing

F_stat

The calculated F statistic

The Good-Turing estimate of species richness

Description

The Good-Turing estimate of species richness

Usage

good_turing(input_data)
good_turing(input_data)

Arguments

input_data

An input type that can be processed by convert() or a phyloseq object

Value

An object of class alpha_estimate, or alpha_estimates for phyloseq objects

(Data) Frequency count table of soil microbes in Hawaii.

Description

(Data) Frequency count table of soil microbes in Hawaii.

Usage

hawaii
hawaii

Format

A data frame with 198 rows and 2 variables:

index: an index variable
frequency: number of taxa that were observed with this frequency

...

References

Willis, A. and Bunge, J. (2015). Estimating diversity via frequency ratios. Biometrics, 71(4), 1042–1049. doi:10.1111/biom.12332

Species richness estimation with Kemp-type models

Description

This function implements the species richness estimation procedure outlined in Willis & Bunge (2015). The diversity estimate, standard error, estimated model coefficients, model details and plot of the fitted model are returned.

Usage

kemp(
  input_data,
  cutoff = NA,
  output = NULL,
  plot = NULL,
  answers = NULL,
  print = NULL,
  ...
)
kemp(
  input_data,
  cutoff = NA,
  output = NULL,
  plot = NULL,
  answers = NULL,
  print = NULL,
  ...
)

Arguments

`input_data`	An input type that can be processed by `convert()`
`cutoff`	The maximum frequency count to use for model fitting
`output`	Deprecated; only for backwards compatibility
`plot`	Deprecated; only for backwards compatibility
`answers`	Deprecated; only for backwards compatibility
`print`	Deprecated; only for backwards compatibility
`...`	Additional arguments will be ignored; this is for backward compatibility

Value

An object of class alpha_estimate

`code`	A category representing algorithm behaviour. ‘⁠code=1⁠’ indicates no nonlinear models converged and the transformed WLRM diversity estimate of Rocchetti et. al. (2011) is returned. ‘⁠code=2⁠’ indicates that the iteratively reweighted model converged and was returned. ‘⁠code=3⁠’ indicates that iterative reweighting did not converge but a model based on a simplified variance structure was returned (in this case, the variance of the frequency ratios is assumed to be proportional to the denominator frequency index). Please peruse your fitted model before using your diversity estimate.
`name`	The “name” of the selected model. The first integer represents the numerator polynomial degree and the second integer represents the denominator polynomial degree of the model for the frequency ratios. See Willis & Bunge (2015) for details.
`para`	Estimated model parameters and standard errors.
`est`	The estimate of total (observed plus unobserved) diversity.
`seest`	The standard error in the diversity estimate.
`full`	The chosen nonlinear model for frequency ratios.
`ci`	An asymmetric 95% confidence interval for diversity.

Author(s)

Amy Willis

References

Willis, A. and Bunge, J. (2015). Estimating diversity via frequency ratios. Biometrics, 71(4), 1042–1049.

Rocchetti, I., Bunge, J. and Bohning, D. (2011). Population size estimation based upon ratios of recapture probabilities. Annals of Applied Statistics, 5.

Examples


kemp(apples)
kemp(apples, plot = FALSE, output = FALSE, answers = TRUE)

kemp(apples)
kemp(apples, plot = FALSE, output = FALSE, answers = TRUE)

Make design matrix

Description

Make design matrix

Usage

make_design_matrix(phyloseq_object, variables)
make_design_matrix(phyloseq_object, variables)

Arguments

`phyloseq_object`	A phyloseq object
`variables`	variable names

Value

A matrix object giving the design matrix from the desired variables.

Draw frequency count subtables from an OTU table

Description

Draw frequency count subtables from an OTU table

Usage

make_frequency_count_table(labels)
make_frequency_count_table(labels)

Arguments

labels

A vector of counts of the taxa; i.e. a vector giving the number of times each taxon was observed.

Value

A frequency count table.

Estimate species richness with an objective Bayes method using a geometric model

Description

Estimate species richness with an objective Bayes method using a geometric model

Usage

objective_bayes_geometric(
  data,
  output = TRUE,
  plot = TRUE,
  answers = FALSE,
  tau = 10,
  burn.in = 100,
  iterations = 2500,
  Metropolis.stdev.N = 75,
  Metropolis.start.theta = 1,
  Metropolis.stdev.theta = 0.3
)
objective_bayes_geometric(
  data,
  output = TRUE,
  plot = TRUE,
  answers = FALSE,
  tau = 10,
  burn.in = 100,
  iterations = 2500,
  Metropolis.stdev.N = 75,
  Metropolis.start.theta = 1,
  Metropolis.stdev.theta = 0.3
)

Arguments

`data`	TODO(Kathryn)(Kathryn)
`output`	TODO(Kathryn)(Kathryn)
`plot`	TODO(Kathryn)(Kathryn)
`answers`	TODO(Kathryn)(Kathryn)
`tau`	TODO(Kathryn)
`burn.in`	TODO(Kathryn)
`iterations`	TODO(Kathryn)
`Metropolis.stdev.N`	TODO(Kathryn)
`Metropolis.start.theta`	TODO(Kathryn)
`Metropolis.stdev.theta`	TODO(Kathryn)

Value

A list of results, including

est

the median of estimates of N

`ci`	a confidence interval for N

mean

the mean of estimates of N

semeanest

the standard error of mean estimates

dic

the DIC of the model

fits

fitted values

, and

diagnostics

model diagonstics

Objective Bayes species richness estimate with the mixed-geometric model

Description

Objective Bayes species richness estimate with the mixed-geometric model

Usage

objective_bayes_mixedgeo(
  data,
  output = TRUE,
  plot = TRUE,
  answers = FALSE,
  tau = 10,
  burn.in = 100,
  iterations = 2500,
  Metropolis.stdev.N = 100,
  Metropolis.start.T1 = 1,
  Metropolis.stdev.T1 = 2,
  Metropolis.start.T2 = 3,
  Metropolis.stdev.T2 = 2,
  bars = 3
)
objective_bayes_mixedgeo(
  data,
  output = TRUE,
  plot = TRUE,
  answers = FALSE,
  tau = 10,
  burn.in = 100,
  iterations = 2500,
  Metropolis.stdev.N = 100,
  Metropolis.start.T1 = 1,
  Metropolis.stdev.T1 = 2,
  Metropolis.start.T2 = 3,
  Metropolis.stdev.T2 = 2,
  bars = 3
)

Arguments

`data`	TODO(Kathryn)
`output`	TODO(Kathryn)
`plot`	TODO(Kathryn)
`answers`	TODO(Kathryn)
`tau`	TODO(Kathryn)
`burn.in`	TODO(Kathryn)
`iterations`	TODO(Kathryn)
`Metropolis.stdev.N`	TODO(Kathryn)
`Metropolis.start.T1`	TODO(Kathryn)
`Metropolis.stdev.T1`	TODO(Kathryn)
`Metropolis.start.T2`	TODO(Kathryn)
`Metropolis.stdev.T2`	TODO(Kathryn)
`bars`	TODO(Kathryn)

Value

A list of results, including

est

the median of estimates of N

`ci`	a confidence interval for N

mean

the mean of estimates of N

semeanest

the standard error of mean estimates

dic

the DIC of the model

fits

fitted values

, and

diagnostics

model diagonstics

Objective Bayes species richness estimate with the Negative Binomial model

Description

Objective Bayes species richness estimate with the Negative Binomial model

Usage

objective_bayes_negbin(
  data,
  output = TRUE,
  plot = TRUE,
  answers = FALSE,
  tau = 10,
  burn.in = 1000,
  iterations = 5000,
  Metropolis.stdev.N = 100,
  Metropolis.start.T1 = -0.8,
  Metropolis.stdev.T1 = 0.01,
  Metropolis.start.T2 = 0.8,
  Metropolis.stdev.T2 = 0.01,
  bars = 5
)
objective_bayes_negbin(
  data,
  output = TRUE,
  plot = TRUE,
  answers = FALSE,
  tau = 10,
  burn.in = 1000,
  iterations = 5000,
  Metropolis.stdev.N = 100,
  Metropolis.start.T1 = -0.8,
  Metropolis.stdev.T1 = 0.01,
  Metropolis.start.T2 = 0.8,
  Metropolis.stdev.T2 = 0.01,
  bars = 5
)

Arguments

`data`	TODO(Kathryn)
`output`	TODO(Kathryn)
`plot`	TODO(Kathryn)
`answers`	TODO(Kathryn)
`tau`	TODO(Kathryn)
`burn.in`	TODO(Kathryn)
`iterations`	TODO(Kathryn)
`Metropolis.stdev.N`	TODO(Kathryn)
`Metropolis.start.T1`	TODO(Kathryn)
`Metropolis.stdev.T1`	TODO(Kathryn)
`Metropolis.start.T2`	TODO(Kathryn)
`Metropolis.stdev.T2`	TODO(Kathryn)
`bars`	TODO(Kathryn)

Value

A list of results, including

est

the median of estimates of N

`ci`	a confidence interval for N

mean

the mean of estimates of N

semeanest

the standard error of mean estimates

dic

the DIC of the model

fits

fitted values

, and

diagnostics

model diagonstics

Objective Bayes species richness estimate with the Poisson model

Description

Objective Bayes species richness estimate with the Poisson model

Usage

objective_bayes_poisson(
  data,
  output = TRUE,
  plot = TRUE,
  answers = FALSE,
  tau = 10,
  burn.in = 100,
  iterations = 2500,
  Metropolis.stdev.N = 75,
  Metropolis.start.lambda = 1,
  Metropolis.stdev.lambda = 0.3,
  bars = 5
)
objective_bayes_poisson(
  data,
  output = TRUE,
  plot = TRUE,
  answers = FALSE,
  tau = 10,
  burn.in = 100,
  iterations = 2500,
  Metropolis.stdev.N = 75,
  Metropolis.start.lambda = 1,
  Metropolis.stdev.lambda = 0.3,
  bars = 5
)

Arguments

`data`	TODO(Kathryn)
`output`	TODO(Kathryn)
`plot`	TODO(Kathryn)
`answers`	TODO(Kathryn)
`tau`	TODO(Kathryn)
`burn.in`	TODO(Kathryn)
`iterations`	TODO(Kathryn)
`Metropolis.stdev.N`	TODO(Kathryn)
`Metropolis.start.lambda`	TODO(Kathryn)
`Metropolis.stdev.lambda`	TODO(Kathryn)
`bars`	TODO(Kathryn)

Value

A list of results, including

est

the median of estimates of N

`ci`	a confidence interval for N

mean

the mean of estimates of N

semeanest

the standard error of mean estimates

dic

the DIC of the model

fits

fitted values

, and

diagnostics

model diagonstics

(Data) Data frame of covariate information about pasolli_et_al.

Description

(Data) Data frame of covariate information about pasolli_et_al.

Usage

pasolli_et_al
pasolli_et_al

Format

A data frame with 4930 rows and 9 variables:

SGB.ID: sample ID
AverageAbundance: average abundance in the sample
std: standard deviation
q1: first quartile
MedianAbundance: median abundance
q3: third quartile
min: minimum
max: maximum
#.Samples: number of samples

...

Wrapper for phyloseq objects

Description

Wrapper for phyloseq objects

Usage

physeq_wrap(fn, physeq, ...)
physeq_wrap(fn, physeq, ...)

Arguments

`fn`	alpha diversity estimator function with `breakaway` to be applied to `physeq`
`physeq`	A `phyloseq` object, or an object of class `otu_table`
`...`	Additional arguments for fn

Value

Object of class alpha_estimates

Plot function for alpha_estimates class

Description

Plot function for alpha_estimates class

Usage

## S3 method for class 'alpha_estimates'
plot(
  x,
  physeq = NULL,
  measure = NULL,
  color = NULL,
  shape = NULL,
  title = NULL,
  trim_plot = FALSE,
  ...
)
## S3 method for class 'alpha_estimates'
plot(
  x,
  physeq = NULL,
  measure = NULL,
  color = NULL,
  shape = NULL,
  title = NULL,
  trim_plot = FALSE,
  ...
)

Arguments

`x`	Object of class `alpha_estimates`.
`physeq`	(Optional). Default `NULL`. Required object of class `phyloseq` if including a `sample_data` variable for `color` or `shape`.
`measure`	(Optional). If there are multiple richness measures included in `x`, this can be set to the the desired measure to be plotted. Defaults to the measure of the first alpha diversity estimate.
`color`	(Optional). Default `NULL`. The sample variable to map to different colors. Can be a single character string of the variable name in `sample_data` or a custom supplied vector with length equal to the number of samples.
`shape`	(Optional). Default `NULL`. The sample variable to map to different shapes. Can be a single character string of the variable name in `sample_data` or a custom supplied vector with length equal to the number of samples.
`title`	(Optional). Default NULL. Character string. The main title for the graphic.
`trim_plot`	(Optional). Default `FALSE`. Boolean indicator for whether you want the plot to include the full confidence intervals.
`...`	See details

Details

... does not currently have any implemented options. Optional arguments currently include "trim_plot", a Optional

Value

A ggplot object.

Examples


library(phyloseq)
data(GlobalPatterns)
alphas <- breakaway(GlobalPatterns)
plot(alphas)

library(phyloseq)
data(GlobalPatterns)
alphas <- breakaway(GlobalPatterns)
plot(alphas)

PoissonModel

Description

A model to estimate the number of missing taxa under a Poisson Model

Usage

poisson_model(input_data, cutoff = 10)
poisson_model(input_data, cutoff = 10)

Arguments

`input_data`	A frequency count table
`cutoff`	The largest frequency to use for predicting f0. Defaults to 10.

Value

An object of class alpha_estimate.

PoissonModelNof1

Description

A model to estimate the number of missing taxa under a zero- and one-truncated Poisson Model

Usage

poisson_model_nof1(input_data, cutoff = 10)
poisson_model_nof1(input_data, cutoff = 10)

Arguments

`input_data`	A frequency count table
`cutoff`	The largest frequency to use for predicting f0 and f1. Defaults to 10.

Value

An object of class alpha_estimate.

OTU table to relative abundances

Description

OTU table to relative abundances

Usage

proportions_instead(the_table)
proportions_instead(the_table)

Arguments

the_table

An OTU table

Value

A proportion table or vector.

Negative binomially distributed frequency count tables.

Description

Simulate a frequency count table based on a negative binomial model. Zero-truncated, obviously.

Usage

rnbinomtable(C, size, probability)
rnbinomtable(C, size, probability)

Arguments

`C`	species richness
`size`	size parameter for the negative binomial distribution
`probability`	probability parameter for the negative binomial distribution

Value

A simulated frequency count table.

Author(s)

Amy Willis

beta version: Zero-truncated negative binomially distributed frequency count tables.

Description

Simulate a frequency count table based on a negative binomial model. Zero-truncated, obviously.

Usage

rztnbinomtable(C, size, probability)
rztnbinomtable(C, size, probability)

Arguments

`C`	species richness
`size`	size parameter for the negative binomial distribution
`probability`	probability parameter for the negative binomial distribution

Value

A simulated frequency count table.

Author(s)

Amy Willis

Plug-in Inverse Simpson diversity

Description

This function implements the plug-in Inverse Simpson diversity

Usage

sample_inverse_simpson(input_data)
sample_inverse_simpson(input_data)

Arguments

input_data

An input type that can be processed by convert() or a phyloseq object

Value

An object of class alpha_estimate, or alpha_estimates for phyloseq objects

Examples

sample_inverse_simpson(apples)

sample_inverse_simpson(apples)

Sample richness estimator

Description

This function implements the sample richness estimate, which is the number of non-zero taxa per sample.

Usage

sample_richness(input_data)
sample_richness(input_data)

Arguments

input_data

An input type that can be processed by convert() or a phyloseq object

Value

An object of class alpha_estimate, or alpha_estimates for phyloseq objects

Examples



sample_richness(apples)


sample_richness(apples)

Plug-in Shannon diversity

Description

This function implements the plug-in Shannon diversity

Usage

sample_shannon(input_data)
sample_shannon(input_data)

Arguments

input_data

An input type that can be processed by convert() or a phyloseq object

Value

An object of class alpha_estimate, or alpha_estimates for phyloseq objects

Examples

sample_shannon(apples)

sample_shannon(apples)

Plug-in Shannon's E ("Equitability")

Description

This function implements the plug-in Shannon's E

Usage

sample_shannon_e(input_data)
sample_shannon_e(input_data)

Arguments

input_data

An input type that can be processed by convert() or a phyloseq object

Value

An object of class alpha_estimate, or alpha_estimates for phyloseq objects

Examples

sample_shannon_e(apples)

sample_shannon_e(apples)

Plug-in Simpson diversity

Description

This function implements the plug-in Simpson diversity

Usage

sample_simpson(input_data)
sample_simpson(input_data)

Arguments

input_data

An input type that can be processed by convert() or a phyloseq object

Value

An object of class alpha_estimate, or alpha_estimates for phyloseq objects

Examples

sample_simpson(apples)

sample_simpson(apples)

Estimate the sample size needed to do an unpaired one-way test using betta

Description

Estimate the sample size needed to do an unpaired one-way test using betta

Usage

sample_size_estimate(
  control_group_est,
  se_est,
  diff = 5,
  alpha = 0.05,
  prop = 0.8,
  samples = 20,
  precision = 5
)
sample_size_estimate(
  control_group_est,
  se_est,
  diff = 5,
  alpha = 0.05,
  prop = 0.8,
  samples = 20,
  precision = 5
)

Arguments

`control_group_est`	An estimate of the alpha diversity parameter for the control group
`se_est`	An estimate of the (common) standard deviation
`diff`	An estimate of the difference between the control and non-control groups
`alpha`	Minimum significance level desired
`prop`	Desired power
`samples`	Number of bootstrap resamples used to estimate the sample size. Increase for a more accurate estimate.
`precision`	How much to increment the sample size as we try to increase the power

Value

An estimate of the necessary sample size and some details

Plot the power obtained with sample size

Description

Plot the power obtained with sample size

Usage

sample_size_figure(control_group_est, se_est, diff = 5, samples = 20)
sample_size_figure(control_group_est, se_est, diff = 5, samples = 20)

Arguments

`control_group_est`	An estimate of the alpha diversity parameter for the control group
`se_est`	An estimate of the (common) standard deviation
`diff`	An estimate of the difference between the control and non-control groups
`samples`	Number of bootstrap resamples used to estimate the sample size. Increase for a more accurate estimate.

Value

A plot of the power with the sample size

Simulate from a fitted betta model

Description

Simulate from a fitted betta model

Usage

simulate_betta(fitted_betta, nsim)
simulate_betta(fitted_betta, nsim)

Arguments

`fitted_betta`	A fitted betta object
`nsim`	Number of times to simulate

Value

A list of length nsim, each element of which is a vector of simulated Y-values under the fitted betta model

Simulate from a fitted betta_random model

Description

Simulate from a fitted betta_random model

Usage

simulate_betta_random(fitted_betta, nsim)
simulate_betta_random(fitted_betta, nsim)

Arguments

`fitted_betta`	A fitted betta_random object
`nsim`	Number of times to simulate

Value

A list of length nsim, each element of which is a vector of simulated Y-values under the fitted betta model

(Data) Data frame of soil data from whitman_et_al.

Description

A phyloseq object with an OTU table and sample data from a soil microbiome study.

Usage

soil_phylo
soil_phylo

Format

A phyloseq-class experiment-level object with an OTU table and sample data.

otu_table

OTU table with 7,770 taxa and 119 samples

tax_table

taxonomy table

sam_data

sample data with the following covariates:

Plants, values 0 and 1. Index for different plants
Day, values 0 (initial sampling point), 1 (12 days after treatment additions), and 2 (82 days after treatment additions). Index for different days of measurement
Amdmt, values 0 (no additions), 1 (biochar additions), and 2 (fresh biomass additions). Index for different soil additives.
DayAmdmt, values 00, 01, 02, 10, 11, 12, 20, 21, and 22. A single index for the combination of Day and Amdmt with Day as the first digit and Amdmt as the second digit.
ID, values A, B, C, D, and F. Index for different soil plots.

...

References

Whitman, T., Pepe-Ranney, C., Enders, A., Koechli, C., Campbell, A., Buckley, D. H., Lehmann, J. (2016). Dynamics of microbial community composi-tion and soil organic carbon mineralization in soil following addition of pyrogenic andfresh organic matter. The ISME journal, 10(12):2918. <doi: 10.1038/ismej.2016.68>.

Conduct F test of null hypothesis LB = 0 using output from betta() or betta_random()

Description

This function performs an F-test of a null hypothesis LB = 0 where B is a vector of p fixed effects returned by betta() or betta_random() and L is an m x p matrix with linearly independent rows.

Usage

test_submodel(
  fitted_betta,
  submodel_formula,
  method = "bootstrap",
  nboot = 1000
)
test_submodel(
  fitted_betta,
  submodel_formula,
  method = "bootstrap",
  nboot = 1000
)

Arguments

`fitted_betta`	A fitted betta object – i.e., the output of either betta() or betta_random() – containing fixed effect estimates of interest.
`submodel_formula`	A formula defining which submodel to treat as the null. It is not necessary to include random effects in this formula (they will be ignored if included – the submodel will be fit with the same random effect structure as the full model regardless of input.)
`method`	A character variable indicating which method should be used to estimate the distribution of the test statistic under the null.
`nboot`	Number of bootstrap samples to use if method = "bootstrap". Ignored if method = "asymptotic".

Value

A list containing

`pval`	The p-value
`F_stat`	The calculated F statistic
`boot_F`	A vector of bootstrapped F statistics if bootstrap has been used. Otherwise NULL.

Author(s)

David Clausen

References

Willis, A., Bunge, J., and Whitman, T. (2015). Inference for changes in biodiversity. arXiv preprint.

Examples


# generate example data
df <- data.frame(chats = c(2000, 3000, 4000, 3000,
2000, 3000, 4000, 3000), ses = c(100, 200, 150, 180,
100, 200, 150, 180),
                 Cont_var = c(100, 150, 100, 50,
                 100, 150, 100, 50),
                 Cont_var_2 = c(50,200,25,125,
                 50,200,25,125))

# fit betta()
example_fit <- betta(formula = chats ~ Cont_var + Cont_var_2, ses = ses, data = df)


# construct L for hypothesis that B_cont_var = B_cont_var_2 = 0
L <- rbind(c(0,1,0),
           c(0,0,1))

F_test_results <- F_test(example_fit, L, nboot = 100)

# generate example data
df <- data.frame(chats = c(2000, 3000, 4000, 3000,
2000, 3000, 4000, 3000), ses = c(100, 200, 150, 180,
100, 200, 150, 180),
                 Cont_var = c(100, 150, 100, 50,
                 100, 150, 100, 50),
                 Cont_var_2 = c(50,200,25,125,
                 50,200,25,125))

# fit betta()
example_fit <- betta(formula = chats ~ Cont_var + Cont_var_2, ses = ses, data = df)


# construct L for hypothesis that B_cont_var = B_cont_var_2 = 0
L <- rbind(c(0,1,0),
           c(0,0,1))

F_test_results <- F_test(example_fit, L, nboot = 100)

(Data) Data frame of covariate information about toy_otu_table.

Description

(Data) Data frame of covariate information about toy_otu_table.

Usage

toy_metadata
toy_metadata

Format

A data frame with 143 rows and 4 variables:

Years: Year of sampling
bloom2: Did the sample correspond to a bloom event?
Period: What season was sampled?
Site: Where was the sample taken from?

...

(Data) A toy OTU table.

Description

Covariate info available in 'toy_metadata'. A data frame with 448 rows and 143 columns. Rows give the abundance of each taxon; columns give the samples

Usage

toy_otu_table
toy_otu_table

Format

An object of class data.frame with 448 rows and 143 columns.

(Data) The taxonomy of the OTUs in 'toy_otu_table'.

Description

(Data) The taxonomy of the OTUs in 'toy_otu_table'.

Usage

toy_taxonomy
toy_taxonomy

Format

An object of class factor of length 448.

Calculate the true Gini-Simpson index

Description

Calculate the true Gini-Simpson index

Usage

true_gini(input)
true_gini(input)

Arguments

input

A vector of proportions.

Value

The Gini-Simpson index of the population given by input.

Note

This function is intended for population-level data. If you are dealing with a microbial sample, use DivNet instead.

Calculate the true Hill numbers

Description

Calculate the true Hill numbers

Usage

true_hill(input, q)
true_hill(input, q)

Arguments

`input`	A vector of proportions.
`q`	The Hill number of interest. q = 0 corresponds to species richness, q = 2 corresponds to inverse Simpson, etc.

Value

The Hill number of the population given by input.

Calculate the true Inverse Simpson index

Description

Calculate the true Inverse Simpson index

Usage

true_inverse_simpson(input)
true_inverse_simpson(input)

Arguments

input

A vector of proportions.

Value

The inverse-Simpson index of the population given by input.

Note

This function is intended for population-level data. If you are dealing with a microbial sample, use DivNet instead.

Calculate the true Shannon index based on proportions

Description

Calculate the true Shannon index based on proportions

Usage

true_shannon(input)
true_shannon(input)

Arguments

input

A vector of proportions.

Value

The Shannon index of the population given by input.

Note

This function is intended for population-level data. If you are dealing with a microbial sample, use DivNet instead.

Calculate the true Shannon's equitability index

Description

Calculate the true Shannon's equitability index

Usage

true_shannon_e(input)
true_shannon_e(input)

Arguments

input

A vector of proportions.

Value

The Shannon E's of the population given by input.

Note

This function is intended for population-level data. If you are dealing with a microbial sample, use DivNet instead.

Calculate the true Simpson index

Description

Calculate the true Simpson index

Usage

true_simpson(input)
true_simpson(input)

Arguments

input

A vector of proportions.

Value

The Simpson index of the population given by input.

Note

This function is intended for population-level data. If you are dealing with a microbial sample, use DivNet instead.

The transformed weighted linear regression estimator for species richness estimation

Description

This function implements the transformed version of the species richness estimation procedure outlined in Rocchetti, Bunge and Bohning (2011).

Usage

wlrm_transformed(
  input_data,
  cutoff = NA,
  print = NULL,
  plot = NULL,
  answers = NULL
)
wlrm_transformed(
  input_data,
  cutoff = NA,
  print = NULL,
  plot = NULL,
  answers = NULL
)

Arguments

`input_data`	An input type that can be processed by `convert()` or a `phyloseq` object
`cutoff`	Maximum frequency count to use
`print`	Deprecated; only for backwards compatibility
`plot`	Deprecated; only for backwards compatibility
`answers`	Deprecated; only for backwards compatibility

Value

An object of class alpha_estimate, or alpha_estimates for phyloseq objects

Note

While robust to many different structures, model is almost always misspecified. The result is usually implausible diversity estimates with artificially low standard errors. Extreme caution is advised.

Author(s)

Amy Willis

References

Rocchetti, I., Bunge, J. and Bohning, D. (2011). Population size estimation based upon ratios of recapture probabilities. Annals of Applied Statistics, 5.

Examples


wlrm_transformed(apples)
wlrm_transformed(apples, plot = FALSE, print = FALSE, answers = TRUE)

wlrm_transformed(apples)
wlrm_transformed(apples, plot = FALSE, print = FALSE, answers = TRUE)

The untransformed weighted linear regression estimator for species richness estimation

Description

This function implements the untransformed version of the species richness estimation procedure outlined in Rocchetti, Bunge and Bohning (2011).

Usage

wlrm_untransformed(
  input_data,
  cutoff = NA,
  print = NULL,
  plot = NULL,
  answers = NULL
)
wlrm_untransformed(
  input_data,
  cutoff = NA,
  print = NULL,
  plot = NULL,
  answers = NULL
)

Arguments

`input_data`	An input type that can be processed by `convert()` or a `phyloseq` object
`cutoff`	Maximum frequency count to use
`print`	Deprecated; only for backwards compatibility
`plot`	Deprecated; only for backwards compatibility
`answers`	Deprecated; only for backwards compatibility

Value

An object of class alpha_estimate, or alpha_estimates for phyloseq objects

Note

This estimator is based on the negative binomial model and for that reason generally produces poor fits to microbial data. The result is usually artificially low standard errors. Caution is advised.

Author(s)

Amy Willis

References

Rocchetti, I., Bunge, J. and Bohning, D. (2011). Population size estimation based upon ratios of recapture probabilities. Annals of Applied Statistics, 5.

Examples


wlrm_untransformed(apples)

wlrm_untransformed(apples)

`estimate`	The estimate
`error`	The standard error in the estimate
`estimand`	What is the estimate trying to estimate? (richness, Shannon...)
`name`	The name of the method
`interval`	An interval estimate
`interval_type`	Type of interval estimate
`type`	TODO(Amy): Deprecate?
`model`	What model is fit
`warnings`	Any warnings?
`frequentist`	Logical. Frequentist or Bayesian?
`parametric`	Logical. Parametric or not?
`plot`	A ggplot associated with the estimate
`reasonable`	Is the estimate likely to be reasonable?
`other`	Any other relevant objects
`...`	Any other objects

Package 'breakaway'

Help Index

alpha_estimate

Description

Usage

Arguments

Value

alpha_estimates

Description

Usage

Arguments

Value

(Data) Frequency count table of soil microbes in an apples orchard.

Description

Usage

Format

References

Modelling total diversity with betta

Description

Usage

Arguments

Value

Note

Author(s)

References

See Also

Examples

Confidence intervals and testing for linear combinations of fixed effects estimated via betta() or betta_random()

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

function for plotting total diversity

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

modelling total diversity with random effects

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Species richness estimation with breakaway

Description

Usage

Arguments

Value

Note

Author(s)

References

See Also

Examples

species richness estimation without singletons

Description

Usage

Arguments

Value

Note

Author(s)

References

See Also

Examples

Build frequency count tables from an OTU table

Description

Usage

Arguments

Value

Chao-Bunge species richness estimator