Skip to contents

This wrapper performs differential expression (DE) analysis (DESeq2, edgeR, limma, or both) and returns a fully initialized VISTA object. The object stores expression matrices and annotations in the SummarizedExperiment core, while all DE outputs and configuration live in metadata(vista):

  • $de_results: named SimpleList of per-contrast DE tables

  • $de_summary: named SimpleList of summary tables

  • $de_cutoffs: list of thresholds/method options

  • $group: list with column, palette, colors

Usage

create_vista(
  counts,
  sample_info,
  column_geneid,
  group_column,
  group_numerator,
  group_denominator,
  method = c("deseq2", "edger", "limma", "both"),
  min_counts = 10,
  min_replicates = 1,
  log2fc_cutoff = 1,
  pval_cutoff = 0.05,
  p_value_type = "padj",
  covariates = NULL,
  design_formula = NULL,
  consensus_mode = c("intersection", "union"),
  consensus_log2fc = c("mean", "deseq2", "edger"),
  result_source = NULL,
  group_palette = "Dark 2",
  comparison_palette = "Dark 3",
  validate = TRUE
)

Arguments

counts

Raw counts (matrix/data.frame) with a gene-id column and sample columns.

sample_info

Data frame with sample metadata. Must contain sample_names (or have rownames equal to sample columns in counts) and the group_column.

column_geneid

Column name in counts that contains gene identifiers.

group_column

Column in sample_info used to group samples.

group_numerator

Character vector of numerator groups for pairwise comparisons.

group_denominator

Character vector of denominator groups (same length/order as numerator).

method

"deseq2", "edger", "limma", or "both".

min_counts

Minimum total counts per gene to retain (default: 10).

min_replicates

Minimum samples per group meeting min_counts (default: 1).

log2fc_cutoff

Absolute LFC threshold for DEG calling (default: 1).

pval_cutoff

p-value (raw or adjusted) threshold (default: 0.05).

p_value_type

Which p-value column to use ("padj" or "pvalue"). Default: "padj".

covariates

Optional character vector of additional sample_info columns to adjust for. These are included as additive terms in the DE design.

design_formula

Optional model formula (or formula string) overriding automatic construction from group_column + covariates. Must include group_column.

consensus_mode

When method = "both", how to define consensus calls: "intersection" (both methods significant in same direction) or "union" (either method significant; discordant directions excluded).

consensus_log2fc

When method = "both", how to populate consensus log2fc: "mean", "deseq2", or "edger".

result_source

Active DE source used in metadata(v)$de_results. For method = "both", one of "consensus", "deseq2", "edger". For single-method runs, this must match method.

group_palette

Qualitative palette name for colorspace::qualitative_hcl(). One of c("Pastel 1","Dark 2","Dark 3","Set 2","Set 3","Warm","Cold","Harmonic","Dynamic"). Default: "Dark 2".

comparison_palette

Qualitative palette name used to assign colors per comparison (stored in metadata(v)$comparison$colors). Defaults to "Dark 3".

validate

Logical; if TRUE (default), run full validate_vista() checks before returning the object.

Value

A VISTA object:

  • assays(v): norm_counts (matrix)

  • colData(v): sample_info (DataFrame)

  • rowData(v): row_data (DataFrame)

  • metadata(v): de_results, de_summary, de_cutoffs, group, comparison, provenance

Details

Contrast names follow "numerator_VS_denominator". Each DE table must have rownames identical to the final norm_counts rownames. When method = "both", method-specific and consensus DE tables are stored in metadata(v)$de_results_by_method and metadata(v)$de_summary_by_method, and the active source is tracked in metadata(v)$de_active_source.

Examples

# Load example data
data("count_data", package = "VISTA")
data("sample_metadata", package = "VISTA")

# Create VISTA object with DESeq2 (default method)
vista <- create_vista(
  counts = count_data[1:100, ],
  sample_info = sample_metadata[1:6, ],
  column_geneid = "gene_id",
  group_column = "cond_long",
  group_numerator = "treatment1",
  group_denominator = "control",
  log2fc_cutoff = 0.6,
  pval_cutoff = 0.05
)
#> estimating size factors
#> estimating dispersions
#> gene-wise dispersion estimates
#> mean-dispersion relationship
#> final dispersion estimates
#> fitting model and testing

# Examine the VISTA object
vista
#> class: SummarizedExperiment 
#> dim: 85 6 
#> metadata(12): de_results de_summary ... design comparison
#> assays(1): norm_counts
#> rownames(85): ENSG00000000003 ENSG00000000419 ... ENSG00000005469
#>   ENSG00000005471
#> rowData names(1): baseMean
#> colnames(6): SRR1039508 SRR1039509 ... SRR1039516 SRR1039517
#> colData names(14): SampleName cell ... sizeFactor sample_names

# Access comparisons
names(comparisons(vista))
#> [1] "treatment1_VS_control"

# View DEG summary
deg_summary(vista)
#> $treatment1_VS_control
#>   regulation  n
#> 1       Down  1
#> 2      Other 82
#> 3         Up  2
#> 

# View cutoffs used
cutoffs(vista)
#> $log2fc
#> [1] 0.6
#> 
#> $pval
#> [1] 0.05
#> 
#> $p_value_type
#> [1] "padj"
#> 
#> $method
#> [1] "deseq2"
#> 
#> $min_counts
#> [1] 10
#> 
#> $min_replicates
#> [1] 1
#> 
#> $covariates
#> character(0)
#> 
#> $design_formula
#> NULL
#> 
#> $consensus_mode
#> NULL
#> 
#> $consensus_log2fc
#> NULL
#> 
#> $active_source
#> [1] "deseq2"
#> 

# Multiple comparisons example
if (FALSE) { # \dontrun{
vista_multi <- create_vista(
  counts = count_data,
  sample_info = sample_metadata,
  column_geneid = "gene_id",
  group_column = "cell",
  group_numerator = c("N052611", "N080611"),
  group_denominator = c("N61311", "N61311"),
  method = "edger",
  log2fc_cutoff = 1.0,
  pval_cutoff = 0.01
)
} # }