For most downstream analyses, genes were deemed statistically significant if the multiple-testing-adjusted probability MK-2206 cost that they were falsely-deemed altered by TCDD (i.e. α, the false-positive probability) was below that of our positive control gene, Cyp1a1, in any of the rat strains. Thus our effective p-value threshold was the maximum adjusted p-value observed for the well-characterized dioxin-responsive gene Cyp1a1. All statistical analyses were performed in the limma package (v3.6.9) for the R environment (v2.12.2). Unsupervised agglomerative hierarchical clustering with complete linkage was employed to visualize patterns in mRNA expression
across rat strains, using Pearson’s correlation as the similarity metric. The lattice and latticeExtra packages were used for visualization (v0.19–24 and v0.6–15 respectively) in the R statistical environment (v2.12.2). Venn diagrams were created using the VennDiagram R package (v1.0.0) (Chen and Boutros, 2011). We applied the hypergeometric test to assess statistical significance of gene overlaps. Pathway analysis was conducted using GOMiner software (Zeeberg et al., 2003). We used build 269 of the GOMiner application, with database build 2009-09. We checked our genes of interest against a randomly drawn sample from the dataset with a false discovery rate (FDR) threshold FLT3 inhibitor of 0.1, 1000 randomizations, all rat
databases and look-up options, all GO evidence codes and ontologies (molecular function, cellular component and biological process) and a minimum of five genes for a GO term. Separate ontological analyses were run for genes differentially expressed in each rat strain. Subsequently, RedundancyMiner (Zeeberg et al., 2011) was used to de-replicate enriched GO categories and to refine pathway analysis. A CIM file generated PRKACG from GOMiner was loaded into R statistical environment (v2.13.1). Input files for RedundancyMiner were created by concatenating categories when
FDR ≤ 0.20 in at least 4 strains. This relaxed p-value threshold was chosen to allow for biological variability between strains; the emphasis on at least 4 strains allowed the genetic model to form the primary filter, while allowing flexibility for biological variability and allowing for false negatives. There are two parameters used to collapse the matrix: compression and biological interpretation. Generally, more permissive p-values offer greater compression but can concatenate many of the same GO categories into different groups, thereby producing another type of redundancy. For each dataset, p-values were empirically chosen to ensure sufficient compression that GO categories with biological functions could be interpreted correctly. Based on these selection criteria, 32 GO categories were chosen. The input matrix was collapsed to obtain 20 final categories and a compression ratio of 1.60. Visualization of RedundancyMiner results was done using lattice package (0.19-31) for R (v2.13.1).