Name	Name	Last commit message	Last commit date
Latest commit History 192 Commits
Day1	Day1
Day2	Day2
Day3	Day3
Day4	Day4
.gitignore	.gitignore
AppStatTrento.Rproj	AppStatTrento.Rproj
LICENSE	LICENSE
README.md	README.md

Name

Last commit message

Last commit date

Day1

Syllabus: Applied Statistics for High-Throughput Biology

Instructor

Levi Waldron, PhD
Associate Professor of Biostatistics
City University of New York School Graduate of Public Health and Health Policy
New York, NY, U.S.A.

Email: lwaldron.research@gmail.com
Hangouts: lwaldron.research
Skype: levi.waldron

Preparation

Please come to the first class with the following installed:

Bioconductor www.bioconductor.org/install
R Studio: https://www.rstudio.com/products/rstudio/download3/

Please create an account at www.github.com, and use it to introduce yourself at https://github.com/waldronlab/AppStatBio/issues.

Summary

This course will provide biologists and bioinformaticians with practical statistical and data analysis skills to perform rigorous analysis of high-throughput biological data. The course assumes some familiarity with genomics and with R programming, but does not assume prior statistical training. It covers the statistical concepts necessary to design experiments and analyze high-dimensional data generated by genomic technologies, including: exploratory data analysis, linear modeling, analysis of categorical variables, principal components analysis, and batch effects.

Textbook

Biomedical Data Science by Irizarry and Love (ePub version)
Source repository

Related Resources

http://waldronlab.io/teaching/resources/

Labs

Each day will include a hands-on lab session, that students should attempt in full.

Session detail by day

All course materials will be available from https://github.com/waldronlab/AppStatBio/.

introduction
- random variables
- distributions
- hypothesis testing for one or two samples (t-test, Wilcoxon test, etc)
- hypothesis testing for categorical variables (Fisher's Test, Chi-square test)
- data manipulation using dplyr
linear modeling
- linear and generalized linear modeling
- model matrix and model formulae
- multiple testing
unsupervised analysis
- graphics for exploratory data analysis
- distance in high dimensions
- principal components analysis and multidimensional scaling
- unsupervised clustering
- batch effects
multi'omic data analysis lab session
- core data classes in Bioconductor: GRanges, SummarizedExperiment, RaggedExperiment, MultiAssayExperiment
- creating a MultiAssayExperiment
- subsetting, reshaping, growing, and extraction of a MultiAssayExperiment
- lotting, correlation, and other statistical analyses
- multi'omics lab code and html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Syllabus: Applied Statistics for High-Throughput Biology

Instructor

Preparation

Summary

Textbook

Related Resources

Labs

Session detail by day

About

Uh oh!

Releases

Packages

Languages

License

mstack-space/AppStatBio

Folders and files

Latest commit

History

Repository files navigation

Syllabus: Applied Statistics for High-Throughput Biology

Instructor

Preparation

Summary

Textbook

Related Resources

Labs

Session detail by day

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages