# Biostatistics

### Unit of study descriptions

**BSTA5001 Mathematics Background for Biostatistics**

Credit points: 6 Teacher/Coordinator: Professor Judy Simpson Session: Semester 1,Semester 2 Classes: 8-12 hours total study time per week, distance learning Assessment: 3x assignments (20%, 40% and 40%) Mode of delivery: Distance education

The aim of this unit is to provide students with the mathematics required for studying biostatistics at a more rigorous level. On completion of this unit students should be able to follow the mathematical demonstrations and proofs used in biostatistics at Masters degree level, and to understand the mathematics behind statistical methods introduced at that level. The intention is to allow students to concentrate on statistical concepts in subsequent units, and not be distracted by the mathematics employed. Content: basic algebra and analysis; exponential functions; calculus; series, limits, approximations and expansions; linear algebra, matrices and determinants; numerical methods.

Textbooks

Compulsory: 1) Anton H, Bivens I, Davis S. Calculus: early transcendentals combined, 10th edition. Wiley, 2012. ISBN 978-0-470-64769-1. 2) Anton, Howard. Elementary Linear Algebra. 10th edition, Wiley 2010. Recommended reference book (not compulsory): Healy, MJR. Matrices for Statistics, 2nd edition. Oxford University Press, 2000, ISBN 978-0-470-45821-1. Notes supplied.

**BSTA5002 Principles of Statistical Inference**

Credit points: 6 Teacher/Coordinator: Ms Liz Barnes, University of Sydney (semester 1); A/Prof Patrick Kelly, University of Sydney (semester 2) Session: Semester 1,Semester 2 Classes: 8-12 hours total study time per week, distance learning Prerequisites: BSTA5023 Assessment: 2 written assignments (40% each) and module exercises (20%) Mode of delivery: Distance education

The aim of this unit is to provide a strong mathematical and conceptual foundation in the methods of statistical inference, with an emphasis on practical aspects of the interpretation and communication of statistically based conclusions in health research. Content covered includes: review of the key concepts of estimation and construction of Normal-theory confidence intervals; frequentist theory of estimation including hypothesis tests; methods of inference based on likelihood theory, including use of Fisher and observed information and likelihood ratio; Wald and score tests; an introduction to the Bayesian approach to inference; an introduction to distribution-free statistical methods.

Textbooks

Marschner IC. Inference Principles for Biostatisticians. Chapman and Hall / CRC Pr, 2014. ISBN 978-1-48222-223-4. Notes supplied.

**BSTA5003 Health Indicators and Health Surveys**

Credit points: 6 Teacher/Coordinator: Associate Professor Armando Teixeira-Pinto, University of Sydney Session: Semester 1 Classes: 8-12 hours total study time per week, distance learning Corequisites: BSTA5001 Assessment: 4 written assignments (25%, 25%, 25%, 25%) Mode of delivery: Distance education

On completion of this unit students should be able to derive and compare population measures of mortality, illness, fertility and survival, be aware of the main sources of routinely collected health data and their advantages and disadvantages, and be able to collect primary data by a well-designed survey and analyse and interpret it appropriately. Content covered in this unit includes: routinely collected health-related data; quantitative methods in demography, including standardisation and life tables; health differentials; design and analysis of population health surveys including the roles of stratification, clustering and weighting.

Textbooks

Paul S. Levy, Stanley Lemeshow, Sampling of Populations: Methods and Applications, 4th edition, Wiley Interscience 2008.

**BSTA5004 Data Management and Statistical Computing**

Credit points: 6 Teacher/Coordinator: Professor Judy Simpson Session: Semester 1,Semester 2 Classes: 8-12 hours total study time per week, distance learning Assessment: 3x written assignments (30%, 35%, 35%) Mode of delivery: Distance education

The aim of this unit is to provide students with the knowledge and skills required to undertake moderate to high level data manipulation and management in preparation for statistical analysis of data typically arising in health and medical research. Students will: gain experience in data manipulation and management using two major statistical software packages (Stata and SAS); learn how to display and summarise data using statistical software; become familiar with the checking and cleaning of data; learn how to link files through use of unique and non-unique identifiers; acquire fundamental programming skills for efficient use of software packages; and learn key principles of confidentiality and privacy in data storage, management and analysis. The topics covered are: Module 1 - Stata and SAS: The basics (importing and exporting data, recoding data, formatting data, labelling variable names and data values; using dates, data display and summary presentation); Module 2 - Stata and SAS: graphs, data management and statistical quality assurance methods (including advanced graphics to produce publication-quality graphs); Module 3 - Data management using Stata and SAS (using functions to generate new variables, appending, merging, transposing longitudinal data; programming skills for efficient and reproducible use of these packages, including loops, arguments and programs/macros).

Textbooks

Recommended if you have not used SAS or Stata before: Lora D. Delwiche and Susan J. Slaughter. SAS: The Little SAS Book, 5th edition. SAS Institute Inc., 2012.

**BSTA5005 Clinical Biostatistics**

Credit points: 6 Teacher/Coordinator: Professor Judy Simpson Session: Semester 1 Classes: 8-12 hours total study time per week, distance learning Prerequisites: BSTA5002 and BSTA5006 Corequisites: BSTA5007 Assessment: 3 written assignments (40%, 30%, 30%) Mode of delivery: Distance education

The aim of this unit is to enable students to use correctly statistical methods of particular relevance to evidence-based health care and to advise clinicians on the application of these methods and interpretation of the results. Content: Clinical trials (equivalence trials, cross-over trials); Clinical agreement (Bland-Altman methods, kappa statistics, intraclass correlation); Statistical process control (special and common causes of variation; quality control charts); Diagnostic tests (sensitivity, specificity, ROC curves); Meta-analysis (systematic reviews, assessing heterogeneity, publication bias, estimating effects from randomised controlled trials, diagnostic tests and observational studies).

Textbooks

Notes supplied

**BSTA5006 Design of Randomised Controlled Trials**

Credit points: 6 Teacher/Coordinator: Professor Judy Simpson Session: Semester 2 Classes: 8-12 hours total study time per week, distance learning Prerequisites: BSTA5001 and (BSTA5011 or PUBH5010) Assessment: 3 written assignments (30%, 30%, 40%) Mode of delivery: Distance education

The aim of this unit is to enable students to understand and apply the principles of design and analysis of experiments, with a particular focus on randomised controlled trials (RCTs), to a level where they are able to contribute effectively as a statistician to the planning, conduct and reporting of a standard RCT. This unit covers: ethical considerations; principles and methods of randomisation in controlled trials; treatment allocation, blocking, stratification and allocation concealment; parallel, factorial and crossover designs including n-of-1 studies; practical issues in sample size determination; intention-to-treat principle; phase I dose-finding studies; phase II safety and efficacy studies; interim analyses and early stopping; multiple outcomes/endpoints, including surrogate outcomes, multiple tests and subgroup analyses, including adjustment of significance levels and P-values; missing data; reporting trial results and use of the CONSORT statement.

Textbooks

Matthews JNS. Introduction to Randomised Controlled Clinical Trials, 2nd edition. Chapman and Hall/CRC Press 2006. ISBN P/back: 978154886242, eBook: 9781420011302

**BSTA5007 Linear Models**

Credit points: 6 Teacher/Coordinator: Professor Judy Simpson (semester 1), Dr Timothy Schlub (semester 2) Session: Semester 1,Semester 2 Classes: 8-12 hours total study time per week, distance learning Prerequisites: BSTA5023 and (BSTA5011 or PUBH5010) Corequisites: BSTA5002 Assessment: 2x written assignments (30% each), 4 shorter assignments including brief online quizzes (40%) Mode of delivery: Distance education

The aim of this unit is to enable students to apply methods based on linear models to biostatistical data analysis, with proper attention to underlying assumptions and a major emphasis on the practical interpretation and communication of results. This unit will cover: the method of least squares; regression models and related statistical inference; flexible nonparametric regression; analysis of covariance to adjust for confounding; multiple regression with matrix algebra; model construction and interpretation (use of dummy variables, parametrisation, interaction and transformations); model checking and diagnostics; regression to the mean; handling of baseline values; the analysis of variance; variance components and random effects.

NOTE: LMR is an important foundation unit. Students who do not develop a strong grasp of this material will struggle to become successful biostatisticians.

NOTE: LMR is an important foundation unit. Students who do not develop a strong grasp of this material will struggle to become successful biostatisticians.

Textbooks

Notes supplied.

**BSTA5008 Categorical Data and GLMs**

Credit points: 6 Teacher/Coordinator: Professor Judy Simpson Session: Semester 2 Classes: 8-12 hours total study time per week, distance learning Corequisites: BSTA5007 Assessment: 3x written assignments (35%, 35%, 30%) Mode of delivery: Distance education

The aim of this unit is to enable students to use generalised linear models (GLMs) and other methods to analyse categorical data, with proper attention to underlying assumptions. There is an emphasis on the practical interpretation and communication of results to colleagues and clients who might not be statisticians. This unit covers: Introduction to and revision of conventional methods for contingency tables especially in epidemiology; odds ratios and relative risks, chi-squared tests for independence, Mantel-Haenszel methods for stratified tables, and methods for paired data. The exponential family of distributions; generalised linear models (GLMs), and parameter estimation for GLMs. Inference for GLMs - including the use of score, Wald and deviance statistics for confidence intervals and hypothesis tests, and residuals. Binary variables and logistic regression models - including methods for assessing model adequacy. Nominal and ordinal logistic regression for categorical response variables with more than two categories. Count data, Poisson regression and log-linear models.

Textbooks

Notes supplied

**BSTA5009 Survival Analysis**

Credit points: 6 Teacher/Coordinator: Professor Judy Simpson Session: Semester 1 Classes: 8-12 hours total study time per week, distance learning Prerequisites: BSTA5007 Assessment: 3x assignments (30%, 30%, 40%) Mode of delivery: Distance education

The aim of this unit is to enable students to analyse data from studies in which individuals are followed up until a particular event occurs, e.g. death, cure, relapse, making use of follow-up data also for those who do not experience the event, with proper attention to underlying assumptions and a major emphasis on the practical interpretation and communication of results. The content covered in this unit includes: Kaplan-Meier life tables; logrank test to compare two or more groups; Cox's proportional hazards regression model; checking the proportional hazards assumption; time-dependent covariates; multiple or recurrent events; sample size calculations for survival studies.

Textbooks

Compulsory: Hosmer DW, Lemeshow S, May S. Applied Survival Analysis: Regression Modeling of Time to Event Data, 2nd edition. Wiley Interscience 2008. ISBN 978-0-471-75499-2; Recommended: Cleves M, Gould W, Gutierrez R, Marchenko Y. An Introduction to Survival Analysis Using Stata, 3rd edition. Stata Press 2010. ISBN 978-1-59718-074-0. Order online at www.survey-design.com.au or www.stata.com/bookstore/bios.html. Notes supplied.

**BSTA5011 Epidemiology for Biostatisticians**

Credit points: 6 Teacher/Coordinator: Professor Judy Simpson Session: Semester 2 Classes: 8-12 hours total study time per week, distance learning Prohibitions: PUBH5010 or CEPI5100 Assessment: 3x written assignments (25%, 50%, 25%) Mode of delivery: Distance education

Note: Department permission required for enrolment

On completion of this unit students should be familiar with the major concepts and tools of epidemiology, the study of health in populations, and should be able to judge the quality of evidence in health-related research literature.

This unit covers: historical developments in epidemiology; sources of data on mortality and morbidity; disease rates and standardisation; prevalence and incidence; life expectancy; linking exposure and disease (eg. relative risk, attributable risk); main types of study designs - case series, ecological studies, cross-sectional surveys, case-control studies, cohort or follow-up studies, randomised controlled trials; sources of error (chance, bias, confounding); association and causality; evaluating published papers; epidemics and epidemic investigation; surveillance; prevention; screening; the role of epidemiology in health services research and policy.

This unit covers: historical developments in epidemiology; sources of data on mortality and morbidity; disease rates and standardisation; prevalence and incidence; life expectancy; linking exposure and disease (eg. relative risk, attributable risk); main types of study designs - case series, ecological studies, cross-sectional surveys, case-control studies, cohort or follow-up studies, randomised controlled trials; sources of error (chance, bias, confounding); association and causality; evaluating published papers; epidemics and epidemic investigation; surveillance; prevention; screening; the role of epidemiology in health services research and policy.

Textbooks

Bain C, Webb P. Essential Epidemiology: An Introduction for Students and Health Professionals, 2nd edition. Cambridge University Press, 2011.

**BSTA5012 Longitudinal and Correlated Data**

Credit points: 6 Teacher/Coordinator: Professor Judy Simpson Session: Semester 1 Classes: 8-12 hours total study time per week, distance learning Prerequisites: BSTA5008 Assessment: 2x major written assignments (30% each), 5x shorter written assignments (8% each) Mode of delivery: Distance education

This unit aims to enable students to apply appropriate methods to the analysis of data arising from longitudinal (repeated measures) epidemiological or clinical studies, and from studies with other forms of clustering (cluster sample surveys, cluster randomised trials, family studies) that will produce non-exchangeable outcomes. Content covered in this unit includes: Paired data; the effect of non-independence on comparisons within and between clusters of observations; methods for continuous outcomes; normal mixed effects (hierarchical or multilevel) models and generalised estimating equations (GEE); role and limitations of repeated measures ANOVA; methods for discrete data; GEE and generalised linear mixed models (GLMM); methods for count data.

Textbooks

Recommended: Fitzmaurice G, Laird N, Ware J. Applied Longitudinal Analysis. John Wiley and Sons, 2011. ISBN 978-0-471-21487-8.

**BSTA5013 Bioinformatics**

*This unit of study is not available in 2018*

Credit points: 6 Teacher/Coordinator: Dr Nicola Armstrong, Murdoch University Session: Semester 2 Classes: 8-12 hours total study time per week Prerequisites: BSTA5007 Assessment: 3 written assignments (20% each); final at-home examination (40%) Mode of delivery: Distance education

Note: This unit of study is only offered in odd numbered years. It is available in 2017.

The aim of this unit is to provide students with an introduction to the field of Bioinformatics. Bioinformatics is a multidisciplinary field that combines biology with quantitative methods to help understand biological processes, such as disease progression. Content: basic notions in biology; basic principles of statistical genetics; web-based tools, data sources and retrieval; analysis of single and multiple DNA or protein sequences; hidden Markov models and their applications; evolutionary models; phylogenetic trees; transcriptomics (gene expression microarrays and RNA-seq); use of R in bioinformatics applications.

Textbooks

Durbin R, Eddy S, Krogh A, Mitchison G. Biological Sequence Analysis: Probabilistic models of proteins and nucleic acids. Cambridge University Press, 1998. ISBN 978-0-521-62971-3. Notes supplied.

**BSTA5014 Bayesian Statistical Methods**

Credit points: 6 Teacher/Coordinator: Prof Judy Simpson Session: Semester 2 Classes: 8-12 hours total study time per week, distance learning Prerequisites: BSTA5008 and (PUBH5010 or BSTA5011 or CEPI5100) Assessment: Assignments 60% (2x 30%) and submitted exercises (40%) Mode of delivery: Distance education

Note: This unit of study is only offered in even numbered years. It is available in 2018.

The aim of this unit is to achieve an understanding of the logic of Bayesian statistical inference, i.e. the use of probability models to quantify uncertainty in statistical conclusions, and acquire skills to perform practical Bayesian analysis relating to health research problems. This unit covers: simple one-parameter models with conjugate prior distributions; standard models containing two or more parameters, including specifics for the normal location-scale model; the role of non-informative prior distributions; the relationship between Bayesian methods and standard "classical" approaches to statistics, especially those based on likelihood methods; computational techniques for use in Bayesian analysis, especially the use of simulation from posterior distributions, with emphasis on the WinBUGS package as a practical tool; application of Bayesian methods for fitting hierarchical models to complex data structures.

Textbooks

Gelman A, Carlin JB, Stern HS, Rubin DB, Dunson DB, Vehtari A. Bayesian Data Analysis, 3rd edition. Chapman and Hall, 2003. ISBN 978-1-58488-388-3; Notes provided.

**BSTA5020 Biostatistics Research Project Part A**

Credit points: 6 Teacher/Coordinator: A/Prof Patrick Kelly, University of Sydney Session: Semester 1,Semester 2 Classes: Supervision by an experienced biostatistician Prerequisites: 24 credit points including BSTA5004 and BSTA5007 Prohibitions: BSTA5022 Assessment: There is no assessment for Part A. For Part B, the portfolio will be examined by two examiners, at least one of whom will be internal to the University of Sydney. (100%) Mode of delivery: Normal (lecture/lab/tutorial) day

Note: Department permission required for enrolment

This unit is for master's students who intend to do two workplace projects and will therefore enrol in BSTA5021 as well. The aim of the unit is to give master's students practical experience, usually in workplace settings, in the application of knowledge and skills learnt during the coursework of the master's program. Students will provide evidence of having met this goal by presenting a portfolio made up of a preface and two project reports. The projects should not all be of the same type and must involve the use of different statistical methods and concepts. At least one project should involve complex multivariable analysis of data. Students should enrol in both Workplace Project Portfolio A and Workplace Project Portfolio Part B, either in semesters 1 and 2 respectively, or both in the same semester.

Textbooks

There are no essential readings for this unit.

**BSTA5021 Biostatistics Research Project Part B**

Credit points: 6 Teacher/Coordinator: A/Prof Patrick Kelly, University of Sydney Session: Semester 1,Semester 2 Classes: Supervision by an experienced biostatistician Prerequisites: 24 credit points including BSTA5004 and BSTA5007 Corequisites: BSTA5020 Prohibitions: BSTA5022 Assessment: There is no assessment for Part A. For Part B, the portfolio will be examined by two examiners, at least one of whom will be internal to the University of Sydney. (100%) Mode of delivery: Normal (lecture/lab/tutorial) day

Note: Department permission required for enrolment

This unit is for master's students who wish to do two workplace projects and are also doing BSTA5020. The aim of the unit is to give master's students practical experience, usually in workplace settings, in the application of knowledge and skills learnt during the coursework of the master's program. Students will provide evidence of having met this goal by presenting a portfolio made up of a preface and two project reports. The projects should not all be of the same type and must involve the use of different statistical methods and concepts. At least one project should involve complex multivariable analysis of data. Students should enrol in both Workplace Project Portfolio A and Workplace Project Portfolio Part B, either in semesters 1 and 2 respectively, or both in the same semester.

Textbooks

There are no essential readings for this unit.

**BSTA5022 Biostatistics Research Project Part C**

Credit points: 6 Teacher/Coordinator: A/Prof Patrick Kelly, University of Sydney Session: Semester 1,Semester 2 Classes: supervision by an experienced biostatistician Prerequisites: 24 credit points including BSTA5004 and BSTA5007 Prohibitions: BSTA5020 or BSTA5021 Assessment: the portfolio will be examined by two examiners, at least one of whom will be internal to the University of Sydney (100%) Mode of delivery: Normal (lecture/lab/tutorial) day

Note: Department permission required for enrolment

This unit is for master's students who intend to do only one workplace project. The aim of the unit is to give master's students practical experience, usually in workplace settings, in the application of knowledge and skills learnt during the coursework of the master's program. Students will provide evidence of having met this goal by presenting a portfolio made up of a preface and one project report. The project must involve complex multivariable analysis of data.

**BSTA5023 Probability and Distribution Theory**

Credit points: 6 Teacher/Coordinator: Professor Judy Simpson Session: Semester 1,Semester 2 Classes: 8-12 hours total study time per week, distance learning Prerequisites: BSTA5001 Assessment: practical written exercises (30%) and 2x written assignments (35% each) Mode of delivery: Distance education

This unit will focus on applying the calculus-based techniques learned in Mathematical Background for Biostatistics (MBB) to the study of probability and statistical distributions. These two units, together with the subsequent Principles of Statistical Inference (PSI) unit, will provide the core prerequisite mathematical statistics background required for the study of later units in the Graduate Diploma or Masters degree. Content: This unit begins with the study of probability, random variables, discrete and continuous distributions, and the use of calculus to obtain expressions for parameters of these distributions such as the mean and variance. Joint distributions for multiple random variables are introduced together with the important concepts of independence, correlation and covariance, marginal and conditional distributions. Techniques for determining distributions of transformations of random variables are discussed. The concept of the sampling distribution and standard error of an estimator of a parameter is presented, together with key properties of estimators. Large sample results concerning the properties of estimators are presented with emphasis on the central role of the Normal distribution in these results. General approaches to obtaining estimators of parameters are introduced. Numerical simulation and graphing with Stata is used throughout to demonstrate concepts.

Textbooks

Wackerly DO, Mendenhall W, Scheaffer RL. Mathematical Statistics with Applications, 7th edition, 2007, Brooks/Cole, Cengage Learning, USA. ISBN 978-0-495-11081-1 Notes supplied

**PUBH5010 Epidemiology Methods and Uses**

Credit points: 6 Teacher/Coordinator: Dr Erin Mathieu, Professor Tim Driscoll Session: Semester 1 Classes: 1x 1hr lecture and 1x 2hr tutorial per week for 13 weeks - faceto face or their equivalent online Prohibitions: BSTA5011,CEPI5100 Assessment: 1x 6 page assignment (25%), 10 weekly quizzes (5% in total) and 1x 2.5hr supervised open-book exam (70%). For distance students, it may be possible to complete the exam externally with the approval of the course coordinator. Mode of delivery: Normal (lecture/lab/tutorial) day, Normal (lecture/lab/tutorial) evening, Online

This unit provides students with core skills in epidemiology, particularly the ability to critically appraise public health and clinical epidemiological research literature regarding public health and clinical issue. This unit covers: study types; measures of frequency and association; measurement bias; confounding/effect modification; randomized trials; systematic reviews; screening and test evaluation; infectious disease outbreaks; measuring public health impact and use and interpretation of population health data. In addition to formal classes or their on-line equivalent,it is expected that students spend an additional 2-3 hours at least each week preparing for their tutorials.

Textbooks

Webb, PW. Bain, CJ. and Pirozzo, SL. Essential Epidemiology: An Introduction for Students and Health Professionals Second Edition: Cambridge University Press 2017.

**PUBH5215 Introductory Analysis of Linked Data**

Credit points: 6 Teacher/Coordinator: Professor Judy Simpson Session: Intensive June,Intensive November Classes: block/intensive mode 5 days 9am-5pm Corequisites: (PUBH5010 or BSTA5011 or CEPI5100) and (PUBH5211 or BSTA5004) Assessment: Reflective journal (30%) and 1x assignment (70%) Mode of delivery: Block mode

This unit introduces the topic of linked health data analysis. It will usually run in late June and late November. The topic is a very specialised one and will not be relevant to most MPH students. The modular structure of the unit provides students with a theoretical grounding in the classroom on each topic, followed by hands-on practical exercises in the computing lab using de-identified linked NSW data files. The computing component assumes a basic familiarity with SAS computing syntax and methods of basic statistical analysis of fixed-format data files. Contents include: an overview of the theory of data linkage methods and features of comprehensive data linkage systems, sufficient to know the sources and limitations of linked health data sets; design of linked data studies using epidemiological principles; construction of numerators and denominators used for the analysis of disease trends and health care utilisation and outcomes; assessment of the accuracy and reliability of data sources; data linkage checking and quality assurance of the study process; basic statistical analyses of linked longitudinal health data; manipulation of large linked data files; writing syntax to prepare linked data files for analysis, derive exposure and outcome variables, relate numerators and denominators and produce results from statistical procedures at an introductory to intermediate level. The main assignment involves the analysis of NSW linked data, which can be done only in the Sydney School of Public Health Computer Lab, and is due 10 days after the end of the unit.

Textbooks

Notes will be distributed in class.