Data Science
Errata
Item |
Errata |
Date |
1. |
Data Science Major requirement (ii) should read: (ii) 6 credit points of 1000-level units according to the following rules*: (a) 6 credit points of selective units OR (b) 3 credit points of statistics units and 3 credit points of computations units OR (c) 3 credit points of advanced statistics units and 3 credit points of calculus and linear algebra units |
10/1/2018 |
2. |
Data Science Minor requirement (ii) should read:
(ii) 6 credit points of 1000-level units according to the following rules*: (a) 6 credit points of selective units OR (b) 3 credit points of statistics units and 3 credit points of computations units OR (c) 3 credit points of advanced statistics units and 3 credit points of calculus and linear algebra units
|
10/1/2018 |
3. |
MATH1021 Calculus Of One Variable: Semester 2 session has been added.
|
1/2/2018 |
4. |
ENVX1002 Introduction to Statistical Methods: Prohibitions have changed. They now read: N: ENVX1001, MATH1005, MATH1905, MATH1015, MATH1115, DATA1001, BUSS1020, STAT1021 and EMCT1010
|
1/2/2018 |
DATA SCIENCE
Advanced coursework and projects will be available in 2020 for students who complete this major.
Data Science major
A major in Data Science requires 48 credit points from this table including:
(i) 6 credit points of 1000-level core units
(ii) 6 credit points of 1000-level units according to the following rules*:
(a) 6 credit points of selective units
(b) 3 credit points of statistics units and 3 credit points of computations units
(c) 3 credit points of advanced statistics units and 3 credit points of calculus and linear algebra units
(iii) 12 credit points of 2000-level core units
(iv) 6 credit points of 2000-level selective units
(v) 6 credit points of 3000-level interdisciplinary project units
(vi) 6 credit points of 3000-level methodology-focussed units
(vii) 6 credit points of 3000-level methodology or application and discipline-focussed units
Additional application and discipline-focussed units will be available in 2019 in discipline areas including functional genomics, animal genetics/bioinformatics, geoscience /hydrology modelling.
*Students not enrolled in the BSc may substitute ECMT1010 or BUSS1020
Data Science minor
A minor in Data Science requires 36 credit points from this table including:
(i) 6 credit points of 1000-level core units
(ii) 6 credit points of 1000-level units according to the following rules*:
(a) 6 credit points of selective units
(b) 3 credit points of statistics units and 3 credit points of computations units
(c) 3 credit points of advanced statistics units and 3 credit points of calculus and linear algebra units
(iii) 12 credit points of 2000-level core units
(iv) 6 credit points of 2000-level selective units
(v) 6 credit points of 3000-level methodology-focussed units
Units of study
The units of study are listed below.
1000-level units of study
Core
DATA1002 Informatics: Data and Computation
Credit points: 6 Session: Semester 2 Classes: Lectures, Laboratories, Project Work - own time Prohibitions: INFO1903 Assessment: through semester assessment (50%), final exam (50%) Mode of delivery: Normal (lecture/lab/tutorial) day
This unit covers computation and data handling, integrating sophisticated use of existing productivity software, e.g. spreadsheets, with the development of custom software using the general-purpose Python language. It will focus on skills directly applicable to data-driven decision-making. Students will see examples from many domains, and be able to write code to automate the common processes of data science, such as data ingestion, format conversion, cleaning, summarization, creation and application of a predictive model.
Selective
DATA1001 Foundations of Data Science
Credit points: 6 Teacher/Coordinator: Dr Di Warren Session: Semester 1,Semester 2 Classes: lecture 3 hrs/week; computer tutorial 2 hr/week Prohibitions: MATH1005 or MATH1905 or MATH1015 or MATH1115 or ENVX1001 or ENVX1002 or ECMT1010 or BUSS1020 or STAT1021 Assessment: assignments, quizzes, presentation, exam Mode of delivery: Normal (lecture/lab/tutorial) day
DATA1001 is a foundational unit in the Data Science major. The unit focuses on developing critical and statistical thinking skills for all students. Does mobile phone usage increase the incidence of brain tumours? What is the public's attitude to shark baiting following a fatal attack? Statistics is the science of decision making, essential in every industry and undergirds all research which relies on data. Students will use problems and data from the physical, health, life and social sciences to develop adaptive problem solving skills in a team setting. Taught interactively with embedded technology, DATA1001 develops critical thinking and skills to problem-solve with data. It is the prerequisite for DATA2002.
Textbooks
Statistics, Fourth Edition, Freedman Pisani Purves
ENVX1002 Introduction to Statistical Methods
Credit points: 6 Teacher/Coordinator: A/Prof Thomas Bishop Session: Semester 1 Classes: Two 1-hour lectures per week, one 1-hour tutorial per week, one 2-hour computer practical per week Prohibitions: ENVX1001 Assessment: One exam during the exam period (50%), three reports (10% each), ten online quizzes (2% each) Mode of delivery: Normal (lecture/lab/tutorial) day
Note: Available as a degree core unit only in the Agriculture, Animal and Veterinary Bioscience, and Food and Agribusiness streams
This is an introductory statistics unit for students in the agricultural, life and environmental sciences. It provides the foundation for statistics and data science skills that are needed for a career in science and for further study in applied statistics and data science. In the first portion of the unit the emphasis is on describing data using statistical and graphical summaries, and probability models. In the second part the focus is on formal hypothesis testing on experimental data using statistical tests. The final part of the unit is on finding patterns in biological and environmental data, through the use of linear and non-linear functions. In the practicals the emphasis is on applying theory to analysing real datasets using the spreadsheet package Excel and the statistical software package R. A key feature of the unit is using R to develop coding skills that are become essential in science for processing and analysing datasets of ever increasing size.
Textbooks
No textbooks are recommended but useful reference books are:
Statistics units
MATH1005 Statistical Thinking with Data
Credit points: 3 Session: Semester 2,Summer Main,Winter Main Classes: Lectures 2 hrs/week; Practical 1 hr/week Prohibitions: MATH1015 or MATH1905 or STAT1021 or STAT1022 or ECMT1010 or ENVX1001 or ENVX1002 or BUSS1020 Assumed knowledge: HSC Mathematics. Students who have not completed HSC Mathematics (or equivalent) are strongly advised to take the Mathematics Bridging Course (offered in February). Assessment: One 1.5 hour examination, assignments and quizzes (100%) Mode of delivery: Normal (lecture/lab/tutorial) day
In a data-rich world, global citizens need to problem solve with data, and evidence based decision-making is essential is every field of research and work.
This unit equips you with the foundational statistical thinking to become a critical consumer of data. You will learn to think analytically about data and to evaluate the validity and accuracy of any conclusions drawn. Focusing on statistical literacy, the unit covers foundational statistical concepts, including the design of experiments, exploratory data analysis, sampling and tests of significance.
Textbooks
Freedman, Pisani and Purves, Statistics, Norton, 2007
MATH1015 Biostatistics
Credit points: 3 Session: Semester 1 Classes: Two 1 hour lectures and one 1 hour tutorial per week. Prohibitions: MATH1005 or MATH1905 or STAT1021 or STAT1022 or ECMT1010 or BIOM1003 or ENVX1001 or ENVX1002 or BUSS1020 Assumed knowledge: HSC Mathematics. Students who have not completed HSC Mathematics (or equivalent) are strongly advised to take the Mathematics Bridging Course (offered in February). Assessment: One 1.5 hour examination, assignments and quizzes (100%) Mode of delivery: Normal (lecture/lab/tutorial) day
MATH1015 is designed to provide a thorough preparation in statistics for students in the Biological and Medical Sciences. It offers a comprehensive introduction to data analysis, probability and sampling, inference including t-tests, confidence intervals and chi-squared goodness of fit tests.
Textbooks
As set out in the Junior Mathematics Handbook
Computation units
MATH1115 Interrogating Data
Credit points: 3 Session: Semester 1,Semester 2,Winter Main Classes: 2-hr lab; and 1x1-hr lecture per week Prerequisites: MATH1005 or MATH1015 Prohibitions: DATA1001 or STAT1021 or ECMT1010 or ENVX1001 or BUSS1020 or ENVX1002 or MATH1905 Assessment: projects/presentations, final exam Mode of delivery: Normal (lecture/lab/tutorial) day
In a data-rich world, global citizens need to problem solve with data, and evidence based decision-making is essential is every field of research and work. This unit equips you with foundational statistical thinking to interrogate data. Focusing on statistical literacy, the unit covers foundational statistical concepts such as visualising data, the linear regression model, and testing significance using the t and chi-square tests. Based on a flipped learning approach, you will experience most of your learning in weekly collaborative 2 hour labs, supplemented by 1 hour lectures. Working in teams, you will explore three real data stories across different domains, with associated literature. The combination of MATH1005/1015 and MATH1115 is equivalent to DATA1001, allowing you to pathway to the Data Science, Statistics, or Quantitative Life Sciences majors.
Textbooks
Freedman, Pisani and Purves, Statistics, 2007
Advanced Statistics units
MATH1905 Statistical Thinking with Data (Advanced)
Credit points: 3 Session: Semester 2 Classes: Two 1 hour lectures and one 1 hour tutorial per week. Prohibitions: MATH1005 or MATH1015 or STAT1021 or STAT1022 or ECMT1010 or ENVX1001 or ENVX1002 or BUSS1020 Assumed knowledge: (HSC Mathematics Extension 2) OR (90 or above in HSC Mathematics Extension 1) or equivalent Assessment: One 1.5 hour examination, assignments and quizzes (100%) Mode of delivery: Normal (lecture/lab/tutorial) day
Note: Department permission required for enrolment
This unit is designed to provide a thorough preparation for further study in mathematics and statistics. It is a core unit of study providing three of the twelve credit points required by the Faculty of Science as well as a Junior level requirement in the Faculty of Engineering. This Advanced level unit of study parallels the normal unit MATH1005 but goes more deeply into the subject matter and requires more mathematical sophistication.
Textbooks
As set out in the Junior Mathematics Handbook
Maths units
MATH1021 Calculus Of One Variable
Credit points: 3 Session: Semester 1 Classes: 2x1-hr lectures; 1x1-hr tutorial per week Prohibitions: MATH1011 or MATH1901 or MATH1906 or MATH1111 or ENVX1001 or MATH1001 or MATH1921 or MATH1931 Assumed knowledge: HSC Mathematics Extension 1. Students who have not completed HSC Extension 1 Mathematics (or equivalent) are strongly advised to take the Extension 1 Mathematics Bridging Course (offered in February). Assessment: exam, quizzes, assignments Mode of delivery: Normal (lecture/lab/tutorial) day
Calculus is a discipline of mathematics that finds profound applications in science, engineering, and economics. This unit investigates differential calculus and integral calculus of one variable and the diverse applications of this theory. Emphasis is given both to the theoretical and foundational aspects of the subject, as well as developing the valuable skill of applying the mathematical theory to solve practical problems. Topics covered in this unit of study include complex numbers, functions of a single variable, limits and continuity, differentiation, optimisation, Taylor polynomials, Taylor's Theorem, Taylor series, Riemann sums, and Riemann integrals.
Textbooks
As set out in the Junior Mathematics Handbook.
MATH1921 Calculus Of One Variable (Advanced)
Credit points: 3 Session: Semester 1 Classes: 2x1-hr lectures; and 1x1-hr tutorial per week Prohibitions: MATH1001 or MATH1011 or MATH1906 or MATH1111 or ENVX1001 or MATH1901 or MATH1021 or MATH1931 Assumed knowledge: (HSC Mathematics Extension 2) OR (Band E4 in HSC Mathematics Extension 1) or equivalent. Assessment: exam, quizzes, assignments Mode of delivery: Normal (lecture/lab/tutorial) day
Note: Department permission required for enrolment
Calculus is a discipline of mathematics that finds profound applications in science, engineering, and economics. This unit investigates differential calculus and integral calculus of one variable and the diverse applications of this theory. Emphasis is given both to the theoretical and foundational aspects of the subject, as well as developing the valuable skill of applying the mathematical theory to solve practical problems. Topics covered in this unit of study include complex numbers, functions of a single variable, limits and continuity, differentiation, optimisation, Taylor polynomials, Taylor's Theorem, Taylor series, Riemann sums, and Riemann integrals. Additional theoretical topics included in this advanced unit include the Intermediate Value Theorem, Rolle's Theorem, and the Mean Value Theorem.
Textbooks
As set out in the Junior Mathematics Handbook
MATH1931 Calculus Of One Variable (SSP)
Credit points: 3 Session: Semester 1 Classes: 2x1-hr lectures; 1x1-hr seminar; and 1x1-hr tutorial per week Prohibitions: MATH1001 or MATH1011 or MATH1901 or MATH1111 or ENVX1001 or MATH1906 or MATH1021 or MATH1921 Assumed knowledge: Band E4 in HSC Mathematics Extension 2 or equivalent. Assessment: exam, quizzes, assignments, seminar participation Mode of delivery: Normal (lecture/lab/tutorial) day
Note: Department permission required for enrolment
Note: Enrolment is by invitation only.
The Mathematics Special Studies Program is for students with exceptional mathematical aptitude, and requires outstanding performance in past mathematical studies. Students will cover the material of MATH1921 Calculus of One Variable (Adv), and attend a weekly seminar covering special topics on available elsewhere in the Mathematics and Statistics program.
MATH1023 Multivariable Calculus and Modelling
Credit points: 3 Session: Semester 2 Classes: 2x1-hr lectures; 1x1-hr tutorial per week Prohibitions: MATH1013 or MATH1903 or MATH1907 or MATH1003 or MATH1923 or MATH1933 Assumed knowledge: HSC Mathematics Extension 1. Students who have not completed HSC Extension 1 Mathematics (or equivalent) are strongly advised to take the Extension 1 Mathematics Bridging Course (offered in February). Assessment: exam, quizzes, assignments Mode of delivery: Normal (lecture/lab/tutorial) day
Calculus is a discipline of mathematics that finds profound applications in science, engineering, and economics. This unit investigates multivariable differential calculus and modelling. Emphasis is given both to the theoretical and foundational aspects of the subject, as well as developing the valuable skill of applying the mathematical theory to solve practical problems. Topics covered in this unit of study include mathematical modelling, first order differential equations, second order differential equations, systems of linear equations, visualisation in 2 and 3 dimensions, partial derivatives, directional derivatives, the gradient vector, and optimisation for functions of more than one variable.
Textbooks
As set out in the Junior Mathematics Handbook
MATH1923 Multivariable Calculus and Modelling (Adv)
Credit points: 3 Session: Semester 2 Classes: 2x1-hr lectures; and 1x1-hr tutorial per week Prohibitions: MATH1003 or MATH1013 or MATH1907 or MATH1903 or MATH1023 or MATH1933 Assumed knowledge: (HSC Mathematics Extension 2) OR (Band E4 in HSC Mathematics Extension 1) or equivalent. Assessment: exam, quizzes, assignments Mode of delivery: Normal (lecture/lab/tutorial) day
Note: Department permission required for enrolment
Calculus is a discipline of mathematics that finds profound applications in science, engineering, and economics. This unit investigates multivariable differential calculus and modelling. Emphasis is given both to the theoretical and foundational aspects of the subject, as well as developing the valuable skill of applying the mathematical theory to solve practical problems. Topics covered in this unit of study include mathematical modelling, first order differential equations, second order differential equations, systems of linear equations, visualisation in 2 and 3 dimensions, partial derivatives, directional derivatives, the gradient vector, and optimisation for functions of more than one variable. Additional topics covered in this advanced unit of study include the use of diagonalisation of matrices to study systems of linear equation and optimisation problems, limits of functions of two or more variables, and the derivative of a function of two or more variables.
Textbooks
As set out in the Junior Mathematics Handbook
MATH1933 Multivariable Calculus and Modelling (SSP)
Credit points: 3 Session: Semester 2 Classes: 2x1-hr lectures; 1x1-hr seminar; and 1x1-hr tutorial per week Prohibitions: MATH1003 or MATH1903 or MATH1013 or MATH1907 or MATH1023 or MATH1923 Assumed knowledge: Band E4 in HSC Mathematics Extension 2 or equivalent. Assessment: exam, quizzes, assignments, seminar participation Mode of delivery: Normal (lecture/lab/tutorial) day
Note: Department permission required for enrolment
Note: Enrolment is by invitation only.
The Mathematics Special Studies Program is for students with exceptional mathematical aptitude, and requires outstanding performance in past mathematical studies. Students will cover the material of MATH1923 Multivariable Calculus and Modelling (Adv), and attend a weekly seminar covering special topics on available elsewhere in the Mathematics and Statistics program.
MATH1002 Linear Algebra
Credit points: 3 Session: Semester 1,Summer Main Classes: Two 1 hour lectures and one 1 hour tutorial per week. Prohibitions: MATH1012 or MATH1014 or MATH1902 Assumed knowledge: HSC Mathematics or MATH1111. Students who have not completed HSC Mathematics (or equivalent) are strongly advised to take the Mathematics Bridging Course (offered in February). Assessment: One 1.5 hour examination, assignments and quizzes (100%) Mode of delivery: Normal (lecture/lab/tutorial) day
MATH1002 is designed to provide a thorough preparation for further study in mathematics and statistics. It is a core unit of study providing three of the twelve credit points required by the Faculty of Science as well as a Junior level requirement in the Faculty of Engineering.
This unit of study introduces vectors and vector algebra, linear algebra including solutions of linear systems, matrices, determinants, eigenvalues and eigenvectors.
Textbooks
As set out in the Junior Mathematics Handbook
MATH1902 Linear Algebra (Advanced)
Credit points: 3 Session: Semester 1 Classes: Two 1 hour lectures and one 1 hour tutorial per week. Prohibitions: MATH1002 or MATH1012 or MATH1014 Assumed knowledge: (HSC Mathematics Extension 2) OR (90 or above in HSC Mathematics Extension 1) or equivalent Assessment: One 1.5 hour examination, assignments and quizzes (100%) Mode of delivery: Normal (lecture/lab/tutorial) day
Note: Department permission required for enrolment
This unit is designed to provide a thorough preparation for further study in mathematics and statistics. It is a core unit of study providing three of the twelve credit points required by the Faculty of Science as well as a Junior level requirement in the Faculty of Engineering. It parallels the normal unit MATH1002 but goes more deeply into the subject matter and requires more mathematical sophistication.
Textbooks
As set out in the Junior Mathematics Handbook
*Students not enrolled in BSc – Substitute units
2000-level units of study
Core
DATA2001 Data Science: Big Data and Data Diversity
Credit points: 6 Session: Semester 1 Classes: Lectures, Laboratories, Project Work - own time Prerequisites: DATA1002 OR INFO1110 OR INFO1903 OR INFO1103 Assessment: through semester assessment (50%), final exam (50%) Mode of delivery: Normal (lecture/lab/tutorial) day
This course focuses on methods and techniques to efficiently explore and analyse large data collections. Where are hot spots of pedestrian accidents across a city? What are the most popular travel locations according to user postings on a travel website? The ability to combine and analyse data from various sources and from databases is essential for informed decision making in both research and industry.
Students will learn how to ingest, combine and summarise data from a variety of data models which are typically encountered in data science projects, such as relational, semi-structured, time series, geospatial, image, text. As well as reinforcing their programming skills through experience with relevant Python libraries, this course will also introduce students to the concept of declarative data processing with SQL, and to analyse data in relational databases. Students will be given data sets from, eg. , social media, transport, health and social sciences, and be taught basic explorative data analysis and mining techniques in the context of small use cases. The course will further give students an understanding of the challenges involved with analysing large data volumes, such as the idea to partition and distribute data and computation among multiple computers for processing of 'Big Data'.
DATA2002 Data Analytics: Learning from Data
Credit points: 6 Teacher/Coordinator: Jean Yang Session: Semester 2 Classes: lecture 3 hrs/week; computer tutorial 2 hr/week Prerequisites: [DATA1001 or ENVX1001 or ENVX1002] or [MATH10X5 and MATH1115] or [MATH10X5 and STAT2011] or [MATH1905 and MATH1XXX (except MATH1XX5)] or [BUSS1020 or ECMT1010 or STAT1021] Prohibitions: STAT2012 or STAT2912 Assumed knowledge: (Basic Linear Algebra and some coding) or QBUS1040 Assessment: written assignment, presentation, exams Mode of delivery: Normal (lecture/lab/tutorial) day
Technological advances in science, business, engineering has given rise to a proliferation of data from all aspects of our life. Understanding the information presented in these data is critical as it enables informed decision making into many areas including market intelligence and science. DATA2002 is an intermediate course in statistics and data sciences, focusing on learning data analytic skills for a wide range of problems and data. How should the Australian government measure and report employment and unemployment? Can we tell the difference between decaffeinated and regular coffee ? In this course, you will learn how to ingest, combine and summarise data from a variety of data models which are typically encountered in data science projects as well as reinforcing their programming skills through experience with statistical programming language. You will also be exposed to the concept of statistical machine learning and develop the skill to analyze various types of data in order to answer a scientific question. From this unit, you will develop knowledge and skills that will enable you to embrace data analytic challenges stemming from everyday problems.
Selective
COMP2123 Data Structures and Algorithms
Credit points: 6 Session: Semester 1 Classes: Lectures, Tutorials Prerequisites: INFO1110 OR INFO1113 OR DATA1002 OR INFO1103 OR INFO1903 Prohibitions: INFO1105 OR INFO1905 OR COMP2823 Assessment: through semester assessment (50%), final exam (50%) Mode of delivery: Normal (lecture/lab/tutorial) day
This unit will teach some powerful ideas that are central to solving algorithmic problems in ways that are more efficient than naive approaches. In particular, students will learn how data collections can support efficient access, for example, how a dictionary or map can allow key-based lookup that does not slow down linearly as the collection grows in size. The data structures covered in this unit include lists, stacks, queues, priority queues, search trees, hash tables, and graphs. Students will also learn efficient techniques for classic tasks such as sorting a collection. The concept of asymptotic notation will be introduced, and used to describe the costs of various data access operations and algorithms.
COMP2823 Data Structures and Algorithms (Adv)
Credit points: 6 Session: Semester 1 Classes: lectures, tutorials Prerequisites: Distinction level result in at least one of INFO1110 OR INFO1113 OR DATA1002 OR INFO1103 OR INFO1903 Prohibitions: INFO1105 OR INFO1905 OR COMP2123 Assessment: through semester assessment (50%), final exam (50%) Mode of delivery: Normal (lecture/lab/tutorial) day
Note: Department permission required for enrolment
This unit will teach some powerful ideas that are central to solving algorithmic problems in ways that are more efficient than naive approaches. In particular, students will learn how data collections can support efficient access, for example, how a dictionary or map can allow key-based lookup that does not slow down linearly as the collection grows in size. The data structures covered in this unit include lists, stacks, queues, priority queues, search trees, hash tables, and graphs. Students will also learn efficient techniques for classic tasks such as sorting a collection. The concept of asymptotic notation will be introduced, and used to describe the costs of various data access operations and algorithms.
ISYS2120 Data and Information Management
Credit points: 6 Session: Semester 2 Classes: Lectures, Tutorials, Laboratories, Project Work - own time Prerequisites: INFO1113 OR INFO1103 OR INFO1105 OR INFO1905 OR INFO1003 OR INFO1903 OR DECO1012 Prohibitions: INFO2120 OR INFO2820 OR COMP5138 Assumed knowledge: Programming skills Assessment: through semester assessment (50%), final exam (50%) Mode of delivery: Normal (lecture/lab/tutorial) day
The ubiquitous use of information technology leaves us facing a tsunami of data produced by users, IT systems and mobile devices. The proper management of data is hence essential for all applications and for effective decision making within organizations.
This unit of study will introduce the basic concepts of database designs at the conceptual, logical and physical levels. We will place particular emphasis on introducing integrity constraints and the concept of data normalization which prevents data from being corrupted or duplicated in different parts of the database. This in turn helps in the data remaining consistent during its lifetime. Once a database design is in place, the emphasis shifts towards querying the data in order to extract useful information. The unit will introduce the SQL database query languages, which is industry standard. Other topics covered will include the important concept of transaction management, application development with a backend database, and an overview of data warehousing and OLAP.
STAT2011 Probability and Estimation Theory
Credit points: 6 Session: Semester 1 Classes: Three 1 hour lectures, one 1 hour tutorial and one 1 hour computer laboratory week. Prerequisites: (MATH1X21 or MATH1931 or MATH1X01 or MATH1906 or MATH1011) and (MATH1XX5 or STAT1021 or ECMT1010 or BUSS1020) Prohibitions: STAT2901 or STAT2001 or STAT2911 Assessment: One 2 hour exam, assignments and/or quizzes, and computer practical reports (100%) Mode of delivery: Normal (lecture/lab/tutorial) day
This unit provides an introduction to univariate techniques in data analysis and the most common statistical distributions that are used to model patterns of variability. Common discrete random models like the binomial, Poisson and geometric, continuous models including the normal and exponential will be studied along with elementary regression models. The method of moments and maximum likelihood techniques for fitting statistical distributions to data will be explored. The unit will have weekly computer classes where candidates will learn to use a statistical computing package to perform simulations and carry out computer intensive estimation techniques like the bootstrap method.
STAT2911 Probability and Statistical Models (Adv)
Credit points: 6 Session: Semester 1 Classes: Three 1 hour lectures, one 1 hour tutorial and one 1 hour computer laboratory per week. Prerequisites: [MATH19X3 or MATH1907 or (a mark of 65 in MATH1023 or MATH1003)] and [MATH1905 or MATH1904 or (a mark of 65 in MATH1005 or ECMT1010 or BUSS1020)] Prohibitions: STAT2001 or STAT2901 or STAT2011 Assessment: One 2 hour exam, assignments and/or quizzes, and computer practical reports (100%) Mode of delivery: Normal (lecture/lab/tutorial) day
This unit is essentially an advanced version of STAT2011, with an emphasis on the mathematical techniques used to manipulate random variables and probability models. Common distributions including the Poisson, normal, beta and gamma families as well as the bivariate normal are introduced. Moment generating functions and convolution methods are used to understand the behaviour of sums of random variables. The method of moments and maximum likelihood techniques for fitting statistical distributions to data will be explored. The notions of conditional expectation and prediction will be covered as will be distributions related to the normal: chi^2, t and F. The unit will have weekly computer classes where candidates will learn to use a statistical computing package to perform simulations and carry out computer intensive estimation techniques like the bootstrap method.
QBUS2830 Actuarial Data Analytics
Credit points: 6 Session: Semester 2 Classes: 1x 2hr lecture per wk and 1x 1hr tutorial per wk Prerequisites: QBUS2810 or DATA2002 or ECMT2110 Assumed knowledge: BUSS1020 or ECMT1010 or ENVX1001 or ENVX1002 or ((MATH1005 or MATH1015) and MATH1115) or 6 credit points in MATH 1000-level units including MATH1905. Assessment: assignments (30%), mid-semester exam (20%), final exam (50%) Mode of delivery: Normal (lecture/lab/tutorial) day
The unit covers a range of statistical models and methods for analysing quantitative actuarial data in general insurance. Both maximum likelihood estimation and Bayesian estimation methods are adopted for statistical inferences with the use of modern software tools such as the R and OpenBUGS packages. Topics covered include probability distributions for actuarial modelling, claim size modelling, claim frequency modelling, loss reserve forecasting, pure premium calculation, premium rates reviewing and revising (credibility theory), linear and generalised linear models, Poisson process and Markov process in actuarial modelling. Upon the completion of this unit and other relevant business analytics units, students may undertake professional examinations for actuaries or may get exemptions in some professional examination papers.
3000-level units of study
Interdisciplinary project unit
DATA3001 to be developed for offering in 2019.
Methodology-focussed units
DATA3404 Data Science Platforms
Credit points: 6 Session: Semester 1 Classes: lectures, tutorials Prerequisites: DATA2001 OR ISYS2120 OR INFO2120 OR INFO2820 Prohibitions: INFO3504 OR INFO3404 Assumed knowledge: This unit of study assumes that students have previous knowledge of database structures and of SQL. The prerequisite material is covered in DATA2001 or ISYS2120. Familiarity with a programming language (e.g. Java or C) is also expected. Assessment: through semester assessment (40%), final exam (60%) Mode of delivery: Normal (lecture/lab/tutorial) day
This unit of study provides a comprehensive overview of the internal mechanisms data science platforms and of systems that manage large data collections. These skills are needed for successful performance tuning and to understand the scalability challenges faced by when processing Big Data. This unit builds upon the second' year DATA2001 - 'Data Science - Big Data and Data Diversity' and correspondingly assumes a sound understanding of SQL and data analysis tasks.
The first part of this subject focuses on mechanisms for large-scale data management. It provides a deep understanding of the internal components of a data management platform. Topics include: physical data organization and disk-based index structures, query processing and optimisation, and database tuning.
The second part focuses on the large-scale management of big data in a distributed architecture. Topics include: distributed and replicated databases, information retrieval, data stream processing, and web-scale data processing.
The unit will be of interest to students seeking an introduction to data management tuning, disk-based data structures and algorithms, and information retrieval. It will be valuable to those pursuing such careers as Software Engineers, Data Engineers, Database Administrators, and Big Data Platform specialists.
ISYS3401 Information Technology Evaluation
Credit points: 6 Session: Semester 1 Classes: Lectures, Tutorials Prerequisites: (INFO2110 OR ISYS2110) AND (INFO2120 OR ISYS2120) AND (ISYS2140 OR ISYS2160) Assessment: Through semester assessment (35%) and Final Exam (65%) Mode of delivery: Normal (lecture/lab/tutorial) day
Information Systems (IS) professionals in today's organisations are required to play leadership roles in change and development. Your success in this field will be aided by your being able to carry out research-based investigations using suitable methods and mastery over data collection and analysis to assist in managing projects and in decision making. Practical research skills are some of the most important assets you will need in your career.
This unit of study will cover important concepts and skills in practical research for solving and managing important problems. This will also provide you with the skills to undertake the capstone project in the IS project unit of study offered in Semester 2 or other projects. It will also provide hand-on experience of using Microsoft Excel and other tools to perform some of the quantitative analysis.
STAT3X23, STAT3X22, STAT3021, STAT3024 and DATA3406 are to be developed for offering in 2019.
Application and discipline-focussed units
ENVX3001 Environmental GIS
Credit points: 6 Teacher/Coordinator: A/Prof Inakwu Odeh Session: Semester 2 Classes: Three-day field trip, (two lectures and two practicals per week) Prerequisites: 6cp from (ENVI1003, AGEN1002) or 6cp from GEOS1XXX or 6cp from BIOL1XXX Assessment: One 15-minute presentation (10%), 3500wd prac report (35%), 1500wd report on trip excursion (15%), 2-hour exam (40%) Mode of delivery: Normal (lecture/lab/tutorial) day
This unit is designed to impart knowledge and skills in spatial analysis and geographical information science (GISc) for decision-making in an environmental context. The lecture material will present several themes: principles of GISc, geospatial data sources and acquisition methods, processing of geospatial data and spatial statistics. Practical exercises will focus on learning geographical information systems (GIS) and how to apply them to land resource assessment, including digital terrain modelling, land-cover assessment, sub-catchment modelling, ecological applications, and soil quality assessment for decisions regarding sustainable land use and management. A three day field excursion during the mid-semester break will involve a day of GPS fieldwork at Arthursleigh University farm and two days in Canberra visiting various government agencies which research and maintain GIS coverages for Australia. By the end of this UoS, students should be able to: differentiate between spatial data and spatial information; source geospatial data from government and private agencies; apply conceptual models of spatial phenomena for practical decision-making in an environmental context; apply critical analysis of situations to apply the concepts of spatial analysis to solving environmental and land resource problems; communicate effectively results of GIS investigations through various means- oral, written and essay formats; and use a major GIS software package such as ArcGIS.
Textbooks
Burrough, P.A. and McDonnell, R.A. 1998. Principles of Geographic Information Systems. Oxford University Press: Oxford.
ENVX3002 Statistics in the Natural Sciences
Credit points: 6 Teacher/Coordinator: Dr Floris Van Ogtrop Session: Semester 1 Classes: one 2-hour workshop per week, one 3-hour computer practical per week Prerequisites: ENVX2001 or BIOM2001 or STAT2X12 or BIOL2X22 or DATA2002 or QBIO2001 Assessment: One exam during the exam period (50%), five assessment tasks (50%) Mode of delivery: Normal (lecture/lab/tutorial) day
Note: Interdisciplinary Unit
This unit of study is designed to introduce students to the analysis of data they may face in their future careers, in particular data that are not well behaved. The data may be non-normal, there may be missing observations, they may be correlated in space and time or too numerous to analyse with standard models. The unit is presented in an applied context with an emphasis on correctly analysing authentic datasets, and interpreting the ouput. It begins with the analysis and design experiments based on the general linear model. In the second part, students will learn about the generalisation of the general linear model to accommodate non-normal data with a particular emphasis on the binomial and poisson distributions. In the third part linear mixed models will be introduced which provide the means to analyse datasets that do not meet the assumptions of independent and equal errors, for example data that is correlated in space and time. The units ends with an introduction to machine learning and predictive modelling. A key feature of the unit is using R to develop coding skills that are become essential in science for processing and analysing datasets of ever increasing size.
AMED3002 Interrogating Biomedical and Health Data
Credit points: 6 Teacher/Coordinator: Prof Jean Yang Session: Semester 1 Classes: face to face 5 hrs/week; online 2 hrs/week; individual and/or group work 3-6 hrs/week Assumed knowledge: A Exploratory data analysis, sampling, simple linear regression, t-tests, confidence intervals and chi-squared goodness of fit tests, familiar with basic coding, basic linear algebra. Additional information for BMedSc degree students: You must have successfully completed BMED2401 and an additional 12cp from BMED240X before enrolling in this unit. Assessment: in-semester exam, assignments, presentation Mode of delivery: Normal (lecture/lab/tutorial) day
Note: BMedSc degree students: You must have successfully completed BMED2401 and an additional 12cp from BMED240X before enrolling in this unit.
Biotechnological advances have given rise to an explosion of original and shared public data relevant to human health. These data, including the monitoring of expression levels for thousands of genes and proteins simultaneously, together with multiple databases on biological systems, now promise exciting, ground-breaking discoveries in complex diseases. Critical to these discoveries will be our ability to unravel and extract information from these data. In this unit, you will develop analytical skills required to work with data obtained in the medical and diagnostic sciences. You will explore clinical data using powerful, state of the art methods and tools. Using real data sets, you will be guided in the application of modern data science techniques to interrogate, analyse and represent the data, both graphically and numerically. By analysing your own real data, as well as that from large public resources you will learn and apply the methods needed to find information on the relationship between genes and disease. Leveraging expertise from multiple sources by working in team-based collaborative learning environments, you will develop knowledge and skills that will enable you to play an active role in finding meaningful solutions to difficult problems, creating an important impact on our lives.
QBUS3810 Actuarial Risk Analytics
Credit points: 6 Session: Semester 1 Classes: 1x 2hr lecture and 1x 1hr tutorial per week Prerequisites: QBUS2810 or DATA2002 or ECMT2110 Prohibitions: ECMT3180 Assessment: assignment 1 (10%), assignment 2 (10%), assignment 3 (10%), mid-semester exam (15%), group assignment (15%), final exam (40%) Mode of delivery: Normal (lecture/lab/tutorial) day
Everyone working in business needs to understand and manage risk. This unit provides the basic knowledge and tools needed to do this. It includes material on the risk management strategies that every business needs, as well as specific quantitative and statistical techniques for evaluating risk. Through this unit students learn how different aspects of risk management fit together (like Value-at-Risk (VaR) and tail-VaR calculations, Monte-Carlo simulation, extreme value theory, individual and collective risk models, credibility theory and credit scoring).