Active Outline
General Information
- Course ID (CB01A and CB01B)
- CISD009.
- Course Title (CB02)
- Introduction to Data Science
- Course Credit Status
- Credit - Degree Applicable
- Effective Term
- Fall 2021
- Course Description
- This course is an introduction to data science, which covers data analytics and machine learning. Topics covered include data gathering and data wrangling, data assessment and visualization, supervised and unsupervised machine learning, natural language processing.
- Faculty Requirements
- Course Family
- Not Applicable
Course Justification
This course is UC/CSU transferable. It is part of the Python Programming Certificate of Achievement. Data science is a fast-growing field within computer science and as a result, several UCs and other 4-year schools have recently created a bachelor's in data science track to meet the interest and demand in the field. This course helps students prepare for their entrance into a data science major.
Foothill Equivalency
- Does the course have a Foothill equivalent?
- No
- Foothill Course ID
Formerly Statement
Course Development Options
- Basic Skill Status (CB08)
- Course is not a basic skills course.
- Grade Options
- Letter Grade
- Pass/No Pass
- Repeat Limit
- 0
Transferability & Gen. Ed. Options
- Transferability
- Transferable to both UC and CSU
Units and Hours
Summary
- Minimum Credit Units
- 4.5
- Maximum Credit Units
- 4.5
Weekly Student Hours
Type | In Class | Out of Class |
---|---|---|
Lecture Hours | 4.0 | 8.0 |
Laboratory Hours | 1.5 | 0.0 |
Course Student Hours
- Course Duration (Weeks)
- 12.0
- Hours per unit divisor
- 36.0
Course In-Class (Contact) Hours
- Lecture
- 48.0
- Laboratory
- 18.0
- Total
- 66.0
Course Out-of-Class Hours
- Lecture
- 96.0
- Laboratory
- 0.0
- NA
- 0.0
- Total
- 96.0
Prerequisite(s)
CIS D041A
Corequisite(s)
Advisory(ies)
Limitation(s) on Enrollment
Entrance Skill(s)
General Course Statement(s)
Methods of Instruction
Lecture and visual aids
Discussion of assigned reading
Discussion and problem solving performed in class
In-class exploration of Internet sites
Quiz and examination review performed in class
Homework and extended projects
Collaborative learning and small group exercises
Collaborative projects
Laboratory discussion sessions and quizzes that evaluate the proceedings weekly laboratory exercises
Assignments
- Reading: required reading from textbook and classnotes
- Programs: 6-8 programming homework assignments, several with 100 or more lines of code.
Methods of Evaluation
- Evaluation of programming assignments and reports for correctness, use of design principles, documentation and efficiency.
- One or more examinations requiring programming ability to develop an algorithm, evaluate code segments, and write code using theories presented in the course.
- In-class lab problems, group collaborative problems, exam questions and/or online assignments or tutorials demonstrating the ability to read and analyze code through debugging and/or writing snippets of code.
- A final examination requiring programming ability to develop algorithms, evaluate code segments, and write code using theories presented in the course.
Essential Student Materials/Essential College Facilities
Essential Student Materials:Â
- None.
- Computer lab with computers running the Python interpreter and Anaconda package
Examples of Primary Texts and References
Author | Title | Publisher | Date/Edition | ISBN |
---|---|---|---|---|
Igual, Laura and Segui, Santi: Introduction to Data Science, 1st Edition. Springer. ISBN 978-3-319-50017-1. 2017 | ||||
Saltz, Jeffrey: An Introduction to Data Science, 1st Edition, Sage Publishing, ISBN: 978-1506377537, 2018 |
Examples of Supporting Texts and References
Author | Title | Publisher |
---|---|---|
Hasti, Trevor: The Elements of Statistical Learning, 2nd Edition, Springer, ISBN: 978-0387848570, 2017 |
Learning Outcomes and Objectives
Course Objectives
- Define and describe data science concepts
- Apply mathematics and statistics building blocks
- Apply programming constructs in data science
- Collect data from multiple sources
- Apply data wrangling techniques
- Assess data sets
- Visualize and present data
- Apply a learning framework
- Evaluate supervised learning
- Evaluate unsupervised learning
- Evaluate natural language processing
CSLOs
- Collect, clean, analyze, and visualize data to meet and defend a measured objective.
- Gather data and choose a model to train and tune the machine learning tool and interpret the result.
Outline
- Define and describe data science concepts
- The role of data analytics
- Machine learning application
- Apply mathematics and statistics building blocks
- Linear algebra - notation, vector and matrix operations
- Statistics - measurement such as mean, median, mode, standard deviation, outlier, correlation, confidence interval
- Apply programming constructs in data science
- Data Structures
- Array
- Data frame
- Data input / output
- Operators
- Functions
- Plots
- Data Structures
- Collect data from multiple sources
- Define objective
- Web scraping
- Web API
- Importing file
- HTML
- text
- CSV
- JSON
- Apply data wrangling techniques
- Joining data
- Data validation
- Data cleaning
- Assess data sets
- Exploratory data analysis
- Data filtering, sorting
- Searching, retrieving data
- Time series
- Visualize and present data
- Univariate and multivariate plots
- Defense of results
- Apply a learning framework
- Loading dataset
- Learning and predicting
- Saving models
- Evaluate supervised learning
- Training, predicting
- Evaluating model success
- Evaluate unsupervised learning
- Clustering
- Outlier detection
- Evaluate natural language processing
- Rule based and statistical NLP
- Sentiment analysis
Lab Topics
- Present, evaluate, and explain the role and application of data science in modern life.
- Solve mathematical and statistical problems with large data sets.
- Write code to perform mathematical and statistical work on input data to produce and present the result.
- Write code to locate and read data from multiple types of data source.
- Write code to join, validate, and clean input data to prepare for analysis.
- Write code to filter, search, sort data to arrive at a conclusion on the data trend.
- Write code to utilize the appropriate plots to visualize data.
- Write code to work with a machine learning model to train and predict data.
- Write code to provide input data for a machine learning model for unsupervised learning.
- Write code to process text and predict outcome.