Active Outline

General Information


Course ID (CB01A and CB01B)
CISD009.
Course Title (CB02)
Introduction to Data Science
Course Credit Status
Credit - Degree Applicable
Effective Term
Fall 2021
Course Description
This course is an introduction to data science, which covers data analytics and machine learning. Topics covered include data gathering and data wrangling, data assessment and visualization, supervised and unsupervised machine learning, natural language processing.
Faculty Requirements
Course Family
Not Applicable

Course Justification


This course is UC/CSU transferable. It is part of the Python Programming Certificate of Achievement. Data science is a fast-growing field within computer science and as a result, several UCs and other 4-year schools have recently created a bachelor's in data science track to meet the interest and demand in the field. This course helps students prepare for their entrance into a data science major.

Foothill Equivalency


Does the course have a Foothill equivalent?
No
Foothill Course ID

Course Philosophy


Formerly Statement


Course Development Options


Basic Skill Status (CB08)
Course is not a basic skills course.
Grade Options
  • Letter Grade
  • Pass/No Pass
Repeat Limit
0

Transferability & Gen. Ed. Options


Transferability
Transferable to both UC and CSU

Units and Hours


Summary

Minimum Credit Units
4.5
Maximum Credit Units
4.5

Weekly Student Hours

TypeIn ClassOut of Class
Lecture Hours4.08.0
Laboratory Hours1.50.0

Course Student Hours

Course Duration (Weeks)
12.0
Hours per unit divisor
36.0
Course In-Class (Contact) Hours
Lecture
48.0
Laboratory
18.0
Total
66.0
Course Out-of-Class Hours
Lecture
96.0
Laboratory
0.0
NA
0.0
Total
96.0

Prerequisite(s)


CIS D041A

Corequisite(s)


Advisory(ies)


Limitation(s) on Enrollment


Entrance Skill(s)


General Course Statement(s)


Methods of Instruction


Lecture and visual aids

Discussion of assigned reading

Discussion and problem solving performed in class

In-class exploration of Internet sites

Quiz and examination review performed in class

Homework and extended projects

Collaborative learning and small group exercises

Collaborative projects

Laboratory discussion sessions and quizzes that evaluate the proceedings weekly laboratory exercises

Assignments


  1. Reading: required reading from textbook and classnotes
  2. Programs: 6-8 programming homework assignments, several with 100 or more lines of code.

Methods of Evaluation


  1. Evaluation of programming assignments and reports for correctness, use of design principles, documentation and efficiency.
  2. One or more examinations requiring programming ability to develop an algorithm, evaluate code segments, and write code using theories presented in the course.
  3. In-class lab problems, group collaborative problems, exam questions and/or online assignments or tutorials demonstrating the ability to read and analyze code through debugging and/or writing snippets of code.
  4. A final examination requiring programming ability to develop algorithms, evaluate code segments, and write code using theories presented in the course.

Essential Student Materials/Essential College Facilities


Essential Student Materials: 
  • None.
Essential College Facilities:
  • Computer lab with computers running the Python interpreter and Anaconda package

Examples of Primary Texts and References


AuthorTitlePublisherDate/EditionISBN
Igual, Laura and Segui, Santi: Introduction to Data Science, 1st Edition. Springer. ISBN 978-3-319-50017-1. 2017
Saltz, Jeffrey: An Introduction to Data Science, 1st Edition, Sage Publishing, ISBN: 978-1506377537, 2018

Examples of Supporting Texts and References


AuthorTitlePublisher
Hasti, Trevor: The Elements of Statistical Learning, 2nd Edition, Springer, ISBN: 978-0387848570, 2017

Learning Outcomes and Objectives


Course Objectives

  • Define and describe data science concepts
  • Apply mathematics and statistics building blocks
  • Apply programming constructs in data science
  • Collect data from multiple sources
  • Apply data wrangling techniques
  • Assess data sets
  • Visualize and present data
  • Apply a learning framework
  • Evaluate supervised learning
  • Evaluate unsupervised learning
  • Evaluate natural language processing

CSLOs

  • Collect, clean, analyze, and visualize data to meet and defend a measured objective.

  • Gather data and choose a model to train and tune the machine learning tool and interpret the result.

Outline


  1. Define and describe data science concepts
    1. The role of data analytics
    2. Machine learning application
  2. Apply mathematics and statistics building blocks
    1. Linear algebra - notation, vector and matrix operations
    2. Statistics - measurement such as mean, median, mode, standard deviation, outlier, correlation, confidence interval
  3. Apply programming constructs in data science
    1. Data Structures
      1. Array
      2. Data frame
    2. Data input / output
    3. Operators
    4. Functions
    5. Plots
  4. Collect data from multiple sources
    1. Define objective
    2. Web scraping
    3. Web API
    4. Importing file
      1. HTML
      2. text
      3. CSV
      4. JSON
  5. Apply data wrangling techniques
    1. Joining data
    2. Data validation
    3. Data cleaning
  6. Assess data sets
    1. Exploratory data analysis
    2. Data filtering, sorting
    3. Searching, retrieving data
    4. Time series
  7. Visualize and present data
    1. Univariate and multivariate plots
    2. Defense of results
  8. Apply a learning framework
    1. Loading dataset
    2. Learning and predicting
    3. Saving models
  9. Evaluate supervised learning
    1. Training, predicting
    2. Evaluating model success
  10. Evaluate unsupervised learning
    1. Clustering
    2. Outlier detection
  11. Evaluate natural language processing
    1. Rule based and statistical NLP
    2. Sentiment analysis

Lab Topics


  1. Present, evaluate, and explain the role and application of data science in modern life.
  2. Solve mathematical and statistical problems with large data sets.
  3. Write code to perform mathematical and statistical work on input data to produce and present the result.
  4. Write code to locate and read data from multiple types of data source.
  5. Write code to join, validate, and clean input data to prepare for analysis.
  6. Write code to filter, search, sort data to arrive at a conclusion on the data trend.
  7. Write code to utilize the appropriate plots to visualize data.
  8. Write code to work with a machine learning model to train and predict data.
  9. Write code to provide input data for a machine learning model for unsupervised learning.
  10. Write code to process text and predict outcome.
Back to Top