01. Introduction - Bootcamp AI

Welcome to the Nanodegree program

The Skills That Set You Apart

The Data Science Process

Learn the data science process, including how to build effective data visualizations, and how to communicate with various stakeholders

Communicating to Stakeholders

01. Video: Introduction00:00

02. Video: First Things First00:00

03. Text: README Showcase

04. Video: Posting to Github00:00

05. Quiz: Github Check

06. Video: Up And Running On Medium00:00

07. Text: Medium Getting Started Post and Links

08. Video: Know Your Audience00:00

09. Who Is The Audience?

10. Video: Three Steps to Captivate Your Audience00:00

11. Video: First Catch Their Eye00:00

12. Picture First, Title Second

13. Video: More Advice00:00

14. More Advice

15. Video: End With A Call To Action00:00

16. End With A Call To Action

17. Video: Other Important Information00:00

18. Text: Recap

19. Video: Conclusion00:00

Project Write A Data Science Blog Post

In this project, learners will choose a dataset, identify three questions, and analyze the data to find answers to these questions. They will create a GitHub repository with their project, and write a blog post to communicate their findings to the appropriate audience. This project will help learners reinforce and extend their knowledge of machine learning, data visualization, and communication.

Introduction to Software Engineering

In this lesson, you’ll write production-level code and practice object-oriented programming, which you can integrate into machine learning projects.

Software Engineering Practices Pt I

01. Introduction1:05

02. Clean and Modular Code4:19

Quiz

03. Refactoring Code2:01

04. Writing Clean Code5:11

05. Quiz: Clean Code

06. Writing Modular Code5:25

07. Refactoring – Wine Quality

08. Solution: Refactoring – Wine Quality

09. Efficient Code1:45

10. Optimizing – Common Books3:35

Documentation1:20

In-line Comments1:38

Docstrings1:14

Project Documentation

19. Documentation

Version Control in Data Science0:41

Scenario #12:39

Scenario #21:19

Scenario n. ° 31:18

Model Versioning

Conclusion0:36

Software Engineering Practices Pt II

OOP

Portfolio Exercise: Upload a Package to PyPi

Web Development

Portfolio Exercise: Deploy a Data Dashboard

Introduction to Data Engineering

ETL Pipelines

Introduction to NLP

Learn Natural Language Processing one of the fields with the most real applications of Deep Learning

Machine Learning Pipelines

01. Introduction0:39

02. Corporate Messaging Case Study4:11

03. Case Study Clean and Tokenize00:00

04. Solution Clean and Tokenize00:00

05. Machine Learning Workflow00:00

06. Case Study Machine Learning Workflow00:00

07. Solution Machine Learning Workflow00:00

08. Using Pipeline00:00

09. Advantages of Using Pipeline00:00

10. Case Study Build Pipeline

11. Solution Build Pipeline

12. Pipelines and Feature Unions00:00

13. Using Feature Union00:00

14. Case Study Add Feature Union

15. Solution Add Feature Union

16. Creating Custom Transformers00:00

17. Case Study Create Custom Transformer00:00

18. Solution Create Custom Transformer

19. Pipelines and Grid Search00:00

20. Using Grid Search with Pipelines2:14

21. Case Study Grid Search Pipeline00:00

22. Solution Grid Search Pipeline

23. Conclusion00:00

Disaster Response Pipeline

Project1: Disaster Response Pipeline

Concepts in Experiment Design

Statistical Considerations in Testing

01. Lesson Introduction00:00

02. Practice: Statistical Significance

03. Statistical Significance – Solution

04. Practical Significance00:00

05. Experiment Size00:00

06. Experiment Size – Solution

07. Using Dummy Tests00:00

08. Non-Parametric Tests Part I

09. Non-Parametric Tests Part I – Solution

10. Non-Parametric Tests Part II

11. Non-Parametric Tests Part II – Solution

12. Analyzing Multiple Metrics00:00

12.2 Analyzing Multiple Metrics00:00

13. Early Stopping00:00

14. Early Stopping – Solution

15. Lesson Conclusion00:00

AB Testing Case Study

Fraud Detection00:00

Pre-Notebook: Payment Fraud Detection

Exercise: Payment Transaction Data00:00

Solution: Data Distribution & Splitting00:00

LinearLearner & Class Imbalance00:00

Exercise: Define a LinearLearner

Solution: Default LinearLearner00:00

Exercise: Format Data & Train the LinearLearner

Solution: Training Job00:00

Precision & Recall, Overview

Exercise: Deploy Estimator

Solution: Deployment & Evaluation00:00

Model Improvements00:00

Improvement, Model Tuning00:00

Exercise: Improvement, Class Imbalance

Solution: Accounting for Class Imbalance00:00

Exercise: Define a Model w/ Specifications

One Solution: Tuned and Balanced LinearLearner

Summary and Improvements00:00

A/B Testing Case Study

01. Lesson Introduction00:00

02. Scenario Description

03. Building a Funnel

04. Building a Funnel – Discussion

05. Deciding on Metrics – Part I

06. Deciding on Metrics – Part II

07. Deciding on Metrics – Discussion

08. Experiment Sizing

09. Experiment Sizing – Discussion

10. Validity, Bias, and Ethics – Discussion

11. Analyze Data

12. Draw Conclusions

13. Draw Conclusions – Discussion

14. Lesson Conclusion00:00

Portfolio Exercise Starbucks

Introduction to Recommendation Engines

Matrix Factorization for Recommendations

Time Series Forecasting00:00

Forecasting Energy Consumption00:00

03. Pre-Notebook: Time-Series Forecasting

Processing Energy Data00:00

Exercise Creating Time Series00:00

06. Solution: Split Data

Exercise Convert to JSON00:00

Solution Formatting JSON Lines _ DeepAR Estimator00:00

09. Exercise: DeepAR Estimator

Solution Complete Estimator _ Hyperparameters00:00

Making Predictions00:00

12. Exercise: Predicting the Future

Solution Predicting Future Data00:00

14. Clean Up: All Resources

Recommendation Engines

Upcoming Lesson

01. Implementing RNNs00:00

Time-Series Prediction00:00

Training _ Memory00:00

Hidden State Dimensions

Character-wise RNNs00:00

Sequence Batching00:00

Pre-Notebook: Character-Level RNN00:00

07. Notebook: Character-Level RNN

Implementing a Char-RNN

Batching Data, Solution00:00

Defining the Model00:00

Char-RNN, Solution00:00

Making Predictions00:00

Sentiment Prediction RNN

Sentiment RNN, Introduction00:00

Pre-Notebook: Sentiment RNN

03. Notebook: Sentiment RNN

04. Data Pre-Processing00:00

Encoding Words, Solution00:00

Getting Rid of Zero-Length00:00

Cleaning & Padding Data00:00

Padded Features, Solution00:00

TensorDataset & Batching Data00:00

Defining the Model00:00

Complete Sentiment RNN00:00

Training the Model00:00

Testing00:00

Inference, Solution

Convolutional Neural Networks

Transfer Learning

Weight Initialization

Autoencoders

Autoencoders00:00

Pre-Notebook: Linear Autoencoder00:00

A Linear Autoencoder00:00

Notebook: Linear Autoencoder

Defining & Training an Autoencoder00:00

A Simple Solution00:00

Learnable Upsampling00:00

Transpose Convolutions00:00

Convolutional Autoencoder00:00

Pre-Notebook: Convolutional Autoencoder

Notebook – Convolutional Autoencoder

Convolutional Solution00:00

Upsampling & Denoising00:00

De-noising00:00

Pre-Notebook: De-noising Autoencoder

Notebook: De-noising Autoencoder

Job Search

Find your dream job with continuous learning and constant effort

Refine Your Entry-Level Resume

Craft Your Cover Letter

Optimize Your GitHub Profile

Introduction00:00

GitHub profile important items00:00

Good GitHub repository00:00

Interview Part 100:00

Identify fixes for example “bad” profile00:00

Identify fixes for example “bad” profile 200:00

Quick Fixes #100:00

Quick Fixes #200:00

Writing READMEs00:00

Interview Part 200:00

Commit messages best practices

Reflect on your commit messages00:00

Participating in open source projects00:00

Interview Part 300:00

Participating in open source projects 2

Starring interesting repositories

Participating in open source projects 200:00

Starring interesting repositories00:00

Develop Your Personal Brand

01. Introduction

Data Pipelines: ETL vs ELT

Data pipeline is a generic term for moving data from one place to another. For example, it could be moving data from one server to another server.

ETL

An ETL pipeline is a specific kind of data pipeline and very common. ETL stands for Extract, Transform, Load. Imagine that you have a database containing web log data. Each entry contains the IP address of a user, a timestamp, and the link that the user clicked.

What if your company wanted to run an analysis of links clicked by city and by day? You would need another data set that maps IP address to a city, and you would also need to extract the day from the timestamp. With an ETL pipeline, you could run code once per day that would extract the previous day’s log data, map IP address to city, aggregate link clicks by city, and then load these results into a new database. That way, a data analyst or scientist would have access to a table of log data by city and day. That is more convenient than always having to run the same complex data transformations on the raw web log data.

Before cloud computing, businesses stored their data on large, expensive, private servers. Running queries on large data sets, like raw web log data, could be expensive both economically and in terms of time. But data analysts might need to query a database multiple times even in the same day; hence, pre-aggregating the data with an ETL pipeline makes sense.

ELT

ELT (extract, load, transform) pipelines have gained traction since the advent of cloud computing. Cloud computing has lowered the cost of storing data and running queries on large, raw data sets. Many of these cloud services, like Amazon Redshift, Google BigQuery, or IBM Db2 can be queried using SQL or a SQL-like language. With these tools, the data gets extracted, then loaded directly, and finally transformed at the end of the pipeline.

However, ETL pipelines are still used even with these cloud tools. Oftentimes, it still makes sense to run ETL pipelines and store data in a more readable or intuitive format. This can help data analysts and scientists work more efficiently as well as help an organization become more data driven.

Welcome to the Nanodegree program

01. Welcome00:00

02. Meet the Instructors00:00

03. Term 2 Projects00:00

03.2 Term 2 Projects00:00

03.3 Term 2 Projects00:00

03.4 Term 2 Projects00:00

04. Program Structure & Syllabus

05. Learning Plan – First Two Weeks

06. How to Succeed00:00

Words of Encouragement00:00

The Skills That Set You Apart

01. What do Data Scientists Do?00:00

02. Interview: Robert Chang [AirBnB]00:00

03. Interview: Caroline [BMG]00:00

04. Interview: Dan [Coinbase]00:00

05. Interview: Richard [Starbucks]00:00

06. Outro00:00

The Data Science Process

01. Video: Intro00:00

02. Video: CRISP-DM00:00

03. Video: The Data Science Process – Business & Data00:00

04. Video: Business & Data Understanding – Example00:00

05. Screencast: Using Workspaces00:00

06. Quiz + Notebook: A Look at the Data

07. Screencast: A Look at the Data00:00

08. What Should You Check?

09. Video: Business & Data Understanding00:00

10. Video: Gathering & Wrangling00:00

11. Screencast: How To Break Into the Field?00:00

12. Notebook + Quiz: How To Break Into the Field

13. Screencast: How to Break Into the Field Solution00:00

14. Screencast: Bootcamps00:00

15.1 Quiz: Bootcamp Takeaways

15.1 Quiz: Bootcamp Takeaways

15.2 Quiz: Bootcamp Takeaways

15.2 Quiz: Bootcamp Takeaways

16. Notebook + Quiz: Job Satisfaction

17. Screencast: Job Satisfaction00:00

18. Video: It Is Not Always About ML00:00

19. Video: The Data Science Process – Modeling00:00

20. Video: Predicting Salary00:00

21. Screencast: Predicting Salary00:00

22. Notebook + Quiz: What Happened?

23. Screencast: What Happened Solution00:00

24. Video: Working With Missing Values00:00

25. Video: Removing Data – Why Not?00:00

26. Video: Removing Data – When Is It OK?00:00

27. Video: Removing Data – Other Considerations00:00

28. Quiz: Removing Data

29. Notebook + Quiz: Removing Values

30. ScreenCast: Removing Data Solution00:00

31. Notebook + Quiz: Removing Data Part II

32. Screencast: Removing Data Part II Solution00:00

33. Video: Imputing Missing Values00:00

34. Notebook + Quiz: Imputation Methods & Resources

35. Screencast: Imputation Methods & Resources Solution00:00

36. Notebook + Quiz: Imputing Values

37. Screencast: Imputing Values Solution00:00

38. Video: Working With Categorical Variables Refresher00:00

39. Notebook + Quiz: Categorical Variables

40. Screencast: Categorical Variables Solution00:00

41. Video: How to Fix This?00:00

42. Notebook + Quiz: Putting It All Together

43. Screencast + Notebook: Putting It All Together Solution00:00

44.1 Text + Quiz: Results