Learn Data Science Tutorial – Full Course for Beginners
August 2, 2024 2024-08-02 6:41Learn Data Science Tutorial – Full Course for Beginners
Learn Data Science is this full tutorial course for absolute beginners. Data science is considered the “sexiest job of the 21st century.” You’ll learn the important elements of data science. You’ll be introduced to the principles, practices, and tools that make data science the powerful medium for critical insight in business and research. You’ll have a solid foundation for future learning and applications in your work. With data science, you can do what you want to do, and do it better. This course covers the foundations of data science, data sourcing, coding, mathematics, and statistics.
💻 Course created by Barton Poulson from datalab.cc.
🔗 Check out the datalab.cc YouTube channel: https://www.youtube.com/user/datalabcc
🔗 Watch more free data science courses at http://datalab.cc/
⭐️ Course Contents ⭐️
⌨️ Part 1: Data Science: An Introduction: Foundations of Data Science
– Welcome (1.1)
– Demand for Data Science (2.1)
– The Data Science Venn Diagram (2.2)
– The Data Science Pathway (2.3)
– Roles in Data Science (2.4)
– Teams in Data Science (2.5)
– Big Data (3.1)
– Coding (3.2)
– Statistics (3.3)
– Business Intelligence (3.4)
– Do No Harm (4.1)
– Methods Overview (5.1)
– Sourcing Overview (5.2)
– Coding Overview (5.3)
– Math Overview (5.4)
– Statistics Overview (5.5)
– Machine Learning Overview (5.6)
– Interpretability (6.1)
– Actionable Insights (6.2)
– Presentation Graphics (6.3)
– Reproducible Research (6.4)
– Next Steps (7.1)
⌨️ Part 2: Data Sourcing: Foundations of Data Science (1:39:46)
– Welcome (1.1)
– Metrics (2.1)
– Accuracy (2.2)
– Social Context of Measurement (2.3)
– Existing Data (3.1)
– APIs (3.2)
– Scraping (3.3)
– New Data (4.1)
– Interviews (4.2)
– Surveys (4.3)
– Card Sorting (4.4)
– Lab Experiments (4.5)
– A/B Testing (4.6)
– Next Steps (5.1)
⌨️ Part 3: Coding (2:32:42)
– Welcome (1.1)
– Spreadsheets (2.1)
– Tableau Public (2.2)
– SPSS (2.3)
– JASP (2.4)
– Other Software (2.5)
– HTML (3.1)
– XML (3.2)
– JSON (3.3)
– R (4.1)
– Python (4.2)
– SQL (4.3)
– C, C++, & Java (4.4)
– Bash (4.5)
– Regex (5.1)
– Next Steps (6.1)
⌨️ Part 4: Mathematics (4:01:09)
– Welcome (1.1)
– Elementary Algebra (2.1)
– Linear Algebra (2.2)
– Systems of Linear Equations (2.3)
– Calculus (2.4)
– Calculus & Optimization (2.5)
– Big O (3.1)
– Probability (3.2)
⌨️ Part 5: Statistics (4:44:03)
– Welcome (1.1)
– Exploration Overview (2.1)
– Exploratory Graphics (2.2)
– Exploratory Statistics (2.3)
– Descriptive Statistics (2.4)
– Inferential Statistics (3.1)
– Hypothesis Testing (3.2)
– Estimation (3.3)
– Estimators (4.1)
– Measures of Fit (4.2)
– Feature Selection (4.3)
– Problems in Modeling (4.4)
– Model Validation (4.5)
– DIY (4.6)
– Next Step (5.1)
—
Learn to code for free and get a developer job: https://www.freecodecamp.org
Read hundreds of articles on programming: https://www.freecodecamp.org/news
source
Comments (44)
@iyoleleiyolele2194
Thank you ❤. I'm not a techy person. However the way you handle the material is so smooth, so compelling that my fears have disappeared and I'm ready to start learning
@patilsgaming992
Where do we use math in this why people say you should know math to learn data science
@Ai4infinity
Subscribe to our YouTube channel for data science tutorials and tips. Drop your email ID in the comments to get a free Data Science Interview Kit! Do it soon! Don't miss out!
@ahemeddidbag-qs3qf
Can anyone be my freind
@Totoro_1
1:37:00
@lowbudgettravelerbd
finished the full 6 hour course. very imformative . i was thinking of learning R for my research analysis. before that i thought i need some foundation on data science and i saw this video. thanks a lote❤
@kzp8gamer333
bro what about data scientist in india ??
@alexmckinley79
His name, was Barton Poulson.
@AniManiaHaven
This Course is made more towards for what the real and practical advice for work that needs to be done in one video, not for actual learning of ML, DL, LLM, CV or really anything specific as a skill not exactly what I was Looking for.
@sergiomatiash5977
Low value stuff…only blah blah blah
@PrasThapa
Your voice and way of teaching ❤❤
@subhammakadia
i would say, worth watching it , best thing happen in a while.
@leomessi-bq4yx
thank you
@822dudes9
4:43:33
@zahidnisar3857
I lost my recent started data science course
@evin1878
Day 1: 32:11
@evin1878
Day 1; 32:10
@alireza61939
Can i get internship as data scientist just with learning from this tutorial????????
@sanikasalunkhe5637
A hearty thanks,
❤
I just completed my 10th boards,
And was in search of a best career option for self with all the clarifications ,
And you this video, made me get it perfectly with no doubt remained in me ,
And really the ones who are contributed in makeing this video are psychologists, who actually knows how we (students) do think !
Really meant alot ❤❤❤
@noziphomahlangu5125
Great content, your voice is soo soothing and interesting to listen to
@vedantraut327
Wonderful voice. Quality information
@JUSTINPCOMEDY
this is amazing lecture, this is my first time on YouTube to come across such amazing course, thank you so much sir
@ghpjoker
Amazing course! I'm having a turn in my career after 10 years of school teaching, and this course is just what I needed to complement my studies in DS. Just an errata : at 2:13:20 you mentioned "data de novo" as your personal expression for the concept of "new data". I don't know you if it meant for this expression to be extracted from portuguese language or latin, but if you took it from portuguese, the adequate expression should be "Nova Data" or "Novo Dado". In portuguese "de novo" means the same as "again" or to do something one more time.
@lucasmassawe9700
found it useful Thank you
@Ayesha_F
For my use as bookmark 4:52:00
@diswlszka
bookmark: 1:54:22
@ranggayogiswara5148
This video is higly valuable. unfortunately this video is old and there is a high probability that much things has changed.
@ricardomiranda9219
"DS is sexy…"❤
@cuneytkaymak4997
dude, you sound exactly like 3Blue1Brown. Are you two the same??
@kaykwanu
🎯 Key Takeaways for quick navigation:
00:02 Data Science Creativity
02:48 Data Inclusivity Insight
03:42 Data Science Demand
08:07 Data Science Ingredients
11:49 Data Science Pathway
19:34 Data Science Roles
Diverse Data Science
Teamwork Makes Unicorns
Data Science vs. BI
Privacy, Anonymity, Proprietary
Copyright, Data Security
Potential Bias, Overconfidence
01:05:08 Statistical models utility.
01:06:01 Machine learning overview.
01:09:08 Clear communication crucial.
01:14:10 Simplify presentation graphics.
01:19:33 Actionable insights importance.
Clear, simple charts
Storytelling with data
Reproducible research
01:46:31 Metrics & Methods Balance
01:48:24 Accuracy Metrics Overview
01:51:00 Social Context Awareness
01:54:14 Data Sourcing Methods
02:01:23 Utilizing APIs in Data Retrieval
APIs simplify web data
Scraping retrieves web data
Mind copyright laws
Experimental Research Benefits: Random assignment minimizes confounds.
Challenges of Experimentation: Training, time-consuming, expensive.
A/B Testing Overview: Compare webpage versions for optimization.
A/B Testing Tools: Optimizely, VWO for statistical analysis.
Data Sourcing: Explore, consider vendors, create new data.
Importance of Spreadsheets: Ubiquitous, versatile, essential for data manipulation.
Tidy Data Concept: Structured format crucial for analysis.
Tableau for Visualization: Powerful, insightful, available in free version.
Download Tableau, Install
Bring in Data
Create Graphs
03:08:10 Collaborative OSF Analysis
03:09:09 Diverse Software Choices
03:18:43 Web Data Basics
Structure Data with JSON
R: Language of Data
Python: General Purpose
SQL: Language of Databases
C/C++/Java: Fast, Reliable
Bash: Command Line
Command line interaction predates monitors.
Shells wrap around computer interaction.
Bash and PowerShell are common shells.
Bash utilities focus on simplicity.
Regular expressions are powerful search tools.
Mathematics is vital for data science.
Algebra is foundational in data science.
Linear algebra is key for manipulating data.
04:10:26 Matrix representation explained.
04:12:14 Linear algebra benefits.
04:17:34 Graphical system solutions.
04:21:10 Derivative calculation.
04:28:14 Maximizing revenue.
04:29:59 Optimize Price Revenue
04:31:41 Big O Growth
04:44:03 Arithmetic Probability
04:49:04 Test result probability: 81.6%
04:49:57 Positive test: 32.1%
04:57:37 Explore data thoroughly
05:07:48 Robust statistics stability
05:09:10 Resampling principle explanation
05:10:06 Transforming variables concept
05:26:55 Hypothesis Testing Basics
05:28:17 False Positive Concept
05:29:13 False Negative Concept
05:31:06 Critiques of Hypothesis Testing
05:31:55 Hypothesis Testing Value
05:32:49 Estimation Introduction
05:33:42 Confidence Intervals Overview
05:36:03 Accuracy vs Precision
05:37:21 Interpreting Confidence Intervals
05:40:52 Estimators Overview
05:46:08 Measures of Fit Explanation
05:47:01 R2: Measure variance.
05:47:30 -2 Log-likelihood: Nested model fit.
05:47:55 Model variations: AIC, BIC.
05:48:24 Chi-squared: Observed vs. expected.
05:48:53 Feature selection: Reduce overfitting.
05:49:19 Multicollinearity: Predictor overlap.
05:50:12 P values: Individual predictor significance.
05:50:40 Betas: Standardized coefficients.
05:51:10 Newer methods: Dominance, Commonality, Relative Importance.
05:51:40 Common modeling problems: Non-Normality, Non-Linearity, Multicollinearity, Missing Data.
05:52:09 Dimensionality: Reducing variables.
05:52:38 Model validation: Bayes, Replication, Holdout, Cross-Validation.
05:53:07 DIY attitude: Start now.
05:53:36 Beware critics: Mistakes happen.
05:53:56 Data value: All data matters.
05:54:05 Continuous improvement mindset.
05:54:42 Explore and analyze.
05:55:01 Domain expertise matters.
05:55:20 Start now.- 05:54:05 Continuous improvement mindset.
05:54:05 Additional conceptual courses.
05:54:05 Practical hands-on tutorials.
05:54:05 "Write what you know".
05:54:05 Domain expertise importance.
05:54:05 You don't have to be perfect.
05:54:05 Just get started.
Made with HARPA AI
@Thelazyguyulove
I m in class 11 can i learn this data science 🥲
@Dongnanjie
Love it. Thank you, Barton!!
@emmanuelokpenkedime8596
This is really good, very well put together, concise and easy to understand. I am totally new to data science and this has seriously got me considering learning about it further. Hopefully with a view of working in this field one day. Thank you
@sobhangh3690
Amaziiingggg
@SomeReal_Diaries
masterpiece … thank uhh
@zawahirmohamed9414
its very useful. would be great if you could share this PPT. thanks again
@rafael_Reis_rv
Very good content !
@Mari_Selalu_Berbuat_Kebaikan
Let's always do alot of good ❤️
@user-to9gv3if2b
🎯 Key Takeaways for quick navigation:
02:48 🌐 Data Science is inclusive analysis, involving all data to provide the most insightful answers to research questions.
14:33 🌐 Data Science involves diverse skills and backgrounds, encompassing coding, statistics, math, and domain expertise, making it a compelling career alternative.
23:09 🦄 Data Science Diversity: Data science is diverse, involving people with different goals, skills, and experiences working in various contexts, making it a rich and interconnected field.
24:34 🤖 Unicorn Analogy: The term "unicorn" is used in data science to describe a mythical data scientist with universal abilities in coding, statistics, design, business, and management. However, in reality, such individuals are rare, and collaboration among specialists is more common.
35:25 📊 Data Science vs. Statistics: Data science and statistics share common procedures, but they differ in training backgrounds, goals, and contexts, highlighting their conceptual distinctions despite overlapping elements.
47:08 🧠 Data science analyses are limited simplifications; humans are essential for interpretation and application. Overconfidence in algorithmic results can lead to incorrect conclusions.
48:25 🌐 Data science projects can't be neutral; algorithms reflect the biases of their creators. Good judgment is vital for the quality and success of a data science project.
58:25 🧮 Math is foundational in data science; understanding procedures, addressing issues, and some manual calculations are essential for informed decisions.
01:10:01 📈 When conducting data analysis, focus on maximizing the story to maximize value. Clearly align the narrative with specific goals, especially when answering client queries.
01:13:47 📊 In presenting data analysis, adhere to the principle of being minimally sufficient. Embrace simplicity, as emphasized by quotes like "Everything should be made as simple as possible, but not simpler."
01:19:33 🎯 Data science is goal-focused. When communicating results, provide specific, justifiable next steps based on the analysis. Consider the social, political, and economic context for actionable insights.
[01:34:28 URL](https://youtu.be/ua-CiDNNj30?t=5668s) 📘 Use narrative methods like Jupyter Notebooks or RMarkdown to document and share the data analysis process, allowing for transparency and understanding of conclusions.
[01:36:46 URL](https://youtu.be/ua-CiDNNj30?t=5806s) 🚀 Next steps after the tutorial: Explore coding in R or Python, dive into data visualization, brush up on statistics and math, explore machine learning, and consider community involvement in data science conferences and projects.
[01:39:27 URL](https://youtu.be/ua-CiDNNj30?t=5967s) 🌐 Data science is democratic and essential for everyone. Encourages learning to work with data intelligently and sensitively, emphasizing its fundamental importance for all.
02:12:52 🕵️♀️ Data scraping from webpages, PDFs, images, and media when no API is available. Code scraping examples in R and Python. Emphasizes respect for copyright and privacy to avoid legal issues.
[02:16:06 URL](https://youtu.be/ua-CiDNNj30?t=9766s) 🎙️ Interviews are valuable for new situations or audiences, offering open-ended information. Structured interviews have predetermined questions, while unstructured interviews are more conversational and varied.
[02:18:26 URL](https://youtu.be/ua-CiDNNj30?t=11006s) 📊 Surveys are effective for obtaining data quickly, but clarity in question wording and response scales is crucial. Beware of bias, and ensure the questions align with the audience's understanding.
[02:25:42 URL](https://youtu.be/ua-CiDNNj30?t=15442s) ⚗️ Laboratory experiments are crucial for determining cause and effect, requiring specialized training. They offer reliable information but can be time-consuming and expensive.
02:40:02 🧹 Spreadsheets excel in tasks like data browsing, sorting, rearranging, finding/replacing, formatting, transposing, tracking changes, creating pivot tables, and arranging output for consumption.
02:41:51 🔍 When working with spreadsheets, maintaining "Tidy Data" is crucial for easy transfer between programs. Tidy Data involves having equivalent columns for variables and rows for cases, ensuring one sheet per file, and maintaining a consistent level of measurement per file.
02:54:39 📊 SPSS (Statistical Package for the Social Sciences) is a desktop program used in academic and business research, known for its point-and-click interface and drop-down menus. The program is available for free for students, with paid versions for others.
02:59:14 📊 SPSS provides various options for data analysis, including descriptive statistics and visualization tools like stem-and-leaf plots and box plots.
03:02:22 🆕 JASP, a free and open-source alternative to SPSS, offers intuitive features, replicability, and includes Bayesian approaches.
03:09:09 💻 A wide range of data science tools, including SAS, Stata, MATLAB, Wolfram Alpha, RapidMiner, KNIME, SOFA Statistics, and more, are discussed with considerations for functionality, ease of use, community support, and cost.
03:34:45 🐍 Python, a general-purpose language, is popular in data science. It has a vast community, multiple versions (2.x and 3.x), and interfaces like Jupyter. Python's strength lies in numerous packages, including NumPy, Pandas, and scikit-learn.
03:40:12 💽 SQL, the language of databases, is crucial in data science. It excels in relational databases (RDBMS), with popular choices like Oracle, SQL Server, MySQL, and PostgreSQL. SQL minimizes data redundancy and is often used via GUIs like SQL Developer.
03:43:44 📊 SQL commands: Learn essential SQL commands, including SELECT, FROM, WHERE, and ORDER BY, for efficient data extraction and organization from relational databases.
03:44:41 ⚙️ Data Science Languages: C, C++, and Java serve as foundational languages in data science, particularly for the back end, offering speed and reliability.
04:06:51 🧮 Algebra is crucial to data science, enabling the combination of scores and various manipulations. Linear algebra, also known as matrix algebra, is the next step, representing data with vectors and matrices, making computations more efficient.
04:29:59 📈 Lowering the cost by 20%, from $500 to $400 per year, can increase sales by 33%, leading to a 7% increase in total revenue.
04:35:51 📊 Understanding Big O helps optimize algorithms, considering time and space complexity, crucial for efficient data processing.
04:41:19 🎲 Probability calculations involve adding or multiplying probabilities, considering overlaps and conditional probabilities.
04:49:31 🔄 Positive test result doesn't guarantee disease; Bayes theorem crucial for accurate probability calculations.
04:51:23 🎯 Focus on goals in data science; understand procedures, diagnose problems, and prioritize meaning.
05:09:40 📊 Tukey's ladder of powers helps transform skewed data; exploring numerical distributions aids in understanding data stability, outliers, and skewness.
05:11:04 📈 Descriptive statistics involve center, spread, and shape. Common measures include mode, median, mean for center; range, interquartile range, variance, and standard deviation for spread.
05:19:02 🔄 Measures of spread (range, interquartile range, variance, and standard deviation) have pros and cons; variance and standard deviation are less intuitive but more useful in data science.
05:20:47 📊 Understanding the shape of the distribution (symmetrical, skewed, unimodal, bimodal, uniform, u-shaped) is crucial for interpreting numerical summaries like mean and standard deviation.
05:26:01 📊 Inferential statistics involve sampling data from populations, adjusting for sampling error; common approaches include hypothesis testing (null and alternative hypotheses) and estimation.
05:51:10 🔄 Multicollinearity, the association between predictors, poses challenges in regression analysis. Methods like stepwise regression, commonality analysis, dominance analysis, and relative importance weights help address multicollinearity issues.
05:51:40 📊 Non-Normality, Non-Linearity, Multicollinearity, and Missing Data are common problems in modeling. Skewed distributions, outliers, and mixed distributions impact the assumptions of statistical procedures. Strategies include data transformation, polynomial terms, and reducing variables.
05:52:38 📉 Model Validation is crucial for assessing the generalizability of statistical models. Approaches like Bayesian methods, replication, holdout validation, and cross-validation help evaluate model performance on different data sets.
05:53:36 🛠️ Adopt a DIY (Do It Yourself) attitude in data science. Emphasize the importance of getting started, align methods with goals, focus on usability, and beware of trolls and critics. Acknowledge that no analysis is perfect, but the goal is to add value to the understanding of the data.
Made with HARPA AI
@grainnetaggart
photos and videos
@sarthakturakhia2922
1:00:00
@agustinamp2483
Can someone tell me if this couse is old? Or is it worth take it now?
@nickbringefer723
17:06
@ankitabafila722
2:0:59