• Data Science with Python Pandas Bootcamp 2020

    Development May 7, 2020
    The Complete Pandas Bootcamp 2020 Data Science with Python

    Data Science with Python

    This study covers Pandas 1.0. It gives optimal guidance on how to transition from old versions to new version 1.0. Python is a great platform & environment for data science, it allows powerful tools for data science, statistics, and machine learning. And the Pandas library is the brain of Python data science. Pandas allow you to import, clean, join, merge, concatenate, manipulate, and understand data and prepare or process data for further data presentation, statistical analysis & machine learning. In actuality, all of these tasks require high proficiency skills in Pandas. Data scientists usually spend up to 80% of their time with manipulating data in Pandas. If you’re just started learning Python language, please go to Python Beginner to Advanced Study.

    Duration:

    • 30+ Hours
    • 150+ Exercises

    Requirements:

    • Machine learning skills
    • Finance skills
    • Seaborn skills
    • A computer capable of storing and running Anaconda
    • Ideally some spreadsheet basics (Microsoft Excel, Google Sheets) or programming at a basic level

    What you’ll learn:

    • Improve data handling & analysis skills to an outstanding level
    • Practice relevant Pandas methods and workflows
    • Learn Pandas based on version 1.0
    • Import, clean and merge raw data and prepare data for machine learning
    • Analyze, visualize and know who data with Matplotlib and Seaborn works
    • Practice and master your Pandas skills with quizzes, 150+ exercises, and detailed oriented projects
    • Import financial/stock data from web sources and analyze them with Pandas
    • Learn how to best transition from old versions to new Pandas version

    Purpose:

    The goal is to bring your data handling & analysis skills to the next level and build your career in data science technology. This study is divided into 4 Parts, Pandas basics, testing your skills in a detailed project challenge that is frequently used in data science job applications/assessment centres. In the last part of this study, you will learn how to import, analyze & handle, and workflow with (financial) time series data.

    Advantage:

    The world is getting more and more data-driven every day. New professions like data scientist are gaining ground with $100k+ salaries. It’s time to switch from the old environment (Microsoft Excel) to a high tuned new environment (Pandas). If you essentially want to use Python for data science and to replace old environments including Microsoft Excel, then this study is a perfect match for you.

    Why to take this study?

    • It is the most relevant and detailed oriented study on Pandas
    • It is the most up-to-date study incorporating all the latest Pandas updates. Pandas library has developed big improvements in the last months. From my experience, working and relying on old & outdated code can be painful
    • It can serve as a Pandas world dictionary covering all relevant methods, properties, and workflows for realtime projects. If you have any problems with any method or workflow, you will most likely get help and find a solution in this study
    • It explains comprehensive realtime data workflows. Starting from importing raw messy data, cleaning data, merging/concatenating/grouping and aggregating data, explanatory data analysis through to preparing and processing data for statistics, machine learning and data presentations
    • Pandas is a very powerful technology. But it also has drawbacks that may lead to errors in your data. This study focuses on such mistakes and guides you, what and how you should do to avoid errors in your data

    Who this study is for:

    • Everyone who wants to step deeply into data science. Pandas is heart to everything
    • Data scientists who want to improve their data handling, data analysis & manipulation skills
    • Everyone who want to switch data projects from the old environment (Microsoft Excel) to new & more powerful environment used in Research & Science
    • Investment and finance students & professionals

    Author: Alexander Hagmann
    Language: English
    Size: 11.21GB

    0 0 vote
    Article Rating

    Study Topics:

    • 01 Getting Started
      1. 1 1.01 Overview Student FAQ
      2. 2 1.02 Tips How to get the most out of this course
      3. 3 1.03 Did you know that
      4. 4 1.04 Installation of Anaconda
      5. 5 1.05 Opening a Jupyter Notebook
      6. 6 1.06 How to use Jupyter Notebooks
      7. 7 1.07 How to tackle Pandas Version 1.0
    • 02 Part I
      1. 1 2.01 Intro to Tabular Data Pandas
      2. 2 2.02 Download Part 1 Course Materials
    • 03 Pandas Basics (DataFrame Basics I)
      1. 1 3.01 Create your very first Pandas DataFrame (from csv)
      2. 2 3.02 Pandas Display Options and the methods head() and tail()
      3. 3 3.03 First Data Inspection
      4. 4 3.04 Built-in Functions, Attributes and Methods with Pandas
      5. 5 3.05 Make it easy TAB Completion and Tooltip
      6. 6 3.06 Explore your own Dataset Coding Exercise 1 (Intro)
      7. 7 3.07 Explore your own Dataset Coding Exercise 1 (Solution)
      8. 8 3.08 Selecting Columns
      9. 9 3.09 Selecting one Column with the dot notation
      10. 10 3.10 Zero-based Indexing and Negative Indexing
      11. 11 3.11 Selecting Rows with iloc (position-based indexing)
      12. 12 3.12 Slicing Rows and Columns with iloc (position-based indexing)
      13. 13 3.13 Selecting Rows with loc (label-based indexing)
      14. 14 3.14 Slicing Rows and Columns with loc (label-based indexing)
      15. 15 3.15 Indexing and Slicing with reindex()
      16. 16 3.16 Summary, Best Practices and Outlook
      17. 17 3.17 Coding Exercise 2 (Intro)
      18. 18 3.18 Coding Exercise 2 (Solution)
      19. 19 3.19 Advanced Indexing and Slicing (optional)
    • 04 Pandas Series and Index Objects
      1. 1 4.01 First Steps with Pandas Series
      2. 2 4.02 Analyzing Numerical Series with unique(), nunique() and value_counts()
      3. 3 4.03 Analyzing non-numerical Series with unique(), nunique(), value_counts()
      4. 4 4.04 Creating Pandas Series (Part 1)
      5. 5 4.05 Creating Pandas Series (Part 2)
      6. 6 4.06 Indexing and Slicing Pandas Series
      7. 7 4.07 Sorting of Series and Introduction to the inplace - parameter
      8. 8 4.08 nlargest() and nsmallest()
      9. 9 4.09 idxmin() and idxmax()
      10. 10 4.10 Manipulating Pandas Series
      11. 11 4.11 Coding Exercise 3 (Solution)
      12. 12 4.12 First Steps with Pandas Index Objects
      13. 13 4.13 Creating Index Objects from Scratch
      14. 14 4.14 Changing Row Index with set_index() and reset_index()
      15. 15 4.15 Changing Column Labels
      16. 16 4.16. Renaming Index and Column Labels with rename()
      17. 17 4.17 Coding Exercise 4 (Solution)
    • 05 DataFrame Basics II
      1. 1 5.01 Filtering DataFrames by one Condition
      2. 2 5.02 Filtering DataFrames by many Conditions (AND)
      3. 3 5.03 Filtering DataFrames by many Conditions (OR)
      4. 4 5.04 Advanced Filtering with between(), isin() and ~
      5. 5 5.05 any() and all()
      6. 6 5.06 Removing Columns
      7. 7 5.07 Removing Rows
      8. 8 5.08 Adding new Columns to a DataFrame
      9. 9 5.09 Creating Columns based on other Columns
      10. 10 5.10 Adding Columns with insert()
      11. 11 5.11 Creating DataFrames from Scratch with pd.DataFrame()
      12. 12 5.12 Adding new Rows (hands-on approach)
      13. 13 5.13 Coding Exercise 5 (Solution)
    • 06 Manipulating Elements in a DataFrame Slice Important, know the Pitfalls
      1. 1 6.01 Best Practice (How you should do it)
      2. 2 6.02 Chained Indexing How you should NOT do it (Part 1)
      3. 3 6.03 Chained Indexing How you should NOT do it (Part 2)
      4. 4 6.04 View vs. Copy
      5. 5 6.05 Simple Rules what to do when
      6. 6 6.06 Coding Exercise 6 (Solution)
    • 07 DataFrame Basics III
      1. 1 7.01 Sorting DataFrames with sort_index() and sort_values() (Version 1.0 Update)
      2. 2 7.02 Ranking DataFrames with rank()
      3. 3 7.03 nunique() and nlargest() nsmallest() with DataFrames
      4. 4 7.04 Summary Statistics and Accumulations
      5. 5 7.05 The agg() method
      6. 6 7.06 Coding Exercise 7 (Solution)
      7. 7 7.07 User-defined Functions with apply(), map() and applymap()
      8. 8 7.08 Hierarchical Indexing (Part 1)
      9. 9 7.09 Hierarchical Indexing (Part 2)
      10. 10 7.10 String Operations (Part 1)
      11. 11 7.11 String Operations (Part 2)
      12. 12 7.12 Coding Exercise 8 (Solution)
    • 08 Visualization with Matplotlib
      1. 1 8.01 The plot() method
      2. 2 8.02 Customization of Plots
      3. 3 8.03 Histograms (Part 1)
      4. 4 8.04 Histograms (Part 2)
      5. 5 8.05 Barcharts and Piecharts
      6. 6 8.06 Scatterplots
      7. 7 8.07 Coding Exercise 9 (Solution)
    • 09 Importing Data
      1. 1 9.01 Importing csv-files with pd.read_csv
      2. 2 9.02 Importing messy csv-files with pd.read_csv
      3. 3 9.03 Importing Data from Excel with pd.read_excel()
      4. 4 9.04 Importing messy Data from Excel with pd.read_excel()
      5. 5 9.05 Importing Data from the Web with pd.read_html()
    • 10 Cleaning Data
      1. 1 10.01 First Inspection and Handling of inconsistent Data
      2. 2 10.02 String Operations
      3. 3 10.03 Changing Datatype of Columns with astype()
      4. 4 10.04 Intro NA values missing values
      5. 5 10.05 Detection of missing Values
      6. 6 10.06 Removing missing values
      7. 7 10.07 Replacing missing values
      8. 8 10.08 Intro Duplicates
      9. 9 10.09 Detection of Duplicates
      10. 10 10.10 Handling Removing Duplicates
      11. 11 10.11 The ignore_index parameter (NEW in Pandas 1.0)
      12. 12 10.12 Detection of Outliers
      13. 13 10.13 Handling Removing Outliers
      14. 14 10.14 Categorical Data
      15. 15 10.15 Pandas Version 1.0 New dtypes and pd.NA
      16. 16 10.16 Coding Exercise 11 (Solution)
    • 11 Merging, Joining, and Concatenating Data
      1. 1 11.01 Adding Rows with append() and pd.concat() (Part 1)
      2. 2 11.02 Adding Rows with pd.concat() (Part 2)
      3. 3 11.03 Arithmetic with Pandas Objects Data Alignment
      4. 4 11.04 Outer Joins with merge()
      5. 5 11.05 Inner Joins with merge()
      6. 6 11.06 Outer Joins (without Intersection) with merge()
      7. 7 11.07 Left Joins (without Intersection) with merge()
      8. 8 11.08 Right Joins (without Intersection) with merge()
      9. 9 11.09 Left Joins with merge()
      10. 10 11.10 Right Joins with merge()
      11. 11 11.11 Joining on different Column Names Indexes
      12. 12 11.12 Joining on more than one Column
      13. 13 11.13 pd.merge() and join()
    • 12 GroupBy Operations
      1. 1 12.01 Intro
      2. 2 12.02 Understanding the GroupBy Object
      3. 3 12.03 Splitting with many Keys
      4. 4 12.04 split-apply-combine explained
      5. 5 12.05 split-apply-combine applied
      6. 6 12.06 Advanced aggregation with agg()
      7. 7 12.07 GroupBy Aggregation with Relabeling (NEW - Pandas Version 0.25)
      8. 8 12.08 Transformation with transform()
      9. 9 12.09 Replacing NA Values by group-specific Values
      10. 10 12.10 Generalizing split-apply-combine with apply()
      11. 11 12.11 Hierarchical Indexing with Groupby
      12. 12 12.12 stack() and unstack()
      13. 13 12.13 Coding Exercise 13 (Solution)
    • 13 Reshaping and Pivoting DataFrames
      1. 1 13.01 Transposing Rows and Columns
      2. 2 13.02 Pivoting DataFrames with pivot()
      3. 3 13.03 Limits of pivot()
      4. 4 13.04 pivot_table()
      5. 5 13.05 pd.crosstab()
      6. 6 13.06 melting DataFrames with melt()
    • 14 Data Preparation and Feature Creation
      1. 1 14.01 Arithmetic Operations (Part 1)
      2. 2 14.02 Arithmetic Operations (Part 2)
      3. 3 14.03 TransformationMapping with map()
      4. 4 14.04 Conditional Transformation
      5. 5 14.05 Discretization and Binning with pd.cut() (Part 1)
      6. 6 14.06 Discretization and Binning with pd.cut() (Part 2)
      7. 7 14.07 Discretization and Binning with pd.qcut()
      8. 8 14.08 Floors and Caps
      9. 9 14.09 Scaling Standardization
      10. 10 14.10 Creating Dummy Variables
      11. 11 14.11 String Operations
    • 15 Advanced Visualization with Seaborn
      1. 1 15.01 First Steps in Seaborn
      2. 2 15.02 Categorical Plots
      3. 3 15.03 Joint Plots Regression Plots
      4. 4 15.04 Matrixplots Heatmaps
    • 16 Part III
      1. 1 16.01 Olympic Medal Tables (Instruction and Hints)
      2. 2 16.02 Olympic Medal Tables (Solution Part 1)
      3. 3 16.03 Olympic Medal Tables (Solution Part 2)
      4. 4 16.04 Olympic Medal Tables (Solution Part 3)
    • 17 Time Series Basics
      1. 1 17.01 Importing Time Series Data from csv-files
      2. 2 17.02 Converting strings to datetime objects with pd.to_datetime()
      3. 3 17.03 Initial Analysis Visualization of Time Series
      4. 4 17.04 Indexing and Slicing Time Series
      5. 5 17.05 Creating a customized DatetimeIndex with pd.date_range()
      6. 6 17.06 More on pd.date_range()
      7. 7 17.07 Downsampling Time Series with resample() (Part 1)
      8. 8 17.08 Downsampling Time Series with resample (Part 2)
      9. 9 17.09 The PeriodIndex object
      10. 10 17.10 Advanced Indexing with reindex()
    • 18 Time Series Advanced Financial Time Series
      1. 1 18.01 Getting Ready (Installing required package)
      2. 2 18.02 Importing Stock Price Data from Yahoo Finance (it still works!)
      3. 3 18.03 Initial Inspection and Visualization
      4. 4 18.04 Normalizing Time Series to a Base Value (100)
      5. 5 18.05 The shift() method
      6. 6 18.06 The methods diff() and pct_change()
      7. 7 18.07 Measuring Stock Performance with MEAN Returns and STD of Returns
      8. 8 18.08 Financial Time Series - Return and Risk
      9. 9 18.09 Financial Time Series - Covariance and Correlation
      10. 10 18.10 Helpful DatetimeIndex Attributes and Methods
      11. 11 18.11 Filling NA Values with bfill, ffill and interpolation
    • 19 What is new in Pandas Version 1.0
      1. 1 19.01 Intro and Overview
      2. 2 19.02 Important Recap Pandas Display Options (Changed in Version 0.25)
      3. 3 19.03 Info() method - new and extended output
      4. 4 19.04 NEW Extension dtypes (nullable dtypes) Why do we need them
      5. 5 19.05 Creating the NEW extension dtypes with convert_dtypes()
      6. 6 19.06 NEW pd.NA value for missing values
      7. 7 19.07 The NEW nullable Int64Dtype
      8. 8 19.08 The NEW StringDtype
      9. 9 19.09 The NEW nullable BooleanDtype
      10. 10 19.10 Addition of the ignore_index parameter
      11. 11 19.11 Removal of prior Version Deprecations
    • 20 Python Basics
      1. 1 20.01 Intro
      2. 2 20.02 First Steps
      3. 3 20.03 Variables
      4. 4 20.04 Data Types Integers and Floats
      5. 5 20.05 Data Types Strings
      6. 6 20.06 Data Types Lists (Part 1)
      7. 7 20.07 Data Types Lists (Part 2)
      8. 8 20.08 Data Types Tuples
      9. 9 20.09 Data Types Sets
      10. 10 20.10 Operators and Booleans
      11. 11 20.11 Conditional Statements (if, elif, else, while)
      12. 12 20.12 For Loops
      13. 13 20.13 Key words break, pass, continue
      14. 14 20.14 Generating Random Numbers
      15. 15 20.15 User Defined Functions (Part 1)
      16. 16 20.16 User Defined Functions (Part 2)
      17. 17 20.17 User Defined Functions (Part 3)
      18. 18 20.18 Visualization with Matplotlib
      19. 19 20.19 Python Basics Quiz Solution
    • 21 The Numpy Package
      1. 1 21.01 Introduction to Numpy Arrays
      2. 2 21.02 Numpy Arrays Vectorization
      3. 3 21.03 Numpy Arrays Indexing and Slicing
      4. 4 21.04 Numpy Arrays Shape and Dimensions
      5. 5 21.05 Numpy Arrays Indexing and Slicing of multi-dimensional Arrays
      6. 6 21.06 Numpy Arrays Boolean Indexing
      7. 7 21.07 Generating Random Numbers
      8. 8 21.08 Performance Issues
      9. 9 21.09 Case Study Numpy vs. Python Standard Library
      10. 10 21.10 Summary Statistics
      11. 11 21.11 Visualization and (Linear) Regression
      12. 12 21.12 Numpy Quiz Solution
    • 22 Statistical Concepts
      1. 1 22.01 Statistics - Overview, Terms and Vocabulary
      2. 2 22.02 Population vs. Sample
      3. 3 22.03 Visualizing Frequency Distributions with plt.hist()
      4. 4 22.04 Relative and Cumulative Frequencies with plt.hist()
      5. 5 22.05 Measures of Central Tendency (Theory)
      6. 6 22.06 Coding Measures of Central Tendency - Mean and Median
      7. 7 22.07 Coding Measures of Central Tendency - Geometric Mean
      8. 8 22.08 Variability around the Central Tendency Dispersion (Theory)
      9. 9 22.09 Minimum, Maximum and Range with PythonNumpy
      10. 10 22.10 Percentiles with PythonNumpy
      11. 11 22.11 Variance and Standard Deviation with PythonNumpy
      12. 12 22.12 Skew and Kurtosis (Theory)
      13. 13 22.13 How to calculate Skew and Kurtosis with scipy.stats
      14. 14 22.14 How to generate Random Numbers with Numpy
      15. 15 22.15 Reproducibility with np.random.seed()
      16. 16 22.16 Probability Distributions - Overview
      17. 17 22.17 Discrete Uniform Distributions
      18. 18 22.18 Continuous Uniform Distributions
      19. 19 22.19 The Normal Distribution (Theory)
      20. 20 22.20 Creating a normally distributed Random Variable
      21. 21 22.21 Normal Distribution - Probability Density Function (pdf) with scipy.stats
      22. 22 22.22 Normal Distribution - Cumulative Distribution Function (cdf) with scipy.stats
      23. 23 22.23 The Standard Normal Distribution and Z-Values
      24. 24 22.24 Properties of the Standard Normal Distribution (Theory)
      25. 25 22.25 Probabilities and Z-Values with scipy.stats
      26. 26 22.26 Confidence Intervals with scipy.stats
      27. 27 22.27 Covariance and Correlation Coefficient (Theory)
      28. 28 22.28 Cleaning and preparing the Data - Movies Database (Part 1)
      29. 29 22.29 Cleaning and preparing the Data - Movies Database (Part 2)
      30. 30 22.30 How to calculate Covariance and Correlation in Python
      31. 31 22.31 Correlation and Scatterplots – visual Interpretation
      32. 32 22.32 What is Linear Regression (Theory)
      33. 33 22.33 A simple Linear Regression Model with numpy and Scipy
      34. 34 22.34 How to interpret Intercept and Slope Coefficient
      35. 35 22.35 Case Study (Part 1) The Market Model (Single Factor Model)
      36. 36 22.36 Case Study (Part 2) The Market Model (Single Factor Model)

    Share Study:


    guest
    0 Comments
    Inline Feedbacks
    View all comments
    0
    Would love your thoughts, please comment.x
    ()
    x