Skip to content

shubham5728/Python_for_DATASCIENCE

Repository files navigation

Python_for_DATSCIENCE

Numpy

NumPy library in Python, NumPy is the backbone of scientific computing, machine learning, and data science workflows. This showcases array creation, manipulation, broadcasting, linear algebra operations, and performance optimizations such as vectorization.

Link - (https://github.com/shubhamkumawat789/Python_for_DATASCIENCE/blob/main/Numpy.ipynb)

What I exploed:

  1. Introduction to NumPy – Overview of NumPy arrays vs Python lists.

  2. Array Creation – np.zeros, np.ones, np.arange, np.linspace.

  3. Array Attributes – Exploring shape, dimensions, item size, and data types.

  4. Reshaping & Indexing – Transforming arrays with .reshape() and accessing elements.

  5. Broadcasting – Demonstrating element-wise operations without loops.

  6. Mathematical Functions – Applying trigonometric and logarithmic functions.

  7. Linear Algebra – Dot product, QR decomposition, SVD, and matrix operations.

  8. Random Numbers – Using np.random for generating synthetic data.

  9. Vectorization – Comparing loop-based vs vectorized computations.

  10. Saving and loading - arrays in binary .npy format, text .csv format with custom delimiters and multiple arrays together in compressed .npz format.

Pandas

Pandas is a powerful and flexible open-source data analysis and manipulation library for Python. It provides data structures and functions designed to efficiently handle structured data, including tabular data such as spreadsheets and SQL tables.

Link - (https://github.com/shubhamkumawat789/Python_for_DATASCIENCE/blob/main/Pandas.ipynb)

What I Explored:

1.Introduction to Pandas : Learned what Pandas is and why it is used in data science (handling structured data, especially tables).

2.Core Data Structures: Series → one-dimensional labeled array. DataFrame → two-dimensional labeled table (rows & columns).

3.Data Handling : Creating Series and DataFrames from lists, dictionaries, and other data.

4.Viewing and inspecting datasets (head(), tail(), info(), describe()).

5.Indexing & Selection: Accessing rows/columns using labels (loc) and positions (iloc).

6.Data Cleaning: Handling missing values, replacing data, and dropping unnecessary columns/rows.

7.Basic Operations: Sorting, filtering, applying functions, and simple statistics.

Matplotlib

I recently compiled my learning notes on Matplotlib, one of Python’s most powerful libraries for data visualization. Create high-quality plots for analytics and reporting. Customize visualizations for clarity and storytelling. Apply visualization techniques to real datasets in a professional context.

Link - (https://github.com/shubhamkumawat789/Python_for_DATASCIENCE/blob/main/Matplotlib.ipynb)

What I Explored :

1.Basic Plotting Functions Line plots, bar charts, histograms, scatter plots, and pie charts.

2.Customization Adding labels, titles, and legends. Controlling colors, markers, line styles, and grid options. Enhancing readability with annotations and ticks formatting.

3.Subplots & Multiple Figures Managing layouts with multiple charts. Exploring figure sizes and aspect ratios.

4.Advanced Features Stack plots and 3D plotting. Combining Matplotlib with libraries like NumPy and Pandas.

5.Saving animations to video file To save the animation to the GIF format.

Seaborn

Seaborn is a Python data visualization library built on top of Matplotlib. Why it’s useful: It provides a high-level interface for creating beautiful and informative statistical graphics with less code.

Link - (https://github.com/shubhamkumawat789/Python_for_DATASCIENCE/blob/main/Seaborn.ipynb)

What i Explored:

1.Built line plots and relational plots to study multi-dimensional relationships.

2.Applied bar, count, and categorical plots to analyze business, demographic, and financial data.

3.Leveraged violin, box, and strip plots to explore variability and detect outliers.

4.Used heatmaps and pair plots for pattern discovery and correlation mapping.

5.Enhanced visual appeal through color palettes, ordering, and styling techniques.

About

A comprehensive learning repository containing Python, mathematics, data visualization, machine learning, deep learning, and R programming code developed while building core data science skills.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors