Introduction to Synthetic Data for Machine Learning
Chapter 1: Understanding Synthetic Data
Why Generate Artificial Data?
Real Data vs. Synthetic Data
Chapter 2: Basic Methods for Data Generation
The Idea of Data Generation Models
Generating Data from Statistical Distributions
Introduction to Rule-Based Systems
Generating Simple Numerical Data
Generating Simple Categorical Data
Hands-on Practical: Create Basic Synthetic Data
Chapter 3: Generating Synthetic Tabular Data
Understanding Tabular Data Structure
Independent Column Value Generation
Preserving Basic Column Correlations
Introduction to Data Anonymization Concepts
Hands-on Practical: Generate a Synthetic Table
Chapter 4: Introduction to Synthetic Image Data
Why Synthetic Data for Images?
Basic Image Properties: Pixels and Color
Creating Images with Simple Shapes and Patterns
Applying Noise and Simple Augmentations
Introduction to Rendering Simple Scenes
Challenges in Realistic Image Generation
Hands-on Practical: Generate Simple Synthetic Images
Chapter 5: Evaluating Synthetic Data Quality
Visual Inspection Methods
Basic Statistical Comparisons
Comparing Data Distributions
Concept of Fidelity vs. Utility
Chapter 6: Tools and Libraries Overview
Role of Software in Data Generation
Libraries for Basic Data Manipulation (NumPy, Pandas)
Introduction to Faker Library
Libraries for Simple Image Manipulation (Pillow, Scikit-image)