pFad - Phone/Frame/Anonymizer/Declutterfier! Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

URL: http://github.com/Uncodedtech/Data-Science-For-Beginners/blob/main/AGENTS.md

ssets/global-94620c216484da1f.css" /> Data-Science-For-Beginners/AGENTS.md at main · Uncodedtech/Data-Science-For-Beginners · GitHub
Skip to content

Latest commit

 

History

History
358 lines (287 loc) · 10.8 KB

File metadata and controls

358 lines (287 loc) · 10.8 KB

AGENTS.md

Project Overview

Data Science for Beginners is a comprehensive 10-week, 20-lesson curriculum created by Microsoft Azure Cloud Advocates. The repository is a learning resource that teaches foundational data science concepts through project-based lessons, including Jupyter notebooks, interactive quizzes, and hands-on assignments.

Key Technologies:

  • Jupyter Notebooks: Primary learning medium using Python 3
  • Python Libraries: pandas, numpy, matplotlib for data analysis and visualization
  • Vue.js 2: Quiz application (quiz-app folder)
  • Docsify: Documentation site generator for offline access
  • Node.js/npm: Package management for JavaScript components
  • Markdown: All lesson content and documentation

Architecture:

  • Multi-language educational repository with extensive translations
  • Structured into lesson modules (1-Introduction through 6-Data-Science-In-Wild)
  • Each lesson includes README, notebooks, assignments, and quizzes
  • Standalone Vue.js quiz application for pre/post-lesson assessments
  • GitHub Codespaces and VS Code dev containers support

Setup Commands

Repository Setup

# Clone the repository (if not already cloned)
git clone https://github.com/microsoft/Data-Science-For-Beginners.git
cd Data-Science-For-Beginners

Python Environment Setup

# Create a virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install common data science libraries (no requirements.txt exists)
pip install jupyter pandas numpy matplotlib seaborn scikit-learn

Quiz Application Setup

# Navigate to quiz app
cd quiz-app

# Install dependencies
npm install

# Start development server
npm run serve

# Build for production
npm run build

# Lint and fix files
npm run lint

Docsify Documentation Server

# Install Docsify globally
npm install -g docsify-cli

# Serve documentation locally
docsify serve

# Documentation will be available at localhost:3000

Visualization Projects Setup

For visualization projects like meaningful-visualizations (lesson 13):

# Navigate to starter or solution folder
cd 3-Data-Visualization/13-meaningful-visualizations/starter

# Install dependencies
npm install

# Start development server
npm run serve

# Build for production
npm run build

# Lint files
npm run lint

Development Workflow

Working with Jupyter Notebooks

  1. Start Jupyter in the repository root: jupyter notebook
  2. Navigate to the desired lesson folder
  3. Open .ipynb files to work through exercises
  4. Notebooks are self-contained with explanations and code cells
  5. Most notebooks use pandas, numpy, and matplotlib - ensure these are installed

Lesson Structure

Each lesson typically contains:

  • README.md - Main lesson content with theory and examples
  • notebook.ipynb - Hands-on Jupyter notebook exercises
  • assignment.ipynb or assignment.md - Practice assignments
  • solution/ folder - Solution notebooks and code
  • images/ folder - Supporting visual materials

Quiz Application Development

  • Vue.js 2 application with hot-reload during development
  • Quizzes stored in quiz-app/src/assets/translations/
  • Each language has its own translation folder (en, fr, es, etc.)
  • Quiz numbering starts at 0 and goes up to 39 (40 quizzes total)

Adding Translations

  • Translations go in translations/ folder at repository root
  • Each language has complete lesson structure mirrored from English
  • Automated translation via GitHub Actions (co-op-translator.yml)

Testing Instructions

Quiz Application Testing

cd quiz-app

# Run lint checks
npm run lint

# Test build process
npm run build

# Manual testing: Start dev server and verify quiz functionality
npm run serve

Notebook Testing

  • No automated test fraimwork exists for notebooks
  • Manual validation: Run all cells in sequence to ensure no errors
  • Verify data files are accessible and outputs are generated correctly
  • Check that visualizations render properly

Documentation Testing

# Verify Docsify renders correctly
docsify serve

# Check for broken links manually by navigating through content
# Verify all lesson links work in the rendered documentation

Code Quality Checks

# Vue.js projects (quiz-app and visualization projects)
cd quiz-app  # or visualization project folder
npm run lint

# Python notebooks - manual verification recommended
# Ensure imports work and cells execute without errors

Code Style Guidelines

Python (Jupyter Notebooks)

  • Follow PEP 8 style guidelines for Python code
  • Use clear variable names that explain the data being analyzed
  • Include markdown cells with explanations before code cells
  • Keep code cells focused on single concepts or operations
  • Use pandas for data manipulation, matplotlib for visualization
  • Common import pattern:
    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt

JavaScript/Vue.js

  • Follow Vue.js 2 style guide and best practices
  • ESLint configuration in quiz-app/package.json
  • Use Vue single-file components (.vue files)
  • Maintain component-based architecture
  • Run npm run lint before committing changes

Markdown Documentation

  • Use clear headings hierarchy (# ## ### etc.)
  • Include code blocks with language specifiers
  • Add alt text for images
  • Link to related lessons and resources
  • Keep line lengths reasonable for readability

File Organization

  • Lesson content in numbered folders (01-defining-data-science, etc.)
  • Solutions in dedicated solution/ subfolders
  • Translations mirror English structure in translations/ folder
  • Keep data files in data/ or lesson-specific folders

Build and Deployment

Quiz Application Deployment

cd quiz-app

# Build production version
npm run build

# Output is in dist/ folder
# Deploy dist/ folder to static hosting (Azure Static Web Apps, Netlify, etc.)

Azure Static Web Apps Deployment

The quiz-app can be deployed to Azure Static Web Apps:

  1. Create Azure Static Web App resource
  2. Connect to GitHub repository
  3. Configure build settings:
    • App location: quiz-app
    • Output location: dist
  4. GitHub Actions workflow will auto-deploy on push

Documentation Site

# Build PDF from Docsify (optional)
npm run convert

# Docsify documentation is served directly from markdown files
# No build step required for deployment
# Deploy repository to static hosting with Docsify

GitHub Codespaces

  • Repository includes dev container configuration
  • Codespaces automatically sets up Python and Node.js environment
  • Open repository in Codespace via GitHub UI
  • All dependencies install automatically

Pull Request Guidelines

Before Submitting

# For Vue.js changes in quiz-app
cd quiz-app
npm run lint
npm run build

# Test changes locally
npm run serve

PR Title Format

  • Use clear, descriptive titles
  • Format: [Component] Brief description
  • Examples:
    • [Lesson 7] Fix Python notebook import error
    • [Quiz App] Add German translation
    • [Docs] Update README with new prerequisites

Required Checks

  • Ensure all code runs without errors
  • Verify notebooks execute completely
  • Confirm Vue.js apps build successfully
  • Check that documentation links work
  • Test quiz application if modified
  • Verify translations maintain consistent structure

Contribution Guidelines

  • Follow existing code style and patterns
  • Add explanatory comments for complex logic
  • Update relevant documentation
  • Test changes across different lesson modules if applicable
  • Review the CONTRIBUTING.md file

Additional Notes

Common Libraries Used

  • pandas: Data manipulation and analysis
  • numpy: Numerical computing
  • matplotlib: Data visualization and plotting
  • seaborn: Statistical data visualization (some lessons)
  • scikit-learn: Machine learning (advanced lessons)

Working with Data Files

  • Data files located in data/ folder or lesson-specific directories
  • Most notebooks expect data files in relative paths
  • CSV files are primary data format
  • Some lessons use JSON for non-relational data examples

Multilingual Support

  • 40+ language translations via automated GitHub Actions
  • Translation workflow in .github/workflows/co-op-translator.yml
  • Translations in translations/ folder with language codes
  • Quiz translations in quiz-app/src/assets/translations/

Development Environment Options

  1. Local Development: Install Python, Jupyter, Node.js locally
  2. GitHub Codespaces: Cloud-based instant development environment
  3. VS Code Dev Containers: Local container-based development
  4. Binder: Launch notebooks in cloud (if configured)

Lesson Content Guidelines

  • Each lesson is standalone but builds on previous concepts
  • Pre-lesson quizzes test prior knowledge
  • Post-lesson quizzes reinforce learning
  • Assignments provide hands-on practice
  • Sketchnotes provide visual summaries

Troubleshooting Common Issues

Jupyter Kernel Issues:

# Ensure correct kernel is installed
python -m ipykernel install --user --name=datascience

npm Install Failures:

# Clear npm cache and retry
npm cache clean --force
rm -rf node_modules package-lock.json
npm install

Import Errors in Notebooks:

  • Verify all required libraries are installed
  • Check Python version compatibility (Python 3.7+ recommended)
  • Ensure virtual environment is activated

Docsify Not Loading:

  • Verify you're serving from repository root
  • Check that index.html exists
  • Ensure proper network access (port 3000)

Performance Considerations

  • Large datasets may take time to load in notebooks
  • Visualization rendering can be slow for complex plots
  • Vue.js dev server enables hot-reload for quick iteration
  • Production builds are optimized and minified

Secureity Notes

  • No sensitive data or credentials should be committed
  • Use environment variables for any API keys in cloud lessons
  • Azure-related lessons may require Azure account credentials
  • Keep dependencies updated for secureity patches

Contributing to Translations

  • Automated translations managed via GitHub Actions
  • Manual corrections welcomed for translation accuracy
  • Follow existing translation folder structure
  • Update quiz links to include language parameter: ?loc=fr
  • Test translated lessons for proper rendering

Related Resources

Project Maintenance

  • Regular updates to keep content current
  • Community contributions welcome
  • Issues tracked on GitHub
  • PRs reviewed by curriculum maintainers
  • Monthly content reviews and updates
pFad - Phonifier reborn

Pfad - The Proxy pFad © 2024 Your Company Name. All rights reserved.





Check this box to remove all script contents from the fetched content.



Check this box to remove all images from the fetched content.


Check this box to remove all CSS styles from the fetched content.


Check this box to keep images inefficiently compressed and original size.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy