Manipulation
Data-Workbook-6-Python.docx covers Python basics from the bootcamp (December 2024 start), emphasizing control flow, loops, and data exploration. Days 2-4 build foundational scripting via classic challenges and CSV previews.
Control Structures: Solved FizzBuzz—loop 1-100, print "fizz" (div by 3), "buzz" (div by 5), "fizzbuzz" (both), else number; tested with full output pasted.
File Handling: Loaded GDP nominal per Capita CSV into DataFrame df; printed head(10), tail(5), selected CountryTerritory/UNRegion columns.
Explored GDP dataset basics: first/last rows, specific columns for regional insights. Day 4 group work on Day4PythonActivity.ipynb reinforced loops/conditionals with fun data experiments and outputs.
Validates Python fundamentals for scripting/automation, foundation for advanced tools like Pandas/SQL in Liverpool data roles.
Data manipulation and Analysis
Data-Workbook-6-Python.docx features Pandas exercises from the bootcamp (December 2024), building core data analysis via student performance data. Day 3 tasks demonstrate loading, exploration, and transformation workflows.
Data Loading/Exploration: Read CSV into DataFrame, displayed head(5), info() structure, describe() statistics with code/output screenshots.
Indexing/Slicing: Selected name column, name/mark columns, first 3 rows, filtered class=="Four" rows.
Manipulation: Added passed column (mark>=60), renamed mark→score, dropped passed column.
Performed aggregations: grouped by class for mean marks, counted students per class, averaged marks by gender. Created pivot tables (class rows, gender columns, mark values), grade column (A>=85,B 70-84,C 60-69,D<60), sorted by descending marks.
Proves production Pandas for ETL, grouping, and reporting—essential for Liverpool data pipelines alongside SQL/Power BI.
Kaggle Data set clean up