As a personal passion project, I dive into synthetic sustainable fashion survey data using R and explore variable effects on consumers' sustainable shopping tendencies. Age and Occupation are the main variables investigated as they pose immediate impact on the decision to shop sustainably.
Using synthetic data from Salifort Motors, this project dives into employee retention rates, and analyzes data to find solutions in increasing employee retention. This project looks into designing a model that predicts whether an employee will leave the company based on their department, number of projects, average monthly hours, and any other data points that may affect our variable.
This project primarily involves on data cleaning and preparation for analyzis. I begin by removing duplicates, standardize the data, adjust data types for visualization, and address or remove null values.
Focused on high cholesterol levels in observations from the Framingham Study to visualize strong association between cholesterol levels and heart disease with Hypothesis Testing. Estrogen levels were also studied through A/B testing to question association as a consequential effect from the Nurses' Health Study (1976).
Looking at data from the FCC, I analyzed which areas in California get the most spam calls, when they occur, the target demographic, aiming to develop an effective pricing strategy.
Analyzing Airbnb data, I focused on listings in Albany, New York to create visuals regarding an average price depending on the neighborhood, the typical number of bedrooms, and its corresponding price ranges. These visuals could help potential travelers understand average Airbnb costs at their destination.