Data-driven graduate student specializing in statistical modeling, machine learning, and LLM integration. Transforming complex datasets into actionable insights through advanced analytics and visualization.
Explore My Workπ§ dinakarreddy2027@gmail.com
π± +1 (856) 652-3960
π Glassboro, NJ
π MS Data Science @ Rowan University
π GPA: 3.70/4.0
I'm a meticulous data science graduate student with a passion for creating statistical models, scaling data pipelines, and leveraging Large Language Models to drive meaningful outcomes. My academic journey has been focused on forecasting, multivariate analysis, end-to-end data processing, and real-time sentiment analysis integration.
What sets me apart is my ability to bridge the gap between complex technical implementation and compelling visual storytelling. I believe in making data-driven decisions by extracting actionable insights from the most complex datasets, whether it's through traditional statistical methods or cutting-edge AI techniques.
My experience spans from developing user-friendly applications for data structuring to building comprehensive analytical pipelines using modern cloud architectures. I'm particularly excited about the intersection of traditional data science and modern LLM capabilities.
Enterprise-grade data pipeline with medallion architecture
Built a comprehensive Databricks pipeline using medallion architecture (Bronze-Silver-Gold) to process billing and resource usage data. Designed dimensional models and delivered actionable insights through analytical tables.
Modular PySpark pipeline with Delta Lake integration
Engineered a modular PySpark pipeline to clean and standardize timestamped stock trading data. Applied custom transformers and used Nutter for unit testing. Stored outputs in silver layer Delta tables and created gold-level analytics on Apple trends, return rates, and volumes.
AI-powered stock prediction with news sentiment integration
Forecasted Microsoft stock prices using ARIMA, LSTM, and Linear Regression models. Integrated LLM-powered sentiment analysis from news articles to enrich market predictions with contextual insights.
Statistical analysis enhanced with RAG-powered insights
Performed comprehensive statistical analysis on insurance data using traditional methods. Implemented RAG (Retrieval-Augmented Generation) to query domain knowledge from PDFs and generate LLM-driven insights.
Advanced statistical modeling and hypothesis testing
Conducted sophisticated multivariate analysis including PCA, Factor Analysis, MANOVA, and Hotelling's TΒ² tests. Applied Box-Cox transformations and created 3D visualizations for complex insurance datasets.
Multivariate normality, Hotelling's TΒ², MANOVA
Compared dominant vs non-dominant bone mineral content using Hotellingβs TΒ². Used MANOVA to identify performance differences in triathlon disciplines (SWIM and BIKE) across age groups. Constructed confidence intervals and validated multivariate normality.
Latent structure discovery using PCA, FA, and MANOVA
Conducted PCA and Factor Analysis on daily stock prices to extract 23 latent factors. Interpreted rotated loadings to group companies. Compared smoker/non-smoker and regional groups in insurance data using MANOVA and Hotellingβs TΒ².
Spatiotemporal anomaly detection using visual analytics
Analyzed chemical contamination trends in Boonsong Lekagul Wildlife Preserve using time-series data. Linked pollution events to wildlife impact, and built Tableau dashboards to explore pollutant patterns and support policy insights.
Rowan University
GPA: 3.70/4.0
Relevant Coursework: