I'm a Statistics undergraduate student at ENCE/IBGE, focused on Data Analysis, Exploratory Data Analysis, Predictive Modeling, Data Visualization, Machine Learning, and Generative AI applications.
I build projects using Python, R, SQL, FastAPI, Streamlit, LangChain, ChromaDB, LLMs, RAG pipelines, automated tests, and CI/CD workflows, combining statistical thinking with practical software development.
REST API for income prediction and statistical analysis using socioeconomic data, machine learning concepts, automated validation, tests, and continuous integration.
Main features:
- API built with FastAPI
- Predictive modeling workflow with Python
- Input validation with schemas
- Automated tests with Pytest
- CI/CD pipeline with GitHub Actions
- Organized backend structure with routes, schemas, security and prediction logic
- Practical project focused on data, APIs and model deployment concepts
๐ IBGE RAG Chatbot
Semantic search application for public IBGE datasets using RAG, Gemini API, ChromaDB, Streamlit, local embeddings, metadata filtering, query expansion, and public deployment.
Main features:
- Semantic search over IBGE XLS/CSV tables
- Local embeddings and vector storage with ChromaDB
- Gemini API integration with fallback demo mode
- Dark mode interface with Streamlit
- Query expansion to improve retrieval
- Metadata filtering for indicators and coefficients of variation
- Retriever evaluation with Precision@k, Recall proxy, MRR and NDCG
- Automated tests with Pytest
- CI/CD with GitHub Actions
Statistical analysis project focused on social data, correlations, data visualization, and reproducible technical documentation using R, R Markdown, and Quarto.
Academic studies applying statistics to public policy problems, social indicators, data visualization, and IBGE microdata analysis.
Personal scripts and small applications built with Python, Pandas, NumPy, Matplotlib, Object-Oriented Programming, and automation workflows.
- Advanced statistical modeling
- Machine learning with Python and R
- API development and backend architecture
- RAG pipelines and LLM applications
- Data engineering fundamentals
- Testing, documentation and CI/CD practices
B.Sc. in Statistics
ENCE/IBGE โ Escola Nacional de Ciรชncias Estatรญsticas
2023 โ 2028 expected
Production Engineering
UNESA
2022 โ 2023
- Back-End Programmer โ SENAI
- Object-Oriented Programming with Python โ ENEP
- Introduction to Data Science โ ENEP
- Concepts and Applications of the Demographic Census in Public Policy โ IBGE
- Plain Language Fundamentals โ ENEP
- ๐ง Email: oliveiraggpedro@gmail.com
- ๐ผ LinkedIn: LinkedIn/pedro
- ๐งโ๐ป GitHub: github.com/justmetro
Turning data into insight, and insight into useful solutions.