Course - detail

LEB5059 - Data science for Biosystems


Credit hours

In-class work
per week
Practice
per week
Credits
Duration
Total
2
2
8
15 weeks
120 hours

Instructor
Marcos Roberto Benso

Objective
This course aims to present and discuss the main statistical methods for data analysis, inference, and optimization for agricultural systems engineering. The class content promotes the understanding of the fundamentals of scientific discovery under uncertainty. The course projects promote qualification for the application of dada-driven methods for modeling and inference through data analysis, machine learning, and optimization. At the end of the course, the following skills are expected: computational thinking, problem solving, programming using languages such as Python and R and knowledge discovery using data-driven methods.

Content
1. Fundamentals of data science for biosystems engineering. Types of data collection, structures, and best practices for data storage.
2. Statistical inference: maximum likelihood estimator (MLE) for linear models. Properties of estimators and applications to real-world problems.
3. Uncertainty analysis in estimators using bootstrapping.
4. Shannon information theory and applied optimization techniques: heuristic, greedy, and metaheuristic searches.
5. Supervised learning models: logistic regression, gradient method, and stochastic gradient. Neural networks for regression problems.
6. Case studies and applications of inferential models in biosystems engineering.

Bibliography
HASTIE, Trevor; TIBSHIRANI, Robert; FRIEDMAN, Jerome. The Elements of Statistical Learning: data mining, inference, and prediction. 2. ed. Nova Iorque: Springer, 2009. 745 p. Disponível em: https://doi.org/10.1007/978-0-387-84858-7. Acesso em: 30 jul. 2025. MORETTIN, Pedro Alberto; BUSSAB, Wilton de Oliveira; SILVA, Aldir Coelho Corrêa da. Estatística Básica. 6. ed. São Paulo: São Paulo Novatec Editora Ltda, 2017. 554 p.
SCAVETTA, Rick J.; ANGELOV, Boyan; SILVA, Aldir Coelho Corrêa da. Python e R para o cientista de dados moderno: o melhor de dois mundos. São Paulo: São Paulo Novatec Editora Ltda, 2022. 200 p.