60 RBRAS e 16 SEAGRO

Sessões Temáticas

 

Sessão Temática I (STI): História da Biometria e RBras no Brasil 

Coordenadora: Clarice Garcia Borges Demétrio (ESALQ/USP)

Membros participantes:

- Paulo Justiniano Ribeiro Junior  (UFPR)

- Roseli A. Leandro (ESALQ/USP)

 

 

Sessão Temática II (STII): Estatística Experimental

Coordenadora: Roseli A. Leandro (ESALQ/USP)

Membros participantes:

- Fabyano Fonseca e Silva (UFV)

- Renata Alcarde Sermarini (ESALQ/USP)

- Eric Batista Ferreira (UNIFAL/MG)

 

Concepts of Experimental Statistics in Gene Expression Studies in Agriculture and Animal Sciences: Fabyano Fonseca e Silva (UFV)

Abstract: The aim is to present and discuss experimental designs issues arising with gene expression studies in Agriculture and Animal Sciences. Examples considering quantitative real-time PCR (qPCR) and next-generation RNA sequencing (RNAseq) will be detailed under an Experimental Statistics viewpoint.  The relevance and usage of traditional concepts like biological and technical replicates, blocking, time dependence and factor interactions will be also considered. The idea is to exploit the link between gene expression experiments and classical ANOVA family models in order to recommend models with different complexity levels (including mixed models and Bayesian modeling) for different situations. Program codes in SAS proc mixed and R (DEGseq and baySeq packages) as well as real datasets will be also available.  

 

Diagramas de Hasse e aplicações à Experimentação Agronômica: Renata Alcarde Sermarini (ESALQ/USP)

Resumo: A crescente aplicação da estatística às mais diversas áreas de pesquisa, tem definido delineamentos complexos, dificultando assim seu planejamento e análise. O diagrama de Hasse é uma ferramenta gráfica, que tem como objetivo facilitar a compreensão da estrutura presente entre os fatores experimentais. Além de uma melhor visualização do experimento o mesmo fornece, através de regras propostas na literatura, os números de graus de liberdade de cada fator. Sob a condição de ortogonalidade do delineamento, podem-se obter também as matrizes núcleo das formas quadráticas para as somas de quadrados e as esperanças dos quadrados médios, propiciando a razão adequada para a aplicação do teste F. A fim de exemplificar as regras e o emprego desta ferramenta, serão utilizados experimentos da área agronômica, detalhando-se a estrutura experimental e indicando-se o esquema da análise da variância.

 

Análise de Experimentos Sensoriais: Eric Batista Ferreira (UNIFAL/MG)

Resumo: O uso de seres humanos como instrumentos de medida está tendo um papel cada vez mais importante no desenvolvimento de produtos, e os consumidores vem cada vez mais ditando as inovações da indústria. Particularmente na indústria de alimentos, dados sensoriais de descrição e preferência têm sido utilizados como base para a tomada de decisões. Esse cenário exige uma compreensão mais ampla da modelagem e análise de dados, bem como do próprio instrumento de medida. A base científica necessária para o desenvolvimento de certo setor industrial se estende da Psicologia à modelagem Matemática e Estatística, aliadas ao conhecimento específico do produto em questão, seja ele um alimento, um televisor, um celular ou qualquer outro objeto. Na pesquisa em alimentos, dados são produzidos e utilizados de forma similar ao uso industrial, e ambientes acadêmicos especificamente devotadas às ciências sensoriais e ciências do consumidor existem em vários países. O desenvolvimento e aplicação da Estatística e técnicas de análise de dados nessas áreas é chamado de Sensometria. Nesse curso serão apresentadas algumas formas básicas de se analisar experimentos sensoriais, como os testes triangulares, pareados, duo trio e delineamentos comumente analisados por análise de variância.

 

Sessão Temática III (STIII): Survival Analysis: modeling and application

Coordenador: Francisco Louzada Neto (ICMC-USP)

- Sujit K. Ghosh (NC State University & SAMSI, USA)

- Fábio Nogueira Demarqui (UFMG)

- Gleici Castro Perdoná (FMRP – USP)

 

A New lifetime model for multivariate survival data with a cure fraction: Francisco Louzada (ICMC-USP)

Abstract: In this talk I present a new lifetime model for multivariate survival data with a surviving fraction. The model is developed under the presence of m types of latent competing risks and proportion of survival individuals. The use of Markov Chain Monte Carlo (MCMC) methods is explored to develop a Bayesian analysis for the proposed model. A simulation study is performed in order to analyze the frequentist coverage probabilities of credible interval derived from the posteriors. The proposed modeling is illustrated through real datasets on finance and medical areas. This work is co-authored by Vicente G. Cancho, Dipak K. Dey and Gladys D.C. Barriga.

 

Nonparametric estimation of the conditional mean residual life function: Sujit K. Ghosh (NC State University and SAMSI)

Abstract: The conditional mean residual life (MRL) function is the  expected remaining lifetime of a system given survival past a particular time point and the values of a set of predictor variables. This function is a valuable tool in reliability and actuarial studies when the right tail of the distribution is of interest, and can be more informative than the survivor function. In this talk, we present theoretical limitations of some semi-parametric conditional MRL models, and propose two nonparametric methods of estimating the conditional MRL function. Asymptotic properties such as consistency and normality of the newly proposed estimators are established. The empirical properties of the proposed estimators, including bootstrap pointwise confidence intervals are illustrated using Monte Carlo simulations and the results based on new estimators are compared with two popular semi-parametric methods of analysis, for varying types of data. This is a joint work with Alexander McLain.

 

A fully Bayesian approach for modeling survival data with informative censoring: Fábio Nogueira Demarqui (Departamento de Estatística - UFMG)

Abstract: An important characteristic which distinguishes survival analysis from other areas in statistics is that survival data are usually censored. Censoring occurs when incomplete information is available about the survival time of some individuals. Most of the procedures found in the literature to model survival data are based on the assumption of a non-informative censoring mechanism, i.e., based on the assumption that the failure and censored times are independent. In several real situations, however, this assumption is not valid, and the censoring mechanism is said to be informative. In this talk we present a fully parametric Bayesian approach for modeling survival data with informative censoring. Specifically, we propose a frailty model to account for dependence between failure and censored times. Conditionally on the frailty term, both failure and censored times are assumed to be independent and following different Weibull distributions. The proposed model enables one to identify the type of association and, consequently, the censoring mechanism of the data. In order to evaluate the performance of our model, we carried out a simulation study taking into account different sample sizes and censoring mechanisms. For comparison purposes, the Weibull regression model based on the assumption of non-informative censoring was also fitted to the simulated datasets. Finally, the usefulness of the proposed model is illustrated through the analysis of the survival times of 72 patients diagnosed with multiple myeloma. This is a joint work with Vinícius Diniz Mayrink and Renata Camila de Souza.

 

Exponentiated Modified Weibull Hazard model for Compare Risk of Death for Different States of Breast Cancer: Gleici Castro Perdoná (Department of Social Medicine, School of Medicine  - FMRP-USP, Brazil)

Abstract: Nowadays, the treatment and diagnostics in cancer are more efficient, decreasing the record of poor outcome such as death, recurrence at the end of follow-up, characterizing the presence of long term survivals (cure fraction) on the dataset. Because this improvement in medicine, understanding how the cancer evolving over time has become more desired. However, the performance of actions and treatments in this area still is used simple techniques of survival analysis. We propose to consider more flexible parametric survival models, capable to incorporate such current phenomenon. The Exponentiated Modified Weibull Hazard (EMW) model is a class of which extend several distributions used in the lifetime literature and allows for accommodating non-monotone hazard function shapes, such as bathtub and unimodal as well as presence of long term survivals.  This flexible formulation includes as particular cases the Exponentiated Weibull (EW), the Weibull (W) and the Exponentiated (E) model with long term survivals. In addition to these advantages, one parameter of the model  (λ) is directly connected with the intensity of occurrence of the event of interest (death or relapse) and this means, for a physician, a tool for interpretation and discrimination between stages of cancer. Thus, the parameter, λ, could be interpreted as a metric to be used to compare the risk between the stages. This model was applied for real breast cancer problems.

 

Sessão Temática IV (STIV): Sessão Embrapa

Coordenador: Waldomiro Barioni Júnior (EMBRAPA)

Membros participantes:

- Renato Limão (Palisade Brasil)

- Fernando Hernandez (Instrutor da Palisade Brasil)

Demonstração do software de análise de Risco “@RISK” e aplicações

 

Sessão Temática V (STV): Sessão Jovens Doutores

Coordenadora: Vera Tomazella (UFSCar), Taciana Villela Savian (ESALQ-USP) e Fábio Nogueira Demarqui (UFMG)

Membros participantes:

- Izabela Regina Cardoso de Oliveira (UFLA)

- Michelle Ferreira Miranda (IME-USP)

- Rafael Izbicki (UFSCar)

- Rafael Pimentel Maia (ESALQ-USP)

 

Modeling strategies for complex hierarchical and overdispersed data in the life sciences: Izabela Regina Cardoso de Oliveira (UFLA)

Abstract: We study the so-called combined models, generalized linear mixed models with extension to allow for overdispersion, in the context of genetics and breeding. Such flexible models accommodates cluster-induced correlation and overdispersion through two separate sets of random effects and contain as special cases the generalized linear mixed models (GLMM) on the one hand, and commonly known overdispersion models on the other. We use such models while obtaining heritability coefficients for non-Gaussian characters. Heritability is one of the many important concepts that are often quantified upon fitting a model to hierarchical data. It is often of importance in plant and animal breeding. Knowledge of this attribute is useful to quantify the magnitude of improvement in the population. For data where linear models can be used, this attribute is conveniently defined as a ratio of variance components. Matters are less simple for non-Gaussian outcomes. The focus is on time-to-event and count traits, where the Weibull-Gamma-Normal and Poisson-Gamma-Normal models are used. The resulting expressions are sufficiently simple and appealing, in particular in special cases, to be of practical value. The proposed methodologies are illustrated using data from animal and plant breeding. Furthermore, attention is given to the occurrence of negative estimates of variance components in the Poisson-Gamma-Normal model. The occurrence of negative variance components in linear mixed models (LMM) has received a certain amount of attention in the literature whereas almost no work has been done for GLMM. This phenomenon can be confusing at first sight because, by definition, variances themselves are non-negative quantities. However, this is a well understood phenomenon in the context of linear mixed modeling, where one will have to make a choice between a hierarchical and a marginal view. The variance components of the combined model for count outcomes are studied theoretically and the plant breeding study used as illustration underscores that this phenomenon can be common in applied research. We also call attention to the performance of different estimation methods, because not all available methods are capable of extending the parameter space of the variance components. Then, when there is a need for inference on such components and they are expected to be negative, the accuracy of the method is not the only characteristic to be considered.

 

Bayesian models for high-dimensional neuroimaging data: Michelle Ferreira Miranda (IME-USP)

Abstract: Advances in medical imaging technology have been improving and allowing researchers and clinicians to gain insights of unprecedented quality into the cerebral anatomical structures, connectivity patterns, and functional properties of the brain. Theses advances pose a challenge to develop automatic methods of classifying brain responses, identifying abnormalities, and revealing important effects of environmental and genetic factors on brain structure and function. Classical statistical tools are usually insufficient to model these data due to their complexity; data most often take the form of multidimensional arrays with intricate spatial correlation and functional changes that evolve over time. In this work, I present some contributions to the statistical methodology, motivated by the aforementioned challenges. In the first half, I talk about a class of spatial transformation models (STM) to model the varying association between imaging measures in a three-dimensional (3D) volume and a set of low-dimensional covariates. This part is motivated by a dataset on Attention Deficit Hyperactivity Disorder (Miranda, M., Zhu, H., Ibrahim, J. G. (2013). Bayesian spatial transformation models with application in neuroimaging data. Biometrics, v.69(4), p.1074-1083). In the second half, I change the perspective to explore some of the challenges faced by traditional classification methods when using high-dimensional covariates (e.g. neuroimaging data) and propose a tensor partition modeling framework to reduce data dimensionality to a manageable level. This part follows with an application of the data obtained by the Alzheimer`s Disease Neuroimaging Initiative (http://adni.loni.usc.edu/).

 

Nonparametric Conditional Density Estimation in a High-Dimensional Regression Setting - Rafael Izbicki (UFSCar)

Abstract: In some applications (e.g., in cosmology and economics), the regression E[Z|x] is not adequate to represent the association between a predictor x and a response Z because of multi-modality and asymmetry of f(z|x); using the full density instead of a single-point estimate can then lead to less bias in subsequent analysis.  However, there are currently no effective ways of estimating f(z|x) when x represents high-dimensional, complex data. In this work, we propose a new nonparametric estimator of f(z|x) that adapts to sparse (low-dimensional) structure in x.  The method is based on a direct expansion of f(z|x) in the eigen functions of a kernel-based operator. These basis functions are orthogonal with respect to the underlying data distribution, allowing fast implementation and tuning of parameters. We derive rates of convergence and show that the method adapts to the intrinsic dimension of the data. We demonstrate the effectiveness of the series method on images, spectra, and an application to photometric redshift estimation of galaxies.

 

A general multivariate competing risks mixed model: Rafael Pimentel Maia (ESALQ-USP)

Abstract: It is presented a class of multivariate mixed survival models for discrete and continuous time variables with a complex covariance structure in a context of quantitative genetic applications. The discrete time models used are multivariate variants of the discrete relative risk models, and the continuous time models are multivariate variants of the piecewise constant hazard models. These class of methods allow for regular parametric likelihood-based inference by exploring a coincidence of their likelihood functions and the likelihood functions of multivariate binomial mixed models for the discrete time, and the likelihood functions of a multivariate log-Poisson model for the continuous time (see Maia et al 2014a). The models include a dispersion parameter, which is essential for obtaining a decomposition of the variance of the trait of interest as a sum of parcels representing the additive genetic effects, environmental effects and unspecified sources of variability; as required in quantitative genetic applications. The methods introduced can be used in many applications in quantitative genetics although the discussion presented concentrates on longevity studies with competing risks. The problem of competing risks arises when the individuals in study may die from one of two or more possible causes. The methods are exemplified with a longevity study of Danish dairy sows. In this study the longevity is measured by two distinct variables: the number of days from the first parity to the culling day, and the number of survived parities (see also Maia et al 2014b). Additionally two general causes of calling are observed, death and slaughtering. The goal of this study was to characterize possible genetic aspects related to the longevity related to each specific cause of culling and their relations.