Computational Graph-based Framework for Integrating Econometric Models and Machine Learning Algorithms
Computational Graph-based Framework for Integrating Econometric Models and Machine Learning Algorithms in Emerging Data-Driven Analytical Environments
Co-Principal Investigator: Ram M. Pendyala, Director and Professor, School of Sustainable Engineering and the Built Environment
Project Duration: 24 months
Project Budget (Federal UTC Funds): $49,995
Project Budget (Cost-share): $25,000
Abstract
In an era of big data and emergence of new technologies such as app-based ride services, there are growing opportunities for better understanding human mobility patterns from newly available data sources. Statistical models have been mainly utilized to uncover and rigorously calibrate the influence of significant factors; and machine learning algorithms have been used to explore complex patterns through improved computing efficiency for large datasets. Focusing on discrete choice modeling applications, this research aims to introduce an open-source computational graph (CG)-based modeling framework for integrating the strengths of econometric models and machine learning algorithms. In particular, multinomial logit (MNL), nested logit (NL), and integrated choice and latent variable (ICLV) models are selected to demonstrate the performance of the proposed graph-oriented functional representation. Furthermore, the calculation of the gradient in the log-likelihood function and associated Hessian matrix is systematically accomplished using automatic differentiation (AD). Using the 2017 National Household Travel Survey data and an open-source dataset, we compare estimation results from the proposed methods with those obtained from two open-source packages, namely Biogeme and Apollo. The results indicate that the CG-based choice modeling approach can produce consistent estimates of parameters and accurate calculations for the gradients of the estimated parameters with substantial computational efficiency.