Vitamin A deficiency (VAD) is a public health problem worldwide. For countries with a high per capita consumption of maize, breeding varieties with higher provitamin A carotenoid content than normal yellow maize — biofortification — can be a viable strategy to reduce VAD. Selection for provitamin A carotenoid content uses molecular markers and phenotypic data generated using expensive and laborious wet lab analyses. Near-infrared spectroscopy (NIRS) could be a fast and cheap method to measure carotenoids. This dataset contains carotenoid and NIRS data from 1857 tropical maize samples used as a training set to predict provitamin A carotenoid content of an independent set of 650 tropical maize samples using Bayesian linear regression models. The datasets contain information about specific carotenoids measured and the NIRS values measured at different wavelengths. The results of the analysis are described in the accompanying article.