Sustainability has become a key concern in process systems engineering. Renewable energy, such as biofuels, and renewable materials, such as bioproducts, could replace their non-renewable, petroleum-based counterparts. However, there remain many challenges in producing biofuels and bioproducts economically and efficiently. There are many different biomass feedstocks, processes to convert them, and many different possible biofuels and bioproducts to produce. Furthermore, prices and demands of biofuels and bioproducts are uncertain. The variation of price or demand of one bioproduct could influence price or demand of another, further complicating the problem. An approach that can identify economical, efficient, and sustainable biofuel and bioproduct production processes from the myriad possible options while also considering correlated and uncorrelated price and demand uncertainties of the final bioproducts is required. In this work, a data-driven decision-making framework is proposed for biomass processing network design that directly integrates machine learning with robust optimization. Principal component analysis (PCA) is used to identify latent uncertainties behind observed uncertainty data. A kernel density estimation approach captures probability distributions of the projected uncertainty data extracted from PCA. This uncertainty data analysis approach is applied to a bioconversion product and process network to identify cost-effective and environmentally-friendly biofuels and bioproducts production pathways. Our approach identifies a total annualized cost of $ 18.3M/y, 6 % lower than the cost found with conventional adaptive robust optimization.