The widespread use of smart meters and other sensors in smart grid has resulted in unprecedented amounts of data being generated at high spatial and temporal resolutions. This high volume data is being generated at a high velocity and comes from a variety of sources, and is designated as “big data” by researchers and practitioners in various domains, including the smart grid domain. Predictive modeling can be used to learn from this data about how electricity consumption patterns change over time and when peak demand periods occur. Utilities routinely face the challenge of ensuring uninterrupted electric supply during peak demand periods. The widespread practice to address this challenge is by demand response (DR), whereby utilities ask customers to reduce their consumption during peak demand periods according to a-priori agreements. For DR, utilities need to make decisions about when, by how much, and how to reduce consumption. While day-ahead predictions have long been used to make these decisions, in this dissertation, we address the problem of making predictions and decisions at a few hours’ advance notice whenever necessitated by the changing conditions of the grid. In particular, we formulate and address the problem of dynamic demand response (D2R) in smart grids that involves balancing supply and demand in real-time and adapting to dynamically changing conditions by automating and transforming the DR planning process. We also focus on the requirements and challenges of prediction modeling of electricity consumption data and its evaluation to enable D2R. For example, the prediction models for D2R must satisfy often conflicting requirements of high accuracy and low computational complexity for fast predictions. First, we identify the limitations of existing measures for evaluating the performance of electricity consumption prediction models in smart grid and propose a suite of performance measures that address accuracy, reliability, and cost. For example, while common error measures only consider the absolute difference between the predicted and observed values, the sign of the difference is very useful in determining if it was an under-prediction or over-prediction, in applications concerned with predicting peaks during D2R. Our application dependent measures with parametrized coefficients set by domain experts allow model comparison that is meaningful for specific smart grid applications. While our measures have been proposed in context of smart grid, their scope and analysis of their use is relevant for applications beyond the smart grid domain. Our analysis of the measures offers deeper insight into models’ behavior and their impact on real applications, enables intelligent cost-benefit trade-offs between models, and offers a comprehensive goodness of fit for picking the “right” model. We formulate and address the partial data problem that arises when real-time data from all sensors is not available at the utilities, making it impossible to do reliable predictions for D2R using traditional time series based models. We propose a novel approach that extends the notion of time series dependency to discover a small subset of “influential” sensors, and uses real-time data only from them to enable accurate predictions for all sensors. Next, we address the problem of predicting reduced consumption during DR, which is required to do planning for D2R, as well as to select customers for participation in D2R and in calculating their compensation. The abrupt change in consumption profiles at the beginning and end of the DR period and the usually short durations of DR make it impossible to reliably use time series based models for predictions. To address the unique challenges of reduced consumption prediction, we propose an ensemble model that uses different sequences of daily consumption on DR event days and contextual attributes for prediction. In particular, we leverage big data on reduced consumption to learn a single ensemble model for diverse customers over different time intervals, thus achieving high cost reduction in terms of number of models trained. The reduction is of the order of nxL, where n is the number of customers and L is the number of intervals in the DR period. Also, the low computational complexity of our model makes it ideal for dynamic decision making required for D2R. The prediction modeling problems addressed in this dissertation are motivated by real-world applications within the University of Southern California (USC) campus microgrid. For training and evaluating our models, we use data from a variety of sources: fine-grained electricity consumption data from the USC campus micro-grid collected at every 15-minutes for over 7 years; hourly weather data collected from the NOAA weather station on campus; as well as schedule data from the campus. Our models have been implemented and integrated with the USC Facilities Management Services’ (FMS) solution for D2R on the USC campus.