Gaussian process (GP) for machine learning has been well studied over the past two decades and is now widely used in many sectors. However, the design of low-complexity GP models still remains a challenging research problem. In this paper, we propose a novel scalable GP regression model for processing big datasets, using a large number of parallel computation units. In contrast to the existing methods, we solve the classic maximum likelihood based hyper-parameter optimization problem by a carefully designed distributed alternating direction method of multipliers (ADMM). The proposed method is parallelizable over a large number of computation units. Simulation results confirm the benefits of the proposed scalable GP model over the state-of-the-art distributed methods.