With the rapid development of China's economy, people's living and industrial water consumption has increased significantly, and the resulting sewage discharge has also increased. At present, the source of sewage generation is widely distributed, the discharge time is not fixed, and there are many indicators for monitoring and monitoring of sewage. Traditional solutions have problems such as long collection period, large number of evaluation indicators, high dimension, and low system processing efficiency. In view of the above problems, this topic, based on an in-depth analysis of the characteristics of waste water data, combined with big data processing technology, researched and implemented a pollution index analysis system based on big data. On the basis of reducing the sewage data dimension, the k-means clustering algorithm is used to enhance the clustering effect of sewage big data, and the main over-standard value of waste water is clarified. The reference values of waste water treatment in this paper are COD, TN, TP, NH3-N. The IBR biological treatment process is selected to treat the sewage and the treatment results are obtained.