Predicting urban traffic (e.g., flow, speed) is of great importance to intelligent transportation systems and public safety, yet is very challenging as it is affected by two aspects: 1) complex spatio-temporal correlations of urban traffic, including spatial correlations between locations along with temporal correlations among different timestamps; 2) diversity of such spatio-temporal correlations, which vary from location to location and depend on the surrounding geographical information, e.g., points of interests and road networks. To tackle these challenges, we proposed a deep-meta-learning based traffic model, entitled ST-MetaNet, to collectively predict urban traffic in all location at once. ST-MetaNet employs a sequence-to-sequence network architecture, consisting of an encoder to learn historical traffic information and a decoder to make predictions step by step. More specifically, the encoder and decoder have the same network structure, which contains a recurrent neural network (RNN) to encode the urban traffic, a meta graph attention network (Meta-GAT) to capture diverse spatial correlations, and a meta recurrent neural network (Meta-RNN) to consider diverse temporal correlations. Extensive experiments were conducted based on two real-world datasets to illustrate the effectiveness of ST-MetaNet against several state-of-the-art methods.