We study modeling joint densities over sets of random variables (next-step movements of multiple agents) which are conditioned on aligned observations (past trajectories). For this setting, we propose an autoregressive approach to model intra-timestep dependencies, where distributions over joint movements are represented by autoregressive factorizations. In our approach, factors are randomly ordered and estimated with a graph neural network to account for permutation equivariance, while a recurrent neural network encodes past trajectories. We further propose a conditional two-stream attention mechanism, to allow for efficient training of random factorizations. We experiment on trajectory data from professional soccer matches and find that we model low frequency trajectories better than variational approaches.