Enforcing contact constraints accurately in many-body simulations is extremely challenging yet critically important for achieving high fidelity. Optimization-based discrete element methods enforce contacts geometrically through complementarity constraints leading to a differential variational inequality problem. Compared to force penalty methods, they allow for the use of much larger time steps at the expense of solving a nonlinear complementarity problem each time step. We present our recent work on accelerating the solution of complementarity problems using a novel Tensor Train decomposition approach. We demonstrate that this approach displays sublinear scaling of precomputation costs, may be efficiently updated across Newton iterations as well as across simulation time steps, and leads to a fast, optimal complexity solution of the Newton step. This allows our method to gain an order of magnitude speedups over state-of-the-art preconditioning techniques for moderate to large-scale systems, hence mitigating the computational bottleneck of second order methods. Next, we discuss our MPI implementation of first and second order solvers for complementarity problems and demonstrate scalability on thousands of processors.