We develop a class of nearest-neighbor mixture models that provide direct, computationally efficient, probabilistic modeling for non-Gaussian geospatial data. The class is defined over a directed acyclic graph, which implies conditional independence in representing a multivariate distribution through factorization into a product of univariate conditionals, and is extended to a full spatial process. We model each conditional as a mixture of spatially varying transition kernels, with locally adaptive weights, for each one of a given number of nearest neighbors. The modeling framework emphasizes direct spatial modeling of non-Gaussian data, in contrast with approaches that introduce a spatial process for transformed data, or for functionals of the data probability distribution. We study model construction and properties analytically through specification of bivariate distributions that define the local transition kernels. This provides a general strategy for modeling different types of non-Gaussian data based on the bivariate distribution family, and offers avenues to incorporate spatial association via different dependence measures. Regarding computation, direct spatial modeling facilitates efficient, full simulation-based inference; moreover, the framework leverages its mixture model structure to avoid computational issues that arise from large matrix operations, and thus has the potential to achieve scalability. We illustrate the methodology using synthetic data examples and an analysis of Mediterranean Sea surface temperature observations.