Small-Scale Variability in Bacterial Community Structure in Different Soil Types

Microbial spatial distribution has mostly been studied at field to global scales (i.e., ecosystem scales). However, the spatial organization at small scales (i.e., centimeter to millimeter scales), which can help improve our understanding of the impacts of spatial communities structure on microbial functioning, has received comparatively little attention. Previous work has shown that small-scale spatial structure exists in soil microbial communities, but these studies have not compared soils from geographically distant locations, nor have they utilized community ecology approaches, such as the core and satellite hypothesis and/or abundance-occupancy relationships, often used in macro-ecology, to improve the description of the spatial organization of communities. In the present work, we focused on bacterial diversity (i.e., 16S rRNA gene sequencing) occurring in micro-samples from a variety of locations with different pedo-climatic histories (i.e., from semi-arid, alpine, and temperate climates) and physicochemical properties. The forms of ecological spatial relationships in bacterial communities (i.e., occupancy-frequency and abundance-occupancy) and taxa distributions (i.e., habitat generalists and specialists) were investigated. The results showed that bacterial composition differed in the four soils at the small scale. Moreover, one soil presented a satellite mode distribution, whereas the three others presented bimodal distributions. Interestingly, numerous core taxa were present in the four soils among which 8 OTUs were common to the four sites. These results confirm that analyses of the small-scale spatial distribution are necessary to understand consequent functional processes taking place in soils, affecting thus ecosystem functioning.


Introduction
Microbial diversity is central to soil ecosystem functioning and the ecosystem services soils deliver, as microbial communities are involved in many biogeochemical processes, and their various interactions may affect these activities [1][2][3]. Despite this fundamental role, we still do not have a full understanding of the processes underpinning the emergence and maintenance of soil microbial diversity. Hubbell (2001) defined biodiversity as being synonymous with species richness and relative species abundance in space and time, where relative species abundance refers to the commonness or rarity of a species in relation to others, in a given community. The question of what makes a species common or rare has long been of empirical and theoretical interest in community ecology [4]. Despite a growing number of surveys interested in rare and core microbial communities, only a few recent studies have targeted bacterial spatial distributions, in a range of different ecosystems, such as water, guts, soil, or even the surface of stones [5][6][7][8][9]. These studies have highlighted the fact that the "rare biosphere" is constituted of a myriad of species. The rare communities are of interest as they may be of functional significance. It has been shown that rare taxa are a resource pool for responding to environmental changes [10], maintaining species diversity [11], and promoting functional redundancy [12]. Other studies conducted on aquatic ecosystems at a global scale have shown the presence of a rich and diverse rare biosphere among Archaea and Bacteria [13,14]. Hugoni et al. (2013) reported that the rare archaeal biosphere in the ocean should not solely be characterized as a seed bank of dormant cells; rather, it is a complex association of indigenous and itinerant cell types of different origins and with different fates that might contribute to microbial interaction networks and metabolic processes in the environment.
Although these have yielded a host of information at larger scales, microbial distribution patterns and their consequences for ecosystem functioning are far less documented at small scales [15][16][17], especially in soils. However, it is clear that cell-to-cell, cell-substrate, and cell-substratum interactions that all take place at small scales are likely to have significant effects on overall processes [18,19]. Previous studies have attempted to study bacterial communities and soil properties at sub-millimeter scales, in different contexts as different soil aggregate size classes, aggregate surfaces vs interiors, various soil fractions, or different soil pore distributions [20][21][22][23][24][25]. These studies, using several methods such as fingerprinting [21], clone libraries [24], pyrosequencing [20,22,25], or particulate organic matter characterization (POM) [23], suggested that microaggregates represent micro-environments that select specific bacterial lineages [26]. In order to evaluate the regulatory role of such interactions, the problem of spatial patterns and scale needs to be addressed, and models derived from the community ecology of macro-organisms might give the tools to understand small-scale microbial biodiversity.
The analysis of how biodiversity is distributed across space has employed various spatial approaches in community ecology, based on habitat generalists and specialists [27,28] and on the core-satellite hypothesis [29,30]. Classifying species into habitat generalists and specialists is the first step toward examining the underlying biological and ecological factors leading to the differential distribution of species among habitats [31]. Habitat generalists are defined as broadly distributed microbial taxa, whereas specialists are rare taxa that can be locally abundant [28,32]. The "core-satellite hypothesis" (or "Hanski's hypothesis") proposes that species fall into one of two categories: "core" species, which are widespread and locally abundant, and "satellite" species consisting mostly of rare species that are present in a limited number of sites and at low abundances [29,30]. Core and satellite taxa are classically identified to predict communities' occupancy-frequency distributions (with occupancy defined as the percentage of samples in which a taxon is present and frequency defined as the percentage of taxa found for each occupancy). This characterization of communities' ecology in terms of spatial distribution has often shown a bimodal distribution (i.e., most species are present in either most patches or only a small number of patches). In contrast, generalists and specialists refer to the ability of a taxon to tolerate a range of habitats by considering the spatial distribution of taxa. Previous work focusing on soil bacteria at the small scale showed that an occupancy-frequency relationship was detectable at this small scale, following Hanski's core-satellite hypothesis [8]. This work was done using microarray and 454 pyrosequencing data, using samples of a few milligrams taken along a 22cm-long transect at 1-cm intervals. Here, we extend this study to include a range of soils and use Illumina sequencing in order to derive a more in-depth analysis of the distribution of individual taxa. We chose to include a number of soils with different abiotic properties because it is known that these properties influence the composition and distribution of microbial communities.
Other works have attempted to identify which environmental variables shape microbial diversity. Indeed, previous work reported relationships between microbial distributions and abiotic variables [33][34][35] and aimed to determine the main spatial community driver at a larger scale.
The objective of this study was to investigate in different soils, at the small scale (from 1-cm to 1-m intervals and about 100 to 500 mg weight) whether (1) soil bacterial communities exhibit spatial structures and, if so, whether (2) these structures varied among soils from different environments. We used Hanski's hypothesis (the core and satellite hypothesis) to structure the analysis. The four soils that were used in the study were from four pedo-climatic situations and from three different regions: two Alpine polygonal soils (one from the centers and the other from the sides of polygons), a soil from the Rhône area, and a Sahel soil. Bacterial diversity analyses, evaluated using a 16S rRNA gene metabarcoding approach, were carried out. Community ecology approaches were used to assess the spatial structures using occupancy-frequency distribution patterns (i.e., core or satellite mode distributions) and abundance-occupancy relationships (abundance being calculated as the average percent presence of a taxon in samples). An in-depth analysis of bacterial OTU spatial distribution patterns was conducted by studying core taxa (defined as OTUs present in 100% of the micro-samples for a considered soil) and satellite taxa (defined as present in only 1 micro-sample for a considered soil, these satellite taxa were therefore spatially rare taxa). Moreover, in order to gain a better understanding of the ecological strategies employed by the taxa with regard to their habitat, we classified them according to their distribution across micro-samples (generalist taxa defined as broadly distributed bacterial taxa and specialists taxa were satellite taxa with more than 0.1% of sequences per micro-sample).

Soil Sampling
Soils were sampled in regions presenting four pedo-climatic combinations in three geographic locations: La Vanoise (Van), a loamy polygonal soil under alpine climate, sampled in the center of polygons (Van_PC) and from the sides of polygons (Van_PS) (polygons being typical structures retrieved in soil that experience freezing and thawing alternately); La Dombes (LD), a silty loam soil under temperate climate; and Grande Muraille Verte (GMV), a sandy soil in a semi-arid climate.
Thirty nine micro-samples were collected at La Vanoise The sampling procedure for the Van_PC, Van_PS, and LD soils proceeded as follows: soil cores, 1-cm diameter and 2-cm depth, were taken with a corer along a longitudinal transect, with 1-cm to 10-m lags and 0.2-to 10-m long (Supplementary Table 2). The top 1 mm of the cores (sample depth named a) and a 1-mm thick slice underneath (sample depth named b) were cut with a scalpel blade. The soil water content allowed the sample to remain intact. Samples were freeze-dried and stored at − 80°C for DNA extraction. Two-centimeter depth core was an intermediary size to sample in the field and to bring it back to the lab and subsample the final sample (i.e., the micro-sample) with spatial coordinate. Slice a was analyzed systematically, and slice b was taken in case there was a need to verify data.
In La Vanoise, micro-samples were collected along two 10m transects with lags from 1 cm to 1 m, to ensure that the sampling covered several polygons, from the centers (Van_PC, yellow colored) and sides of polygons (Van_PS, brown colored), respectively (Supplementary Table 2). Micro-sample weights were 119 mg on average (from 70 to 288 mg) for Van_PS and 129 mg on average (from 120 to 150 mg) for Van_PC.
In LD, the micro-samples were collected along a 2-m transect with 1-cm lags, from an area at the edge of a maize field that had not been ploughed for at least 10 years. Microsample weights were 100 mg on average (77 to 144 mg).
For the GMV soil, the sampling procedure of 1-m lags along to a 20-m transect is needed to be adapted as it is a sandy soil, and we could not proceed with cores. As the soil was very sandy, 10 g of soil were sampled every meter at 20 spots in the field and then subsampled in the laboratory (250 to 500 mg), lyophilized, and stored at − 80°C for DNA extraction.

Illumina Sequencing and Sequence Processing
Amplification of the bacterial V4 region of the 16S rRNA genes was performed using the universal primers 515F/806R. High-throughput sequencing was carried out after a multiplexing step using a MiSeq 2 × 300bp PE technology (MR DNA lab, www.mrdnalab.com, Shallowater, TX, USA).
According to the MR DNA analysis pipeline (MR DNA, Shallowater, TX, USA), paired-end sequences were merged, and denoising procedures that consisted in discarding reads containing ambiguous bases (N) or reads that were outside the range of expected length (i.e., < 150 bp) were carried out using Uchime [37]. The remaining sequences were clustered at a 97% similarity threshold [38] with Usearch [39]. Chimeras were detected using Uchime [37] and removed from the dataset. Final OTUs were taxonomically affiliated using BLASTn against a cured database derived from Greengenes, RDPII, and NCBI (www.ncbi.nlm.nih.gov, http://rdp.cme. msu.edu) [40].
Singletons were removed from the dataset. In order to compare micro-samples and to retain the largest possible number of micro-samples, a random resampling normalization step at a depth of 5474 sequences per sample was carried out. It should be noted that this procedure resulted in the exclusion of 1 sample from the Van_PC and 3 samples from the GMV soils as they did not contain 5474 sequences ( Table 1).
The sequencing data of the LD soil have been published in Michelland et al. (2016), and the dataset is available under the accession number PRJEB14534 in the SRA database. Data for other soils have been deposited under the following accession number ERP016178 at EBI.

Statistical Analyses
The soil parameters (Supplementary Table 1, parameters in column) were analyzed by a standardized PCA. Standardization was needed because of the differences in parameter units. The data table was transformed into row percentages (row sums = 1) before doing the PCA, to reduce the impact of the high sequence count variability among OTUs in the soil micro-samples.
The table of diversity data (OTUs in columns) was analyzed by a PCoA on the Jaccard distance matrix.
Computations were performed with the ade4 package [41] for the R Statistical Software [42].

Physico-chemical Characteristics and Bacterial Diversity
Rarefaction curves indicated that the sequencing effort was sufficient to describe bacterial diversity in Van_PC and GMV soils while in Van_PS and LD, some samples do not appear to have reached an asymptote (Supplementary Figure  1). Moreover, it showed that bacterial richness in the Van_PC micro-samples was more variable than in the other soils (Supplementary Figure 1). There were 418 OTUs common to the four soils, equivalent to 16.54% of the total richness (Fig. 1a).
The PCA ordination, based on the physical and chemical properties of the soils (texture, pH, organic carbon, organic matter, C/N, assimilable P, K, Ca, Mg; Supplementary  Table 1), is shown in Fig. 2a. The percentage variability explained by axes 1 and 2 was 59% and 27%, respectively. There was a clear separation among the soils, the LD soil being related to coarse silt (Csi), K, and assimilable P; the GMV soil related to coarse sand; the Van_PC soil to high C/N ratios; and the Van_PS soil to clay and total nitrogen. Both La Vanoise soils (i.e., Van_PC and Van_PS soils) were associated with high fine silt, OM, Mg, and Ca contents. The pH of the La Vanoise soils was close to or above 8 (Supplementary Table 1).
The PCoA ordination graph of bacterial diversity data is shown in Fig. 2b. The first two axes account for 11% and 9% of the total variability. This analysis highlighted that GMV and LD soils exhibited lower intra-diversity compared to Van_PC and Van_PS soils. On the first axis, GMV and LD soils were distinguished from the Van_PC and Van_PS soils, while on the second axis, GMV and LD soils were distinguished.

Abundance-Occupancy and Occupancy-Frequency Relationships
The abundance-occupancy diagrams for each soil showed a positive relationship between the OTU average relative abundances in micro-samples and the proportion of micro-samples in which they were found (Fig. 3). The patterns in the occupancy-frequency diagrams were different however (Fig.  4). The distribution in Van_PC was unimodal with a satellite mode (a large proportion of taxa occupied only one microsample), as highlighted by the ratio of OTUs number core over OTUs number satellite equal to 0.027. Van_PS, GMV and LD presented bimodal distributions with number core OTUs : number satellite OTUs ratios of 0.301, 0.323 and 0.571, respectively. Interestingly, compared to Van_PC, Van_PS, and GMV, a smaller proportion of the taxa that were limited to a small number of micro-samples associated with a larger proportion of taxa and were present in all micro-samples is characteristic of the LD soil (i.e., ratio of OTUs number core over OTUs number satellite of 0.571). Van_PS, GMV, and LD core taxa number are always important, 105, 86, and 155, respectively, but they were less numerous than the satellite taxa. The average abundances (i.e., % of sequences) of satellite OTUs in micro-samples were higher in Van_PC soil than in Van_PS, GMV, and LD soils (Table 1), whereas the average abundances of core OTUs in Van_PC soil (i.e., 21.72%) were much lower compared to other soils (80.96%, 78.80%, and 82.78% for Van_PS, GMV, and LD soils, respectively; Table 1). Core OTUs in Van_PC soil exhibited the lowest frequency compared to the other soils (7.65, 9.11, and 12.86% for Van_PS, GMV, and LD soils, respectively;  Table 1).
The PCoA ordination based on OTU patterns, which explained 11 and 9% of total variability on the first and second axes, respectively, separated soils with positive coordinates for LD and Van_PS soils and negative coordinates for GMV and Van_PC. Moreover, on the second axis (Fig. 2b), it separated LD and GMV. Moreover, the PCoA ordination based on OTU patterns separated the four soils as the PCA ordination of the physico-chemical properties did (Fig. 2). Thus, the spatial distribution type, which separates Van_PC (satellite mode) from Van_PS, GMV, and LD (bimodal), presents a typology that is different from the typology of soil physicochemical characteristics and to the typology based on diversity. The first axis of soil PCA (which explains 59% total variability, Fig. 2a) separates GMV and LD soils from Van_PS and Van_PC. This corresponds to the opposition between semi-arid sandy soil of GMV and organic matter rich soils from La Vanoise, while LD is "positioned" by coarse silt. The second axis (27% explained variability) separates LD and Van_PS from Van_PC and GMV. The community spatial Fig. 1 Venn diagrams presenting the shared and specific OTUs a detected in the four soils, b among core OTUs, and c among satellite OTUs distribution type may thus be driven by a combination of factors different from the ones driving diversity, thus bringing more information about the soil function.

Composition of Core and Satellite Bacterial Communities
Taxonomic investigations, at the phylum_class level (representing > 2% of sequence abundance) on satellite and core OTUs, revealed that some classes, such as Actinobacteria, Alphaproteobacteria, and Betaproteobacteria, were ubiquitous (Fig. 5). Actinobacteria were more abundant among GMV and LD core OTUs than among satellite OTUs, and Deltaproteobacteria were less abundant among core taxa in GMV and LD than among satellite taxa. Globally, Bacilli and Clostridia represented, on average, 10.2 and 0.7% of all sequences, respectively, in the satellite taxa of the four soils, but only 4.1 and 2.8 %, on average in the core taxa (to be noted that Bacilli were enriched in GMV core OTUs representing 13.9% of sequences).

Bacterial Ecological Strategy at the Small-Scale: Habitat Generalists and Specialists
Concerning generalists, among core taxa retrieved in the four studied soils (i.e., 229 OTUs), 34.7% were common to at least two soils, 20.2% common to at least 3 soils, and only 3.5% common to the four soils (Fig. 1b).
Fifty-three specialist OTUs were found in Van_PC, representing 4.8% of the total richness in Van_PC (Table 1). To be noted that in Van_PC, 45.3% of specialist taxa belonged to Proteobacteria and 15.1% belonged to Actinobacteria. In GMV, 9 OTUs were considered specialist, while there were 2 in Van_PS and only 1 in LD (Table 1). Interestingly, specialist OTUs in LD and GMV are specific to these soils, while one specialist OTU was common to Van_PC and Van_PS soils.

Discussion
We use a community ecology approach, drawn from the study of macro-organisms that was used to investigate bacterial community spatial distributions at the micro-scale across four different soils. The micro-sampling (few hundred milligrams: from 100 to 500 mg) was carried out on undisturbed soil so as to take the spatial distribution of taxa in the micro-samples into account (within and between micro-samples). The analysis of bacterial diversity data combined with soil microsampling suggested that bacteria can be classified in each soil as core or satellite taxa, depending on their distribution at the small scale.

Spatial Distribution Patterns
The separation of the soils on the two PCA and PCoA ordinations suggests that satellite mode distribution, retrieved in Van_PC, is related to coarse sand, high C/N, and pH. Unfortunately, we could not directly correlate bacterial taxa with environmental variables as micro-samples were too small to perform physico-chemical analyses. To date, quantifying environmental variables in micro-samples remains a big technical issue which needed to be resolved in further decades to gain insight into relationships between microbial communities and fine soil physico-chemical properties. Moreover, other complementary variables could be drivers of community spatial structure such as bulk density, soil structural heterogeneity, pore size distributions, and connectivity and organic features (i.e., root architecture or soil invertebrate activity).
While the existence of a rare microbial biosphere, across different ecosystems, is now accepted [43,44], there is no widely accepted definition of rare taxa in the literature. Indeed, arbitrary relative abundance cut-offs, depending on the study, range from 0.1% to 0.01% [45]. In the present work, conducted at the small scale, we used a spatial definition of rare taxa based on micro-sample occupancy, where taxa found in only one micro-sample were considered spatially rare, or satellite taxa [32]. This corresponds to an average relative abundance of 0.027% of the total sequences, when including all micro-samples in this study. Compared to the definition of Lynch and Neufeld (2015), our percentage of rare taxa identified is quite conservative. Because rare taxa are substantial contributors to bacterial community ecology [43,45], their spatial distribution may be an important factor impacting ecological processes and thus soil functioning. Studies focusing on spatial ecology of soil bacteria [46] concluded that the relative importance of the underlying processes contributing to the establishment of bacterial distributions can change with soil characteristics. More generally, Lindh et al. (2017) suggested that analyses of occupancy-frequency patterns can be a highly valuable approach allowing the definition of microbial biomes across environmental gradients [7].
Although a positive small-scale abundance-occupancy relationship has already been identified in the LD soil [8], this study confirmed this relationship and extended it to three other soils, suggesting a certain prevalence of the relationship in soil microbial communities. This is hardly surprising as it is one of the most fundamental patterns in ecology [2]. Moreover, the bacterial community spatial distributions also conformed to Hanski's core and satellite hypothesis [29]. However, the present work was conducted in four soils for which samples  The ten most abundant families per conditions italicized differed a bit in the size and/or mass collected as there was a loamy, a silty loam, and a sandy soil that do not allow the strictly same sampling procedure. Soil heterogeneity and, thus, the requirement for different experimental procedures may have an impact on the results. Gleason (1929) showed that quadrats of the most serviceable size should be chosen to measure frequency distributions of plants and that the information is obscured or lost if the quadrats are either too large or too small [47]. The fact that different ecological strategies were identifiable here suggests that the sample size used was appropriate for the study of bacterial spatial distributions in soils [16]. Compared to other studies, the present work showed the relevance of the smallscale approach in spite of the very low organism's size to sample size ratio [48]. Previous studies investigating the type of occupancy-frequency patterns have suggested that unimodal distributions that are characterized by an excess of rare, endemic species occur more frequently than bimodal distributions [49,50]. Even when bimodal distributions are detected, the magnitude of the core mode, representing cosmopolitan species, is generally smaller than that of the rare species [50]. The three bimodal distributions that we observed indicate that it may be a common community spatial distribution of soil bacteria at the small scale. This suggested that OTUs are either adapted to the majority of micro-habitats in this soil or that these OTUs have high dispersal capabilities favoring the colonization of empty ecological niches. Indeed, core species are believed to owe their regional commonness to the rescue effect (i.e., reduction in local extinction probability [51]), while satellite species are maintained by the habitat heterogeneity [52]. Cadotte and Lovett-Douste (2007) suggested that via the rescue effect, widespread and abundant species should have reduced risk of local extinction. In their work conducted on trees, different community spatial structures were observed [53]. Indeed, trees in degraded environments showed a satellite mode distribution, contrary to the other studied environments. In the Van_PC soil, the satellite mode distribution might represent a functional mode (i.e., corresponding to a stressful state) as in Cadotte and Lovett-Douste (2007).

Bacterial Ecological Strategies
For some authors, identifying core taxa is the first step in defining a "healthy" community and predicting community responses to perturbation [54]. Interestingly, it has been suggested that due to the great microbial diversity among ecosystems, the possibility of any species being present at high abundance in a large range of samples is low, and it is possible that the focus should instead be on higher taxonomic levels or on functional genes [55]. The results presented here are in accordance with this theory as only a few core OTUs are common to the four soils. However, when considering the commonness in only two or three soils, their richness increases, suggesting a patchy distribution rather than a ubiquitous one.

Conclusions
There is an increasing recognition that interactions within bacterial communities can have an impact on community functioning [56,57]. In this contribution, we have shown that the types of small-scale distributions of bacterial communities vary across soils with different physico-chemical features. This suggests that the type and range of interactions that might be expected to occur within microbial communities are likely to vary depending on the pedo-climatic context. We have shown that one soil presented a satellite mode, and the three others presented bimodal distributions. We also identify among bacterial richness retrieved in those four soils microsamples that up to 13% can be classified as core taxa that could be responsible for the stability of soil bacterial communities. Thus, the consideration of microbial spatial distribution that could play a major ecological role should further improve predictions and modelling on diversity modification risks on soil bio-conservation and epidemiology.