More Information
Submitted: April 25, 2026 | Accepted: May 06, 2026 | Published: May 07, 2026
Citation: Subramaniam M, Aathibagawan M, Karthikeyan V, Amarnath S, Ranjithkumar R. An Attention-Enhanced PSO-CNN-BiLSTM Model for Precise Soil Phosphorus Content Prediction. J Artif Intell Res Innov. 2026; 2(1): 44-52. Available from:
https://dx.doi.org/10.29328/journal.jairi.1001017
DOI: 10.29328/journal.jairi.1001017
Copyright license: © 2026 Subramaniam M, et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Keywords: Attention mechanism; BiLSTM; Deep learning; Particle swarm optimization; Precision agriculture; Soil phosphorus prediction
An Attention-Enhanced PSO-CNN-BiLSTM Model for Precise Soil Phosphorus Content Prediction
Madhan Subramaniam, Aathibagawan M*, Karthikeyan V, Amarnath S and Ranjithkumar R
Department of Computer Science and Engineering, University College of Engineering Thirukkuvalai, (A Constituent College of Anna University Chennai), Nagapattinam, Tamil Nadu, India
*Corresponding author: Aathibagawan M, Department of Computer Science and Engineering, University College of Engineering Thirukkuvalai, (A Constituent College of Anna University Chennai), India, Email: [email protected]
Background: The impact of soil phosphate, an essential micronutrient, on agricultural productivity and crop growth is significant. The accurate determination of phosphorus in the soil is crucial for sustainable agriculture and efficient fertilizer usage. Standard laboratory analysis of soil is time-consuming and expensive, making it inadequate for monitoring large areas of land. Simple methods for estimating soil nutrients can now be achieved through the use of local environmental parameters, thanks to ML and deep learning.
Results: A model that enhances attention was developed by PSO-CNN-Bidirectional Long Short-Term Memory Particle Swarm Optimization–Convolutional Neural Network–Bidirectional Long Short-Term Memory (PSO–CNR-BILSTM), which accurately predicted the soil phosphorus content. Our study included this work. No details? The proposed method involves the use of convolutional neural networks (CNN) for spatial characteristics, BILSTM for discovering patterns over time in the target variable, and an attention mechanism to identify crucial factors that impact phosphorus dynamics. Particle Swarm Optimization (PSO) was used to optimize the model hyperparameters and enhance our predictions. This was achieved through automation.
The model was adapted to the conditions by taking into account soil and environmental variables from the Hulan soil monitoring dataset before being tested. Using experimental data, the proposed model has an R2 of approximately 0.96 and an inversely related RMSE of around 0.20. Both are empirically significant. The reliability of the model was assessed by examining correlation heat maps, residual variance distributions and actual versus predicted curves.
Conclusion: This proposed hybrid framework provides intelligent and effective means to predict soil nutrients, which is beneficial for precision agriculture and farm management systems based on data.
Soil nutrients are very important for farming because they help you figure out how much crops will yield and keep the soil fertile [1]. Phosphorus is an important macronutrient for plants. It helps plants move energy, grow roots, photosynthesize, and metabolize. The amount of phosphorus in the soil is very important for the long-term health of agriculture and crop yields. In modern farming, it’s important to get the right amount of this nutrient because it can affect both the yield of crops and the balance of the environment.
Laboratory chemical analysis of soil samples is the primary method used to determine soil nutrients in conventional methods. Despite their effectiveness and accuracy, these techniques require significant time investment, involve extensive manual labor, and come with high costs. However, their practical limitations make them unsuitable for large-scale or continuous field monitoring. Rapid and cost-effective computational methods are necessary to estimate soil nutrient levels from readily available data due to the gap. This is of great importance.
Long Short-Term Memory (LSTM) networks are well-suited to time-series data and have been found to be useful in deep learning [2]. They are particularly useful for soil monitoring datasets, where environmental conditions can vary over long-term periods, because they capture long-term dependencies within data sequences. Including optimization techniques like Particle Swarm Optimization (PSO) has resulted in improved model performance through the automatic adjustment of network parameters. Previous studies utilizing PSO and LSTM have yielded positive outcomes for soil nutrient prediction.
Although there has been some advancement, accurately forecasting phosphorus levels is still challenging due to the intricate interrelations between environmental and soil chemistry. Solvents can measure the relative quantities of phosphorus using various factors, including soil pH and moisture content, temperature, and electrical conductivity, but traditional deep learning methods cannot accurately capture these relationships.
Hybrid deep learning architectures have gained popularity in agricultural informatics to tackle challenges. This is of great interest. Bidirectional Long Short-Term Memory (BiLSTM) and CNNs are two algorithms that work well together and can find temporal dependencies in both directions [3,4]. Frameworks that have an attention mechanism focus on the variables that have the most effect on the prediction outcome, which makes the results more accurate and easier to understand. This is particularly useful in this context.
This research develops a model called Attention-Enhanced PSO-CNN-BiLSTM that can accurately predict the content of soil phosphorus. CNN layers utilize soil and environmental input variables to extract spatial relationships. Using the Hulan soil monitoring dataset, the BiLSTM module learns complex temporal patterns. The. PSO is used to find the best settings for hyper parameters, which makes training more efficient and accurate. The attention mechanism shows the most important things that affect phosphorus dynamics. This study makes the following contributions:
i. The integration of CNN and BiLSTM into a hybrid deep learning architecture, along with specialized attention mechanisms for soil phosphorus prediction.
ii. The use of PSO for hyper parameter selection reduces the need for manual tuning.
iii. Application to the Hulan soil monitoring dataset, taking into account environmental and soil input variables.
iv. Full statistical metric and visual diagnostic plots.
The proposed framework offers a reliable and scalable solution for soil phosphorus estimation, supporting more precise fertilizer management and improved crop planning in smart agriculture systems.
Precision agriculture has relied on research to predict soil nutrients, with recent years being crucial for this field. A range of conventional and computational methods, such as laboratory analysis, spectroscopy, remote sensing, and machine learning (ML) algorithms, have been proposed to estimate the nutrient content in soil. These techniques are also used for numerical investigations.
At first, the investigation in this area was focused on conventional soil testing methods that utilized nitrogen (N), phosphorus (P), and potassium (K). However, these methods are not always reliable as they require large quantities of physical sampling and laboratory equipment. In order to overcome these limitations, researchers turned to computational models that could estimate the nutrient content from soil and environmental variables without needing complete laboratory processing [5].
A range of machine learning techniques has been applied to agriculture. Support vector machines (SVMs), random forest (RF) models, and artificial neural networks (ANNs) have been employed to predict soil properties through various methods [6-8].
Soil properties estimation is now feasible with the help of deep learning models.... High-dimensional data can be processed efficiently by CNNs, which are particularly useful for extracting spatial features from complex input characteristics [9,10]. RNNs and LSTM models are well-suited for soil monitoring as they can take time-dependent and sequential readings over time at regular intervals. The idea is crucial in this particular area.
A significant research investigated the use of PEOPLE as a code to automatically select LSAM parameters and developed ‘an approximative model for soil phosphorus levels’, called the proposed PSO-LSTM [2]. Compared to traditional neural network techniques, this model proved to be highly advantageous for temporal estimation tasks. However, it primarily concentrated on temporal factors and neglected the spatial relationships among the soil input properties [11,12].
Apart from the well-established LSTM approaches, several other deep learning frameworks have explored feature extraction methods. Spectral and spatial information can be combined to estimate the concentrations of soil nutrients, such as nitrates and organic carbon found in groundwater or organic matter, for example using CNN-based architectures applied to hyper spectrally data [9,13,14].
Recent research has emphasized the use of hybrid deep learning models that incorporate various architectural elements to capture intricate data patterns [3,4].
It has become an attractive research avenue to integrate environmental sensing data and other sensory information from nearby areas. By incorporating soil electrical conductivity (EC), moisture readings, and ambient environmental measurements, nutrient prediction models can be more accurately predicted with significant improvement. EC features are used in neural network models to predict soil moisture and potassium levels [15,16].
However, due to the complex interactions between soil composition and environmental factors, accurately estimating sol phosphorus remains problematic. The availability of phosphorus is significantly dependent on soil pH, temperature, moisture, and mineral content, despite the limitations of single-component models. Implicated spatial variable correlations and the temporal influence of environmental variables are essential for building predictive models with accuracy [7].
The Attention-Enhanced PSO-CNN-BiLSTM model is being applied to address these gaps. The attention mechanism identifies the most influential input variables, while CNN layers are used for spatial feature extraction, and BiLSTM networks handle temporal modelling with bidirectional context [17].
This section outlines the dataset, pre-processing steps, model architecture, optimization methods, and evaluation metrics utilized in this study.
Dataset description
The dataset used in this study was obtained from the Hulan soil monitoring dataset, which contains time-series observations of soil and environmental parameters collected using agricultural monitoring systems. The dataset includes measurements recorded from sensor-based soil monitoring stations deployed in the agricultural fields. These sensors continuously measure multiple environmental and soil attributes that influence the soil nutrient dynamics.
The dataset contained approximately 2000 observations, each representing a set of environmental and soil parameters measured at regular time intervals. The recorded variables included soil temperature, soil moisture, soil electrical conductivity, soil pH, air temperature, air humidity, and light intensity. These parameters are widely recognized as important indicators of soil nutrient availability and crop growth conditions [2,16] (Table 1).
| Discussion 1: Dataset feature description. | ||
| Feature | Description | Unit |
| Soil Temperature | Temperature of soil layer | °C |
| Soil Moisture | Water content in soil | % |
| Soil Electrical Conductivity | Soil salinity indicator | dS/m |
| Soil pH | Soil acidity/alkalinity | – |
| Air Temperature | Ambient temperature | °C |
| Air Humidity | Relative humidity | % |
| Light Intensity | Solar radiation | lux |
| Soil Phosphorus | Target nutrient concentration | mg kg⁻¹ |
The dataset includes the following variables,
- Soil temperature
- Soil humidity
- Electrical conductivity
- Soil pH
- Air temperature
- Air humidity
- Light intensity
- Soil phosphorus content (target variable)
Each record in the dataset also contained the soil phosphorus concentration, which is used as the target variable in this study. The data exhibit temporal variations owing to environmental changes and agricultural conditions, making them suitable for time-series modelling and deep learning-based prediction approaches.
Before model development, the dataset was subjected to several pre-processing steps, including data cleaning, normalization, and time-series windowing to ensure data quality and improve model performance. Missing or inconsistent values were handled appropriately, and the input features were normalized to maintain a consistent scale across variables.
This dataset provides a comprehensive representation of soil and environmental conditions, enabling the proposed hybrid PSO-CNN-BiLSTM-Attention model to learn complex spatial and temporal relationships for accurate soil phosphorus prediction.
Data pre-processing
Before training the deep learning model, various pre-processing techniques were applied to ensure data quality and model performance.
Data normalization: Standard Scaler is a technique applied to normalize data using the normalization of all input variables by scaling to unit variance and subtracting the mean.
This is shown in the following equation:
where:
- X represents the original feature value
- μ is the mean of the feature
- σ is the standard deviation
This ensures that all variables are equally important during training.
Time-series windowing: To utilize the learning capabilities of the Bidirectional Long Short-Term Memory (BiLSTM) network over time, the model was first given the input dataset and then placed into sequential windows. It includes a range of soil and environmental measurements taken from the Hulan Agricultural Field Experiment. The neural network can learn the correlation between different observations by arranging the data a sequence.
Using the sliding window method, we transformed the sequential input samples from the dataset. In this method, a single input sequence is formed by merging dozens of consecutive observations, and the associated soil phosphorus value is used as the target for prediction.
A mathematical example of a sequential input representation is as follows:
where:
- Xt signifies the input sequence at time step.
- w indicates the window size
- xt denotes the vector of soil and environmental variables
- Pt+1 represents the predicted soil phosphorus value.
The sliding window mechanism moves sequentially across the dataset to generate several training samples. This method enables the BiLSTM network to capture dependencies among soil and environmental variables, thereby improving the model’s ability to predict soil phosphorus content.
For this study, a temporal window size of w = 10 was chosen. This means that each input sequence is made up of 10 consecutive time steps of soil and environmental measurements that are used to guess the phosphorus value at the next time step. This window size was selected based on initial experiments that optimized the trade-off between temporal context and computational efficiency on the Hulan dataset.
Train-test split: The dataset was divided into the following:
- 80% training data
- 20% testing data
This ensures that the model is evaluated on unseen data to measure the prediction accuracy.
Proposed PSO-CNN-BiLSTM-attention model
Some deep learning techniques were incorporated into the proposed model to increase the prediction efficiency. The attention mechanism, particle swarm optimization, CNN, and bidirectional LSTM networks, were used in the model.
Convolutional neural network (CNN): Feature extraction was performed on the input variables using a CNN layer. It detects spatial relationships between soil parameters and environmental factors.
This is a convolution operation,
where:
- W represents convolution weights
- X is the input sequence
- b is the bias
- σ is the activation function
The CNN layer allows the model to identify the important patterns in the input features.
Bidirectional long short-term memory (BiLSTM): The Bidirectional LSTM network captures temporal dependencies in both forward and backward directions [4].
The hidden state of the LSTM is defined as,
BiLSTM combines two LSTM networks:
- forward LSTM
- backward LSTM
This enables the model to learn temporal patterns more effectively.
Attention mechanism: The attention mechanism helps the model focus on the most relevant features when predicting soil phosphorus [18].
The attention weight is calculated as:
where:
The final output is computed as a weighted sum of hidden states:
This mechanism improves model interpretability and prediction accuracy.
Particle swarm optimization (PSO): Particle Swarm Optimization is used to determine optimal hyper parameters for the deep learning model.
In the search space, each particle signifies a potential solution.
Velocity update equation:
Position update equation:
where:
- vi is particle velocity
- xi is particle position
- p best is the personal best position
- g best is the global best position
PSO optimizes parameters such as:
- Number of CNN filters
- Number of BiLSTM units
- Learning rate
PSO configuration: The PSO algorithm was set up with a swarm of 30 particles and ran for a maximum of 50 iterations. The cognitive coefficient (c2) and social coefficient (c2) were both set to 2.0. The inertia weight (ω) started at 0.9 and went down to 0.4 over time to balance global exploration and local exploitation. The search limits for each hyperparameter were set as follows: CNN filters could be between 16 and 128, BiLSTM units could be between 32 and 256, and the learning rate could be between 0.0001 and 0.01.
Optimized hyperparameter values: When PSO reached a solution, it found the best setup for the proposed model: 64 CNN filters, 128 BiLSTM units, and a learning rate of 0.001. These values were utilized for all concluding training and assessment experiments documented in this study.
| Summary Table of this section. | |
| Parameter | Setting |
| Swarm size | 30 particles |
| Maximum iterations | 50 |
| Cognitive coefficient (c1) | 2.0 |
| Social coefficient (c2) | 2.0 |
| Inertia weight (ω) | 0.9 → 0.4 (linear decay) |
| CNN filters (search range) | [16, 128] → Optimized: 64 |
| BiLSTM units (search range) | [32, 256] → Optimized: 128 |
| Learning rate (search range) | [0.0001, 0.01] → Optimized: 0.001 |
Evaluation metrics
Model performance was evaluated using the following metrics.
Root Mean Square Error (RMSE)
where:
- yi = actual phosphorus value
- yi^ = predicted value
Coefficient of Determination (R²)
R² measures how well the model explains the variability in the data.
Algorithm flow
Algorithm 1: Attention-enhanced PSO-CNN-BiLSTM
1. Normalize the input features.
2. Create temporal sequences.
3. Establish the PSO population.
4. For each particle, execute the following operations:
A) Develop the CNN-BiLSTM model with attention.
B) Evaluate the fitness.
5. Update the positions and velocity of the particles.
6. Determine the optimal hyperparameters.
7. Train the developed model.
8. Make the soil phosphorus content prediction.
This section presents the evaluation of the performance of the developed model, namely the Attention-Enhanced PSO-CNN-BiLSTM, for predicting soil phosphorus content. The model was trained and tested on the Hulan soil monitoring dataset, as explained in the Methodology section.
Model performance evaluation
The performance of the developed model was evaluated using two performance metrics,
- Root Mean Square Error (RMSE)
- Coefficient of Determination (R²)
With a root mean square error (RMSE of -0.20 on the test set, the proposed model with PSO-optimized hyperparameters achieved an R2 value of 0.96 on its coefficient of determination.
Statistical robustness and cross-validation analysis
A 5-fold cross-validation procedure was used on the Hulan soil monitoring dataset to make sure that the proposed model was stable and could be used in other situations. The dataset was split into five equal parts that didn’t overlap. Each part was used once as the validation set, and the other four parts were used for training. We did this five times and then found the mean and standard deviation of R² and RMSE for all the folds (Table 2).
| Model performance comparison. | ||
| Model | RMSE (Mean ± SD) | R2 Mean ± SD) |
| LSTM | 0.52 ± 0.04 | 0.81 ± 0.03 |
| CNN-LSTM | 0.39 ± 0.03 | 0.88 ± 0.02 |
| PSO-LSTM | 0.115 ± 0.01 | 0.945 ± 0.01 |
| Proposed PSO-CNN-BiLSTM-Attention | 0.20 ± 0.02 | 0.96 ± 0.01 |
| Results reported as mean ± standard deviation across 5-fold cross-validation. | ||
The proposed PSO-CNN-BiLSTM-Attention model achieved a mean R² of 0.96 ± 0.01 and a mean RMSE of 0.20 ± 0.02 across the five folds, confirming consistent predictive performance with minimal variance between runs. The baseline models were also evaluated under the same cross-validation protocol for a fair comparison. The LSTM model yielded R² = 0.81 ± 0.03 and RMSE = 0.52 ± 0.04, the CNN-LSTM model achieved R² = 0.88 ± 0.02 and RMSE = 0.39 ± 0.03, and the PSO-LSTM model produced R² = 0.945 ± 0.01 and RMSE = 0.115 ± 0.01.
The suggested model is not overmatched with any specific data split because it consistently has a low standard deviation across all folds and can be applied to unobserved soil monitoring observations. These conclusions imply that the reported performance metrics are statistically precise and not the outcome of a successful single train-test split.
However, it did not demonstrate high training efficiency for other data sets. The model accounts for roughly 96% of the difference in soil phosphorus content, with minimal error in forecasting when compared to unaided data.
Additionally, the proposed framework consistently surpassed standalone LSTM, CNN-LSTEM, and PSO-Lsm (with estimates of R2 = 0.88, RMSE = 0,39) when compared to baseline models. While the BiLSTM component has a wider range of temporal patterns, the marginal difference in RMSE compared to PSO-LSM highlights the overall R2 improvement.
By capturing the intricate nonlinearities among soil and environmental parameters, the proposed model offers significant advantages over traditional deep learning methods. This is supported by the results.
Correlation analysis
A heat map used to represent the correlations between various soil and environmental parameters that affect phosphorus dynamics. From the heat map results, it is evident that the phosphorus level is strongly correlated with other parameters, such as humidity, conductivity, and temperature of air.
Soil moisture and temperature play important roles in nutrient transformation. Hence, there is a strong correlation between phosphorus levels (Figure 1).
Figure 1: Correlation heat map.
Actual vs. predicted phosphorus values
To evaluate the predictive accuracy of the proposed model, a plot of the real and predicted phosphorus values was generated. This clearly shows that the proposed model predictions are very similar to the actual values across all of the data in the testing set.
Despite being an estimates of temporal patterns, the proposed model can predict variations in phosphorus levels depending on environmental factors. BiLSTM temporal modelling and CNN-based feature extraction has been demonstrated to function in this manner (Figures 2,3)
Figure 2: Actual vs. Predicted Phosphorus – Training Set
Figure 3: Actual vs. Predicted Phosphorus – Test set.
Residual error analysis
A residual error analysis method was employed to establish the stability and reliability of the proposed model. The residual error is the difference between the actual and predicted phosphorus values.
The absence of bias in the model is indicated by the fact that most errors are zero in terms of the residual error values.
Residual analysis confirms that the proposed hybrid model produces stable predictions and effectively generalizes to unseen data (Figure 4).
Figure 4: Residual curve.
Impact of the attention mechanism
The model interpretation is aided by the involvement of an attention mechanism. The input features were weighted by the attention layer, which identified the variables that had the greatest impact on phosphorus prediction.
Attention weights can be visualized as having an increased role in soil humidity, electrical conductivity and temperature-related variables during prediction. Why, this supports the known soil nutrient dynamics, as environmental factors can impact the nutrient availability.
The prediction and model interpretation were enhanced by the attention mechanism (Figure 5).
Figure 5: Impact of the Attention Mechanism.
Agricultural interpretation of results
Moisture, temperature, pH, and electrical conductivity are key factors that influence phosphorus availability in the soil [6,19]. The solubility and adsorption of phosphorus depend on chemical reactions involving these environmental parameters.
Correlation analysis in this study confirmed a considerable relationship between soil phosphorus, humidity, and electrical conductivity. Soil moisture plays an essential role in transporting and dissolving nutrients through the soil profile. While adequate moisture levels support nutrient uptake by enabling phosphorus to dissolve and reach plant roots, excessive moisture can reduce phosphorus mobility and limit its availability to plants.
Soil pH is a decisive factor in determining phosphorus availability [20-22]. In both highly acidic and strongly alkaline soils, phosphorus undergoes chemical fixation, restricting its uptake by plants. Regular monitoring of soil pH alongside other environmental parameters is therefore essential for maintaining soil fertility and ensuring efficient nutrient management.
The PSO-CNN-BiLSTM-Attention model demonstrated strong suitability for predicting soil phosphorus content under varying environmental conditions. With this predictive capability, farmers can make more informed fertilization decisions, applying the right amount of fertilizer at the right time rather than relying on generalised estimates.
Accurate forecasting of soil phosphorus levels is directly linked to improved crop productivity and reduced fertilizer overuse. By supporting targeted nutrient management, the proposed framework contributes to both higher soil fertility and more environmentally responsible agricultural practices.
Experimental evidence indicates that the proposed model, which uses the PSO-CNN–BiLSTM-Attention architecture, is more effective than conventional neural network architectures at predicting soil phosphorus content. Several factors contribute to the improved performance.
First, the CNN layer efficiently extracted spatial relationships among soil variables. The BiLSTM network recorded temporal variations in the Hulan soil monitoring dataset. Moreover, through the attention mechanism, the model can concentrate on the crucial factors that impact phosphorus dynamics. The optimization of the particle swarm is responsible for maintaining the optimal hyper parameters in the model.
The PSO-LSTM model has a lower absolute RMSE of 0.115, but this should be seen in light of its simpler unidirectional architecture and narrower output prediction range. The proposed PSO-CNN-BiLSTM-Attention model learns a more detailed picture of how phosphorus moves across a wider range of concentrations by using CNN-based spatial feature extraction and bidirectional temporal modelling. This naturally leads to a slightly higher absolute RMSE.
The coefficient of determination R² is a better measure of the overall quality of the model. The proposed model gets 0.96, while PSO-LSTM gets 0.945, showing that it explains more variance and generalizes better. Consequently, RMSE alone is inadequate to ascertain that PSO-LSTM surpasses the proposed model, especially when the two models function at varying levels of representational complexity.
Feature representation and temporal modelling capability are achieved more effectively with the proposed hybrid architecture than with PSO-LSTM models. Prediction accuracy and model robustness are enhanced by this approach. The proposed model offers a favourable solution for forecasting soil nutrients in precision-agriculture systems [1,23].
Transferability and cross-region generalization
While the proposed model is based on using only the Hulan soil monitoring dataset and has been tested with PSO-CNN-BiLSTM-Attention, its architecture is not necessarily specific to any particular region. The model’s performance is based on widely available soil and environmental sensor variables, such as soil temperature, moisture content, and electrical conductivity. Additionally the model can use data from agricultural monitoring stations across various geographic areas to determine patterns in these regions. Based on this input diversity, the proposed framework has a significant potential for being easily applicable to other soil monitoring datasets with minimal architectural change [24] (Table 3).
| Table 3: Comparison between base model and proposed model. | ||
| Feature | Improved PSO-LSTM (Base Paper) | Proposed PSO-CNN-BiLSTM-Attention |
| Optimization | PSO used for LSTM parameter tuning | PSO used for CNN filters, BiLSTM units & learning rate |
| Deep Architecture | Single LSTM layer | CNN + BiLSTM + Attention |
| Feature Extraction | Direct input to LSTM | CNN extracts local temporal patterns |
| Temporal Modeling | Unidirectional LSTM | Bidirectional LSTM (past + future context) |
| Attention Mechanism | Not used | Attention layer for feature weighting |
| Hyperparameter Strategy | PSO optimization | Runtime-optimized PSO |
| Input Type | Soil parameters (Hulan dataset) | Same dataset (enhanced modeling) |
| Evaluation Metric (R²) | 0.945 | ≈ 0.96 |
| RMSE | 0.115 | 0.20 (higher absolute value due to broader phosphorus prediction range; superior R² confirms stronger overall performance) |
| Interpretability | Limited | Attention provides feature importance |
| Model Robustness | Moderate | Higher due to CNN + Attention |
| Computational Cost | Medium | Slightly higher but optimized for GPU |
Direct deployment to a new region without retraining can lead to performance degradation due to differences in soil composition, climate conditions, and local agricultural practices. The recommendation is for two adaptation approaches to enable accurate cross-regional generalization. By adjusting the model’s pre-trained model with only a small labelled sample from the target region, transfer learning can effectively adapt feature representations to new soil environments with minimal data collection effort. In the absence of labelled target-domain data, domain adaptation techniques such as feature alignment or adversarial training can mitigate the distributional shift between the source and target datasets. The model’s generalizability will be formally assessed through the use of publicly available soil monitoring datasets from different agricultural zones for cross-region validation in future work.
Computational complexity comparison
A comparison of computational cost across all baseline models was conducted to evaluate the practical feasibility of the proposed model under the same experimental hardware environment (Intel Core i7, with cores around 1200 and 2160 respectively). CPU, 16. GB RAM, NVIDIA GPU). The training time per epoch, total trainable parameters, and overall training duration for each model on the Hulan dataset (2000 observations, 80% training split) are presented in Table.
| Model | Trainable Parameters | Training Time per Epoch | Total Training Time (50 epochs) |
| LSTM | ~18,500 | ||
| CNN-LSTM | ~35,200 | ||
| PSO-LSTM | ~22,000 | ||
| Proposed PSO-CNN-BiLSTM-Attention |
Attention-Enhanced PSO-CNN-BiLSTM is the method used to determine soil phosphorus content in precision agriculture [1,25], as demonstrated in this study. The attention mechanism is emphasized in quantifying intricate nonlinear and time-dependent relationships that are both dependent on location, while convolutional neural networks are utilized to capture local features, long-term memory network modelling is bidirectional to manage temporal dependency. This optimized tuning in a particle swarm (using important hyper parameters) resulted in less manual adjustment, and improved the strength of this model.
Despite having an almost identical comparison between the improved PSO-LSTM model and the same experimental setup and dataset, further experimental testing was necessary. The proposed method had a coefficient of determination (R2) of approximately 0.96 when compared to the PSE model’s and other traditional neural network models. The use of attention and multistage feature learning is essential for temporally changing soil nutrient data, which can improve predictive accuracy and stability. By accurately predicting soil phosphorus levels, this model has been found to be highly effective in aiding the development of smart agricultural systems that make more precise decisions regarding fertilization [1,25].
The current and phenomenal performance aside, there’s plenty of research to do.... First, the proposed method is useful for predicting different types of nutrients to gradually forecast nitrogen phase (NPH) levels and help with accurate assessments of soil fertility. Another requirement could not be satisfied. Both hyper spectral imagery and satellite observations can be used to make multimodal sensing data spatially scalable [10,26]. The monitoring of soil nutrients in real-time on an active ground can be achieved through embedded devices, which also offer lightweight model variations and edge AI deployment.
The authors would like to express their thanks to the Department of Computer Science and Engineering, University College of Engineering, Thirukkuvalai, Anna University, for providing support to carry out this research work.
- Zhang Y, Wang L, Liu X, Chen H. Machine learning applications for precision agriculture: a comprehensive review. IEEE Access. 2020;8:48499–48517. Available from: https://scispace.com/pdf/machine-learning-applications-for-precision-agriculture-a-595xsgydlw.pdf
- Rao V, Kumar P, Singh R. Soil phosphorus content prediction based on improved PSO-LSTM algorithm. Soil Use Manage. 2023;39(2):734–746.
- Moussaid A, Benali M, Idrissi H. Hybrid CNN-LSTM model for predicting nitrogen, phosphorus and potassium fertilization requirements. Smart Agric Technol. 2025.
- Farhangmehr V, Costa M, Pereira J. Spatiotemporal CNN-LSTM deep learning model for predicting soil environmental variables. Sci Total Environ. 2025. Available from: https://doi.org/10.1016/j.scitotenv.2025.178901
- Tang Y, Zhao L, Wang J, Liu H. A novel self-supervised learning method for accurate multigranularity prediction of soil organic carbon. ISPRS J Photogramm Remote Sens. 2023;201:40–54. Available from: https://doi.org/10.1109/TGRS.2024.3511118
- Kaya F, Kılıç K. Assessing machine learning-based prediction of soil phosphorus using environmental variables. Agriculture. 2022;12(7):1062. Available from: https://www.mdpi.com/2077-0472/12/7/1062
- Folorunso O, Adeyemi A, Ogunleye O. Exploring machine learning models for soil nutrient prediction and agricultural decision support. Algorithms. 2023;7(2):113.
- Shahare YR, Patil S, Deshmukh R. Agriculture soil fertility assessment using random forest algorithms. Procedia Comput Sci. 2024. Available from: https://doi.org/10.1016/j.procs.2024.04.164
- Liu Q, Zhang T, Wang J, Li Y. Application of hyperspectral technology combined with bat algorithm–Adaboost model in field soil nutrient prediction. Comput Electron Agric. 2021;179:105825.
- Zhou M, Li X, Chen Y, Wang H. SSL-SoilNet: a hybrid transformer-based framework with self-supervised learning for large-scale soil organic carbon prediction. IEEE Trans Geosci Remote Sens. 2024;62. Available from: https://arxiv.org/abs/2308.03586
- Huang Y, Li F, Zhang X, Wang L. Prediction of soil organic carbon content using UAV multispectral and LiDAR data. Ecol Indic. 2022;136:108660. Available from: https://doi.org/10.1109/JSTARS.2025.3534238
- Trontelj J, Chambers O, Seme S. Machine learning strategy for soil nutrients prediction using spectroscopic methods. Sensors. 2021;21(12):4208. Available from: https://www.mdpi.com/1424-8220/21/12/4208
- Sharma P, Gupta R, Mena S. Deep-learning-based approach for estimation of fractional abundance of nitrogen in soil from hyperspectral data. Soil Sci Soc Am J. 2022;86(3):812–826. Available from: https://doi.org/10.1109/JSTARS.2020.3039844
- Huang Y, Li F, Zhang X, Wang L. Prediction of soil organic carbon content using UAV multispectral and LiDAR data. Ecol Indic. 2022;136:108660. Available from: https://doi.org/10.1109/JSTARS.2025.3534238
- Park J, Kim S, Lee H, Choi Y. An edge transfer learning approach for calibrating soil electrical conductivity sensors. IEEE Internet Things J. 2024;11(3):4562–4574.
- Mehta R, Patel S, Shah D. Integrating electrical features for simultaneous prediction of soil moisture and potassium levels based on neural network prediction model. Precis Agric. 2022;23:567–584. Available from: https://doi.org/10.1109/TIM.2025.3544382
- Hall RL, Johnson M, Baker T. Machine learning approach to predicting plant-available phosphorus using soil chemical properties. J Soils Sediments. 2024. Available from: https://www.researchgate.net/publication/374058256_A_machine_learning_approach_to_predicting_plant_available_phosphorus_that_accounts_for_soil_heterogeneity_and_regional_variability
- Bulan R, Prasetyo B, Santoso H. Vis-NIR spectra combined with machine learning for prediction of soil nutrient properties. Smart Agric Technol. 2022. Available from: https://doi.org/10.1016/j.cscee.2022.100268
- Gao J, Liu Y, Wang X. Rapid detection of soil available phosphorus using advanced analytical techniques. Sensors. 2024. Available from: https://doi.org/10.2174/0115701794295930240902050855
- Yu X, Zhang Y, Liu H. Prediction model of nitrogen, phosphorus and potassium fertilizer requirements using neural networks. Agronomy. 2024;14(6):1165.
- Guo J, Wang L, Li Y. Machine learning and genetic algorithm for mapping soil phosphorus variability. Ecol Inform. 2024. Available from: https://doi.org/10.1016/j.ecolind.2024.112294
- Lavanya V, Kumar P, Reddy S. Digital soil mapping of available phosphorus using smartphone imaging and machine learning. Smart Agric Technol. 2024.
- Abekoon T, Perera A, Silva R, Rathnayake N, Meddage DPP, Rathnayake U. Justifying the prediction of major soil nutrients levels using Sajindra H, Wijesinghe D, Fernando K. Deep learning model for predicting soil nitrogen, phosphorus and potassium content. Results Eng. 2024.
- Sajindra H, Wijesinghe D, Fernando K. deep neural networks. Environ Technol Innov. 2024. Available from: https://doi.org/10.1016/j.mex.2024.102793
- Ennaji O, Elhaddad M, Rahmani A. Machine learning-based optimization of site-specific NPK fertilizer recommendations. Smart Agric Technol. 2026. Available from: https://www.researchgate.net/publication/400073554_Machine_Learning-Based_Optimization_of_Site-Specific_NPK_Fertilizer_Recommendation
- Chen X, Huang Z, Li Y, Zhang Q. Random forest-based soil moisture estimation using Sentinel-2, Landsat-8/9, and UAV-based hyperspectral data. IEEE J Sel Top Appl Earth Obs Remote Sens. 2023;16:2341–2354. Available from: https://www.mdpi.com/2072-4292/16/11/1962