Overview on the attacks against civilians throughout the Syrian civil war

Tents for homeless people in stony terrain in Idlib, Syria Tents for homeless people in stony terrain in Idlib, Syria © Ahmed Akacha on Pexels

This report presents a research paper on attacks against civilian infrastructures during the Syrian civil war from 2012 to 2018 published on BMJ Global Health

During the civil war in Syria, hundreds of thousands of people have been killed and millions displaced. Destruction of civilian infrastructures was also very high during that period. The paper discussed the damage to civilian infrastructures by various actors between 2012 and 2018 by providing a data-driven analysis.

The data on different infrastructure classes, perpetrator, weapon, governorate, and date were collected and analysed using various statistical methods; Multiple Correspondence Analysis (MCA), k-means Clustering, Binomial Lasso Classification, and Cramer’s V Coefficient. Overall, the paper reported that violence against healthcare facilities is strongly correlated with specific perpetrators. 

The authors obtained data from three sources; Airwars, Physicians for Human Rights (PHR), and Safeguarding Health in Conflict Coalition / Insecurity Insight (SHCC / II). These sources were selected based on the availability of data, the level of specificity to each incident, and the verification process for data collection. The dataset consists of 4 categorical variables on year, governorate, perpetrator, and weapon along with five different classes of infrastructure; health, private, public, school, and unknown. The data on infrastructure types are binary coded as 0 and 1. A total of 2689 attacks from the period between 2012 and 2018 were studied in this paper. 

The data is visualised with a bar graph where the proportion of attacks were compared against the year of attack, region, perpetrator, and weapon. The number of counts and proportion of attacks on infrastructure by year, governorate, perpetrator, and weapon are also provided in a table. To detect the underlying structure in the dataset, the authors employed MCA, which is an unsupervised and dimension reduction technique. The strikes are labelled as 1s and 0s which represent ‘present’ and ‘absent’ categories for this analysis. The variation between the categorical groups can be interpreted as single point coordinates in the axis of variation. The analysis focused on the first and second degree of variation as it was evenly dispersed across the other remaining dimensions. Similarly, k-means clustering was applied to the row coordinates from the MCA to measure the accuracy of the classification of the variables. The number of clusters was determined to be eight, based on the binary distance dissimilarity method and k-means analysis method.

Furthermore, using binomial lasso classification the paper explored related variables to anticipate attacks on healthcare facilities. To make the model more interpretable, the value of beta coefficients of variables unrelated to the outcome of interest, binomial in this case, was made zero. In this method, the data was split into training and test sets. After that, a model capturing the relationship between the outcome variable and the independent variables was developed using the training set. This model was evaluated using a confusion matrix and was compared against the test data. The splitting of testing and training data was done 20 times to ensure that all the data were in the test set at least once. 

As the amount of data of attacks on healthcare infrastructure was small in comparison to attacks on non-healthcare infrastructures, the area under the curve-receiver operator characteristic (AUC-ROC) was calculated and the data oversampled to approximately match the number of attacks on healthcare facilities to the non-healthcare facilities. To further understand the relationship generated by the MCA, the strength of the relationship between the variables was measured using Cramer’s V coefficient. Ranging from 0 to 1, the coefficient with a value between 0.3 to 0.5 indicates a moderate correlation between variables. 

The study found that 91% of the attacks that were analysed were perpetrated by the US Coalition, the Russian military, and the Syrian Government. The most frequently attacked infrastructure was of unknown type (37 %). The highest proportion of attacks was done in the year 2017 (37%) and the hardest hit governorate was Raqqa (27 %) closely followed by Aleppo (24 %). Similarly, the most used method for the destruction of infrastructure was airstrike (83%). MCA showed that the year, perpetrator, governorate, and health infrastructure variables vary the most in the X and Y-axis. This brought out four important topics of interest: (1) US Coalition-led attacks in Raqqa in 2017, (2) Russian strikes against Aleppo in 2016, (3) Syrian government targeting health facilities, and (4) airstrikes on non-health infrastructures.

These topics highlight the fact that the Syrian government, along with the Russian government, are responsible for the destruction of the majority of health infrastructures. They also bolstered the records documenting US Coalition-led attacks in Raqqa in 2017 and Russian attacks on Aleppo in 2016 that resulted in the death of thousands of civilians. The paper voiced support for the investigation of violence against civilians in Syria as war crimes and crimes against humanity as defined by the Rome Statute of International Criminal Court.

The purity scores of k-means classification of the school, health, weapon, private, and perpetrator variables are above 0.8 and are similar to the scores when eight clusters are used. Binomial Lasso classification resulted in six non-zero coefficients; three of infrastructure (unknown, private, and public) and three of the perpetrator (Syria, Russia and Syria, and US Coalition), indicating that these infrastructure types and perpetrators characterize attacks on healthcare facilities. Furthermore, the attacks on healthcare are only moderately correlated with the year and governorate variables with Cramer’s V coefficient of 0.56 and 0.46 respectively. 

This paper has several limitations. First and foremost is the difficulty in extracting high-quality data during the war. If conflict data such as attacks by local perpetrators and ground strikes against non-healthcare facilities were added, the analysis would have been more informative. Besides, as the data were collected from different sources, the inherent inaccuracies from collecting and cleaning the data introduced errors. The smaller number of attacks on healthcare facilities may be due to non-random reasons which could also bias the results. Finally, the interpretation of non-zero coefficients obtained through Lasso classification could be misleading as few variables were used for analysis. 

The study analysed the attacks that directly affected civilians in Syria from 2012 to 2018. The statistical analysis used in the paper can be used as a supplement to documented records and eye-witness testimonies for the protection of human rights. Also, the novel data-driven statistical methods implemented in this study contribute to understanding complex events like wars and their consequences with high-level clarity.


To read more, visit:



Author: Shrabya Ghimire, Editor: Tiago Cotogni

Read 1468 times