Load the UCDP conflict termation data in conflict termination.csv
. Drop any asterisks from observations in Years
, and drop any text between parentheses in SideB
. Convert EpStartDate
to a date object, and then report the range of dates present in the data. Using Years
create a dataframe of conflicts that last more than one year, and a second dataframe of conflicts that only span one year.
library(lubridate)
# rad in conflict termination data
term <- read.csv('conflict termination.csv')
# drop asterisks
term$Years <- gsub('\\*', '', term$Years)
# drop anything between parentheses
term$SideB <- gsub('\\(.*\\)', '', term$SideB)
# convert episode start date to date
term$EpStartDate <- parse_date_time(term$EpStartDate, orders = c('mdy', 'dmy', 'ymd'))
# range of dates
range(term$EpStartDate)
## [1] "1946-01-01 UTC" "2009-12-17 UTC"
# create dataframe of multiyear conflicts
term[grep('.*-.*', term$Years), ]
Count <int> | Count2 <int> | ConflictID <int> | Ep <int> | ConflEp <fctr> | Location <fctr> | ||
---|---|---|---|---|---|---|---|
4 | 4 | 1 | 2 | 1 | 2_1 | Cambodia | |
5 | 5 | 4 | 3 | 1 | 3_1 | China | |
6 | 6 | 5 | 4 | 1 | 4_1 | Greece | |
7 | 7 | 2 | 5 | 1 | 5_1 | Indonesia | |
9 | 9 | 7 | 6 | 2 | 6_2 | Iran | |
10 | 10 | 8 | 6 | 3 | 6_3 | Iran | |
16 | 16 | 4 | 9 | 1 | 9_1 | Laos | |
17 | 17 | 13 | 10 | 1 | 10_1 | Philippines | |
18 | 18 | 14 | 10 | 2 | 10_2 | Philippines | |
20 | 20 | 16 | 10 | 4 | 10_4 | Philippines |
# create dataframe of single year conflicts
term[!grepl('.*-.*', term$Years), ]
Count <int> | Count2 <int> | ConflictID <int> | Ep <int> | ConflEp <fctr> | Location <fctr> | ||
---|---|---|---|---|---|---|---|
1 | 1 | 1 | 1 | 1 | 1_1 | Bolivia | |
2 | 2 | 2 | 1 | 2 | 1_2 | Bolivia | |
3 | 3 | 3 | 1 | 3 | 1_3 | Bolivia | |
8 | 8 | 6 | 6 | 1 | 6_1 | Iran | |
11 | 11 | 9 | 6 | 4 | 6_4 | Iran | |
12 | 12 | 10 | 6 | 5 | 6_5 | Iran | |
13 | 13 | 11 | 6 | 6 | 6_6 | Iran | |
14 | 14 | 12 | 7 | 1 | 7_1 | Iran | |
15 | 15 | 3 | 8 | 1 | 8_1 | Israel | |
19 | 19 | 15 | 10 | 3 | 10_3 | Philippines |