Individual Exercise Solution

Load the UCDP conflict termation data in conflict termination.csv. Drop any asterisks from observations in Years, and drop any text between parentheses in SideB. Convert EpStartDate to a date object, and then report the range of dates present in the data. Using Years create a dataframe of conflicts that last more than one year, and a second dataframe of conflicts that only span one year.

library(lubridate)

# rad in conflict termination data
term <- read.csv('conflict termination.csv')

# drop asterisks
term$Years <- gsub('\\*', '', term$Years)

# drop anything between parentheses
term$SideB <- gsub('\\(.*\\)', '', term$SideB)

# convert episode start date to date
term$EpStartDate <- parse_date_time(term$EpStartDate, orders = c('mdy', 'dmy', 'ymd'))

# range of dates
range(term$EpStartDate)
## [1] "1946-01-01 UTC" "2009-12-17 UTC"
# create dataframe of multiyear conflicts
term[grep('.*-.*', term$Years), ]
ABCDEFGHIJ0123456789
 
 
Count
<int>
Count2
<int>
ConflictID
<int>
Ep
<int>
ConflEp
<fctr>
Location
<fctr>
441212_1Cambodia
554313_1China
665414_1Greece
772515_1Indonesia
997626_2Iran
10108636_3Iran
16164919_1Laos
17171310110_1Philippines
18181410210_2Philippines
20201610410_4Philippines
# create dataframe of single year conflicts
term[!grepl('.*-.*', term$Years), ]
ABCDEFGHIJ0123456789
 
 
Count
<int>
Count2
<int>
ConflictID
<int>
Ep
<int>
ConflEp
<fctr>
Location
<fctr>
111111_1Bolivia
222121_2Bolivia
333131_3Bolivia
886616_1Iran
11119646_4Iran
121210656_5Iran
131311666_6Iran
141412717_1Iran
15153818_1Israel
19191510310_3Philippines