Working with Strings

Individual Exercise Solution

Load the UCDP conflict termation data in conflict termination.csv. Drop any asterisks from observations in Years, and drop any text between parentheses in SideB. Convert EpStartDate to a date object, and then report the range of dates present in the data. Using Years create a dataframe of conflicts that last more than one year, and a second dataframe of conflicts that only span one year.

library(lubridate)

# rad in conflict termination data
term <- read.csv('conflict termination.csv')

# drop asterisks
term$Years <- gsub('\\*', '', term$Years)

# drop anything between parentheses
term$SideB <- gsub('\\(.*\\)', '', term$SideB)

# convert episode start date to date
term$EpStartDate <- parse_date_time(term$EpStartDate, orders = c('mdy', 'dmy', 'ymd'))

# range of dates
range(term$EpStartDate)

## [1] "1946-01-01 UTC" "2009-12-17 UTC"

# create dataframe of multiyear conflicts
term[grep('.*-.*', term$Years), ]

ABCDEFGHIJ0123456789

	Count <int>	Count2 <int>	ConflictID <int>	Ep <int>	ConflEp <fctr>	Location <fctr>
4	4	1	2	1	2_1	Cambodia
5	5	4	3	1	3_1	China
6	6	5	4	1	4_1	Greece
7	7	2	5	1	5_1	Indonesia
9	9	7	6	2	6_2	Iran
10	10	8	6	3	6_3	Iran
16	16	4	9	1	9_1	Laos
17	17	13	10	1	10_1	Philippines
18	18	14	10	2	10_2	Philippines
20	20	16	10	4	10_4	Philippines

# create dataframe of single year conflicts
term[!grepl('.*-.*', term$Years), ]

ABCDEFGHIJ0123456789

	Count <int>	Count2 <int>	ConflictID <int>	Ep <int>	ConflEp <fctr>	Location <fctr>
1	1	1	1	1	1_1	Bolivia
2	2	2	1	2	1_2	Bolivia
3	3	3	1	3	1_3	Bolivia
8	8	6	6	1	6_1	Iran
11	11	9	6	4	6_4	Iran
12	12	10	6	5	6_5	Iran
13	13	11	6	6	6_6	Iran
14	14	12	7	1	7_1	Iran
15	15	3	8	1	8_1	Israel
19	19	15	10	3	10_3	Philippines

Working with Strings

Rob Williams

October 11, 2018

Individual Exercise Solution