I have a dataframe that has multiple patient records for the same patients with the first two columns being their visit times, the next two columns being what stage of their disease they are in when they visited, and the next two columns corresponding with what treatment they are on. I want to calculate the start and end time of each patient based on the first treatment they were given at visit 1. I was able to figure out the start time based on a solution in another post but am struggling with finding a end time.
Modifying function
I was thinking I would try the same method as finding the start time using an “ifelse” function but there is a lot of conditions I need for it to take into factor. The end treatment of the patient is recorded if the patient has a start time recorded then go into their second patient record and see if R2 is “response” or “death” and if it is then check to see if T1 and T2 equal each other and if all those requirements are met then to record the end time which could be V2 and possibly keep repeating if the conditions are being met.
Here is a reproducible example
df <- data.frame(
Patient = c('Dave', 'Dave', 'Dave', "Angel", "Angel", "Angel", "Joe", "Joe", "Joe", "Cara", "Cara"),
V1 = c(1, 150, 375, 1, 150, 375, 1, 150, 375, 1, 150),
V2 = c(150, 375,568,150, 375, 568, 150, 375, 568, 150,375),
R1 = c("Disease","Response","Response", "Disease","Disease", "Response","Disease", "Response", "Response", "Disease", "Response"),
R2 = c("Response", "Response", "Response", "Disease", "Response", "Death", "Response", "Disease", "Response", "Response", "Death"),
T1 = c("A","A", "A", "A","B","B", "A","A","C", "A", "A"),
T2 = c("A", "A","B", "B","B","B", "A","C","C" , "A", "A"))
df$start <- NULL
df$start <- ifelse(df$V1 == 1 & df$T1 == df$T2 & df$R2 == "Response", df$V2, NA)
The end time for Dave would be 568 because technically up to visit 568 his treatment was A and then it was changed. Angel would have no start time because they never saw a response on the first treatment given. Joe’s end time would be 150 because he stopped seeing a response on visit 375 so the end time would be the same as start time. Lastly, Cara’s end time would be 375 because we are assuming she was responding up until her death.
I feel this is complicated to understand so I can answer questions in the comments. Thanks in advance!
Read more here: Source link