I have two files 1) leafdata file with readcount values 2) metadata file with sample information… both are in tab delimited format. They look like this:
Data file:
genus sample1 sample2 sample3 sample4 ........ sample206
Massilistercora 26 419 16 2974 159
Aminipila 104 59 183 2594 209
Mogibacterium 502 971 591 218 2974
Flintibacter 418 0 981 2397 264
.
.
Metadata file:
samplename group timepoint gender
sample1 case A M
sample2 control B F
sample3 control A F
.
.
.
sample206 case E M
I loaded the data into R as below:
testdata <- read.table("leafdata.txt", sep = "t", header = TRUE, check.names = FALSE)
Then checked the dimension as below:
dim(testdata)
2874 207
However when I loaded the metadata as below:
leafmetadata <- read.table("metadata.txt", sep = "t", header = TRUE, check.names = FALSE)
Then dimensions as below:
dim(leafmetadata)
206 4
My question is why do I get number 206 for metadata but 207 for the leafdata even though my sample number is same in both files? This is what causing error for further analysis. Am I reading the file incorrectly in R?
I would really appreciate if some expert could please help me to solve this issue. Many thanks in advance!
Read more here: Source link