Creating Histogram and Boxplot in R
This is an explanation about creating a histogram and boxplot in R through the given data and understanding the outcome of the results from the plots.
The names of 5 variables in the dataset are as follows:
‘Frequency’,‘BP’,
‘First’, ‘Second’, ‘FinalDecision’
1. "0.6","103","bad","low","low”
2. "0.3","87","bad","low","high”
3. "0.4","32","bad","high","low”
4. "0.4","42","bad","high","high"
5. "0.2","59","good","low","low”
6. "0.6","109","good","low","high”
7. "0.3","78","good","high","low”
8. "0.4","205","good","high","high”
9. "0.9","135",”NA","high","high"
10. "0.2","176",”bad","high","high”
Conversion of the following data into code and
obtaining data.frame from it:
>Freq <- c(0.6,0.3,0.4,0.4,0.2,0.6,.3,0.4,0.9,0.2)
# frequency of hospital visits by patients
during 12 month period
>BP <- c(103,87,32,43,59,109,78,205,135,178)# blood
pressure of each individual patient
>First <- c(1,1,1,1,0,0,0,0,NA,1)# first doctor
evaluation of BP, where 1=bad and 0=good
>Second <- c(0,0,1,1,0,0,1,1,1,1) # second and final
evaluation of BP, where 0=low and 1=high
>Final <- c(0,1,0,1,0,1,0,1,1,1)
>hospital.df <-
data.frame(Freq,BP,First,Second,Final) # contain everything under hospital
data.frame.
Frequency BP First Second Final
1 0.6 103 1 0 0
2 0.3 87 1 0 1
3 0.4 32 1 1 0
4 0.4 42 1 1 1
5 0.2 59 0 0 0
6 0.6 109 0 0 1
7 0.3 78 0 1 0
8 0.4 205 0 1 1
9 0.9 135 NA 1 1
10 0.2 176 1 1 1
Find the mean of the final decision and BP rating using the code:
> mean(Final)
> mean(BP)
Plot side-by-side histograms and boxplot of each variable using '(boxplot (x, ...))' and 'histogram ((hist(x, ...))'
and implementing code ‘par(mfrow=c(1,5))’ to create a matrix of plots in one plotting
space along with it.
> par(mfrow=c(1,5)) > hist(hospital.df$BP,col= "green",main="Histogram of the BP variable") > hist(hospital.df$Frequency,col="red",main="Histogram of the Frequency variable") > hist(hospital.df$First,col="purple",main="Histogram of the First variable") > hist(hospital.df$Second,col="yellow",main="Histogram of the Second variable") > hist(hospital.df$Final,col="brown",main="Histogram of the Final variable")
> par(mfrow=c(1,5))
Results regarding patient’s Blood Pressure:
From the histogram and boxplot created, it can be observed that most of the patient’s blood pressure is in the range of 59 to 140 and other patients blood pressure lies in the lies below 50 and above 140. The Mean of the patient’s blood pressure is 102.6. The lower Quartile range is 59. The upper quartile range is 140. The Mean of the final decision ratings is 0.6.
In this way, through histograms and boxplots in R, data can be visualized and understood where the data is majorly distributed and compared data sets.




Comments
Post a Comment