Creating Histogram and Boxplot in R

This is an explanation about creating a histogram and boxplot in R through the given data and understanding the outcome of the results from the plots.

The names of 5 variables in the dataset are as follows: 

Frequency’,‘BP’, ‘First’, ‘Second’, ‘FinalDecision’ 

1.     "0.6","103","bad","low","low”
2.     "0.3","87","bad","low","high”
3.     "0.4","32","bad","high","low”
4.     "0.4","42","bad","high","high"
5.     "0.2","59","good","low","low”
6.     "0.6","109","good","low","high”
7.     "0.3","78","good","high","low”
8.     "0.4","205","good","high","high”
9.     "0.9","135",”NA","high","high"
10.    "0.2","176",”bad","high","high”

Conversion of the following data into code and obtaining data.frame from it:

>Freq <- c(0.6,0.3,0.4,0.4,0.2,0.6,.3,0.4,0.9,0.2)
# frequency of hospital visits by patients during 12 month period
>BP <- c(103,87,32,43,59,109,78,205,135,178)# blood pressure of each individual patient
>First <- c(1,1,1,1,0,0,0,0,NA,1)# first doctor evaluation of BP, where 1=bad and 0=good
>Second <- c(0,0,1,1,0,0,1,1,1,1) # second and final evaluation of BP, where 0=low and 1=high
>Final <- c(0,1,0,1,0,1,0,1,1,1)
>hospital.df <- data.frame(Freq,BP,First,Second,Final) # contain everything under hospital data.frame. 

   Frequency  BP First Second Final
1        0.6 103     1      0     0
2        0.3  87     1      0     1
3        0.4  32     1      1     0
4        0.4  42     1      1     1
5        0.2  59     0      0     0
6        0.6 109     0      0     1
7        0.3  78     0      1     0
8        0.4 205     0      1     1
9        0.9 135    NA      1     1
10       0.2 176     1      1     1

Find the mean of the final decision and BP rating using the code:

> mean(Final)
[1] 0.6
> mean(BP)
[1] 102.6

Plot side-by-side histograms and boxplot of each variable using '(boxplot (x, ...))' and 'histogram ((hist(x, ...))'
and implementing code ‘par(mfrow=c(1,5))’ to create a matrix of plots in one plotting
space along with it.

> par(mfrow=c(1,5)) > hist(hospital.df$BP,col= "green",main="Histogram of the BP variable") > hist(hospital.df$Frequency,col="red",main="Histogram of the Frequency variable") > hist(hospital.df$First,col="purple",main="Histogram of the First variable") > hist(hospital.df$Second,col="yellow",main="Histogram of the Second variable") > hist(hospital.df$Final,col="brown",main="Histogram of the Final variable")

> par(mfrow=c(1,5))  
> boxplot(hospital.df$BP,col= "green",main="boxplot of the BP variable")
> boxplot(hospital.df$Frequency,col="red",main="boxplot of the Frequency variable")
> boxplot(hospital.df$First,col="purple",main="boxplotof the First variable")
> boxplot(hospital.df$Second,col="yellow",main="boxplot of the Second variable")
> boxplot(hospital.df$Final,col="brown",main="boxplot of the Final variable")

Results regarding patient’s Blood Pressure:




From the histogram and boxplot created, it can be observed that most of the patient’s blood pressure is in the range of 59 to 140 and other patients blood pressure lies in the lies below 50 and above 140. The Mean of the patient’s blood pressure is 102.6. The lower Quartile range is 59. The upper quartile range is 140. The Mean of the final decision ratings is 0.6.

In this way, through histograms and boxplots in R, data can be visualized and understood where the data is majorly distributed and compared data sets.

 URL to git repo:https://github.com/VedaVangala/vedas-r-repo/tree/main/R4

Comments

Popular posts from this blog

Input/Output, String manipulation and 'plyr' package in R

PACKAGE "ACCURACY"

Visualization of Graphics in R