Basically a bar chart shows rectangular bars with length proportional to the quantities being described. It helps to see relative quantities between various category types.
The barplot() command is used for making Bar Plots, while hist() is used for histograms. You can also use the plot() command with type=h to create histograms-The official R manual also suggests that Dot plots using dotchart () are a reasonable substitute for bar plots.
A very simple easy to understand tutorial for basic bar plots is at http://msenux.redwoods.edu/math/R/barplot.php
The difference between the three main functions that can be used for these charts are shown below-
> VADeaths
Rural Male Rural Female Urban Male Urban Female
50-54 11.7 8.7 15.4 8.4
55-59 18.1 11.7 24.3 13.6
60-64 26.9 20.3 37.0 19.3
65-69 41.0 30.9 54.6 35.1
70-74 66.0 54.3 71.1 50.0
> plot(VADeaths,type=”h”)
Arguments
x | the coordinates of points in the plot. Alternatively, a single plotting structure, function or any R object with a plot method can be provided. |
y | the y coordinates of points in the plot, optional if x is an appropriate structure. |
… | Arguments to be passed to methods, such as graphical parameters (see par). Many methods will accept the following arguments: type what type of plot should be drawn. Possible types are
|
From http://stat.ethz.ch/R-manual/R-patched/library/graphics/html/hist.html
hist(x, …)
## Default S3 method:
hist(x, breaks = “Sturges”,
freq = NULL, probability = !freq,
include.lowest = TRUE, right = TRUE,
density = NULL, angle = 45, col = NULL, border = NULL,
main = paste(“Histogram of” , xname),
xlim = range(breaks), ylim = NULL,
xlab = xname, ylab,
axes = TRUE, plot = TRUE, labels = FALSE,
nclass = NULL, warn.unused = TRUE, …)
Details
The definition of histogram differs by source (with country-specific biases). R’s default with equi-spaced breaks (also the default) is to plot the counts in the cells defined by breaks. Thus the height of a rectangle is proportional to the number of points falling into the cell, as is the area provided the breaks are equally-spaced.
The default with non-equi-spaced breaks is to give a plot of area one, in which the area of the rectangles is the fraction of the data points falling in the cells.
If right = TRUE (default), the histogram cells are intervals of the form (a, b], i.e., they include their right-hand endpoint, but not their left one, with the exception of the first cell when include.lowest is TRUE.
For right = FALSE, the intervals are of the form [a, b), and include.lowest means ‘include highest’.
A numerical tolerance of 1e-7 times the median bin size is applied when counting entries on the edges of bins. This is not included in the reported breaks nor (as from R 2.11.0) in the calculation of density.
The default for breaks is “Sturges”: see nclass.Sturges. Other names for which algorithms are supplied are “Scott” and “FD” / “Freedman-Diaconis” (with corresponding functions nclass.scott andnclass.FD). Case is ignored and partial matching is used. Alternatively, a function can be supplied which will compute the intended number of breaks as a function of x.
Arguments
x | a vector of values for which the histogram is desired. |
breaks | one of:
In the last three cases the number is a suggestion only. |
freq | logical; if TRUE, the histogram graphic is a representation of frequencies, the counts component of the result; if FALSE, probability densities, component density, are plotted (so that the histogram has a total area of one). Defaults to TRUE if and only if breaks are equidistant (and probability is not specified). |
probability | an alias for !freq, for S compatibility. |
include.lowest | logical; if TRUE, an x[i] equal to the breaks value will be included in the first (or last, for right = FALSE) bar. This will be ignored (with a warning) unless breaks is a vector. |
right | logical; if TRUE, the histogram cells are right-closed (left open) intervals. |
density | the density of shading lines, in lines per inch. The default value of NULL means that no shading lines are drawn. Non-positive values of density also inhibit the drawing of shading lines. |
angle | the slope of shading lines, given as an angle in degrees (counter-clockwise). |
col | a colour to be used to fill the bars. The default of NULL yields unfilled bars. |
border | the color of the border around the bars. The default is to use the standard foreground color. |
main, xlab, ylab | these arguments to title have useful defaults here. |
xlim, ylim | the range of x and y values with sensible defaults. Note that xlim is not used to define the histogram (breaks), but only for plotting (when plot = TRUE). |
axes | logical. If TRUE (default), axes are draw if the plot is drawn. |
plot | logical. If TRUE (default), a histogram is plotted, otherwise a list of breaks and counts is returned. In the latter case, a warning is used if (typically graphical) arguments are specified that only apply to theplot = TRUE case. |
barplot {graphics} | R Documentation |
http://stat.ethz.ch/R-manual/R-patched/library/graphics/html/barplot.html
Bar Plots
Description
Creates a bar plot with vertical or horizontal bars.
Usage
barplot(height, …)
## Default S3 method:
barplot(height, width = 1, space = NULL,
names.arg = NULL, legend.text = NULL, beside = FALSE,
horiz = FALSE, density = NULL, angle = 45,
col = NULL, border = par(“fg”),
main = NULL, sub = NULL, xlab = NULL, ylab = NULL,
xlim = NULL, ylim = NULL, xpd = TRUE, log = “”,
axes = TRUE, axisnames = TRUE,
cex.axis = par(“cex.axis”), cex.names = par(“cex.axis”),
inside = TRUE, plot = TRUE, axis.lty = 0, offset = 0,
add = FALSE, args.legend = NULL, …)
Arguments
height | either a vector or matrix of values describing the bars which make up the plot. If height is a vector, the plot consists of a sequence of rectangular bars with heights given by the values in the vector. Ifheight is a matrix and beside is FALSE then each bar of the plot corresponds to a column of height, with the values in the column giving the heights of stacked sub-bars making up the bar. If height is a matrix and beside is TRUE, then the values in each column are juxtaposed rather than stacked. |
width | optional vector of bar widths. Re-cycled to length the number of bars drawn. Specifying a single value will have no visible effect unless xlim is specified. |
space | the amount of space (as a fraction of the average bar width) left before each bar. May be given as a single number or one number per bar. If height is a matrix and beside is TRUE, space may be specified by two numbers, where the first is the space between bars in the same group, and the second the space between the groups. If not given explicitly, it defaults to c(0,1) if height is a matrix andbeside is TRUE, and to 0.2 otherwise. |
names.arg | a vector of names to be plotted below each bar or group of bars. If this argument is omitted, then the names are taken from the names attribute of height if this is a vector, or the column names if it is a matrix. |
legend.text | a vector of text used to construct a legend for the plot, or a logical indicating whether a legend should be included. This is only useful when height is a matrix. In that case given legend labels should correspond to the rows of height; if legend.text is true, the row names of height will be used as labels if they are non-null. |
beside | a logical value. If FALSE, the columns of height are portrayed as stacked bars, and if TRUE the columns are portrayed as juxtaposed bars. |
horiz | a logical value. If FALSE, the bars are drawn vertically with the first bar to the left. If TRUE, the bars are drawn horizontally with the first at the bottom. |
density | a vector giving the density of shading lines, in lines per inch, for the bars or bar components. The default value of NULL means that no shading lines are drawn. Non-positive values of density also inhibit the drawing of shading lines. |
angle | the slope of shading lines, given as an angle in degrees (counter-clockwise), for the bars or bar components. |
col | a vector of colors for the bars or bar components. By default, grey is used if height is a vector, and a gamma-corrected grey palette if height is a matrix. |
border | the color to be used for the border of the bars. Use border = NA to omit borders. If there are shading lines, border = TRUE means use the same colour for the border as for the shading lines. |
main,sub | overall and sub title for the plot. |
xlab | a label for the x axis. |
ylab | a label for the y axis. |
xlim | limits for the x axis. |
ylim | limits for the y axis. |
xpd | logical. Should bars be allowed to go outside region? |
log | string specifying if axis scales should be logarithmic; see plot.default. |
axes | logical. If TRUE, a vertical (or horizontal, if horiz is true) axis is drawn. |
axisnames | logical. If TRUE, and if there are names.arg (see above), the other axis is drawn (with lty=0) and labeled. |
cex.axis | expansion factor for numeric axis labels. |
cex.names | expansion factor for axis names (bar labels). |
inside | logical. If TRUE, the lines which divide adjacent (non-stacked!) bars will be drawn. Only applies when space = 0 (which it partly is when beside = TRUE). |
plot | logical. If FALSE, nothing is plotted. |
axis.lty | the graphics parameter lty applied to the axis and tick marks of the categorical (default horizontal) axis. Note that by default the axis is suppressed. |
offset | a vector indicating how much the bars should be shifted relative to the x axis. |
add | logical specifying if bars should be added to an already existing plot; defaults to FALSE. |
args.legend | list of additional arguments to pass to legend(); names of the list are used as argument names. Only used if legend.text is supplied. |
… | arguments to be passed to/from other methods. For the default method these can include further arguments (such as axes, asp and main) and graphical parameters (see par) which are passed toplot.window(), title() and axis. |
Details
This is a generic function, it currently only has a default method. A formula interface may be added eventually.
Value
A numeric vector (or matrix, when beside = TRUE), say mp, giving the coordinates of all the bar midpoints drawn, useful for adding to the graph.
If beside is true, use colMeans(mp) for the midpoints of each group of bars,
Related Articles
- The R code for those time-use graphs (stat.columbia.edu)
- Top Ten Graphs for Business Analytics -Pie Charts (1/10) (decisionstats.com)