From R for Data Science
Exercises 10.4.1
A new graph is shown for each value in the continous variable
ggplot(mpg) +
geom_point(aes(x = drv, y = cyl)) +
facet_wrap(~hwy)
#ggsave("r-10-4-1-q1.png")
It means there are no data points for the drv and cyl combinations
The resulting plot from the code below is related to the empty points because they show there are no combinations of the drv and cyl where the empty cells in the other plot. For example, there are no combinations of cyl =5 and drv = 4 in the dataset
ggplot(mpg) +
geom_point(aes(x = drv, y = cyl))
#ggsave("r-10-4-1-q2.png")
ggplot(mpg) +
geom_point(aes(x = displ, y = hwy)) +
facet_grid(drv ~ .)
ggsave("r-10-4-1-q3_1.png")
ggplot(mpg) +
geom_point(aes(x = displ, y = hwy)) +
facet_grid(. ~ cyl)
ggsave("r-10-4-1-q3_2.png")
drv ~ .
makes the facets as rows, with no columns, so that the .
means cols = 0
. ~ cyl
makes the facets as columns, with no rows, so that the .
means rows = 0
ggplot(mpg) +
geom_point(aes(x = displ, y = hwy)) +
facet_wrap(~ class, nrow = 2)
The color aesthetic shows overlapping data and can provide more insights into small changes; but it can be difficult to separate the groups to see differences. The facets show differences between groups very clearly, but do not as clearly show small differences between the groups. I think that if there are larger datasets, faceting might be better because there would be several data points, and the overlapping data could be more difficult to decipher with color aesthetics instead of separating out the groups
nrow
and ncol
provide the number of rows and columns, respectively. scales
also has control over the layout of the individual panels.
facet_grid()
does not have nrow
and ncol
arguments because those are determined by the data
ggplot(mpg, aes(x = displ)) +
geom_histogram() +
facet_grid(drv ~ .)
ggsave("r-10-4-1-q6_1.png")
ggplot(mpg, aes(x = displ)) +
geom_histogram() +
facet_grid(. ~ drv)
ggsave("r-10-4-1-q6_2.png")
The first plot, with the facets as rows, makes it easier to compare engine size (displ) with cars across different drive trains. When comparing data as histograms, it might be better to view them across rows; and when comparing scatter plots, it might be better to view them as columns.
ggplot(mpg) +
geom_point(aes(x = displ, y = hwy)) +
facet_grid(drv ~ .)
ggsave("r-10-4-1-q7_1.png")
ggplot(mpg) +
geom_point(aes(x = displ, y = hwy)) +
facet_wrap(drv ~ .)
ggsave("r-10-4-1-q7_2.png")
The facet labels on facet_grid are on the right as rows. The facet labels on the facet wrap are on the top as columns.