The Grammar of Data
bookOne :: chapter-1The Grammar of Data
Data becomes knowledge only when language makes it visible.
Prelude — Why Grammar Matters
Every language has grammar — rules that turn sound into sense.
R, too, has grammar. But instead of words and verbs, its syntax shapes data.
The tidyverse gives us a language that makes analysis readable and expressive.
Where base R often feels like an equation, the tidyverse feels like poetry:
each function a phrase, each pipe a continuation of thought.
Tidy Data — The Foundation
Before you can speak clearly, your sentences must be structured.
In R, that means your data must be tidy:
- Each variable forms a column
- Each observation forms a row
- Each value lives in a single cell
This idea, championed by Hadley Wickham, is deceptively simple — but it changes everything.
Kesulitan dengan riset atau tugas akademik Anda?
Tim ahli Notivra siap mendampingi Anda memberikan solusi bimbingan dan dukungan akademik yang komprehensif.
library(tidyr)
messy <- data.frame(
person = c("Alice", "Bob"),
treatmentA = c(3, 6),
treatmentB = c(5, 2)
)
tidy <- messy %>%
pivot_longer(cols = starts_with("treatment"),
names_to = "treatment",
values_to = "response")
tidy
Now your data thinks in rows, ready to be filtered, summarized, or visualized without confusion.
The Grammar of Manipulation — dplyr
Once your data is tidy, you can begin to speak.
dplyr gives you the verbs:
select()— what mattersfilter()— what qualifiesmutate()— what transformssummarise()— what you want to learnarrange()— how you want it ordered
library(dplyr)
tidy %>%
group_by(treatment) %>%
summarise(mean_response = mean(response))
Each verb is declarative — not “how to compute,” but “what I mean.”
The Grammar of Visualization — ggplot2
If dplyr is the syntax of logic, ggplot2 is the syntax of imagination.
It’s based on the Grammar of Graphics — the idea that every plot is built from consistent components.
library(ggplot2)
ggplot(tidy, aes(x = treatment, y = response, fill = person)) +
geom_col(position = "dodge") +
labs(title = "Response by Treatment", y = "Response Value")
This code reads like a sentence:
Take the data, map treatment to x, response to y, fill by person, draw columns, label clearly.
Your plot isn’t a decoration — it’s the final clause in a well-formed paragraph of analysis.
Reflection — Grammar Makes Thought Visible
R’s grammar isn’t just syntax — it’s philosophy.
It teaches clarity through structure, and insight through fluency.
When your code reads like your reasoning, your analysis stops being a tangle of operations and becomes a story.
🧩 Try This
Use pivot_longer() and summarise() to reshape and summarize your own dataset.
Then visualize it with ggplot2.
Ask: What changed when I saw my data tidy?
💭 Reflection
What does your data say, once it starts speaking a cleaner language?
🔗 Further Reading
Artikel Terkait
Data as Objects, Analysis as Transformation
Every R session is a living stream. Data flows in, and with each pipe, it changes shape — cleaner, clearer, closer to insight.
From Code To Insight
Numbers whisper. Models hum. But the real music begins only when you *listen*. R is not just a tool for data — it’s a language for seeing. Each transformation, each plot, each tidy frame is a translation of the world’s complexity into meaning.
Thinking in Vectors, Not Loops
R doesn’t iterate — it transforms. Understanding vectors is the key to writing expressive, efficient, and elegant R code.