← Kembali ke Blog
R dplyr data transformation
+1
17 Oktober 2025

The Grammar of Data

bookOne :: chapter-1

The Grammar of Data

Data becomes knowledge only when language makes it visible.

Prelude — Why Grammar Matters

Every language has grammar — rules that turn sound into sense.
R, too, has grammar. But instead of words and verbs, its syntax shapes data.
The tidyverse gives us a language that makes analysis readable and expressive.

Where base R often feels like an equation, the tidyverse feels like poetry:
each function a phrase, each pipe a continuation of thought.


Tidy Data — The Foundation

Before you can speak clearly, your sentences must be structured.
In R, that means your data must be tidy:

  • Each variable forms a column
  • Each observation forms a row
  • Each value lives in a single cell

This idea, championed by Hadley Wickham, is deceptively simple — but it changes everything.

Dukungan Sistem Akademik

Kesulitan dengan riset atau tugas akademik Anda?

Tim ahli Notivra siap mendampingi Anda memberikan solusi bimbingan dan dukungan akademik yang komprehensif.

library(tidyr)

messy <- data.frame(
  person = c("Alice", "Bob"),
  treatmentA = c(3, 6),
  treatmentB = c(5, 2)
)

tidy <- messy %>%
  pivot_longer(cols = starts_with("treatment"),
               names_to = "treatment",
               values_to = "response")
tidy

Now your data thinks in rows, ready to be filtered, summarized, or visualized without confusion.


The Grammar of Manipulation — dplyr

Once your data is tidy, you can begin to speak.
dplyr gives you the verbs:

  • select() — what matters
  • filter() — what qualifies
  • mutate() — what transforms
  • summarise() — what you want to learn
  • arrange() — how you want it ordered
library(dplyr)

tidy %>%
  group_by(treatment) %>%
  summarise(mean_response = mean(response))

Each verb is declarative — not “how to compute,” but “what I mean.”


The Grammar of Visualization — ggplot2

If dplyr is the syntax of logic, ggplot2 is the syntax of imagination.
It’s based on the Grammar of Graphics — the idea that every plot is built from consistent components.

library(ggplot2)

ggplot(tidy, aes(x = treatment, y = response, fill = person)) +
  geom_col(position = "dodge") +
  labs(title = "Response by Treatment", y = "Response Value")

This code reads like a sentence:
Take the data, map treatment to x, response to y, fill by person, draw columns, label clearly.

Your plot isn’t a decoration — it’s the final clause in a well-formed paragraph of analysis.


Reflection — Grammar Makes Thought Visible

R’s grammar isn’t just syntax — it’s philosophy.
It teaches clarity through structure, and insight through fluency.
When your code reads like your reasoning, your analysis stops being a tangle of operations and becomes a story.


🧩 Try This

Use pivot_longer() and summarise() to reshape and summarize your own dataset.
Then visualize it with ggplot2.
Ask: What changed when I saw my data tidy?

💭 Reflection

What does your data say, once it starts speaking a cleaner language?

🔗 Further Reading


Artikel Terkait

bookOne :: chapter-1

From Code To Insight

Numbers whisper. Models hum. But the real music begins only when you *listen*. R is not just a tool for data — it’s a language for seeing. Each transformation, each plot, each tidy frame is a translation of the world’s complexity into meaning.

20 Oktober 2025
bookOne :: chapter-1

Thinking in Vectors, Not Loops

R doesn’t iterate — it transforms. Understanding vectors is the key to writing expressive, efficient, and elegant R code.

16 Oktober 2025
Dukungan Sistem Akademik

Kesulitan dengan riset atau tugas akademik Anda?

Tim ahli Notivra siap mendampingi Anda memberikan solusi bimbingan dan dukungan akademik yang komprehensif.