Mastering LINQ for Powerful Data Analysis
In everyday C# development, you often need to summarise, categorise, or analyse collections of data – tasks that traditionally require multiple loops and temporary structures.
LINQ (Language Integrated Query) makes this elegant and expressive with built-in operators like group by, Count(), Sum(), Average(), and Aggregate().
These features let you process in-memory collections or databases with SQL-like power – directly inside C#.
🔹 What Are Grouping and Aggregation?
Grouping means partitioning a dataset into smaller buckets based on a shared key (for example, by department, category, or region).
Aggregation means computing summary values (like counts, sums, or averages) from those groups.
Together, they allow you to transform raw data into structured insights.
🔹 Basic Grouping with LINQ
Example: Group Employees by Department
var employees = new[]
{
new { Name = "Alice", Department = "IT", Salary = 60000 },
new { Name = "Bob", Department = "HR", Salary = 45000 },
new { Name = "Clara", Department = "IT", Salary = 65000 },
new { Name = "David", Department = "Finance", Salary = 50000 },
new { Name = "Ella", Department = "HR", Salary = 42000 }
};
var groups =
from e in employees
group e by e.Department into g
select new
{
Department = g.Key,
EmployeeCount = g.Count(),
TotalSalary = g.Sum(x => x.Salary),
AverageSalary = g.Average(x => x.Salary)
};
foreach (var group in groups)
Console.WriteLine($"{group.Department}: {group.EmployeeCount} employees, avg £{group.AverageSalary:F0}");
Output:
IT: 2 employees, avg £62500
HR: 2 employees, avg £43500
Finance: 1 employees, avg £50000
✅ What’s Happening
group e by e.Departmentcreates groups keyed by department name.- The
into gclause introduces a range variable (g) representing each group. g.Keyholds the group key.- Aggregates like
Count(),Sum(), andAverage()operate within each group.
🔹 The group by Clause Explained
The syntax looks similar to SQL but runs entirely in C#:
group <element> by <key> into <groupName>
| Part | Description | Example |
|---|---|---|
<element> | Each item in the source sequence | e |
<key> | Expression defining the group key | e.Department |
into | Optional keyword to name the new group variable | into g |
<groupName> | Identifier for the grouped sequence | g |
Each group is an IGrouping<TKey, TElement> — essentially a key/value pair where the value is an enumerable of matching items.
🔹 Counting and Aggregating
LINQ provides built-in methods to perform fast aggregations on sequences and groups:
| Method | Description | Example |
|---|---|---|
Count() | Counts the number of elements | employees.Count() |
Sum() | Adds numeric values | salaries.Sum() |
Average() | Calculates the mean value | scores.Average() |
Min() / Max() | Finds smallest/largest values | numbers.Max() |
Aggregate() | Custom accumulation logic | See below |
🔹 Custom Aggregation with Aggregate()
When you need more control than Sum() or Average() provide, use Aggregate() — it lets you define exactly how results are combined.
Example: Concatenate Names into a String
var names = new[] { "Alice", "Bob", "Clara" };
var combined = names.Aggregate((acc, next) => acc + ", " + next);
Console.WriteLine(combined);
Output:
Alice, Bob, Clara
✅ How It Works
- The first element initialises the accumulator.
- Each subsequent element is processed by the lambda
(acc, next). - Perfect for building strings, computing running totals, or custom logic.
🔹 Grouping by Multiple Keys
You can also group by composite keys using anonymous types.
var sales = new[]
{
new { Region = "North", Year = 2024, Revenue = 120000 },
new { Region = "North", Year = 2025, Revenue = 150000 },
new { Region = "South", Year = 2024, Revenue = 90000 },
new { Region = "South", Year = 2025, Revenue = 110000 }
};
var grouped =
from s in sales
group s by new { s.Region, s.Year } into g
select new
{
g.Key.Region,
g.Key.Year,
TotalRevenue = g.Sum(x => x.Revenue)
};
foreach (var g in grouped)
Console.WriteLine($"{g.Region} ({g.Year}): £{g.TotalRevenue}");
Output:
North (2024): £120000
North (2025): £150000
South (2024): £90000
South (2025): £110000
✅ Tip:
Using new { s.Region, s.Year } ensures each unique region/year pair becomes its own group.
🔹 Method Syntax Equivalents
The same logic using method syntax instead of query syntax:
var results = employees
.GroupBy(e => e.Department)
.Select(g => new
{
Dept = g.Key,
Count = g.Count(),
MaxSalary = g.Max(x => x.Salary)
});
Both syntaxes are functionally identical — choose whichever reads clearer for your scenario.
🔹 Advanced Example – Sales Summary Report
var orders = new[]
{
new { Customer = "Alice", Amount = 120 },
new { Customer = "Bob", Amount = 80 },
new { Customer = "Alice", Amount = 50 },
new { Customer = "Clara", Amount = 200 },
new { Customer = "Bob", Amount = 60 }
};
var summary =
from o in orders
group o by o.Customer into g
select new
{
Customer = g.Key,
Orders = g.Count(),
TotalSpent = g.Sum(x => x.Amount),
AverageOrder = g.Average(x => x.Amount)
};
foreach (var s in summary)
Console.WriteLine($"{s.Customer}: {s.Orders} orders, total £{s.TotalSpent}, avg £{s.AverageOrder}");
Output:
Alice: 2 orders, total £170, avg £85
Bob: 2 orders, total £140, avg £70
Clara: 1 orders, total £200, avg £200
✅ Real-World Use
- Generates instant summary reports.
- Perfect for analytics dashboards or audit logs.
- Easily extended to databases via LINQ to SQL or Entity Framework.
📚 Summary
| Concept | Description | Example |
|---|---|---|
| Grouping | Partition data into subsets by key | group e by e.Department |
| Count() | Number of items in each group | g.Count() |
| Sum() / Average() | Aggregate numeric fields | g.Sum(x => x.Salary) |
| Aggregate() | Custom accumulation logic | Aggregate((a,b) => a+b) |
| Composite Grouping | Group by multiple fields | new { s.Region, s.Year } |
✅ Best Practices
- Use group by for clear, readable summaries – not just loops.
- Combine grouping with anonymous types for dynamic result shapes.
- Avoid redundant enumeration — convert to
.ToList()only when needed. - For database queries, prefer
group byexpressions that can translate to SQL efficiently. - Keep aggregations simple and predictable — heavy logic belongs outside LINQ.
🧪 Challenge Task
Create a LINQ query that groups employees by department and prints:
- Department name
- Number of employees
- Highest and lowest salaries
Dataset:
var staff = new[]
{
new { Name = "James", Dept = "IT", Salary = 70000 },
new { Name = "Anna", Dept = "HR", Salary = 48000 },
new { Name = "Liam", Dept = "IT", Salary = 55000 },
new { Name = "Sarah", Dept = "Finance", Salary = 52000 }
};
Expected Output:
IT: 2 staff, max £70000, min £55000
HR: 1 staff, max £48000, min £48000
Finance: 1 staff, max £52000, min £52000
👨💻 Want More?
Learn advanced LINQ techniques in our Advanced C# Course, covering:
- Grouping and multi-key joins
- Aggregate and statistical functions
- Real-time data analytics with LINQ to Objects
- Optimising queries for performance and clarity
Master the art of writing expressive, data-driven C# code that turns collections into insights – all with LINQ.