Just a set of notes I bashed together for work a couple of years ago, under the snappy title of "Excel as a substitute for commonly used CaseWare IDEA functionality" – that's a data analysis and manipulation package we use both for scripted automated tasks and more ad-hoc ones.
IDEA particularly suits situations where you want to perform multiple tasks on a particular set of data, such as splitting down spend into categories. The costing of its licensing model is unfortunately prohibitive to have more than one or two concurrent users, plus for a lot of more ad-hoc queries it'd be overkill. I prefer SQL for building queries where that's practical, but Excel is a graphical interface most people are familiar with.
Whilst as a how-to document it's not perfect by any stretch and some of the terminology isn't as clear or regular as I might use now, such as talking about forming tokens for matching rather than keys, it may still be of passing interest – particularly if you're in either field mentioned. Contents are as follows:
1. Summarise / aggregate
2. Use pivot tables to categorise or timeline data
3. Extract records from a set
4. Find unique or duplicate values
5. Find gaps in a sequence
6. Stratify records
7. Join two sets of data on a field
8. Creating identifiers to data match with
I should mention that I did this outside of work, that all of the data in the screenshots is fake, randomised or a matter of public record, and as far as licence goes let's say https://creativecommons.org/publicdomain/zero/1.0/
You might also be interested in a previous post, Some SQL Server functions for string matching
Note from the future: you should also check out Power Query / Get & Transform in more recent versions of Excel, as it makes one-to-many and many-to-many joins between data sets straightforward in ways you'd previously have wanted MS Query or Access to achieve using Office applications. Simple but very powerful.