The data mentality
Overview. Thinking about data, ideas for projects. Things to remember: (1) Ideas aren't discovered, they're developed. (2) Ideas have friends: when you find one, there are others nearby.
Buzzwords. Questions, data, idea machines.
Code. Related examples
We study data because we want to learn something. But what? In our world, we might want to know:
- How is the US economy doing?
- What emerging market countries offer the best business opportunities?
- How do returns on US and European stocks compare?
- What college majors are paid the most?
You'll notice that the starting point is a question, something we'd like to know more about. We provide a toolkit for working effectively with data to find answers. Most of our examples are about economics and finance -- that's what we know -- but the same tools can be used to address any data we like.
Thinking about data
It's not that we have no lives or anything, but we think about data all the time. If we see an interesting graphic in The Economist -- or the Wall Street Journal, or the New York Times -- it triggers a series of questions.
- What did we learn from the graph?
- What else would we like to know?
- Where does the data come from?
Following up on these questions often leads to interesting insights. And it's fun.
Let's give it a try:
Exercise. The 538 blog has a nice summary of salaries of recent college graduates. Skip to the bottom to sort by major and play around. Answer these questions:
- What did you learn from their table?
- What else would you like to know?
- Where did they get their data?
Exercise. What kinds of things would you like to know more about? Think of this as improv, there are no bad answers.
Generating project ideas
One of our goals is for you to produce a piece of work -- data and graphics -- that you can show potential employers. There's nothing like a concrete example to show off your skill set. We still have lots of time, but it can't hurt to start thinking about it now.
Idea machines. How would we find a good project idea? That's not something you run across a lot in modern education, where our job is typically to absorb what's taught rather than come up with our ideas. So how would we get started?
We're looking for a topic that satisfies two conditions: (1) we find it interesting and (2) we have access to data related to it. We can start with either one, or with an existing example we would like to reproduce and extend:
Start with what interests you. Economics, finance, marketing, emerging markets, movies, sports. You be the judge. Be specific: You want a topic, not a category.
Start with data. Take a dataset you find interesting, ask what you might do with it. If you're not sure where to look, try our list of data sources.
Start with an example. Find an analyst report, blog post, or graphic you like. Ask where the data comes from and think about whether you can replicate and/or extend it. The blogs listed on our data sources page are a good place to start.
If you're not sure how this works, watch Steve Levitt's video about working with company data. It's an entertaining and informative 50 minutes. Note specifically how he comes up with ideas for using the data he's given.
Keep in mind: we're not looking for a perfect idea. Perfection takes time, and we may never get there. Long experience has shown us:
Start small. Small ideas often grow into bigger ones.
Ideas have friends. If you have an idea, even a not very good one, it often triggers thoughts of other ideas, sometimes even better ones.
Ideas aren't discovered, they're developed. Allow your ideas to mature, to evolve and improve. Like kimchi and red wine, they get better with time.
Common mistakes -- and how to fix them. We mean this in a good way, but in our experience there are a number of things students do that make this harder than it should be. Here's a list, with suggestions for overcoming them:
Reject an idea too soon, before you’ve given it enough thought. Solution: Don't be critical too early, you don't want to inhibit your creativity. Collect ideas first, whittle them down later.
Choose a project that’s too large. Solution: Keep it simple. Think it over for a while, and choose a small part of a larger project that is interesting on its own. You can always do more later.
Your dataset doesn't have everything you want. To be honest, that's pretty much every dataset we've ever seen. Solution: Make do with what you have.
Pick a dataset that's not available. Solution: Start with what you have, ask what you can do with it. We call this the Jeopardy approach: start with the answer, come up with the question. If that fails, find another dataset.
Projects are less structured than most things you'll run across in school. It's challenging, at first, to work with so little structure, but most students find that the freedom to develop their own projects is one of the most rewarding things they can do.
Exercise. Write down three project ideas. Don't overthink this, one or two lines each will do.