How to Become a Data Journalist

By

First, the bad news: If you want to be a data journalist, odds are you’ll need to teach yourself. There are courses and organizations that can help, but journalism schools are only slowly adding data to their degree offerings. What’s more, if you’re reading this, you’re probably a journalist who wants to add data to your skill set. Going back to grad school may not be practical. One does not enroll in a design undergrad because one wants to learn how to use Photoshop, for instance.

Now, the good news: it’s possible to transform yourself into a data journalist. Here’s how:

1)    Pick a platform.

2)    Learn the basics.

3)    Tackle one example of each of the Big Three.

This is how those steps work in more detail:

Pick a platform

First step is picking your platform. You can take the traditional approach and just learn barely enough of a dozen different platforms to churn out a serviceable pie chart after an hour of bashing around and Googling everything, but I don’t recommend it. At a certain point, one must commit. This is one of those points. Figure out which software feels right to you, whether it be Tableau, Google Data Studio, Power BI, or some other platform, and go with it.

This choice may not be entirely up to you. If your newsroom has already done some data projects, you should probably get with the program and use whatever they’re already using. There are also plenty of other ways to visualize and analyze data than with the three I listed above. However, I’m operating under the assumption that you’re a reporter, and therefore probably do not find yourself in possession of coding skills, and that eliminates certain options like D3.

Learn the basics

Second step is learning enough of your chosen platform to get started. Most platforms come with their own tutorials, and you can also look to services like Lynda for training. Alternately, you can just dive into step 3 and try to learn through trial and error. Learning software this way is a bit like learning to swim by jumping in a lake and hoping things work out for the best, but experiential learners may find it’s the only approach that really works.

If your newsroom has any experience with data, your very first step should probably involve finding whichever reporters have done data work in the past, and wheedling them for help getting started. You can also hunt down a full-time data journalist outside your newsroom and ask them to mentor you. There’s a good chance they’ll go for it.

Data journalism can be a lonely, isolated life bereft of human sources to interview. In other words, many data journalists could use a break and some human contact. Reach out to one and see what they say. I’ve spent years trying to wheedle the journalists I work with into learning Tableau, with little success. Every single journalist who has come to me asking for help with training has received some variation of, “Absolutely! Of course I can help!”

Tackle one example of each of the Big Three

Third step is tackling the Big Three: elections, census, and economy. If you’re a journalist, you need to be able to cover elections, you need to be able to cover demographics, and you need to be able to cover economic issues. From these three basic types of data, most other types of analysis can be extrapolated.

Further, all three types are topics for which data should be fairly easy to find. Most countries have plenty of election data available, whether generated internally or externally. Same goes for census information and economic metrics.

In each case, you will follow the same basic production outline:

1) Hit Google and figure out which data is available on the subject. This is the most difficult step. Don’t be discouraged if you hit a wall early on–data is often surprisingly elusive. Keep trying. Be prepared to jump through some hoops even once you find the data. Many sources have a learning curve simply to figure out how to select or download their data.

2) Pull up the data you find in something simple like Excel, so you can see how the file is structured and what it contains. Figure out how the headers work. What do those column names mean? Are there any cryptic titles that need to be replaced with titles that make more sense? Is your dataset manageable in size? If you have more than 20 columns, it’s probably a poor choice for your first data project.

3) Clean your data, if needed. For your first training datasets, this shouldn’t be an issue. Try and just steer clear of dirty data right now. When dealing with data, anything that’s wrong is “dirty.” For instance, if you have two alternate spellings of the same person’s name in your dataset, that’s “dirty” data because your software will treat those two spellings as different people. When you’re first starting out, you should seek out simpler datasets with few or no mistakes. You want to have a firm footing in your platform of choice before you try and tackle really messy data sources. These types of sources can easily lead to errors in your analysis and conclusions.

4) Figure out the question you want to answer. Think about which columns (which data) you’ll need to answer that question. What sorts of views make sense? Is there a geographic component? Time component? Both?

5) Open your data in the data platform of your choice. If you decided to go with something Excel-based, like Pivot Tables, you can pat yourself on the back at this point for having saved yourself a step.

6) Start to answer your question from step 3. Remember that most visualizations begin life looking homely and unintelligible. It takes time to shape them. Be prepared to make multiple visualizations looking at the data different ways, and throw away those that don’t do anything useful. Often your finished product will be a dashboard with multiple views on the same page, and it’s normal to build more views than you need as you’re exploring your data. There’s nothing wrong with throwing away a third of the views you construct by the time you publish. This is perfectly normal.

7) Decide whether you’ve answered your question. If you have, see if that answer leads to any more questions–or if you’ve stumbled across any other interesting questions as you’ve worked through what you have. The side questions are frequently the most interesting.

8) Go show your results to someone. Their response can be anything from shocked astonishment at your feat to bored indifference. Bear in mind that just because you blow someone’s mind, that doesn’t mean you’re necessarily a genius. (But it certainly does encourage you to go out and keep analyzing.)

Strength in Numbers

Once you’ve tackled the Big 3, you should have enough familiarity with your chosen data platform to see whether data is for you. Many journalists are math phobes, and find numbers intimidating. It’s okay if you feel out of your depth. Bear in mind that as well as being scary, numbers can be transformative. Presented properly, they lend credence to your stories in a way that no amount of good prose can match.

One of the standby rules of writing states, “Show, don’t tell.” Data visualizations are the ultimate expression of this rule. They sit alongside our words and show the audience the numeric truth in what we report. This is of tremendous value to us, our work and our audience. In a world where credibility is a continuing challenge, we could all use some authoritative weight to throw around. Data can be that weight.