I’m doing an independent study on Computer Assisted Reporting with Professor Cory Armstrong in the Spring. I was told at a couple of job interviews that I need CAR experience, but the University of Florida takes data no further than the Fact Finding class.
So I’m going to find a dataset, explore it, and hopefully be able to produce a story package.
Right now I’m doing some research on different datasets currently available, but I’m having trouble narrowing down my subject.
I’ve been looking at some PEW studies for ideas on what sort of data to look at, as well as the IRE Database Library.
Some ideas so far:
- Campus Crime: compare Florida colleges or SEC colleges or just look at UF crime
- Walter Reed: I’m not sure how to find this data, or if it is readily available. But it was one of the seriously under covered stories listed by PEW. This could be taken more broadly: reduced funding in VA hospitals, funding vs. number of troops vs. number of living vets, 2001 to present for all kinds of money issues, number of wounded, currently enlisted, vets no longer enlisted, maybe also insurance
- Fluctuating Gas Prices
- Tasering Cases in Florida
Edit: I’m also trolling the Sunlight Foundation’s “Insanely Useful Web Sites.”
That’s it so far. (Thanks to Mindy for the help.)
Picking a subject has always been the hardest thing for me. I just want to look at everything!
Suggestions, as always, are welcome.
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=cbac8c31-1250-4074-b941-81ae0e64d596)
14 comments ↓
The Pew studies are a great resource for data that you can repurpose. I used one of their internet studies in my Ph.D. statistics class. Good luck. It’ll be interesting to see what you come up with.
Thanks Bryan!
DIY is my favorite approach. In fact, I have a few cool hand-built and totally exclusive databases which you’re welcome to have a go at.
My favorite is the one I built for this story. It’s an anonymous database of every single person awarded a degree between 1990 and 2000 by Florida’s public universities, plus the University of Miami and a sample of smaller private colleges. For every degree recipient, it includes hometown city, state and ZIP code; current city, state and ZIP code; and degree year, major and level. It includes a standard federal coding system for majors, and weighs in at more than 500,000 records.
This story on weight in the NFL and this one on poor hiring practices in Florida’s juvenile justice system also relied on exclusive, DIY databases.
Well, I don’t know how to build databases yet (it’s on the to do list for the spring). So I’ll probably be working in Excel. Love the degree database though.
What is your statistical toolset? Looking at tons of data isn’t very useful (imo) without some concept of how you plan to compare your variables and understand associations. Bug me once you have a more solid idea of your dataset.
Dave, I got an A in Stats 1, and I did well in stats in high school, but I’ve probably forgotten most of it by now. Professor Armstrong is in charge of helping me remember. :)
Megan: Looking for data to find a story in it is a mistake. The best editor I’ve ever had told me that data before story usually leads to a bunch of no-s***-sherlock journalism. Ideally, you need to find a story, and then find the data to tell the story, not the other way around. Getting some experience importing, sorting, aggregating and generally handling datasets is fine. It’s a good idea. But for your final project story, start by looking for a story and then look for the data. It may be found in one of the datasets you’re playing with, or one available online. It may be in a DIY one like Mr. Hartnett describes (and don’t fear those — some of the best stories are in DIY databases, and they aren’t black box magic to make). You live in the best of all places for this kind of work, so at first, divorce yourself from the notion that you have to have data to find a story. Find the story, then find the data.
If you get stuck, holler at me. This independent study idea is a great one. Let me know if I can help.
Thanks Matt. The professor and I haven’t really worked everything out yet, so I’m kinda poking around on my own. I think we’ll probably play with some random data first and then I’ll look for my story. Thanks for the advice.
Gosh…I love all of the helpful feedback Megan is getting. I think everyone has given her great advice. I’m starting to think this will be a bit of a group effort.
My two cents is this: Matt’s right about the fishing expedition being a bad idea. My first task for Megan is to think about what she wants to study and come up with some research questions: What are we looking for? Crime patterns? Tuition changes? Salaries fluctuations? Air quality?
Once you come up with the idea to study–then think about what kinds of datasets would provide that kind of information.
The worst thing to do–and I think that all of the great comments here have addressed this, at least implicitly–is to have a preconceived notion of what you expect the data to tell you. Using tunnel vision in this case will often result is a pretty biased story.
Gosh, isn’t this cool? :) I think Megan was thinking about finding a big old dataset so she could first learn how to “interrogate the data,” as they say in IRE. (Correct me if I’m wrong.) But I see how that is putting the cart before the horse if she is also going to produce a story package.
I’m concerned that she has only one semester (and other classes and stuff too) — I would hate to see her spend too much time collecting data, and she doesn’t have time for a FOIA request. What I hope she will learn is how to do the stuff Mark and Matt and Derek do — after they’ve got the data.
I’m out of my comfort zone on this, however.
@Mindy: I think the IRE folks say “interview the data,” though I actually like your version better. Maybe even “waterboard the data.”
TORTURE those data, YEAH!!
[...] semester I am taking an independent study on Computer-Assisted Reporting. I blogged about this last week, but to recap briefly: I will be learning how to find, clean and analyze data. At the end of the [...]
[...] this works out. Already, we’ve gotten lots of feedback from those in the field about some ideas . But here’s what we’ve decided on: she’ll develop a topic and specific research [...]
Leave a Comment