Tom Soderstrom, CTO of NASA's Jet Propulsion Laboratory: How to choose a data scientist

At AWS re:Invent , ITProPortal spoke to Tom Soderstrom, the chief technology officer for NASA's Jet Propulsion Laboratory (JPL), the space agency's research and development centre in Pasadena, CA. The laboratory's primary function is the construction and operation of robotic planetary spacecraft, though it also conducts Earth-orbit and astronomy missions, as well as operating NASA's Deep Space Network.The JPL recently hired Rob Witoff, its first ever IT data scientist.

So what made NASA's JPL take the decision to hire a data scientist?

I look at the future. Every three years, I look at what the next IT decade will look like - because an IT decade is three years. We're taste-testing the future now, to see what makes sense in our environment. So 3 years ago there were 9 trends, and what we found was that big data was one of the trends. So when looking at big data, we though "how do you take advantage of it?" - "how do you get your arms around big data?" And we realised we needed an IT data scientist.

We wrote a job ad, and tried to find someone both inside and outside. We eventually found Rob Witoff, who is as far as I know the first data scientist at NASA.

So what should you look for in a data scientist?

A data scientist is someone who understands the domains of programming, machine learning, data mining, statistics, and hacking - in a good way: knowing how to get in and grab the data - and needs to understand his domain, whether it's science, engineering or business. But most of all, a data scientist needs to be able to tell a story. They need to be able to teach the data to tell a story we didn't know from data we already head. Rob is a marvel at that.

This Summer, we created a startup at the JPL, just like Skunkworks at Lockheed Martin, who built the SR-71 Blackbird. Sometimes it's impossible to do something that drastic within the mother company. What we wanted to do was do the same thing with big data and analytics. But we created a startup inside JPL.

How did you model this new startup?

I've worked in lots of startups and Rob has too. We took a cubicle farm, ripped out the cubicles, put in tables that faced each other, so people faced each other instead of the walls. We used a startup mentality, where I blocked and tackled, allowing them to get their job done, and Rob led them. We had some tremendous results - just amazing results. In one summer, we were able to answer questions that we've been trying to answer for years and years. Not just one, but half a dozen of them.

We took a low-hanging fruit mentality for data. So what's a low-hanging fruit? It's low-hanging because it's possible to go in and get it. It's fruit because it has value to someone.

So you see, we're tackling data analytics from two sides - the small and the very large. We have data coming from the Mars Curiosity Rover, and lots of sensors looking at oceans data very long term, for example. If we could predict hurricanes, say, that's a very big job. But we also have the small.

So we took the short-term, pragmatic quick results - the low-hanging fruit - and the long-term data also, and the idea is that they'll meet in the middle.

How did you choose Rob Witoff out of all the applicants?

When I looked for a data scientist, I got 24 resumes. They came in three categories: the scientist, the data analytics guru, and the IT professional. But I also wanted someone who had done startups, someone who could tap into all of that information coming from the outside - because often we become so insular. When we're successful, we think we've done big data, and we know it all.

There's all this wonderful buzz going on around big data, and buzz generates venture capital, and venture capital generates new technologies and tools. And in the end Rob was top of the heap. He's done startups, and he knows how to visualise things on mobile devices. He can code. He knows how to hack - in a good way - getting into data quickly and extracting what we need.

How do you even approach such massive amounts of data?

We're not digging for gold. Instead of digging, you do these quick prototypes to get a feel for where the gold might be. Once you sense it, then you can go back and get investment to dig for the gold there. that's what we did with these prototypes. Now, by the end of this summer, we know where it's worth digging deeper. Once you have your arms on that data - the low-hanging fruit - you climb higher into the tree. That prototype will lead to more - or not. If not, it doesn't matter, because you haven't spent much money.

Will you employ more data scientists in the future?

Right now, we employ only one, but we're soon hiring another one. They'll be a business analyst. I envision the future as involving many data scientists, each for their own domain. MGM just hired a HR data scientist, for example. A single data scientist is absolutely not enough.

A data scientist is a consultant role - no one person ever becomes an expert. He ties everything together, maybe, but he's an expert in none of it. They have to be someone who likes numbers and likes people.

We also interviewed the JPL's first ever data scientist, Rob Witoff. Tom and Rob will be giving a talk at 11:00 tomorrow (19:00 GMT), called "Small steps in visual analytics for NASA's big data. Make sure to follow our live coverage of the AWS re:Invent conference for minute-by-minute updates.

Images: NASA