Actual Intelligence in the Age of AI

In the Author Spotlight series, TDS Editors chat with members of our community about their career path in data science and AI, their writing, and their sources of inspiration. Today, we’re thrilled to share our conversation with Jarom Hulet.

Jarom is a data science leader at Toyota Financial Services. He believes in using practical data science solutions to add value. He is passionate about developing a deep knowledge of basic and advanced data science topics.

You’ve argued that a well-designed experiment can teach you more than knowing the counterfactual. In practice, where experimentation is still underused, what’s your minimum viable experiment when data is scarce or stakeholders are impatient?

I do think that experimentation is still underused, and may be more underused now than it has been historically. Observational data is cheaper, easier to access, and more abundant with every passing day – and that is a great thing. But because of this, I don’t think many data scientists have what Paul Rosenbaum called the “experimental state of mind” in his book Causal Inference. In other words, I think that observational data has crowded out experimental data in a lot of places. While observational data can legitimately be used for causal analysis, experimental data will always be the gold standard.

One of my mentors frequently says “some testing is better than no testing.” This is an effective, pragmatic philosophy in industry. In business, learning doesn’t have intrinsic value – we don’t run experiments just to learn, we do it to add value. Because experimental learnings must be converted into economic value, they can be balanced with the cost of experimentation, which is also measured in economic value. We only want to do things that have a net benefit to the organization. Because of this, statistically ideal experiments are often not economically ideal. I think data scientists’ focus should be on understanding different levels of business constraints on the experimental design and articulating how those constraints will impact the value of the learnings. With those key ingredients, the right compromises can be made that result in experiments that have a positive value impact to the organization overall. In my mind, a minimal viable experiment is one that stakeholders are willing to sign off on and is expected to have a positive economic impact to the firm.

Where has AI improved your day-to-day workflow, as a practicing/leading data scientist, and where has it made things worse?

Generative AI has made me a more productive data scientist overall. I do however think there are drawbacks if we “abuse” it.

Improvements to productivity

Coding

I leverage GenAI to make my coding faster – right now I use it to help (1) write and (2) debug code.

Most of the productivity I see from GenAI is related to writing basic Python code. GenAI can write basic snippets of code faster than I can. I often find myself telling ChatGPT to write a somewhat simple function, and I respond to a message or read an email while it writes the code. When ChatGPT first came out, I found that the code was often pretty bad and required a lot of debugging. But now, the code is generally pretty good – of course I’m always going to review and test the generated code, but the higher quality of the generated code increases my productivity even more.

Generally, Python error notifications are pretty helpful, but sometimes they are cryptic. It is really nice to just copy/paste an error and instantly get clues as to what is causing it. Before I would have to spend a lot of time parsing through Stack Overflow and other similar sites, hoping to find a post that is close enough to my problem to help. Now I can debug much faster.

I haven’t used GenAI to write code documentation or answer questions about codebases yet, but I hope to experiment with these capabilities in the future. I’ve heard really good things about these tools.

Research

The second way that I use GenAI to increase my productivity is in research. I have found GenAI to be a good study companion as I’m researching and studying data science topics. I’m always careful not to believe everything it generates, but I have found that the material is generally quite accurate. When I want to learn something, I usually find a paper or published book to read through. Often, I’ll have questions about parts that aren’t clear in the texts and ChatGPT does a pretty good job of clarifying things I find confusing.

I have also found ChatGPT to be a great resource for finding resources. I can tell it that I’m trying to solve a specific type of problem at work and I want it to refer me to papers and books that cover the topic. I’ve found its recommendations to generally be pretty helpful.

Drawback — Substituting actual intelligence for artificial intelligence

Socrates was skeptical of storing knowledge in writing (that’s why we primarily know about him through Plato’s books – Socrates didn’t write). One of his concerns with writing is that it makes our memory worse — that we rely on external writing instead of relying on our internal memorization and deep understanding of topics. I have this concern for myself and humanity with GenAI. Because it is always available, it is easy to just ask the same things over and over again and not remember or even understand the things that it generates. I know that I’ve asked it to write similar code multiple times. Instead I should ask it once, take notes and memorize the techniques and approaches it generates. While that is the ideal, it can definitely be a challenge to stick to that standard when I have deadlines, emails, chats, etc. vying for my time. Basically, I’m concerned that we will use artificial intelligence as a substitute for actual intelligence rather than a supplement and multiplier.

I’m also concerned that the access to quick answers leads to a shallow understanding of topics. We can generate an answer to anything and get the ‘gist’ of the information. This can often lead to knowing just enough to ‘be dangerous.’ That is why I use GenAI as a supplement to my studies, not as a primary source.

You’ve written about breaking into data science, and you’ve hired interns. If you were advising a career-switcher today, which “break-in” tactics still work, which aged poorly, and what early signals really predict success on a team?

I think that all of the tactics I’ve shared in previous articles still apply today. If I were to write the article again I would probably add two points though.

One is that not everyone is looking for GenAI experience in data science. It is a very important and trendy skill, but there are still a lot of what I would call “traditional” data science positions that require traditional data science skills. Make sure you know which type of position you are applying for. Don’t send a GenAI saturated resume to a traditional position or vice versa.

The second is to pursue an intellectual mastery of the basics of data science. Actual intelligence is a differentiator in the age of artificial intelligence. The educational field has become pretty crowded with short data science master’s programs that often seem to teach people just enough to have a superficial conversation about data science topics, train a cookie-cutter model in Python and rattle off a few buzzwords. Our interview process elicits deeper conversations on topics — this is where candidates with shallow knowledge go off the rails. For example, I’ve had many interns tell me that accuracy is a good performance measurement for regression models in interviews. Accuracy is typically not even a good performance metric for classification problems, it doesn’t make any sense for regression. Candidates who say this know that accuracy is a performance metric and not much more. You need to develop a deep understanding of the basics so you can have in-depth conversations in interviews at first and later effectively solve analytics problems.

You have written about a wide range of topics on TDS. How do you decide what to write about next?

Generally, the inspiration for my topics comes from a combination of necessity and curiosity.

Necessity

Often I want to get a deeper understanding of a topic because of a problem I’m trying to solve at work. This leads me to research and study to gain more in-depth knowledge. After learning more, I’m usually pretty excited to share my knowledge. My series on linear programming is a good example of this. I had taken a linear programming course in college (which I really enjoyed), but I didn’t feel like I had a deep mastery of the topic. At work, I had a project that was using linear programming for a prescriptive analytics optimization engine. I decided I wanted to become an expert inf linear programming. I bought a textbook, read it, replicated a lot of the processes from scratch in Python, and wrote some articles to share the knowledge that I had recently mastered.

Curiosity

I’ve always been an intensely curious person and learning has been fun for me. Because of these personality traits, I’m often reading books and thinking about topics that seem interesting. This naturally generates a never-ending backlog of things to write about. My curiosity-driven approach has two elements – (1) reading/researching and (2) taking intentional time away from the books to digest what I read and make connections—- what Kethledge and Erwin refer to as the definition of solitude in their book Lead Yourself First: Inspiring Leadership Through Solitude. This combined approach is much greater than the sum of the parts. If I just read all of the time and didn’t take time to think about what I was reading, I wouldn’t internalize the information or come up with my own unique insights on the material. If I just thought about things I’d be ignoring life times of research by other people. By combining both elements, I learn a lot and I also have insights and opinions about what I learn.

The data science and philosophy series I wrote is a good example of curiosity-driven articles. I got really curious about philosophy a few years ago. I read multiple books and watched some lectures on it. I also took a lot of time to set the books down and just think about the ideas in them. That is when I realized that many of the concepts I studied in philosophy had strong implications on and connections to my work as a data scientist. I wrote down my thoughts and had the outline for my first article series!

What does your drafting workflow for an article look like? How do you decide when to include code or visuals, and who (if anyone) do you ask to review your draft before you publish it?

Typically I’ll have mulled over an idea for an article for a few months before I start writing. At any given point in time I have 2-4 article ideas in my head. Because of the length of time that I think about articles I usually have a pretty good structure before I start writing. When I start writing, I put the headers in the articles first, then I write down good sentences that I previously came up with. At that point, I start filling in the gaps until I feel that the article gives a clear picture of the thoughts I’ve generated through my studies and contemplations. This process works really well for my goal of writing one article every month. If I wanted to write more, I’d probably have to be a little more intentional and less organic in my process.

Any time I find myself writing a paragraph that is painful to write and read, I try to come up with a graphic or visual to replace it. Graphics and concise commentary can be really powerful and way better in creating understanding than a lengthy and cumbersome paragraph.

I often insert code for the same reason that I put visuals. It is annoying to read a verbal description of what code is doing — it is way better to just read well-commented code. I also like putting code in articles to demonstrate “baby” solutions to problems that any practitioner would use pre-built packages to actually solve. It helps me (and hopefully others) to get an intuitive understanding of what is going on under the hood.

To learn more about Jarom‘s work and stay up-to-date with his latest articles, you can follow him on TDS or LinkedIn.

Source link

Sign Up to Our Newsletter

Top Categories

Tech News

Tech

Software development

Robotics

Popular Tech News

Neuraxpharm and mjn-neuro launch EPISERAS AI-powered wearable device

How much RAM does your PC actually need...

When do Black Friday sales end in Australia?...

Cyber Monday 75-inch TV deals: the 6 best...