ML camp teaching 2023 edition

I’ll be teaching next week at the University of Minnesota’s Machine Learning Camp for high school students, which I founded back in 2018 with Dr. Melissa Lynn, Dr. Kaitlin Hill, and the MCFAM team. It’s a bit of a homecoming and a bookmark to my three years at CH Robinson. I looooooooove talking about the edge cases and problems and complications of data science, machine learning, and artificial intelligence, and with the explosion of ChatGPT and other Large Language Models (LLMs) into the common consciousness, as well as Midjourney, Stable Diffusion, and Dall-e for images, we’ve got a whole new set of examples to talk about!

Examples I want to hit on: the lawyer who uses ChatGPT with disastrous results, the “who is pregnant” problem in Large Language Models like ChatGPT, the marked downgrade in diversity in images from the image generators, the pr0n-y problem with training on, um, well, certain kinds of fiction (not to mention similar problems on the image side). I don’t think I’ll manage to get to intellectual property problems per se (the “stolen data” problem).

Time is getting away from me so this will be a multi-part set of posts. In this post I’ll tackle the misconceptions that ChatGPT is a search engine, ChatGPT is evaluated on truth, and ChatGPT represents the real world.

Misconception 1: ChatGPT is a search engine. Related misconception: ChatGPT and other LLMs are trained to be true and evaluated on the truth of their answers.

Fact 1: Nope, ChatGPT is not a search engine and neither is Bard. The older version of ChatGPT was only trained on info from the internet up to about September 2021. ChatGPT now has plugins that allow “internet access” in some cases, and Bard can access the internet as well, but LLMs in general are trained and evaluated on plausibility.

A large language model looks for what set of words would naturally follow another set of words with highest probability given the training data. Of course this is a bit oversimplified, as one can adjust weights, add stochasticity, etc. But the basic core is true: the value of LLMs is plausibility/probability, not truth. The lawyer who used ChatGPT to assist in his research learned this the hard way. ChatGPT entirely made up a number of cases that don’t exist. They don’t even make sense — there’s a wrongful death suit about a guy whose name changed in the first few pages who missed his flight. He’s a fictional character made up by ChatGPT, but unlike a fiction made up by a human writer, he’s not even got the consistency to have the grace to die! (ChatGPT says it is a wrongful death suit, remember.)

Misconception 2: ChatGPT is trained on real-life writing so it reflects the world we live in, right? And since Stable Diffusion and other image-generation models are trained on images from the internet they too represent the world, right?

Fact 2: Nope. When you skim “highest probability” paths off of life, you simply lose a lot of reality. Yes, ChatGPT overrepresents American experiences and writing — but *even given that* it loses a lot. Check out these examples from Kathryn Tewson (she’s got her own interesting AI story):

ChatGPT has specific pathways trained in regarding gender associated to professions. Stable Diffusion, an image-generation model, shows the same phenomenon, *amplifying* disparities beyond reality: “Stable Diffusion depicts a different scenario, where hardly any women have lucrative jobs or occupy positions of power. Women made up a tiny fraction of the images generated for the keyword “judge” — about 3% — when in reality 34% of US judges are women, according to the National Association of Women Judges and the Federal Judicial Center. In the Stable Diffusion results, women were not only underrepresented in high-paying occupations, they were also overrepresented in low-paying ones.”  Similar phenomenon with skin tone: Stable Diffusion “specifically overrepresented people with darker skin tones in low-paying fields. For example, the model generated images of people with darker skin tones 70% of the time for the keyword “fast-food worker,” even though 70% of fast-food workers in the US are White. Similarly, 68% of the images generated of social workers had darker skin tones, while 65% of US social workers are White.”

It’s been a minute

An entire three-year tour at C.H. Robinson began, developed, and ended since I last wrote here. I started there as a data scientist in September 2019, grew through senior and principal data scientist positions, and ended with a year and half as director of data science, leading a team that grew to be like 14 data scientists! A deep education in freight (North American surface transit), math in industry, how to work with engineering partners to balance innovation/scalability/maintainability, how to work with product partners (product means very different things to different people), and how to keep effort directed at solving high-impact business problems. That last one is toughest and probably yields the most disagreement. Some battles on data integrity probably don’t seem high-impact to business partners — but you’re never going to implement reinforcement learning or game theory or even automation of the most basic kinds if you can’t compare the algorithmic suggestion with the human action.

Anyhow, wonderful people there and a lot of learning — and again I’m gearing up to teach at the machine learning summer camp for high school students that I started some years ago. So wonderful to see that it survived COVID disruptions and is occurring in person at the University of Minnesota this year!

Why come back to this antiquated WordPress site? Well, ChatGPT and other LLMs are changing everything. It’s flooding the internet (and sci fi magazines and the arXiv) with generic-ish “content”. Seems like “content” coming from a real human will be in fashion again soon.

What is happening, then, “at the moment”? Since leaving CH Robinson I’ve dug up a lot of stumps in the back yard. Some of them I couldn’t dig up, so I had to hack them down with an axe. I’ve used that axe more this summer than I have in years. I planted three more black currant bushes, propagated from the black currant bush mentioned in the last post (from 2019). The original bush is doing well. The red currant bush died, after a wonderful run. Our cherry tree died and split in half. Now the backyard is full sun instead of full shade. It seems like a good time to literally tend my garden, and ideally set up for another 3 years of garden autopilot.

I have some ideas on education on time series — after working with time series models in industry for several years, I have some perhaps heterodox ideas on what’s useful. ARIMA is not it. Updates to come.

I’m also in the throes of summer camp stuff. This country is ridiculous when it comes to taking care of children and working full time. Competition for many camps is cutthroat — Circus Juventas sold out in less than two hours. The hours are different for each one, transportation is different for each one, after care, lunch, etc etc etc. There’s got to be an app for that. I see clearly the app I want for that. However, creating it myself might be kind of a pain. I am not, after all, a web dev or app developer. So, how far do I want to go to solve my own pain? Many moms I’ve talked to have a spreadsheet. The number of (mainly) women out here managing childrens’ summer schedules using Excel macros does speak to a market…

Clearly the other thing to consider is home improvement. I went through a deep dive on the wisdom of using muriatic acid to remove efflorescence on cinder blocks in a poorly-ventilated basement and finally concluded that I don’t think I can ventilate well enough to not rust our appliances since the vapors are denser than air. Pretty annoying. On to mechanical solutions.

I am (slowly) looking for a next step in corporate data science/quantitative finance/useful math. It is fun to explore the universe of possibilities & look for the right fit. I want to use math and insight to address business problems that fundamentally sit in the world of complex systems, and I want to understand more deeply the levers of the systems that run the world. Logistics was great for that, like being inside a Neal Stephenson novel. So… what’s next? Fortunately I don’t have to address that in a blog titled “At the moment”.