A Tale of Three Artificial Intelligence (AI) Experiences in Healthcare Interactions

AI tools are being increasingly used in healthcare, particularly for tasks like clinical notetaking during virtual visits. As a patient, I’ve had three recent experiences with AI-powered notetaking tools during appointments with the same clinician. Each time, I consented to its use, but the results were very different across the three encounters. The first two involved similar tools with mostly good results but surprising issues around pronouns and transparency of the consent process. The third was a different tool with a noticeable drop in quality. But what really stands out, when I compare these to a visit without AI, is that human errors happen too — and the healthcare system lacks effective processes for identifying and correcting errors, no matter the source.

Encounter One: Good Notes, Incorrect Pronouns

At the start of my first virtual appointment, my clinician asked for my permission to use an AI-powered tool for notetaking. I consented. After the visit, I reviewed the clinical note, and the summary at the top described me using “he/him” pronouns. I’m female, so they should have been “she/her”.

The rest of the note was detailed and clinically accurate and useful. But the pronoun error stood out. It seemed like the AI defaulted to male pronouns when gender information wasn’t explicitly mentioned, which made me wonder whether the model was trained with gender bias or if this was a design flaw in this tool.

Encounter Two: Clarifying Pronouns, Learning About Chart Access

At the next appointment, my clinician again asked for consent to use an AI-powered notetaker. I agreed and pointed out the pronoun error from the previous visit, clarifying that I am female and use she/her pronouns. My clinician looked at the prior note and was equally puzzled, commenting that this issue had come up with other patients — both directions, sometimes assigning female pronouns to male patients and vice versa. The clinician mentioned that the AI system supposedly had access to patient charts and should be able to pull gender information from existing records. That really surprised me: the consent statement had described the tool as a notetaking aid, but nothing had been said about access to my full chart. I would have given permission either way, but the fact that this hadn’t been disclosed upfront was disappointing. I had understood this to be a passive notetaking tool summarizing the visit in real time, not something actively pulling and using other parts of my health record.

This time, the pronouns in the note were correct (which could be because we talked about it and I declared the pronouns), and the overall summary was again accurate and detailed. But the fact that this was a recurring issue, with my provider seeing it in both directions across multiple patients, made it clear that pronoun errors weren’t a one-off glitch.

Encounter Three: A Different AI with Worse Results

By the third appointment, I knew what to expect. The clinician again asked for consent to use an AI notetaker, and I agreed. But after reviewing the note from this visit, two things stood out.

First, the quality of the notetaking was noticeably worse. Several errors were obvious, including situations where the note reflected the exact opposite of what had been discussed. For example, I had said that something did not happen, yet the note recorded that it did.

Second, this time the note disclosed the specific software used for notetaking at the bottom of the document. It was a different tool than the one used in the first two visits. I hadn’t been told that a different AI tool was being used, but based on the change in quality and the naming disclosure, it was clear this was a switch.

This experience reinforced that even when performing the same task — in this case, AI notetaking — the software can vary widely in accuracy and quality. I much preferred the output from the first two visits, even with the initial pronoun error, over the third experience where clinically significant details were recorded incorrectly.

Notably, there doesn’t seem to be a process or method (or if there is one, it is not communicated to patients or easily findable when searching) to give the health system feedback on the quality and accuracy of these tools. Which seems like a major flaw in most health systems’ implementations of AI-related tools, assessing and evaluating only from the healthcare provider perspective and overlooking or outright ignoring the direct impact on patients (which influences patient care, the clinician-patient relationship, trust with the health system….).

A Human-Only Encounter: Still Not Error-Free

To give further context, I want to compare these AI experiences with a separate virtual visit where no AI was involved. This was with a different clinician who took notes manually. The pronouns were correct in this note, but there were still factual inaccuracies.

A small but clear example: I mentioned using Device A, but the note stated I was using Device B. This was not a critical error at the time, but it was still incorrect.

The point here is that human documentation errors are not rare. They happen frequently, even without AI involved. Yet the narrative around AI in healthcare often frames mistakes as uniquely concerning when, in reality, this problem already exists across healthcare.

A Bigger Issue is Lack of Processes for Fixing Errors

Across all four encounters — both AI-assisted and human-driven — the most concerning pattern was not the errors themselves but the failure to correct them, even after they were pointed out.

In the first AI note where the pronouns were wrong, the note was never corrected, even after I brought it up at the next appointment. The error remains in my chart.

In the human-driven note, where the wrong device was recorded, I pointed out the error multiple times over the years. Despite that, the error persisted in my chart across multiple visits.

Eventually, it did affect my care. During a prescription renewal, the provider questioned whether I was using the device at all because they referenced the erroneous notes rather than the prescription history. I had to go back, cite old messages where I had originally pointed out the error, and clarify that the device listed in the notes was wrong.

I had stopped trying to correct this error after multiple failed attempts because it hadn’t impacted my care at the time. But years later, it suddenly mattered — and the persistence of that error caused confusion and required extra time, adding friction into what should have been a seamless prescription renewal process.

My point: the lack of effective remediation processes is not unique to either AI or human documentation. Errors get introduced and then they stay. There are no good systems for correcting clinical notes, whether written by a human or AI.

So, What Do We Do About AI in Healthcare?

Critics of AI in healthcare often argue that its potential for errors is a reason to avoid the technology altogether. But as these experiences show, human-driven documentation isn’t error-free either.

The problem isn’t AI.

It’s that healthcare systems as a whole have poor processes for identifying and correcting errors once they occur.

When we evaluate AI tools, we need to ask:

  • What types of errors are we willing to tolerate?
  • How do we ensure transparency about how the tools work and what data they access?
  • Most importantly, what mechanisms exist to correct errors after they’re identified?

This conversation needs to go beyond whether errors happen and instead focus on how we respond when they do.  It’s worth thinking about this in the same way I’ve written about errors of commission and omission in diabetes care with automated insulin delivery (AID) systems (DOI: 10.1111/dme.14687; author copy here). Errors of commission happen when something incorrect is recorded. Errors of omission occur when important details are left out. Both types of errors can affect care, and both need to be considered when evaluating the use of AI or human documentation.

In my case, despite the pronoun error in the first AI note, the notetaking quality was generally higher than the third encounter with a different AI tool. And even in the human-only note, factual errors persisted over years with no correction.

Three encounters with AI in healthcare - reflecting on errors of omission and commission that happen both with humans and AI , a blog post by Dana M. Lewis from DIYPS.orgAI can be useful for reducing clinician workload and improving documentation efficiency. But like any tool, its impact depends on how it’s implemented, how transparent the process is, and whether there are safeguards to address errors when they occur.

The reality is both AI and human clinicians make mistakes.

What matters, and what we should work on addressing, is how to fix errors in healthcare documentation and records when they occur.

Right now, this is a weakness of the healthcare system, and not unique to AI.

Infection is not inevitable: how to stop the spread of infections like COVID-19, flu, RSV, colds, and more in your house

I observe a number of people who seem to think it’s inevitable that once someone gets sick, the rest of the house is going to get sick with 100% certainty.

Nope.

First of all, household transmission rates are less than 100% for all of these conditions, even if you didn’t take any precautions or make any behavior changes.

Secondly, with knowledge about how these things spread and some mitigation measures, you can reduce this a lot – and in some cases to nearly zero.

I will caveat: that of course depending on the situation some of these precautions may not make sense or be possible. For example, if you have kids, your exposures may be different. We don’t have kids in our house, so we are dealing with adult to adult possible transmission. That being said, some of these things may still be worth doing to some degree, to cut down the risk of exposure and/or to limit the viral dose you are exposed to, even in a situation that is less straightforward like a parent taking care of sick kids.

PS – if you’re reading this in January 2025 and don’t read the rest, make sure you’ve gotten your flu shot (yes, it helps) for the 2024-2025 flu season. No, it’s not too late. If you’re >65, you should also check about the RSV vaccine (which like the flu shot is a seasonal vaccine). It’s not too late and given the current high rates of RSV and flu (and soon to be uptick in COVID), they can help prevent getting or limit the severity if you do get exposed.

Our experience preventing the spread of RSV and the common cold

I can speak with recent, practical experience on this.

Twice.

First, let’s talk about RSV.

Before Thanksgiving, Scott and I were exposed to a nibling (aka a niece or nephew – of which we have 10 plus several honorary ones!) who had what we thought was a lingering cough from a cold from a few weeks ago. Because I am avoiding infection, I wore a mask inside and did not get up close to the nibling, so as a result of all of this likely had minimal exposure. Scott did not mask and had spent a lot more time with this nibling hanging all over him and coughing near or on him. Within 48 hours, he started to get symptoms of something.

We activated our plan for household transmission avoidance. Well, with a rolling start: Scott recognized by Thanksgiving evening that he was starting to feel unwell and had a tiny bit of coughing. I thought I could hear something in his chest differently, in addition to the occasional cough, so I went into full precaution mode while Scott did a partial precaution mode. This meant we set up air purifiers by each of our beds, and a fan pointed in my direction so all air was blowing away from me. I also wore a mask to go to sleep in. (This was super annoying and I don’t like doing this, especially because I usually take a shower and go to bed with wet hair. Wearing an n95 with head straps on wet hair plus having a fan and purifier blowing on me is chilly and unfun.) I would’ve preferred Scott to mask, too, or go to the guest bedroom to sleep, but it was late in the evening; he wasn’t convinced he was really sick; and I was too exhausted to argue about it on top of the fact that we were leaving on a trip the next morning. So he did not mask that night.

The next morning, though, he was definitely sick. He tested negative for COVID, and the nibling and everyone else from that house had been testing for COVID and negative, so we were fairly confident based on serial testing that this was not COVID. At the time, the thought was this was a common cold.

Since we were planning to mask in indoor spaces, anyway, including in the airport and on the plane, we felt comfortable going on our trip as planned, because we would be unlikely to infect anyone else. (This includes no indoor dining: we don’t take off our masks and eat inside.)

Because Scott realized he was sick, he masked from that point forward (with a non-valved N95). We both masked in the car, in the airport, on the plane, and again when we arrived while driving in the rental car. Then a challenge: we needed to eat dinner (we got takeout), and we were sharing a hotel room overnight. We switched from a hotel room with a king bed to a room with two queen beds, which would give us some more space overnight. But we took turns eating dinner unmasked in the hotel room (it was too cold to be outside) with the far-UVC lights on and the purifiers around each of us when we ate. While we ate, the other person was masked. (And I went first, so there was no unmasked air from Scott while I was eating and he went second). We also took turns showering, again with me going first and him not having been in the bathroom unmasked until after I had gone in. Other than that, we stayed masked in the hotel room including overnight, again with purifiers between us and the far-UVC lights running.

(This hotel did not have windows that opened to outside, but if there had been windows I would’ve eaten in front of the open window and we would’ve likely kept it cracked open and the heat turned up, to improve the room’s ventilation).

The next day, we had more of a drive, and again we masked. We also slightly rolled down the windows in the backseat to improve ventilation. Scott sometimes took his mask off for comfort stretches, because he was driving, but put it back on fully and sealed it before coughing. I kept my mask on without ceasing. We did a 4.5 hour drive this way.

Luckily, once we arrived at our destination, there was a spare bedroom, so that became Scott’s headquarters. He stayed masked in the living room/shared areas. He sat downwind outside and masked up when coughing if anyone was outside. We left the sliding door to the outside cracked open, in order to keep the air in the common areas well-ventilated. This worked, because we were able to keep CO2 levels (a proxy for ventilation) down below 700 ppm most of the time.

Because we had separate bedrooms, we did not mask while sleeping the rest of the week, because we each had our own rooms (and own airflow). I did keep a purifier running in my room all week, but that’s my habit regardless because I’m so allergic to dust.

And guess what? It all worked. We masked again on the drive back to the airport and in the airport and on the plane and again once we got home.

I never got RSV. The four other adults we spent time with and shared a house with….also did not get RSV. So we are pretty confident that the transmission chain stopped completely at Scott.

In summary, what worked:

  • Masking in shared spaces, and two-way masking when it wasn’t possible to ventilate
  • When we had to sleep in the same room, two-way masking even for sleeping overnight
  • Scott masking in shared spaces that were well ventilated, and often left the room to go cough even when masked (or coughing outside). This often meant he masked, but the rest of us did not mask inside the whole time.
  • Generally keeping distance. Droplets were managed by the N95 mask, and we were ventilating to reduce aerosol transmission risk, but still keeping physical distance to further reduce the risk.

RSV is *very* transmissible especially with aerosols, and Scott was coughing a lot all day and night. (At one point, his Sleep Cycle app was estimating 18 coughs per hour). It took a long time for that to get down to normal, so he continued to sleep in our guest room when we got back and we continued to ventilate well even when we gradually reduced masking once he stopped coughing. It took about 10 or so days for all of his biometrics to normalize, and about 14 days for his cough to completely go away. It probably was closer to three weeks before he finally felt all recovered.

So with that timing in mind, you know what happens 4 weeks after Thanksgiving? Christmas/other end of year holiday gatherings.

We had plans to see 8 kids and 8 adults (plus us) for Christmas. And at Christmas, it seemed like everyone had a cold already. So again, I went in and mostly masked except for when I was in front of an open window and the room was well ventilated, without anyone coughing actively in the room. (If anyone was in the room with me and coughing, especially the kids, I would mask even with the window open).

I did not get the cold that 8-10 (out of 16) people eventually got.

But…Scott did. And this time, he was mostly masked, but he still spent more time up close with kids who were coughing quite a bit. And this is where some of the dynamics of knowing WHAT people have is helpful. You can’t always know, but you can sometimes use the symptoms to figure out what people have.

For example, based on symptoms of the nibling who passed on germs to Scott around Thanksgiving, and Scott’s symptoms (instant, incredible chest cough but no runny nose, sore throat, fever, or aches) we had ultimately guessed that Scott had RSV. We then knew that the biggest risk was either droplets from coughing (especially because the volume of coughing), which could be reduced drastically by masking, or aerosols, which again would be helped by his masking and also ventilation, and in closed spaces, two-way masking (me masking).

For the Christmas germs, everyone seemed to have mild symptoms with congestion, runny noses, some coughs. But no fevers or aches and it seemed less severe. Given our recent experience with RSV, we narrowed it down to likely being a cold (rhinovirus), given again everyone testing repeatedly negative for COVID.

Given that, we knew the risk was going to be highest for us from droplets and fomites. So we again masked in shared spaces; Scott went to sleep in the guest bedroom as soon as he started getting symptoms; and we both did a lot of hand washing. Scott washed his hands before touching any of my things and regularly wiped down the kitchen. I tried not to go in the kitchen much (our main overlapping shared space), but also wash my hands after any time that I did. He didn’t have much of a cough and it was more controlled, so he would hold his cough until he could cover it with a mask or be in the room by himself. We also did our usual running of purifiers and opened windows and ran fans to increase ventilation to keep CO2 low.

And again? It worked. I did not get the cold, either from any of the ~8+ folks who did across the holiday period, or from Scott. Scott’s vitals all returned to normal at the five day mark, although we continued to mask in the car through day 7, to be more cautious (due to my personal situation).

So, infection is not inevitable, even in small houses and apartments.

Here’s what we’ve taken away from these experiences with more aerosol-based (RSV) transmission diseases and more droplet and fomite-based (cold) experiences:

  1. Two-way N95 masking works. Mask in the car, run the fan, keep the windows cracked, run purifiers at home, and ventilate spaces, but you still want two-way masking when something is aerosolized and you’re in the same spaces. This can prevent transmission.
  2. Keep distance when someone is coughing and sneezing (and if they have a cough or sneeze type illness, you want 6 foot distance even when they’re not actively coughing or sneezing, because they make droplets just from breathing and talking). The person who’s coughing and sneezing should mask, even inside, unless they are in their own room in private (and it’s not a shared room).Keep your air ventilated (if you haven’t, read my post about ventilation and using a Co2 monitor)Depending on the illness, to fully protect yourself you’ll need to commit to wearing a mask at all times indoors to protect yourself if the person who is sick is not masking. (Eg, Scott got a cold while mostly masked around heavily coughing niblings, but not throughout the whole house the whole day). With adults, the adults who are sick should definitely mask if they’re in shared spaces with other adults. (It’s harder with kids, and it should be a conversation depending on the age of the kids about them masking in shared spaces, such as if they want to play with Uncle Scott, or help them understand that someone may not want to play up close if they’re sick and coughing and not willing to mask. That’s fine, but that’s a choice they can make when kids are old enough to understand.)
  3. Have the infected person sleep in a different space (on the couch or in another room if you have a spare bedroom). If you have to share a room, both should mask.
  4. Use cleaning wipes to wipe down shared surfaces (e.g. fridge handles, microwave, counters, bathroom surfaces like the flush on the toilet or sink faucet, etc) and wash your hands after using these shared spaces every time. Fomites can last longer than you’d expect.
  5. Use metrics from your wearable devices (eg Apple Watch or Oura ring or similar) to track when your temperature, respiratory rate, heart rate, cough rate, etc. return to normal. That tied with symptom elimination can help you determine how long you’re likely most infectious for. The general estimates of contagiousness for each condition generally seem to be right (e.g., two weeks for an adult with RSV and 5-7 days for a cold) in our recent experiences. I would continue precautions for at least those minimum time frames, if you can.
  6. Yes, there’s a cost to these precautions, in terms of human contact. There was no hugging or hand holding or kissing or any touch contact during these time periods. I felt pretty lonely, especially because it was me we were trying to protect (because I am at high risk for bad outcomes due to immunosuppression right now), and I’m sure Scott also felt lonely and isolated. That part sucked, but we at least knew it was a fixed period of time, which helped.

What we’d do differently next time

Infection is not inevitable -how to reduce transmission of illness in your household (including COVID-19, RSV, flu, and the common cold), written by Dana M. Lewis from DIYPS.orgThis basically has been our plan for if either of us were to get COVID-19 (or the flu), and it’s good to know this plan works for a variety of conditions including RSV and the common cold. The main thing we would do differently in the future is that Scott should have masked the very first night he had symptoms of RSV, and he has decided that he’ll be masking any time he’s in the same room as someone who’s been coughing, as that’s considerably less annoying than being sick. (He really did not like the experience of having RSV.) I obviously did not get it from that first night when he first had the most minor symptoms of RSV, but that was probably the period of highest risk of transmission of either week, given the subsequent precautions we took after that.

Combined, everything we did worked, and we’ll do it again when we need to in the future, which should not be very often. We went five full years without either of us getting any type of infection (yay), and hopefully that continues from here on out. We’ll also continue to get regular COVID-19 boosters; annual flu shots; and other annual shots if/when they become available (e.g. when we reach the age, getting the RSV vaccine).

Remember, if you’re reading this in January 2025, RSV and flu levels are very high in the US right now, with COVID-19 expected to pick up again soon. It’s not too late to get your boosters and given the rates of respiratory illness, consider situational masking even if you don’t typically mask.

Power outage, winter heat, battery, and other tools and solutions

We recently had a multi-day power outage. Ugh. It luckily wasn’t so cold we needed to go find a hotel to stay in (or stay overnight with family elsewhere), but it was chilly, inconvenient, and annoying. We have done a lot of preparation for power outages, though, because we often get a handful a year during really windy storms around here. We found ourselves pleasantly surprised with the utility of a couple of things and wanted to share with other folks. Most of these serve two purposes, but work really well in a power outage situation. I’ll describe what we got them for originally and how we use them in a power outage. (PS – we bought all of these on Amazon, and I’m sharing these links as Amazon affiliate links).

Lights
Sure, your phone has a flashlight, but if you’re spending hours without power, it makes a surprising psychological difference to have at least a little light in the room(s) you’re spending time in.

Our first go-to is the fact that our bigger batteries (more details on them below) have decent lights that can be turned on. Point it toward a wall or set it upright and point to the ceiling, and you have enough light to see the room by. And your brain is going to be less stressed. Win.

The other lights we have had for years are high powered, rechargeable LED waist or head lamps. I *love* these, which I got for ultrarunning. The ones we got come in a pack of two, and what I do is deconstruct them and link the two bands together, essentially making one band that has both lights and backlights on them. I point one toward the front, and one to the back. The battery pack and red light part that usually goes on the back is facing the sides. These are really bright, and good enough to wear around my waist when running or walking in the dark, and I don’t use a headlamp with them. So that’s why we had them, and they’re usually the second light source we go to in a power outage. Scott will wear one as a waist lamp or pendant lamp as he moves around. The other thing we realized this time is that you can hang it over a standing lamp and the light reflects off the wall really well (and at height), which is a nice complement to any other lower light sources you have.

Some of our small battery power banks we use for charging phones or little electronic devices have lights, too, that are more similar to your phone’s flashlight. I wouldn’t buy these for the sake of these lights, but it’s handy to know if any of your power banks have lights so you can use them in a power outage, too.

The last set of lights I’ll mention are new in our collection this year. We got them for our pantry, which doesn’t have lights. These are motion sensor lights that trigger when we open our pantry door and stand there looking in the pantry. They flip off after a few seconds in auto mode. They are rechargeable, but we found that they lasted for months on auto without needing recharging. Once we hit the power outage, Scott pulled one out of the pantry and realized it magnetically attaches to the light fixture in our bathroom. There’s four brightness settings, and you can also switch it to stay on. So if you are in the bathroom and using that room, you can switch it to “on” and it’s a great light source, then switch it back off (to motion sensor mode) when you’re done in there. We thought they were so useful that we went ahead and ordered a second set to use in the house and leverage for a future power outage.

Heat

We were lucky this time that it was only 40-50F for most of our power outage. That being said, we did get chilly and it got chillier every day. I tend to run cold anyway and use a heating pad at the foot of my bed normally, so without that…brrrr. We’ve gotten all of these things over the years to make hiking and walking and skiing in the cold and wet more comfy, so like the lights these all have primary uses beyond power outages, but they doubled up nicely to serve us during the outage, too.

(By the way, heating pads have changed since we were kids. I resisted getting another or a new heating pad for about a year. Because the heating pad I had was ~10-15 years old and one of those old style plasticky ones with a blue scratchy fabric cover over it. I hated the plastic. Then I finally looked online and was shocked that they come in different colors – I could get purple! – and sizes and they aren’t plastic anymore. Woohoo! So, I now own multiple heating pads in different colors and sizes. Mentioning that in case anyone else has a historical heating pad and doesn’t know there are better options now! The one I got isn’t listed anymore, but this one looks similar, other than color choice (which might be under a different listing).)

A heated vest is one of the things Scott used the most. We originally chose this one (there are a ton of options from different brands and different styles) because we wanted one that could heat the back as well as the front. This one you can turn on each of those independently, and there are also three different levels of heat. If you’re chilly but don’t want to be bundled up inside in a coat with sleeves, this is a great choice. (I wear it a lot on wet days for outside activities, putting it on under a rain jacket).

The other thing we use a lot and used in the outage to stay warm is fingerless heated gloves. We have big, full finger heated gloves, too, but those are more like bulky ski gloves and don’t work well for me (because I have especially small hands). In fact, we have two sets of full finger heated gloves, because last year we bought a small pair for me to use for cross country skiing…and they were way too big for me. Scott adopted that pair, and we found another pair that was a smidgen smaller. They’re still bulky and make it hard to do anything, but when hiking or skiing in the rain, they’re perfect. Otherwise, when on walks where I want to use my phone or at home in power outages, I like the new fingerless heated gloves we got. Again, if you have small hands like I do, measure your hand and check the measurements on heated gloves (fingerless or full finger) when you order, most of the “small” to “small/medium” or “one size fits all” are huge on me. The ones I got are a little big through the palm and wide on the fingers, but because they’re fingerless, they’re still functional. I’ve actually surprised myself with how often I’ve been wearing them on slightly wet days when heading out for a walk and are on the list of “I can’t believe it took me so long to discover/figure out I should buy these”, just like the modern era heating pads. If you get cold at work and your hands get really cold, these are the kind I would look for. But I also highly recommend them for outdoor activities, too.

Food prep and heating

Speaking of heat, let’s talk food. After multiple days, even if you have available ready to eat food, it gets old. We bought a 12 volt DC hot plate in a zipper bag (check the picture if that description doesn’t make sense) to use on long car trips and for after big hikes and ultramarathons, when I wanted to be able to eat and refuel on a long drive home. It works awesome in the car for things like a frozen dinner or a shallow tupperware container of stew or chili (check the size of your tupperware to make sure it fits; shallow square ones work best in the one I got); but also warming up savory biscuits or sandwiches or scones or leftover pizza. Pretty versatile, and easy to clean even when using between a gluten eater (Scott) and someone with celiac. This is also one of our go-to’s for if we have an extended power outage and want hot food if there’s nothing in the area and we don’t want to or can’t drive to an area with power. Ready to eat canned soup or stew into a tupperware into the hot plate container, and like light in the room…it’s just going to make everything else feel a little less stressful than it would be otherwise.

Battery and batteries and batteries galore

We have two large (250Wh+) power stations, and lots of smaller (5-30Ah / 20-100Wh) power banks. One of the power stations has a working AC inverter, allowing us to power a heating pad for a few hours, recharge the more portable power banks, or keep all of our electronics powered/charging all night. The other used to have an AC inverter, but its cooling fan sounded like a helicopter, so we weren’t too sad when its AC outlet broke. Now we just use its built-in light, its 12V DC auto outlet (e.g. for the hot plate), its USB ports for charging electronics, lights, or smaller power banks, and/or its 12V DC barrel plug outlet to charge the power station whose inverter still works.

The smaller “phone charger” power banks are pretty much interchangeable with the ones you probably already have. Some of them (like the Anker one) support wireless phone charging, and serve nicely as a plugged-in wireless charger until you want to unplug them and take them with you. A couple of the larger ones (like this one, which we recently got) support USB-C PD at high enough wattage to keep my laptop’s battery from draining while I use it, or even (slowly) charge it.

Keeping batteries topped off is a bit of a chore during a long outage. What we found works best with two power stations is:

  • Keep a 12VDC to AC inverter in the car, and use it to recharge one of the large power stations at a time (via its AC to 15VDC adapter), and any smaller power banks that you’ve used up (via USB). (You probably don’t want to try to plug in multiple 15VDC adapters at once, or you risk blowing your car’s 12VDC fuse/breaker.)
  • Keep the other power station in the house for AC needs, to keep other electronics, lights, and small power banks topped off. When its battery gets low, swap the two power stations.

If you live in a sunnier climate, these kinds of power stations can be recharged via solar panels, but for a Western Washington November windstorm, we didn’t even bother getting our solar panel out of our camping bag. If you plan to use PV, you might want to buy an integrated kit to ensure optimal compatibility between the PV panel and the power station’s charge controller.

Outside of a power outage situation, Scott mostly uses these power stations as range extenders for his e-bike. He has two e-bike batteries, so can take both of those and a power station in the saddle-bags and charge one battery off the power station while using the other one. On a long ride without too much pedal assist, he can fully drain a power station to get an extra ~15mi of range.

Ice and keeping things cold (and safe)

While we are talking about power outages, we’ll share some of what we found works for us in terms of managing our fridge and freezer, too. This is less on the “here’s things you can buy” list and more “here’s information your brain can consider in the future” if you have an extended power outage.

Before the power goes out, we make sure we have lots of frozen ice packs (and/or ice) scattered around your freezer.

When the power goes out:

  • Eat ice cream until you’re full. The ice cream will be a lot tastier before it melts.
  • Take a hot shower: the hot water won’t stay hot for more than a few hours.
  • Put a few of the ice packs in the fridge to keep it cold.
  • Leave the fridge door mostly shut. You can quickly grab things out as needed: just don’t stand there with it open deciding what you want to eat. The objective is to keep the temperature from rising too much at once, as it takes a lot longer than normal to cool back down just from ice packs.
  • When you’re tired of ice cream and get hungry, eat leftovers, etc. out of the fridge (cold, or heated up on the 12VDC hot plate).

If the power is still going to be out by the time your ice packs melt:

  • Drive to a nearby area that still has power, and buy a few bags of ice from a convenience store, Walmart, etc.
  • Pour the bagged ice out into gallon ziploc bags: the bags it comes in will leak when it starts to melt.
  • Put at least two gallons of ice in the fridge and two gallons in the freezer.

When the power comes back on:

  • Check that you still have un-melted ice in both the fridge and freezer. If not, your remaining food may not be safe. (40F is when bacteria can start to grow, which is why if you’re in doubt, you should toss it out. But if you keep frozen packs frozen… you may not have to toss everything out.)

Electric vehicles as a power assist

We have an EV (a used 2018 Tesla Model 3), and it worked quite well. It can’t power the whole house like an F150 Lightning or a Cybertruck, but with a 12V to AC inverter, can recharge power banks of all sizes, without having to worry about running the battery down or idling the engine to avoid it. The battery lasts for weeks if we don’t drive it, or longer if we leave Sentry off. If we go anywhere, we can charge at our destination or Supercharge on the way. And even with 500k people without power, the nearest powered Supercharger was only a 10m drive from here.

What we wished we had (and didn’t, yet)

The main thing we were missing was having fast enough Internet. We had cell service, so we could hotspot with our phones, but during the day when everyone’s doing that it’s unusably slow for working from the laptop. Zoom still worked via phone, but in order to get any “real work” done, we wanted “real” Internet. This time around, we got that (and hot showers) by driving up to Scott’s parents’ house, where they never lost power. But to be ready for next time, and for back country hikes and long road trips, we ordered a Starlink Mini. It’s a bit pricier than a regular Starlink dish, but is only about the size of a laptop, works off a power bank, and can be easily used in the car. We got the $50/mo 50GB plan, which we can pause and reactivate at any time, paying $50 only for the months we want to use it. We’ll likely post an update later once we use it a bit. (If you want a Starlink referral code for either a regular Starlink or a Starlink mini, this referral code gives you one month of free service.)

—-

TLDR: it’s nice when you have things that can pull double duty in regular life and in power outages. Our list includes a lot of lights; a lot of heating options; batteries galore; and strategies for keeping our food cold or hot as desired.

If you have any favorite double duty tools or solutions we should get, that you love for power outages and other use cases, please comment and describe them so we – and everyone else – can check them out!

Multi-day power outage: batteries, power sources, heating, and other things that worked well

Pain and translation and using AI to improve healthcare at an individual level

I think differently from most people. Sometimes, this is a strength; and sometimes this is a challenge. This is noticeable when I approach healthcare encounters in particular: the way I perceive signals from my body is different from a typical person. I didn’t know this for the longest time, but it’s something I have been becoming more aware of over the years.

The most noticeable incident that brought me to this realization involved when I pitched head first off a mountain trail in New Zealand over five years ago. I remember yelling – in flight – help, I broke my ankle, help. When I had arrested my fall, clung on, and then the human daisy chain was pulling me back up onto the trail, I yelped and stopped because I could not use my right ankle to help me climb up the trail. I had to reposition my knee to help move me up. When we got up to the trail and had me sitting on a rock, resting, I felt a wave of nausea crest over me. People suggested that it was dehydration and I should drink. I didn’t feel dehydrated, but ok. Then because I was able to gently rest my foot on the ground at a normal perpendicular angle, the trail guides hypothesized that it was not broken, just sprained. It wasn’t swollen enough to look like a fracture, either. I felt like it hurt really bad, worse than I’d ever hurt an ankle before and it didn’t feel like a sprain, but I had never broken a bone before so maybe it was the trauma of the incident contributing to how I was feeling. We taped it and I tried walking. Nope. Too-strong pain. We made a new goal of having me use poles as crutches to get me to a nearby stream a half mile a way, to try to ice my ankle. Nope, could not use poles as crutches, even partial weight bearing was undoable. I ended up doing a mix of hopping, holding on to Scott and one of the guides. That got exhausting on my other leg pretty quickly, so I also got down on all fours (with my right knee on the ground but lifting my foot and ankle in the air behind me) to crawl some. Eventually, we realized I wasn’t going to be able to make it to the stream and the trail guides decided to call for a helicopter evacuation. The medics, too, when they arrived via helicopter thought it likely wasn’t broken. I got flown to an ER and taken to X-Ray. When the technician came out, I asked her if she saw anything obvious and whether it looked broken or not. She laughed and said oh yes, there’s a break. When the ER doc came in to talk to me he said “you must have a really high pain tolerance” and I said “oh really? So it’s definitely broken?” and he looked at me like I was crazy, saying “it’s broken in 3 different places”. (And then he gave me extra pain meds before setting my ankle and putting the cast on to compensate for the fact that I have high pain tolerance and/or don’t communicate pain levels in quite the typical way.)

A week later, when I was trying not to fall on my broken ankle and broke my toe, I knew instantly that I had broken my toe, both by the pain and the nausea that followed. Years later when I smashed another toe on another chair, I again knew that my toe was broken because of the pain + following wave of nausea. Nausea, for me, is apparently a response to very high level pain. And this is something I’ve carried forward to help me identify and communicate when my pain levels are significant, because otherwise my pain tolerance is such that I don’t feel like I’m taken seriously because my pain scale is so different from other people’s pain scales.

Flash forward to the last few weeks. I have an autoimmune disease causing issues with multiple areas of my body. I have some progressive slight muscle weakness that began to concern me, especially as it spread to multiple limbs and areas of my body. This was followed with pain in different parts of my spine which has escalated. Last weekend, riding in the car, I started to get nauseous from the pain and had to take anti-nausea medicine (which thankfully helped) as well as pain medicine (OTC, and thankfully it also helped lower it down to manageable levels). This has happened several other times.

Some of the symptoms are concerning to my healthcare provider and she agreed I should probably have a MRI and a consult from neurology. Sadly, the first available new patient appointment with the neurologist I was assigned to was in late September. Gulp. I was admittedly nervous about my symptom progression, my pain levels (intermittent as they are), and how bad things might get if we are not able to take any action between now and September. I also, admittedly, was not quite sure how I would cope with the level of pain I have been experiencing at those peak moments that cause nausea.

I had last spoken to my provider a week prior, before the spine pain started. I reached out to give her an update, confirm that my specialist appointment was not until September, and express my concern about the progression and timeline. She too was concerned and I ended up going in for imaging sooner.

Over the last week, because I’ve been having these progressive symptoms, I used Katie McCurdy’s free templates from Pictal Health to help visualize and show the progression of symptoms over time. I wasn’t planning on sending my visuals to my doctor, but it helped me concretely articulate my symptoms and confirm that I was including everything that I thought was meaningful for my healthcare providers to know. I also shared them with Scott to confirm he didn’t think I had missed anything. The icons in some cases were helpful but in other cases didn’t quite match how I was experiencing pain and I modified them somewhat to better match how I saw the pain I was experiencing.

(PS – check out Katie’s templates here, you can make a copy in Google Drive and use them yourself!)

As I spoke with the nurse who was recording my information at intake for imaging, she asked me to characterize the pain. I did and explained that it was probably usually a 7/10 then but periodically gets stronger to the point of causing nausea, which for me is a broken bone pain-level response. She asked me to characterize the pain – was it burning, tingling…? None of the words she said matched how it feels. It’s strong pain; it sometimes gets worse. But it’s not any of the words she mentioned.

When the nurse asked if it was “sharp”, Scott spoke up and explained the icon that I had used on my visual, saying maybe it was “sharp” pain. I thought about it and agreed that it was probably the closest word (at least, it wasn’t a hard no like the words burning, tingling, etc. were), and the nurse wrote it down. That became the word I was able to use as the closest approximation to how the pain felt, but again with the emphasis of it periodically reaching nausea-inducing levels equivalent to broken bone pain, because I felt saying “sharp” pain alone did not characterize it fully.

This, then, is one of the areas where I feel that artificial intelligence (AI) gives me a huge helping hand. I often will start working with an LLM (a large language model) and describing symptoms. Sometimes I give it a persona to respond as (different healthcare provider roles); sometimes I clarify my role as a patient or sometimes as a similar provider expert role. I use different words and phrases in different questions and follow ups; I then study the language it uses in response.

If you’re not familiar with LLMs, you should know it is not human intelligence; there is no brain that “knows things”. It’s not an encyclopedia. It’s a tool that’s been trained on a bajillion words, and it learns patterns of words as a result, and records “weights” that are basically cues about how those patterns of words relate to each other. When you ask it a question, it’s basically autocompleting the next word based on the likelihood of it being the next word in a similar pattern. It can therefore be wildly wrong; it can also still be wildly useful in a lot of ways, including this context.

What I often do in these situations is not looking for factual information. Again, it’s not an encyclopedia. But I myself am observing the LLM in using a pattern of words so that I am in turn building my own set of “weights” – meaning, building an understanding of the patterns of words it uses – to figure out a general outline of what is commonly known by doctors and medical literature; the common terminology that is being used likely by doctors to intake and output recommendations; and basically build a list of things that do and do not match my scenario or symptoms or words, or whatever it is I am seeking to learn about.

I can then learn (from the LLM as well as in person clinical encounters) that doctors and other providers typically ask about burning, tingling, etc and can make it clear that none of those words match at all. I can then accept from them (or Scott, or use a word I learned from an LLM) an alternative suggestion where I’m not quite sure if it’s a perfect match, but it’s not absolutely wrong and therefore is ok to use to describe somewhat of the sensation I am experiencing.

The LLM and AI, basically, have become a translator for me. Again, notice that I’m not asking it to describe my pain for me; it would make up words based on patterns that have nothing to do with me. But when I observe the words it uses I can then use my own experience to rule things in/out and decide what best fits and whether and when to use any of those, if they are appropriate.

Often, I can do this in advance of a live healthcare encounter. And that’s really helpful because it makes me a better historian (to use clinical terms, meaning I’m able to report the symptoms and chronology and characterization more succinctly without them having to play 20 questions to draw it out of me); and it saves me and the clinicians time for being able to move on to other things.

At this imaging appointment, this was incredibly helpful. I had the necessary imaging and had the results at my fingertips and was able to begin exploring and discussing the raw data with my LLM. When I then spoke with the clinician, I was able to better characterize my symptoms in context of the imaging results and ask questions that I felt were more aligned with what I was experiencing, and it was useful for a more efficient but effective conversation with the clinician about what our working hypothesis was; what next short-term and long-term pathways looked like; etc.

This is often how I use LLMs overall. If you ask an LLM if it knows who Dana Lewis is, it “does” know. It’ll tell you things about me that are mostly correct. If you ask it to write a bio about me, it will solidly make up ⅓ of it that is fully inaccurate. Again, remember it is not an encyclopedia and does not “know things”. When you remember that the LLM is autocompleting words based on the likelihood that they match the previous words – and think about how much information is on the internet and how many weights (patterns of words) it’s been able to build about a topic – you can then get a better spidey-sense about when things are slightly more or less accurate at a general level. I have actually used part of a LLM-written bio, but not by asking it to write a bio. That doesn’t work because of made up facts. I have instead asked it to describe my work, and it does a pretty decent job. This is due to the number of articles I have written and authored; the number of articles describing my work; and the number of bios I’ve actually written and posted online for conferences and such. So it has a lot of “weights” probably tied to the types of things I work on, and having it describe the type of work I do or am known for gets pretty accurate results, because it’s writing in a general high level without enough detail to get anything “wrong” like a fact about an award, etc.

This is how I recommend others use LLMs, too, especially those of us as patients or working in healthcare. LLMs pattern match on words in their training; and they output likely patterns of words. We in turn as humans can observe and learn from the patterns, while recognizing these are PATTERNS of connected words that can in fact be wrong. Systemic bias is baked into human behavior and medical literature, and this then has been pattern-matched by the LLM. (Note I didn’t say “learned”; but they’ve created weights based on the patterns they observe over and over again). You can’t necessarily course-correct the LLM (it’ll pretend to apologize and maybe for a short while adjust it’s word patterns but in a new chat it’s prone to make the same mistakes because the training has not been updated based on your feedback, so it reverts to using the ‘weights’ (patterns) it was trained on); instead, we need to create more of the correct/right information and have it voluminously available for LLMs to train on in the future. At an individual level then, we can let go of the obvious not-right things it’s saying and focus on what we can benefit from in the patterns of words it gives us.

And for people like me, with a high (or different type of) pain tolerance and a different vocabulary for what my body is feeling like, this has become a critical tool in my toolbox for optimizing my healthcare encounters. Do I have to do this to get adequate care? No. But I’m an optimizer, and I want to give the best inputs to the healthcare system (providers and my medical records) in order to increase my chances of getting the best possible outputs from the healthcare system to help me maintain and improve and save my health when these things are needed.

TLDR: LLMs can be powerful tools in the hands of patients, including for real-time or ahead-of-time translation and creating shared, understandable language for improving communication between patients and providers. Just as you shouldn’t tell a patient not to use Dr. Google, you should similarly avoid falling into the trap of telling a patient not to use LLMs because they’re “wrong”. Being wrong in some cases and some ways does not mean LLMs are useless or should not be used by patients. Each of these tools has limitations but a lot of upside and benefits; restricting patients or trying to limit use of tools is like limiting the use of other accessibility tools. I spotted a quote from Dr. Wes Ely that is relevant: “Maleficence can be created with beneficent intent”. In simple words, he is pointing out that harm can happen even with good intent.

Don’t do harm by restricting or recommending avoiding tools like LLMs.

Meet me in the gray area: beyond prevention, before progression

Two things can simultaneously be true:

  • Doctors may wish they had more opportunities to help patients prevent having worse/later stage outcomes of a disease.
  • Doctors may struggle when a patient seeks health care at an early stage, asking for strategies and intervention support against developing worse/later stage outcomes of a disease.

The struggle may be for a few reasons. There’s often a lack of systemic infrastructure to support patients who show up earlier rather than later in a disease progression, especially when the frequency/timeline of care is much quicker than the system is currently resourced for. There’s often a lack of research for these earlier stages and what effective strategies are for preventing progression and treating earlier stage outcomes.

When a clinician struggles with this, it’s not a moral failing of the clinician if they don’t feel equipped to tackle those challenges. However, I do wish clinicians would more often clearly communicate to patients about these struggles. The patient might have a choice: do they pursue another clinician who might have different resources (including time/energy) or expertise in navigating the unknown? Or do they work with the existing clinician to navigate the murky waters together, figuring it out as they go? But patients only have a choice if they realize it themselves and are equipped to pursue alternative paths – or are told that this is a fork in the path.

The challenge is this is a gray area for all of us – patients and clinicians alike. But the reality is, the gray area (for a patient) betwixt and between prevention and progression is our life. The black and white that may emerge after the gray space can be as significant, literally, as life and death. We as patients are highly motivated to navigate the gray area and reduce suffering and possibly try different or new strategies that have shown early promise (although maybe haven’t yet been tested to RCT or the ideal standard, or in the specific disease or stage of the disease in question). We as patients may not have time to wait for the evidence to evolve further.

Clinicians may be aware of the gray space that the patient has landed in. The reality that many clinicians may not know or forget – or have slipped out of their mind – is that the gray space is even more daunting to face alone. If the instinct is to simply shoot down every patient idea with “that’s not approved for use in this disease” without forthrightly contextualizing against a recognition “there’s nothing tested or proven for this part/stage of the disease”, it can begin to put cracks in a relationship. What clinicians might not realize is that a patient may not have time to be in the gray space with a clinician who simply says no to trying anything, because no one has ever studied it before and when little study is being done at all about the gray area the patient is within. Or maybe clinicians do realize it, and sometimes rely on the power of the broken systemic infrastructure that keeps a patient from finding a clinician who does feel equipped to walk through the gray area with them.

What I wish is for clinicians to be equipped to identify this situation, standing on the edge of the gray area with a patient. And to say up front, then and there, if they don’t feel comfortable pursuing off-label strategies when there are zero documented on-label strategies beyond waiting for the worse outcomes to progress. I don’t like that (because why wait for permanent damage to do something, when permanent damage is not inevitable if action is taken sooner), but I very much highly respect and appreciate a clinician who is forthright and willing to say they don’t have time/energy/feel equipped to do so.

Why? Because if I’m already in the gray space, past prevention and before serious progression, it gives me a better opportunity to find someone else who can partner there. It might take a try or two, but it keeps me from wasting time and energy and trying to invest in developing a relationship with a clinician who has already decided they can’t help me until I cross over into black and white worse outcomes.

When we talk about prevention, it’s often about preventing a disease. In the world I live in, and the body I live in – now inhibited by five autoimmune diseases, I don’t have a choice about disease prevention for the most part. My body is clearly equipped with a superstar hyperactive immune system. While I’ve seen some research working on addressing autoimmune stuff, it’s likely decades away from any cure of any one condition that I have (let alone all of them) or fixing the hyperactivity of my immune system and preventing additional autoimmune diseases. Sure, I can work to prevent other diseases that aren’t autoimmune (exercising, staying in as best health overall that I can, etc.), but my focus right now is keeping each of my five autoimmune conditions from being bigger headaches than they already are.

(As a side note, I recently read this paper looking at rates of autoimmune conditions after T1D, based on a registry analysis in Sweden of people with T1D. It’s interesting that the risk of “one more” condition following T1D is 17%, two more is essentially the square of that, etc. etc. all the way down…so the risk is typically about 17% and it’s not additive; having two does not mean you’re more likely to get three, it means you have about the same 17% chance of something else. That’s a useful mental model to me, understanding that I got unlucky 4 more times…and that combination of luck is rare among people with T1D. They went all the way to the category of “three or more” autoimmune conditions after T1D, calculating that 0.3% of people with T1D have 3 or more autoimmune conditions after T1D. They stopped there, but you can extrapolate by multiplying by 17% again and estimate it’s 0.08% for four or more…which is where I’m at. This shows me that I’m not alone in dealing with so many things, but it puts me at about 1 in 1,250 of people with T1D or around one in a million – heh – in the general population if you extrapolate based on global population estimates and assume similar rates/risks of autoimmune conditions in the general public.)

Four of the five are easy enough (although, the fourth took about a year and a half to get to ‘easy enough’, overlapping with the third taking two to three years). The fifth, though, is the gray area that I currently inhabit. Possibly because I am in tune with my body because of the experience with these other autoimmune conditions, I have been presenting to the healthcare system to address this fifth autoimmune condition earlier than most people. Like many autoimmune conditions, it takes years to decades for some people to get diagnosed. Many are diagnosed after systemic manifestations have fully kicked in, e.g. these later stage worse outcomes I referred to above. I’m in a gray area, at the edge of seeing systemic activity, and able to identify it as a red flag, but before – I hope – permanent irreversible damage has been done. The question remains, however, for me to figure out how to navigate this gray area and with which clinicians, in order to achieve care that will possibly prevent or delay or reduce the severity of the outcomes that I will end up with.

I speak from personal experience with this gray area. It’s not fun to navigate, even if you do have a really great clinician partner. But it’s infinitely more challenging to stand there in the gray, unsure of the ability or willingness of a clinician to partner with you.

Meet me in the gray: beyond prevention, before progression - a blog written by Dana M. Lewis on DIYPS.org

A Slackbot for using Slack to access and use a chat-based LLM in public

I’ve been thinking a lot about how to help my family, friends, and colleagues use LLMs to power their work. (As I’ve written about here, and more recently here with lots of tips on prompting and effectively using LLMs for different kinds of projects). 

Scott has been on the same page, especially thinking about how to help colleagues use LLMs effectively, but taking a slightly different approach: he built a Slackbot (a bot for Slack) which uses GPT-3.5 and GPT-4 to answer questions. This uses the API of GPT but presents it to the user in Slack instead of having to use ChatGPT as the chat interface. So, it’s a LLM chatbot, different than ChatGPT (because it’s a different chat interface), but uses the same AI (GPT-3.5 and GPT-4 from OpenAI). You could implement the same idea (a chat-based bot in Slack) using different AIs/LLMs, of course.

Using a slack-based bot for an LLM achieves a couple of things:

  1. More people can try GPT-4 and compare it to GPT-3.5 to get a taste for prompting and responses, without having to pay $20/month for a ChatGPT Pro account to get access to GPT-4.
  2. If you spend a lot of time in Slack for work, you don’t have to switch windows to access GPT.
  3. If your employer doesn’t want you to use the ChatGPT consumer product for work due to security concerns, but is more comfortable with OpenAI’s confidentiality guarantees for their API, a Slack bot interface provides a more secure alternative.
  4. You can see “in public” how other people in your Slack workspace are using it, and learn from their prompts and interactions new tricks and ideas (in addition to reading my long post here with lots of tips and suggestions).
  5. You can privately (in DM by yourself) talk to the bot and ask it your “dumb” questions (which aren’t dumb), and you can reason through things at your own speed or learn how to do things that you haven’t done before, and do so in private just like if you had your own ChatGPT account. It doesn’t burn social capital or take other people’s time, which you can save for more nuanced questions.

Scott has implemented it for work, but also came up with another fun use for it: we created a family-focused Slack channel for kids in our lives (we have ten (!) nieces and nephews – you can see how I used GPT to help with special custom books for some of them recently) so their parents and they can access GPT in a safe environment, too. ChatGPT (and OpenAI’s GPT API) have some reasonable safeguards in it already to protect kids, but more usefully we want to be able to provide custom instructions to keep it always at a kid-focused level. Plus this way we can share the experience with them, too.

I asked the bot to summarize itself for a lay audience not familiar with the project, and this was the output:

### Summary of the SlackAskBot

The SlackAskBot is a smart assistant designed to operate within Slack, a popular communication platform used by teams for messaging and collaboration. Imagine having a helpful robot in your group chats or direct messages that can understand your questions and provide informative responses. That’s what the SlackAskBot does. It’s like adding a knowledgeable team member to your Slack workspace who is always ready to assist.

#### How It Works:

  1. **Integration with Slack:**

– The bot is integrated into Slack, meaning it can read messages in channels it’s added to or when directly messaged. It’s like inviting a friend into your group chat who can listen and respond to conversations.

  1. **Understanding Messages:**

– When you type a message that includes the bot (either by mentioning it with an @ symbol in a channel or by sending it a direct message), the bot pays attention. It’s designed to understand that you’re talking to it and that you’re expecting some form of response.

  1. **Processing Your Requests:**

– Once the bot receives a message, it doesn’t just look at the latest message you’ve sent. It considers the context by looking at the conversation history. This way, it can provide responses that make sense within the ongoing discussion. It’s akin to a human joining a conversation, where understanding the previous discussion points is crucial for making relevant comments.

  1. **Generating Responses:**

– The bot uses a powerful language model (think of it as a very advanced AI that understands and generates human-like text) to come up with responses to your messages. It’s as if the bot thinks about what you’ve asked and then crafts a reply that aims to be helpful and informative.

  1. **Interacting Back:**

– After thinking through your message and generating a response, the bot then sends this response back into the Slack channel or direct message. You’ll see its reply just like you would from any other team member, providing the information or assistance you requested.

  1. **Special Features:**

– The bot is designed with specific settings for different channels or direct messages. This means it can tailor its responses based on where it’s talking. For example, it might respond differently in a casual chat compared to a more formal project discussion channel.

  1. **Respecting Privacy and Permissions:**

– The bot is programmed to respect Slack’s rules and the privacy of your conversations. It only listens and responds to messages where it’s explicitly mentioned or directly messaged, ensuring that it doesn’t intrude on private discussions.

#### In Summary:

The SlackAskBot acts as a virtual assistant within your Slack workspace, ready to provide information and answers to your questions. It’s designed to understand the context of conversations, generate helpful responses, and interact with users in a way that enhances team communication and productivity. Whether you’re asking for quick facts, seeking advice on a project, or just interacting for fun, the SlackAskBot is there to make your Slack experience more informative and engaging.


The SlackAskBot is open source – you can check out the code and instructions for how to implement it in your own Slack environment in the repository here.

We are using Slack as a chat interface for LLMs like GPT, making the user experience similar to ChatGPT

Effective Pair Programming and Coding and Prompt Engineering and Writing with LLMs like ChatGPT and other AI tools

I’ve been puzzled when I see people online say that LLM’s “don’t write good code”. In my experience, they do. But given that most of these LLMs are used in chatbot mode – meaning you chat and give it instructions to generate the code – that might be where the disconnect lies. To get good code, you need effective prompting and to do so, you need clear thinking and ideas on what you are trying to achieve and how.

My recipe and understanding is:

Clear thinking + clear communication of ideas/request = effective prompting => effective code and other outputs

It also involves understanding what these systems can and can’t do. For example, as I’ve written about before, they can’t “know” things (although they can increasingly look things up) and they can’t do “mental” math. But, they can generally repeat patterns of words to help you see what is known about a topic and they can write code that you can execute (or it can execute, depending on settings) to solve a math problem.

What the system does well is help code small chunks, walk you through processes to link these sections of code up, and help you implement them (if you ask for it). The smaller the task (ask), the more effective it is. Or also – the easier it is for you to see when it completes the task and when it hasn’t been able to finish due to limitations like response length limits, information falling out of the context window (what it knows that you’ve told it); unclear prompting; and/or because you’re asking it to do things for which it doesn’t have expertise. Some of the last part – lack of expertise – can be improved with specific prompting techniques –  and that’s also true for right-sizing the task it’s focusing on.

Right-size the task by giving a clear ask

If I were to ask an LLM to write me code for an iOS app to do XYZ, it could write me some code, but it certainly wouldn’t (at this point in history, written in February 2024), write all code and give me a downloadable file that includes it all and the ability to simply run it. What it can do is start writing chunks and snippets of code for bits and pieces of files that I can take and place and build upon.

How do I know this? Because I made that mistake when trying to build my first iOS apps in April and May 2023 (last year). It can’t do that (and still can’t today; I repeated the experiment). I had zero ideas how to build an iOS app; I had a sense that it involved XCode and pushing to the Apple iOS App Store, and that I needed “Swift” as the programming language. Luckily, though, I had a much stronger sense of how I wanted to structure the app user experience and what the app needed to do.

I followed the following steps:

  1. First, I initiated chat as a complete novice app builder. I told it I was new to building iOS apps and wanted to use XCode. I had XCode downloaded, but that was it. I told it to give me step by step instructions for opening XCode and setting up a project. Success! That was effective.
  2. I opened a different chat window after that, to start a new chat. I told it that it was an expert in iOS programming using Swift and XCode. Then I described the app that I wanted to build, said where I was in the process (e.g. had opened and started a project in XCode but had no code yet), and asked it for code to put on the home screen so I could build and open the app and it would have content on the home screen. Success!
  3. From there, I was able to stay in the same chat window and ask it for pieces at a time. I wanted to have a new user complete an onboarding flow the very first time they opened the app. I explained the number of screens and content I wanted on those screens; the chat was able to generate code, tell me how to create that in a file, and how to write code that would trigger this only for new users. Success!
  4. I was able to then add buttons to the home screen; have those buttons open new screens of the app; add navigation back to the home; etc. Success!
  5. (Rinse and repeat, continuing until all of the functionality was built out a step at a time).

To someone with familiarity building and programming things, this probably follows a logical process of how you might build apps. If you’ve built iOS apps before and are an expert in Swift programming, you’re either not reading this blog post or are thinking I (the human) am dumb and inexperienced.

Inexperienced, yes, I was (in April 2023). But what I am trying to show here is for someone new to a process and language, this is how we need to break down steps and work with LLMs to give it small tasks to help us understand and implement the code it produces before moving forward with a new task (ask). It takes these small building block tasks in order to build up to a complete app with all the functionality that we want. Nowadays, even though I can now whip up a prototype project and iOS app and deploy it to my phone within an hour (by working with an LLM as described above, but skipping some of the introductory set-up steps now that I have experience in those), I still follow the same general process to give the LLM the big picture and efficiently ask it to code pieces of the puzzle I want to create.

As the human, you need to be able to keep the big picture – full app purpose and functionality – in mind while subcontracting with the LLM to generate code for specific chunks of code to help achieve new functionality in our project.

In my experience, this is very much like pair programming with a human. In fact, this is exactly what we did when we built DIYPS over ten years ago (wow) and then OpenAPS within the following year. I’ve talked endlessly about how Scott and I would discuss an idea and agree on the big picture task; then I would direct sub-tasks and asks that he, then also Ben and others would be coding on (at first, because I didn’t have as much experience coding and this was 10 years ago without LLMs; I gradually took on more of those coding steps and roles as well). I was in charge of the big picture project and process and end goal; it didn’t matter who wrote which code or how; we worked together to achieve the intended end result. (And it worked amazingly well; here I am 10 years later still using DIYPS and OpenAPS; and tens of thousands of people globally are all using open source AID systems spun off of the algorithm we built through this process!)

Two purple boxes. The one on the left says "big picture project idea" and has a bunch of smaller size boxes within labeled LLM, attempting to show how an LLM can do small-size tasks within the scope of a bigger project that you direct it to do. On the right, the box simply says "finished project". Today, I would say the same is true. It doesn’t matter – for my types of projects – if a human or an LLM “wrote” the code. What matters is: does it work as intended? Does it achieve the goal? Does it contribute to the goal of the project?

Coding can be done – often by anyone (human with relevant coding expertise) or anything (LLM with effective prompting) – for any purpose. The critical key is knowing what the purpose is of the project and keeping the coding heading in the direction of serving that purpose.

Tips for right-sizing the ask

  1. Consider using different chat windows for different purposes, rather than trying to do it all in one. Yes, context windows are getting bigger, but you’ll still likely benefit from giving different prompts in different windows (more on effective prompting below).Start with one window for getting started with setting up a project (e.g. how to get XCode on a Mac and start a project; what file structure to use for an app/project that will do XYZ; how to start a Jupyter notebook for doing data science with python; etc); brainstorming ideas to scope your project; then separately for starting a series of coding sub-tasks (e.g. write code for the home page screen for your app; add a button that allows voice entry functionality; add in HealthKit permission functionality; etc.) that serves the big picture goal.
  2. Make a list for yourself of the steps needed to build a new piece of functionality for your project. If you know what the steps are, you can specifically ask the LLM for that.Again, use a separate window if you need to. For example, if you want to add in the ability to save data to HealthKit from your app, you may start a new chat window that asks the LLM generally how does one add HealthKit functionality for an app? It’ll describe the process of certain settings that need to be done in XCode for the project; adding code that prompts the user with correct permissions; and then code that actually does the saving/revising to HealthKit.

    Make your list (by yourself or with help), then you can go ask the LLM to do those things in your coding/task window for your specific project. You can go set the settings in XCode yourself, and skip to asking it for the task you need it to do, e.g. “write code to prompt the user with HealthKit permissions when button X is clicked”.

    (Sure, you can do the ask for help in outlining steps in the same window that you’ve been prompting for coding sub-tasks, just be aware that the more you do this, the more quickly you’ll burn through your context window. Sometimes that’s ok, and you’ll get a feel for when to do a separate window with the more experience you get.)

  • Pay attention as you go and see how much code it can generate and when it falls short of an ask. This will help you improve the rate at which you successfully ask and it fully completes a task for future asks. I observe that when I don’t know – due to my lack of expertise – the right size of a task, it’s more prone to give me ½-⅔ of the code and solution but need additional prompting after that. Sometimes I ask it to continue where it cut off; other times I start implementing/working with the bits of code (the first ⅔) it gave me, and have a mental or written note that this did not completely generate all steps/code for the functionality and to come back.Part of why sometimes it is effective to get started with ⅔ of the code is because you’ll likely need to debug/test the first bit of code, anyway. Sometimes when you paste in code it’s using methods that don’t match the version you’re targeting (e.g. functionality that is outdated as of iOS 15, for example, when you’re targeting iOS 17 and newer) and it’ll flag a warning or block it from working until you fix it.

    Once you’ve debugged/tested as much as you can of the original ⅔ of code it gave you, you can prompt it to say “Ok, I’ve done X and Y. We were trying to (repeat initial instructions/prompt) – what are the remaining next steps? Please code that.” to go back and finish the remaining pieces of that functionality.

    (Note that saying “please code that” isn’t necessarily good prompt technique, see below).

    Again, much of this is paying attention to how the sub-task is getting done in service of the overall big picture goal of your project; or the chunk that you’ve been working on if you’re building new functionality. Keeping track with whatever method you prefer – in your head, a physical written list, a checklist digitally, or notes showing what you’ve done/not done – is helpful.

Most of the above I used for coding examples, but I follow the same general process when writing research papers, blog posts, research protocols, etc. My point is that this works for all types of projects that you’d work on with an LLM, whether the output generation intended is code or human-focused language that you’d write or speak.

But, coding or writing language, the other thing that makes a difference in addition to right-sizing the task is effective prompting. I’ve intuitively noticed that has made the biggest difference in my projects for getting the output matching my expertise. Conversely, I have actually peer reviewed papers for medical journals that do a horrifying job with prompting. You’ll hear people talk about “prompt engineering” and this is what it is referring to: how do you engineer (write) a prompt to get the ideal response from the LLM?

Tips for effective prompting with an LLM

    1. Personas and roles can make a difference, both for you and for the LLM. What do I mean by this? Start your prompt by telling the LLM what perspective you want it to take. Without it, you’re going to make it guess what information and style of response you’re looking for. Here’s an example: if you asked it what caused cancer, it’s going to default to safety and give you a general public answer about causes of cancer in very plain, lay language. Which may be fine. But if you’re looking to generate a better understanding of the causal mechanism of cancer; what is known; and what is not known, you will get better results if you prompt it with “You are an experienced medical oncologist” so it speaks from the generated perspective of that role. Similarly, you can tell it your role. Follow it with “Please describe the causal mechanisms of cancer and what is known and not known” and/or “I am also an experienced medical researcher, although not an oncologist” to help contextualize that you want a deeper, technical approach to the answer and not high level plain language in the response.

      Compare and contrast when you prompt the following:

      A. “What causes cancer?”

      B. “You are an experienced medical oncologist. What causes cancer? How would you explain this differently in lay language to a patient, and how would you explain this to another doctor who is not an oncologist?”

      C. “You are an experienced medical oncologist. Please describe the causal mechanisms of cancer and what is known and not known. I am also an experienced medical researcher, although not an oncologist.”

      You’ll likely get different types of answers, with some overlap between A and the first part of answer B. Ditto for a tiny bit of overlap between the latter half of answer B and for C.

      I do the same kind of prompting with technical projects where I want code. Often, I will say “You are an expert data scientist with experience writing code in Python for a Jupyter Notebook” or “You are an AI programming assistant with expertise in building iOS apps using XCode and SwiftUI”. Those will then be followed with a brief description of my project (more on why this is brief below) and the first task I’m giving it.

      The same also goes for writing-related tasks; the persona I give it and/or the role I reference for myself makes a sizable difference in getting the quality of the output to match the style and quality I was seeking in a response.

  • Be specific. Saying “please code that” or “please write that” might work, sometimes, but more often or not will get a less effective output than if you provide a more specific prompt.I am a literal person, so this is something I think about a lot because I’m always parsing and mentally reviewing what people say to me because my instinct is to take their words literally and I have to think through the likelihood that those words were intended literally or if there is context that should be used to filter those words to be less literal. Sometimes, you’ll be thinking about something and start talking to someone about something, and they have no idea what on earth you’re talking about because the last part of your out-loud conversation with them was about a completely different topic!

    LLMs are the same as the confused conversational partner who doesn’t know what you’re thinking about. LLMs only know what you’ve last/recently told it (and more quickly than humans will ‘forget’ what you told it about a project). Remember the above tips about brainstorming and making a list of tasks for a project? Providing a description of the task along with the ask (e.g. we are doing X related to the purpose of achieving Y, please code X) will get you better output more closely matching what you wanted than saying “please code that” where the LLM might code something else to achieve Y if you didn’t tell it you wanted to focus on X.

    I find this even more necessary with writing related projects. I often find I need to give it the persona “You are an expert medical researcher”, the project “we are writing a research paper for a medical journal”, the task “we need to write the methods section of the paper”, and a clear ask “please review the code and analyses and make an outline of the steps that we have completed in this process, with sufficient detail that we could later write a methods section of a research paper”. A follow up ask is then “please take this list and draft it into the methods section”. That process with all of that specific context gives better results than “write a methods section” or “write the methods” etc.

  • Be willing to start over with a new window/chat. Sometimes the LLM can get itself lost in solving a sub-task and lose sight (via lost context window) of the big picture of a project, and you’ll find yourself having to repeat over and over again what you’re asking it to do. Don’t be afraid to cut your losses and start a new chat for a sub-task that you’ve been stuck on. You may be able to eventually come back to the same window as before, or the new window might become your new ‘home’ for the project…or sometimes a third, fourth, or fifth window will.
  • Try, try again.
    I may hold the record for the longest running bug that I (and the LLM) could. Not. solve. This was so, so annoying. No users apparently noticed it but I knew about it and it bugged me for months and months. Every few weeks I would go to an old window and also start a new window, describe the problem, paste the code in, and ask for help to solve it. I asked it to identify problems with the code; I asked it to explain the code and unexpected/unintended functionality from it; I asked it what types of general things would be likely to cause that type of bug. It couldn’t find the problem. I couldn’t find the problem. Finally, one day, I did all of the above, but then also started pasting every single file from my project and asking if it was likely to include code that could be related to the problem. By forcing myself to review all my code files with this problem in mind, even though the files weren’t related at all to the file/bug….I finally spotted the problem myself. I pasted the code in, asked if it was a possibility that it was related to the problem, the LLM said yes, I tried a change and…voila! Bug solved on January 16 after plaguing me since November 8. (And probably existed before then but I didn’t have functionality built until November 8 where I realized it was a problem). I was beating myself up about it and posted to Twitter about finally solving the bug (but very much with the mindset of feeling very stupid about it). Someone replied and said “congrats! sounds like it was a tough one!”. Which I realized was a very kind framing and one that I liked, because it was a tough one; and also I am doing a tough thing that no one else is doing and I would not have been willing to try to do without an LLM to support.

    Similarly, just this last week on Tuesday I spent about 3 hours working on a sub-task for a new project. It took 3 hours to do something that on a previous project took me about 40 minutes, so I was hyper aware of the time mismatch and perceiving that 3 hours was a long time to spend on the task. I vented to Scott quite a bit on Tuesday night, and he reminded me that sure it took “3 hours” but I did something in 3 hours that would take 3 years otherwise because no one else would do (or is doing) the project that I’m working on. Then on Wednesday, I spent an hour doing another part of the project and Thursday whipped through another hour and a half of doing huge chunks of work that ended up being highly efficient and much faster than they would have been, in part because the “three hours” it took on Tuesday wasn’t just about the code but about organizing my thinking, scoping the project and research protocol, etc. and doing a huge portion of other work to organize my thinking to be able to effectively prompt the LLM to do the sub-task (that probably did actually take closer to the ~40 minutes, similar to the prior project).

    All this to say: LLMs have become pair programmers and collaborators and writers that are helping me achieve tasks and projects that no one else in the world is working on yet. (It reminds me very much of my early work with DIYPS and OpenAPS where we did the work, quietly, and people eventually took notice and paid attention, albeit slower than we wished but years faster than had we not done that work. I’m doing the same thing in a new field/project space now.) Sometimes, the first attempt to delegate a sub-task doesn’t work. It may be because I haven’t organized my thinking enough, and the lack of ideal output shows that I have not prompted effectively yet. Sometimes I can quickly fix the prompt to be effective; but sometimes it highlights that my thinking is not yet clear; my ability to communicate the project/task/big picture is not yet sufficient; and the process of achieving the clarity of thinking and translating to the LLM takes time (e.g. “that took 3 hours when it should have taken 40 minutes”) but ultimately still moves me forward to solving the problem or achieving the tasks and sub-tasks that I wanted to do. Remember what I said at the beginning:

    Clear thinking + clear communication of ideas/request = effective prompting => effective code and other outputs

 

  • Try it anyway.
    I am trying to get out of the habit of saying “I can’t do X”, like “I can’t code/program an iOS app”…because now I can. I’ve in fact built and shipped/launched/made available multiple iOS apps (check out Carb Pilot if you’re interested in macronutrient estimates for any reason; you can customize so you only see the one(s) you care about; or if you have EPI, check out PERT Pilot, which is the world’s first and only app for tracking pancreatic enzyme replacement therapy and has the same AI feature for generating macronutrient estimates to aid in adjusting enzyme dosing for EPI.) I’ve also made really cool, 100% custom-to-me niche apps to serve a personal purpose that save me tons of time and energy. I can do those things, because I tried. I flopped a bunch along the way – it took me several hours to solve a simple iOS programming error related to home screen navigation in my first few apps – but in the process I learned how to do those things and now I can build apps. I’ve coded and developed for OpenAPS and other open source projects, including a tool for data conversion that no one else in the world had built. Yet, my brain still tries to tell me I can’t code/program/etc (and to be fair, humans try to tell me that sometimes, too).

    I bring that up to contextualize that I’m working on – and I wish others would work on to – trying to address the reflexive thoughts of what we can and can’t do, based on prior knowledge. The world is different now and tools like LLMs make it possible to learn new things and build new projects that maybe we didn’t have time/energy to do before (not that we couldn’t). The bar to entry and the bar to starting and trying is so much lower than it was even a year ago. It really comes down to willingness to try and see, which I recognize is hard: I have those thought patterns too of “I can’t do X”, but I’m trying to notice when I have those patterns; shift my thinking to “I used to not be able to do X; I wonder if it is possible to work with an LLM to do part of X or learn how to do Y so that I could try to do X”.

    A recent real example for me is power calculations and sample size estimates for future clinical trials. That’s something I can’t do; it requires a statistician and specialized software and expertise.

    Or…does it?

    I asked my LLM how power calculations are done. It explained. I asked if it was possible to do it using Python code in a Jupyter notebook. I asked what information would be needed to do so. It walked me through the decisions I needed to make about power and significance, and highlighted variables I needed to define/collect to put into the calculation. I had generated the data from a previous study so I had all the pieces (variables) I needed. I asked it to write code for me to run in a Jupyter notebook, and it did. I tweaked the code, input my variables, ran it..and got the result. I had run a power calculation! (Shocked face here). But then I got imposter syndrome again, reached out to a statistician who I had previously worked with on a research project. I shared my code and asked if that was the correct or an acceptable approach and if I was interpreting it correctly. His response? It was correct, and “I couldn’t have done it better myself”.

    (I’m still shocked about this).

    He also kindly took my variables and put it in the specialized software he uses and confirmed that the results output matched what my code did, then pointed out something that taught me something for future projects that might be different (where the data is/isn’t normally distributed) although it didn’t influence the output of my calculation for this project.

    What I learned from this was a) this statistician is amazing (which I already knew from working with him in the past) and kind to support my learning like this; b) I can do pieces of projects that I previously thought were far beyond my expertise; c) the blocker is truly in my head, and the more we break out of or identify the patterns stopping us from trying, the farther we will get.

    “Try it anyway” also refers to trying things over time. The LLMs are improving every few months and often have new capabilities that didn’t before. Much of my work is done with GPT-4 and the more nuanced, advanced technical tasks are way more efficient than when using GPT-3.5. That being said, some tasks can absolutely be done with GPT-3.5-level AI. Doing something now and not quite figuring it out could be something that you sort out in a few weeks/months (see above about my 3 month bug); it could be something that is easier to do once you advance your thinking ; or it could be more efficiently done with the next model of the LLM you’re working with.

  • Test whether custom instructions help. Be aware though that sometimes too many instructions can conflict and also take up some of your context window. Plus if you forget what instructions you gave it, you might get seemingly unexpected responses in future chats. (You can always change the custom instructions and/or turn it on and off.)

I’m hoping this helps give people confidence or context to try things with LLMs that they were not willing to try before; or to help get in the habit of remembering to try things with LLMs; and to get the best possible output for the project that they’re working on.

Remember:

  • Right-size the task by making a clear ask.
  • You can use different chat windows for different levels of the same project.
  • Use a list to help you, the human, keep track of all the pieces that contribute to the bigger picture of the project.
  • Try giving the LLM a persona for an ask; and test whether you also need to assign yourself a persona or not for a particular type of request.
  • Be specific, think of the LLM as a conversational partner that can’t read your mind.
  • Don’t be afraid to start over with a new context window/chat.
  • Things that were hard a year ago might be easier with an LLM; you should try again.
  • You can do more, partnering with an LLM, than you can on your own, and likely can do things you didn’t realize were possible for you to do!

Clear thinking + clear communication of ideas/request = effective prompting => effective code and other outputs

Have any tips to help others get more effective output from LLMs? I’d love to hear them, please comment below and share your tips as well!

Tips for prompting LLMs like ChatGPT, written by Dana M. Lewis and available from DIYPS.org

Personalized Story Prompts for Kids Books and Early Reader Books

For the holidays this year, I decided to try my hand at creating another set of custom, illustrated stories for my nieces and nephews (and bonus nieces and nephews). I have a few that are very advanced readers and/or too old for this, but I ended up with a list of 8 kids in my life from not-yet-reading to beginning reading to early 2nd grade reading level. I wanted to write stories that would appeal to each kid, include them as the main character, be appropriate for their reading (or read-to) level, and also include some of their interests.

Their interests were varied which made it quite a challenge! Here’s the list I worked from:

  • 2nd grade reading level, Minecraft
  • early 2nd grade reading level: soccer, stunt biking, parkour, ninja, Minecraft
  • beginning reading level: soccer, stunt biking, ninja, Spiderman
  • beginning reading level: Peppa Pig, moko jumbies
  • (read to younger child): Minnie Mouse, Peppa Pig, Bluey, and tea parties
  • (read to younger child): Bluey, Olaf, Elsa, & Anna
  • (read to younger child): cars/vehicles

I enlisted ChatGPT, an LLM, and ended up creating stories for each kid, matching their grade levels and interests, then illustrating them.

But illustrating them was actually a challenge (still), trying to create images with similar characters that would be on every page of the story and similar enough throughout that they were the “same” character.

Illustration challenges and how I got successful prompts:

My first pass on images wasn’t very good. I could get basic details to repeat, but often had images that looked like this – slightly different style and character throughout:

8 different illustrations in slightly different styles and almost different characters of a girl with blonde, shoulder length hair and a purple dress in an enchanted forest

Different styles throughout and that makes it look like a different character, even though it’s the same character in the whole story. This was a book to read to a <3 year old, though, and I thought she wouldn’t mind the different styles and left it as is. I also battled with adding, for personal use, the characters that most interested her: Peppa Pig and Minnie Mouse.

Interestingly, if I described with a prompt to illustrate a scene including a character “inspired by, but distinct from, Peppa Pig”…it essentially drew Peppa Pig or a character from it. No problems.

But if you gave the same prompt “inspired by, but distinct from, Minnie Mouse”? No go. No image at all: ChatGPT would block it for copyright reasons and wouldn’t draw any of the image. I riffed a bunch of times and finally was able to prompt a good enough mouse with round ears and a red dress with white polka dots. I had to ultimately illustrate the mouse character alone with the human character, because if I tried to get a Peppa-inspired character and then separately a mouse character, it wanted to draw the mouse with a pig-style face in the correct dress! I could never work around that effectively for the time I had available (and all the other books I was trying to illustrate!) so I stopped with what I had.

This was true for other characters, too, with copyright issues. It won’t draw anything from or like Bluey – or Frozen, when prompted. But I could get it to draw “an ethereal but warm, tall female adult with icy blonde hair, blue eyes, in an icy blue dress”, which you can see in the fourth image on the top row here:

Another series of illustrations with slightly different characters but closer in style throughout. there's one image showing a Frozen-inspired female character that I got by not prompting with Frozen.

I also managed to get slightly closer matching characters throughout this, but still quite a bit of variability. Again, for a young being-read-to-child, it was good enough for my purposes. (I never could get it to draw a Bluey-like character, even when I stopped referencing Bluey by name and described the shape and character, so I gave up on that.)

I tried a variety of prompts and series of prompts for each book. Sometimes, I would give it the story and prompt it with each page’s text, asking for an illustration and to keep it in the same style and the same character as the previous image. That didn’t work well, even when I told it in every prompt to use the same style and character plus the actual image prompt. I then tried to create a “custom” GPT, with the GPT’s instructions to use the same style throughout. That started to give me slightly better results, but I still had to remind it constantly to use the same style.

I also played around with taking an image that I liked, starting a new chat, and asking it to describe that image. Then I’d use that prompt to create a new prompt, describing the character in the same way. That started to get me slightly better results, especially when I did so using the custom GPT I had designed (you can try using this GPT here). I started to get better, more consistent characters:

A series of images of a young cartoon-drawn boy with wavy blonde hair riding a bike through an enchanted forest.

 

A series of drawings of a cartoon-like character with spiky blonde hair, blue eyes, and various outfits including a ninja costume

Those two had some variability, but a lot improved beyond the first several books. They are for the beginning and second-grade reading levels, too, so they are older kids with more attention to detail so it was worth the extra effort to try to get theirs to be more consistent.

The last one with the ninja and ninja outfits is another one that ran into copyright issues. I tried to have it illustrate a character inspired by, but distinct from, Spiderman – nope, no illustration at all. I asked it to illustrate the first picture in the soccer park with a spider strand looping in the corner of the image, like Spiderman had swung by but was out of sight and not picture – NOPE. You can’t even get an image that has Spiderman in the prompt at all, even if Spiderman isn’t in the picture! (I gave up and moved on without illustrating spiderwebs, even though Spiderman is described in the story).

My other favorite and pretty consistent one was two more of the early reader ones:

A series of images showing a young cartoon boy with wavy brown hair at a car fair

The hard part from that book was actually trying to do the cars consistently, rather than the human character. The human character was fairly consistent (although in different outfits, despite clear outfit prompts – argh) throughout, because I had learned from the previous images and prompt processes and used the Custom GPT, but the cars varied more. But, for a younger reader, hopefully that doesn’t matter.

The other, more-consistent character one for an early reader had some variations in style but did a better job matching the character throughout even when the style changed.

Another example with a mostly consistent young cartoon drawn girl with whispy blonde pigtails and big blue eyes, plus moko jumbies and peppa pig

How I wrote each story:

I also found some processes for building better stories. Again, see the above list of very, varied interests for each kid. Some prompts were straight forward (Minecraft) and other were about really different characters or activities (moko jumbies and Peppa Pig? Minnie Mouse and Peppa Pig? soccer ninja and Minecraft?).

What I ended up doing for each:

  1. In a new ChatGPT window (not the custom GPT for illustrating): Describe the reading level; the name of the character(s); and the interests. Ask it to brainstorm story ideas based on these interests.
  2. It usually gave 3 story ideas in a few sentences each, including a title. Sometimes, I would pick one and move on. Other times, I would take one of the ideas and tweak it a bit and ask for more ideas based on that. Or, I’d have it try again generally, asking for 3 more ideas.
  3. Once I had an idea that I liked, I would ask it to outline the story, based on the chosen story idea and the grade level we were targeting. Sometimes I would tweak the title and other times I would take the title as-is.
  4. Once it had the outline, I could have it then write the entire story (especially for the younger, beginner reader or read-to levels that are so short), but for the “chapter” books of early 2nd and 2nd grade reading level, I had it give me a chapter at a time, based on the outline. As each chapter was generated, I edited and tweaked it and took the text to where I would build the book. Sometimes, I would re-write the whole chapter myself, then give it back the chapter text and ask it to write the next one. If you didn’t give it back, it wouldn’t know what the chapter ended up as, so this is an important step to do when you’re making more than minor sentence construction changes.
  5. Because I know my audience(s) well, I tweaked it heavily as I went, incorporating their interests. For example, in the second images I showed above, there’s a dancing dog. It’s their actual dog, with the dog named in the story along with them as characters. Or in the chapter book for the character with the bike, it described running up a big mountain on a quest and being tired. I tossed in an Aunt-Dana reference including reminding the character about run-walking as a way to keep moving forward without stopping and cover the distance that needs to be covered. I also tweaked the stories to include character traits (like kindness) that each child has, and/or behaviors that their family prioritizes.

I described the images processes first, then the story writing, in this blog post, but I actually did the opposite for each book. I would write (brainstorm, outline, write, edit, write) the entire book, then I would go start a new chat window (eventually solely using my custom GPT) and ask for illustrations. Sometimes, I would give it the page of the story’s text and ask it to illustrate it. That’s helpful when you don’t know what to illustrate, and it did fairly well for some of the images (especially the Minecraft-inspired ones!). Ultimately, though, I would often get an image, ask what the prompt was for the image, tweak the prompt, and give it back to better match the story or what I wanted to illustrate. Once I was regularly asking for the image prompts, I had realized that giving the character details repeatedly for every image helped with consistency. Then I would use the ad-nauseam details myself for a longer prompt, which resulted in better images throughout, so I spent more energy deciding myself what to illustrate to best match the story.

All in all, I made 7 custom books (and 8 copies, one of the Minecraft books I copied and converted to a different named character for a friend’s child!). Between writing and editing, and illustrating, I probably spent an average of one hour per book! That’s a lot of time, but it did get more efficient as I went, and in some cases the hour included completely starting over and re-working the images in the book for consistency compared to the version I had before. The next books I create will probably take less time, both because I figured out the above processes but also because hopefully DALL*E and other illustration tools will get better about being able to illustrate the same character consistently across multiple prompts to illustrate a story.

How other people can use this to create stories – and why:

I have been so excited about this project. I love, love, love to read and I love reading with my nieces and nephews (and bonus kids in my life) and finding books that match their interest and help spark or maintain their love of reading. That’s why I did this project, and I have been bursting for WEEKS waiting to be able to give everyone their books! I wanted it to be a surprise for their parents, too, which meant that I couldn’t tell 2/3 of my closest circles about my cool project.

One of my friends without young kids that I finally told about my project loved the idea: she works as staff at an elementary school, supporting some students who are working on their reading skills who are nonverbal. She thought it would be cool to make a book for one student in particular, and described some of her interests: violins, drums, raspberries, and unicorns. I was in the car when she told me this, and I was able to follow the same process as above in the mobile ChatGPT app and list the interests, ask for a brainstorm of story ideas for a beginning reading level style book that had some repetitive text using the interests to aid in reading. It created a story about a unicorn who gathers other animals in the forest to play in an orchestra (with drums and violins) and eat raspberries. I had it illustrate the story, and it did so (with slightly different unicorns throughout). I only had to have it re-draw one image, because it put text in one of the last images that didn’t need to be there.

Illsutrations from a quick story about a unicorn, drums, violin, and an orchestra, plus raspberries

It was quick and easy, and my friend and her student LOVED it, and the other teachers and staff at the school are now working on personalized books for a lot of other students to help them with reading skills!

It really is an efficient and relatively easy way to generate personalized content; it can do so at different reading levels (especially when a teacher or someone who knows the student can tweak it to better match the reading level or sounds and words they are working on next); and you can generate pretty good matching illustrations too.

The hardest part is consistent characters; but when you don’t need consistency throughout a whole book, the time it takes drops to ~5 or so minutes to write, tweak, and illustrate an entire story.

Illustrations require a paid ChatGPT account, but if you have one and want to try out the custom GPT I built for (slightly more consistent) illustrations of stories, you can check it out here.

Custom stories: prompting and effective illustrating with ChatGPT, a blog post by Dana M. Lewis from DIYPS.org

How I Use LLMs like ChatGPT And Tips For Getting Started

You’ve probably heard about new AI (artificial intelligence) tools like ChatGPT, Bard, Midjourney, DALL-E and others. But, what are they good for?

Last fall I started experimenting with them. I looked at AI art tools and found them to be challenging, at the time, for one of my purposes, which was creating characters and illustrating a storyline with consistent characters for some of my children’s books. I also tested GPT-3 (meaning version 3.0 of GPT). It wasn’t that great, to be honest. But later, GPT-3.5 was released, along with the ChatGPT chat interface to it, which WAS a big improvement for a lot of my use cases. (And now, GPT-4 is out and is an even bigger improvement, although it costs more to use. More on the cost differences below)

So what am I using these AI tools for? And how might YOU use some of these AI tools? And what are the limitations? This is what I’ve learned:

  1. The most frequent way I use these AI tools is for getting started on a project, especially those related to writing.

You know the feeling of staring at a blank page and not knowing where to start? Maybe it’s the blank page of a cold email; the blank page of an essay or paper you need to write; the blank page of the outline for a presentation. Starting is hard!

Even for this blog post, I had a list of bulleted notes of things I wanted to remember to include. But I wasn’t sure how I wanted to start the blog post or incorporate them. I stuck the notes in ChatGPT and asked it to expand the notes.

What did it do? It wrote a few paragraph summary. Which isn’t what I wanted, so I asked it again to use the notes and this time “expand each bullet into a few sentences, rather than summarizing”. With these clear directions, it did, and I was able to look at this content and decide what I wanted to edit, include, or remove.

Sometimes I’m stuck on a particular writing task, and I use ChatGPT to break it down. In addition to kick-starting any type of writing overall, I’ve asked it to:

  • Take an outline of notes and summarize them into an introduction; limitations section; discussion section; conclusion; one paragraph summary; etc.
  • Take a bullet point list of notes and write full, complete sentences.
  • Take a long list of notes I’ve written about data I’ve extracted from a systematic review I was working on, and ask it about recurring themes or outlier concepts. Especially when I had 20 pages (!) of hand-written notes in bullets with some loose organization by section, I could feed in chunks of content and get help getting the big picture from that 20 pages of content I had created. It can highlight themes in the data based on the written narratives around the data.

A lot of times, the best thing it does is it prompts my brain to say “that’s not correct! It should be talking about…” and I’m able to more easily write the content that was in the back of my brain all along. I probably use 5% of what it’s written, and more frequently use it as a springboard for my writing. That might be unique to how I’m using it, though, and other simple use cases such as writing an email to someone or other simplistic content tasks may mean you can keep 90% or more of the content to use.

2. It can also help analyze data (caution alert!) if you understand how the tools work.

Huge learning moment here: these tools are called LLMs (large language models). They are trained on large amounts of language. They’re essentially designed so that, based on all of those words (language) it’s taken in previously, to predict content that “sounds” like what would come after a given prompt. So if you ask it to write a song or a haiku, it “knows” what a song or a haiku “looks” like, and can generate words to match those patterns.

It’s essentially a PATTERN MATCHER on WORDS. Yeah, I’m yelling in all caps here because this is the biggest confusion I see. ChatGPT or most of these LLMs don’t have access to the internet; they’re not looking up in a search engine for an answer. If you ask it a question about a person, it’s going to give you an answer (because it knows what this type of answer “sounds” like), but depending on the amount of information it “remembers”, some may be accurate and some may be 100% made up.

Why am I explaining this? Remember the above section where I highlighted how it can start to sense themes in the data? It’s not answering solely based on the raw data; it’s not doing analysis of the data, but mostly of the words surrounding the data. For example, you can paste in data (from a spreadsheet) and ask it questions. I did that once, pasting in some data from a pivot table and asking it the same question I had asked myself in analyzing the data. It gave me the same sense of the data that I had based on my own analysis, then pointed out it was only qualitative analysis and that I should also do quantitative statistical analysis. So I asked it if it could do quantitative statistical analysis. It said yes, it could, and spit out some numbers and described the methods of quantitative statistical analysis.

But here’s the thing: those numbers were completely made up!

It can’t actually use (in its current design) the methods it was describing verbally, and instead made up numbers that ‘sounded’ right.

So I asked it to describe how to do that statistical method in Google Sheets. It provided the formula and instructions; I did that analysis myself; and confirmed that the numbers it had given me were 100% made up.

The takeaway here is: it outright said it could do a thing (quantitative statistical analysis) that it can’t do. It’s like a human in some regards: some humans will lie or fudge and make stuff up when you talk to them. It’s helpful to be aware and query whether someone has relevant expertise, what their motivations are, etc. in determining whether or not to use their advice/input on something. The same should go for these AI tools! Knowing this is an LLM and it’s going to pattern match on language helps you pinpoint when it’s going to be prone to making stuff up. Humans are especially likely to make something up that sounds plausible in situations where they’re “expected” to know the answer. LLMs are in that situation all the time: sometimes they actually do know an answer, sometimes they have a good guess, and sometimes they’re just pattern matching and coming up with something that sounds plausible.

In short:

  • LLM’s can expand general concepts and write language about what is generally well known based on its training data.
  • Try to ask it a particular fact, though, and it’s probably going to make stuff up, whether that’s about a person or a concept – you need to fact check it elsewhere.
  • It can’t do math!

But what it can do is teach you or show you how to do the math, the coding, or whatever thing you wish it would do for you. And this gets into one of my favorite use cases for it.

3. You can get an LLM to teach you how to use new tools, solve problems, and lower the barrier to entry (and friction) on using new tools, languages, and software.

One of the first things I did was ask ChatGPT to help me write a script. In fact, that’s what I did to expedite the process of finding tweets where I had used an image in order to get a screenshot to embed on my blog, rather than embedding the tweet.

It’s now so easy to generate code for scripts, regardless of which language you have previous experience with. I used to write all of my code as bash scripts, because that’s the format I was most familiar with. But ChatGPT likes to do things as Python scripts, so I asked it simple questions like “how do I call a python script from the command line” after I asked it to write a script and it generated a python script. Sure, you could search in a search engine or Stack Overflow for similar questions and get the same information. But one nice thing is that if you have it generate a script and then ask it step by step how to run a script, it gives you step by step instructions in context of what you were doing. So instead of saying “to run a script, type `python script.py’”, using placeholder names, it’ll say “to run the script, use ‘python actual-name-of-the-script-it-built-you.py’ “ and you can click the button to copy that, paste it in, and hit enter. It saves a lot of time for figuring out how to take placeholder information (which you would get from a traditional search engine result or Stack Overflow, where people are fond of things like saying FOOBAR and you have no idea if that means something or is meant to be a placeholder). Careful observers will notice that the latest scripts I’ve added to my Open Humans Data Tools repository (which is packed with a bunch of scripts to help work with big datasets!) are now in Python rather than bash; such as when I was adding new scripts for fellow researchers looking to check for updates in big datasets (such as the OpenAPS Data Commons). This is because I used GPT to help with those scripts!

It’s really easy now to go from an idea to a script. If you’re able to describe it logically, you can ask it to write a script, tell you how to run it, and help you debug it. Sometimes you can start by asking it a question, such as “Is it possible to do Y?” and it describes a method. You need to test the method or check for it elsewhere, but things like uploading a list of DOIs to Mendeley to save me hundreds of clicks? I didn’t realize Mendeley had an API or that I could write a script that would do that! ChatGPT helped me write the script, figure out how to create a developer account and app access information for Mendeley, and debug along the way so I ended up within an hour and a half of having a tool that easily saved me 3 hours on the very first project that I used it with.

I’m gushing about this because there’s probably a lot of ideas you have that you immediately throw out as being too hard, or you don’t know how to do it. It takes time, but I’m learning to remember to think “I should ask the LLM this” and ask it questions such as:

  • Is it possible to do X?
  • Write a script to do X.
  • I have X data. Pretend I am someone who doesn’t know how to use Y software and explain how I should do Z.

Another thing I’ve done frequently is ask it to help me quickly write a complex formula to use in a spreadsheet. Such as “write a formula that can be used in Google Sheets to take an average of the values in M3:M84 if they are greater than zero”.

It gives me the formula, and also describes it, and in some cases, gives alternative options.

Other things I’ve done with spreadsheets include:

  • Ask it to write a conditional formatting custom formula, then give me instructions for expanding the conditional formatting to apply to a certain cell range.
  • Asking it to check if a cell is filled with a particular value and then repeating the value in the new cell, in order to create new data series to use in particular charts and graphs I wanted to create from my data.
  • Help me transform my data so I could generate a box and whisker plot.
  • Ask it for other visuals that might be effective ways to illustrate and visualize the same dataset.
  • Explain the difference between two similar formulas (e.g. COUNT and COUNTA or when to use IF and IFS).

This has been incredibly helpful especially with some of my self-tracked datasets (particularly around thyroid-related symptom data) where I’m still trying to figure out the relationship between thyroid levels, thyroid antibody levels, and symptom data (and things like menstrual cycle timing). I’ve used it for creating the formulas and solutions I’ve talked about in projects such as the one where I created a “today” line that dynamically updates in a chart.

It’s also helped me get past the friction of setting up new tools. Case in point, Jupyter notebooks. I’ve used them in the web browser version before, but often had issues running the notebooks people gave me. I debugged and did all kinds of troubleshooting, but have not for years been able to get it successfully installed locally on (multiple of) my computers. I had finally given up on effectively using notebooks and definitely given up on running it locally on my machine.

However, I decided to see if I could get ChatGPT to coax me through the install process.

I told it:

“I have this table with data. Pretend I am someone who has never used R before. Tell me, step by step, how to use a Jupyter notebook to generate a box and whisker plot using this data”

(and I pasted my data that I had copied from a spreadsheet, then hit enter).

It outlined exactly what I needed to do, saying to install Jupyter Notebook locally if I hadn’t, gave me code to do that, installing the R kernel, told me how to do that, then how to start a notebook all the way down to what code to put in the notebook, the data transformed that I could copy/paste, and all the code that generated the plot.

However, remember I have never been able to successfully get Jupyter Notebooks running! For years! I was stuck on step 2, installing R. I said:

“Step 2, explain to me how I enter those commands in R? Do I do this in Terminal?”

It said “Oh apologies, no, you run those commands elsewhere, preferably in Rstudio. Here is how to download RStudio and run the commands”.

So, like humans often do, it glossed over a crucial step. But it went back and explained it to me and kept giving more detailed instructions and helping me debug various errors. After 5-6 more troubleshooting steps, it worked! And I was able to open Jupyter Notebooks locally and get it working!

All along, most of the tutorials I had been reading had skipped or glossed over that I needed to do something with R, and where that was. Probably because most people writing the tutorials are already data scientists who have worked with R and RStudio etc, so they didn’t know those dependencies were baked in! Using ChatGPT helped me be able to put in every error message or every place I got stuck, and it coached me through each spot (with no judgment or impatience). It was great!

I was then able to continue with the other steps of getting my data transformed, into the notebook, running the code, and generating my first ever box and whisker plot with R!

A box and whisker plot, illustrated simply to show that I used R and Jupyter finally successfully!

This is where I really saw the power of these tools, reducing the friction of trying something new (a tool, a piece of software, a new method, a new language, etc.) and helping you troubleshoot patiently step by step.

Does it sometimes skip steps or give you solutions that don’t work? Yes. But it’s still a LOT faster than manually debugging, trying to find someone to help, or spending hours in a search engine or Stack Overflow trying to translate generic code/advice/solutions into something that works on your setup. The beauty of these tools is you can simply paste in the error message and it goes “oh, sorry, try this to solve that error”.

Because the barrier to entry is so low (compared to before), I’ve also asked it to help me with other project ideas where I previously didn’t want to spend the time needed to learn new software and languages and all the nuances of getting from start to end of a project.

Such as, building an iOS app by myself.

I have a ton of projects where I want to temporarily track certain types of data for a short period of time. My fall back is usually a spreadsheet on my phone, but it’s not always easy to quickly enter data on a spreadsheet on your phone, even if you set up a template with a drop down menu like I’ve done in the past (for my DIY macronutrient tool, for example). For example, I want to see if there’s a correlation in my blood pressure at different times and patterns of inflammation in my eyelid and heart rate symptoms (which are symptoms, for me, of thyroid antibodies being out of range, due to Graves’ disease). That means I need to track my symptom data, but also now some blood pressure data. I want to be able to put these datasets together easily, which I can, but the hardest part (so to speak) is finding a way that I am willing to record my blood pressure data. I don’t want to use an existing BP tracking app, and I don’t want a connected BP monitor, and I don’t want to use Apple Health. (Yes, I’m picky!)

I decided to ask ChatGPT to help me accomplish this. I told it:

“You’re an AI programming assistant. Help me write a basic iOS app using Swift UI. The goal is a simple blood pressure tracking app. I want the user interface to default to the data entry screen where there should be three boxes to take the systolic, diastolic blood pressure numbers and also the pulse. There should also be selection boxes to indicate whether the BP was taken sitting up or laying down. Also, enable the selection of a section of symptom check boxes that include “HR feeling” and “Eyes”. Once entered on this screen, the data should save to a google spreadsheet.” 

This is a completely custom, DIY, n of 1 app. I don’t care about it working for anyone else, I simply want to be able to enter my blood pressure, pulse, whether I’m sitting or laying down, and the two specific, unique to me symptoms I’m trying to analyze alongside the BP data.

And it helped me build this! It taught me how to set up a new SwiftUI project in XCode, gave me code for the user interface, how to set up an API with Google Sheets, write code to save the data to Sheets, and get the app to run.

(I am still debugging the connection to Google Sheets, so in the interim I changed my mind and had it create another screen to display the stored data then enable it to email me a CSV file, because it’s so easy to write scripts or formulas to take data from two sources and append it together!)

Is it fancy? No. Am I going to try to distribute it? No. It’s meeting a custom need to enable me to collect specific data super easily over a short period of time in a way that my previous tools did not enable.

Here’s a preview of my custom app running in a simulator phone:

Simulator iphone with a basic iOS app that intakes BP, pulse, buttons for indicating whether BP was taken sitting or laying down; and toggles for key symptoms (in my case HR feeling or eyes), and a purple save button.

I did this in a few hours, rather than taking days or weeks. And now, the barrier to entry to creating more custom iOS is reduced, because now I’m more comfortable working with XCode and the file structures and what it takes to build and deploy an app! Sure, again, I could have learned to do this in other ways, but the learning curve is drastically shortened and it takes away most of the ‘getting started’ friction.

That’s the theme across all of these projects:

  • Barriers to entry are lower and it’s easier to get started
  • It’s easier to try things, even if they flop
  • There’s a quicker learning curve on new tools, technologies and languages
  • You get customized support and troubleshooting without having to translate through as many generic placeholders

PS – speaking of iOS apps, based on building this one simple app I had the confidence to try building a really complex, novel app that has never existed in the world before! It’s for people with exocrine pancreatic insufficiency like me who want to log pancreatic enzyme replacement therapy (PERT) dosing and improve their outcomes – check out PERT Pilot and how I built it here.

4. Notes about what these tools cost

I found ChatGPT useful for writing projects in terms of getting started, even though the content wasn’t that great (on GPT-3.5, too). Then they came out with GPT-4 and made a ChatGPT Pro option for $20/month. I didn’t think it was worth it and resisted it. Then I finally decided to try it, because some of the more sophisticated use cases I wanted to use it for required a longer context window, and in addition to a better model it also gave you a longer context window. I paid the first $20 assuming I’d want to cancel it by the end of the month.

Nope.

The $20 has been worth it on every single project that I’ve used it for. I’ve easily saved 5x that on most projects in terms of reducing the energy needed to start a project, whether it was writing or developing code. It has saved 10x that in time cost recouped from debugging new code and tools.

GPT-4 does have caps, though, so even with the $20/month, you can only do 25 messages every 3 hours. I try to be cognizant of which projects I default to using GPT-3.5 on (unlimited) versus saving the more sophisticated projects for my GPT-4 quota.

For example, I saw a new tool someone had built called “AutoResearcher”, downloaded it, and tried to use it. I ran into a bug and pasted the error into GPT-3.5 and got help figuring out where the problem was. Then I decided I wanted to add a feature to output to a text file, and it helped me quickly edit the code to do that, and I PR’ed it back in and it was accepted (woohoo) and now everyone using that tool can use that feature. That was pretty simple and I was able to use GPT-3.5 for that. But sometimes, when I need a larger context window for a more sophisticated or content-heavy project, I start with GPT-4. When I run into the cap, it tells me when my next window opens up (3 hours after I started using it), and I usually have an hour or two until then. I can open a new chat on GPT-3.5 (without the same context) and try to do things there; switch to another project; or come back at the time it says to continue using GPT-4 on that context/setup.

Why the limit? Because it’s a more expensive model. So you have a tradeoff between paying more and having a limit on how much you can use it, because of the cost to the company.

—–

TLDR:

Most important note: LLMs don’t “think” or “know” things the way humans do. They output language they predict you want to see, based on its training and the inputs you give it. It’s like the autocomplete of a sentence in your email, but more words on a wider range of topics!

Also, the LLM can’t do math. But they can write code. Including code to do math.

(Some, but not all, LLMs have access to the internet to look up or incorporate facts; make sure you know which LLM you are using and whether it has this feature or not.)

Ways to get started:

    1. The most frequent way I use these AI tools is for getting started on a project, especially those related to writing.
      • Ask it to help you expand on notes; write summaries of existing content; or write sections of content based on instructions you give it
    2.  It can also help analyze data (caution alert!) if you understand the limitations of the LLM.
      • The most effective way to work with data is to have it tell you how to run things in analytical software, whether that’s how to use R or a spreadsheet or other software for data analysis. Remember the LLM can’t do math, but it can write code so you can then do the math!
    3.  You can get an LLM to teach you how to use new tools, solve problems, and lower the barrier to entry (and friction) on using new tools, languages, and software.
      • Build a new habit of asking it “Can I do X” or “Is it possible to do Y” and when it says it’s possible, give it a try! Tell it to give you step-by-step instructions. Tell it where you get stuck. Give it your error messages or where you get lost and have it coach you through the process. 

What’s been your favorite way to use an LLM? I’d love to know other ways I should be using them, so please drop a comment with your favorite projects/ways of using them!

Personally, the latest project that I built with an LLM has been PERT Pilot!

How I use LLMs (like ChatGPT) and tips for getting started

How to Pick Food (Fuel) For Ultramarathon Running

I’ve previously written about ultrarunning preparation and a little bit about how I approach fueling. But it occurred to me there might be others out there wondering exactly HOW to find fuel that works for them, because it’s an iterative process.

The way I approach fueling is based on a couple of variables.

First and foremost, everything has to be gluten free (because I have celiac). So that limits a lot of the common ultrarunning fuel options. Things like bars (some are GF, most are not), Uncrustables, PopTarts, and many other common recommendations in the ultra community just aren’t an option for me. Some, I can find or make alternatives to, but it’s worth noting that being gluten free for celiac (where cross-contamination is also an issue, not just the ingredients) or having a food allergy and being an ultrarunner can make things more challenging.

Then, I also have exocrine pancreatic insufficiency. This doesn’t limit what I eat, but it factors in to how I approach ideal fueling options, because I have to match the enzyme amounts to the amount of food I’m eating. So naturally, the pill size options I have of OTC enzymes (one is lipase only and covers ~6g of fat for me, the other is a multi-enzyme option that includes protease to cover protein, and only enough lipase to cover ~4g of fat for me; I also have one much larger that covers ~15g of fat but I don’t typically use this one while running) influence the portion sizes of what I choose.

That being said, I probably – despite EPI – still tend toward higher fat options than most people. This is in part because I have had type 1 diabetes for 20+ years. While I by no means consume a low c-a-r-b diet, I typically consume less than the people with insulin-producing pancreases in my life, and lean slightly toward higher fat options because a) my taste buds like them and b) they’ve historically had less impact on my glucose levels. Reason A is probably the main reason now, thanks to automated insulin delivery, but regardless of reason, 20+ years of a higher level than most people’s fat consumption means I’m also probably better fat-adapted for exercise than most people.

Plus, ultrarunning tends to be slower than shorter runs (like marathons and shorter for most people), so that’s also more amenable to fat and other nutrient digestion. So, ultrarunners in general tend to have more options in terms of not just needing “gu” and “gel” and “blocks” and calorie-sugar drinks as fuel options (although if that is what you prefer and works well for you, great!).

All of these reasons lead me toward generally preferring fuel portions that are:

  1. Gluten free with no cross-contamination risk
  2. ~20g of carbs
  3. ~10g of fat or less
  4. ~5-10g of protein or less

Overall, I shoot for consuming ~250 calories per hour. Some people like to measure hourly fuel consumption by calories. Others prefer carb consumption. But given that I have a higher tolerance for fat and protein consumption – thanks to the enzymes I need for EPI plus decades of practice – calories as a metric for hourly consumption makes sense for me. If I went for the level of carb intake many recommend for ultrarunners, I’d find it harder to consistently manage glucose levels while running for a zillion hours. I by no means think any of my above numbers are necessarily what’s best for anyone else, but that’s what I use based on my experiences to date as a rough outline of what to shoot for.

After I’ve thought through my requirements: gluten free, 250 calories per hour, and preferably no single serving portion size that is greater than 20ish grams of carbs or 10g of fat or 5-10g or protein, I can move on to making a list of foods I like and that I think would “work” for ultrarunning.

“Work” by my definition is not too messy to carry or eat (won’t melt easily, won’t require holding in my hands to eat and get them messy).

My initial list has included (everything here gluten free):

  • Oreos or similar sandwich type cookies
  • Yogurt/chocolate covered pretzels
  • PB or other filled pretzel nuggets
  • Chili cheese Fritos
  • Beef sticks
  • PB M&M’s
  • Reese’s Pieces
  • Snickers
  • Mini PayDays
  • Macaroons
  • Muffins
  • Fruit snacks
  • Fruit/date bars
  • GF (only specific flavors are GF which is why I’m noting this) of Honey Stinger Stroopwaffles

I wish I could include more chip/savory options on my lists, and that’s something I’ve been working on. Fritos are easy enough to eat from a snack size baggie without having to touch them with my hands or pull individual chips out to eat; I can just pour portions into my mouth. Most other chips, though, are too big and too ‘sharp’ feeling for my mouth to eat this way, so chili cheese Fritos are my primary savory option, other than beef sticks (that are surprisingly moist and easy to swallow on the run!).

Some of the foods I’ve tried from the above list and have eventually taken OFF my list include:

  • PB pretzel nuggets, because they get stale in baggies pretty fast and then they feel dry and obnoxious to chew and swallow.
  • Muffins – I tried both banana muffin halves and chocolate chip muffin halves. While they’re moist and delicious straight out of the oven, I found they are challenging to swallow while running (probably because they’re more dry).
  • Gluten free Oreos – actual Oreo brand GF Oreos, which I got burnt out on about the time I realized I had EPI, but also they too have a pretty dry mouthfeel. I’ve tried other brand chocolate sandwich cookies and also for some reason find them challenging to swallow. I did try a vanilla sandwich cookie (Glutino brand) recently and that is working better – the cookie is harder but doesn’t taste as dry – so that’s tentatively on my list as a replacement.

Other than “do I like this food” and “does it work for carrying on runs”, I then move on to “optimizing” my intake in terms of macronutrients.  Ideally, each portion size and item has SOME fat, protein, and carbs, but not TOO MUCH fat, protein and carbs.

Most of my snacks are some fat, a little more carb, and a tiny bit of protein. The outlier is my beef sticks, which are the highest protein option out of my shelf-stable running fuel options (7g of fat, 8g of protein). Most of the others are typically 1-3g of protein, 5-10g of fat (perfect, because that is 1-2 enzyme OTC pills), and 10-20g of carb (ideal, because it’s a manageable amount for glucose levels at any one time).

Sometimes, I add things to my list based on the above criteria (gluten free with no cross-contamination list; I like to eat it; not messy to carry) and work out a possible serving size. For example, the other day I was brainstorming more fuel options and it occurred to me that I like brownies and a piece of brownie in a baggie would probably be moist and nice tasting and would be fine in a baggie. I planned to make a batch of brownies and calculated how I would cut them to get consistent portion sizes (so I would know the macronutrients for enzymes).

However, once I made my brownies, and started to cut them, I immediately went “nope” and scratched them off my list for using on runs. Mainly because, I hate cutting them and they crumbled. The idea of having to perfect how to cook them to be able to cut them without them crumbling just seems like too much work. So I scratched them off my list, and am just enjoying eating the brownies as brownies at home, not during runs!

I first started taking these snacks on runs and testing each one, making sure that they tasted good and also worked well for me (digestion-wise) during exercise, not just when I was sitting around. All of them, other than the ones listed above for ‘dry’ reasons or things like brownies (crossed off because of the hassle to prepare), have stayed on the list.

I also started looking at the total amount of calories I was consuming during training runs, to see how close I was to my goal of ~250 calories per hour. It’s not an exact number and a hard and fast “must have”, but given that I’m a slower runner (who run/walks, so I have lower calorie burn than most ultrarunners), I typically burn in the ballpark of ~300-400 calories per hour. I generally assume ~350 calories for a reasonable average. (Note, again, this is much lower than most people’s burn, but it’s roughly my burn rate and I’m trying to show the process itself of how I make decisions about fuel).

Aiming for ~250 calories per hour means that I only have a deficit of 100 calories per hour. Over the course of a ~100 mile race that might take 30 hours, this means I’ll “only” have an estimated deficit of 3,000 calories. Which is a lot less than most people’s estimated deficit, both because I have a lower burn rate (I’m slower) and because, as described above and below, I am trying to be very strategic about fueling for a number of reasons, including not ending up under fueling for energy purposes. For shorter runs, like a 6 hour run, that means I only end up ~600 calories in deficit – which is relatively easy to make up with consumption before and after the run, to make sure that I’m staying on top of my energy needs.

It turns out, some of my preferred snacks are a lot lower and higher calories than each other! And this can add up.

For example, fruit snacks – super easy to chew (or swallow without much chewing). 20g of carb, 0g of fat or protein, and only 80 calories. Another easy to quickly chew and swallow option: a mini date (fruit) bar. 13g carb, 5g fat, 2 protein. And…90 calories. My beef stick? 7g of fat, 8g of protein, and only 100 calories!

My approach that works for me has been to eat every 30 minutes, which means twice per hour. Those are three of my favorite (because they’re easy to consume) fuel options. If I eat two of those in the same hour, say fruit snacks and the date bar, that’s only 170 calories. Well below the goal of 250 for the hour! Combining either with my beef stick (so 180 or 190 calories, depending), is still well below goal.

This is why I have my macronutrient fuel library with carbs, fat, protein, *and* calories (and sodium, more on that below) filled out, so I can keep an eye on patterns of what I tend to prefer by default – which is often more of these smaller, fewer calorie options as I get tired at the end of the runs, when it’s even more important to make sure I’m at (or near) my calorie goals.

Tracking this for each training run has been really helpful, so I can see my default tendency to choose “smaller” and “easier to swallow” – but that also means likely fewer calories – options. This is also teaching me that I need to pair larger calorie options with them or follow on with a larger calorie option. For example, I have certain items on my list like Snickers. I get the “share size” bars that are actually 2 individual bars, and open them up and put one in each baggie. ½ of the share size package (aka 1 bar) is 220 calories! That’s a lot (relative to other options), so if I eat a <100 calorie option like fruit snacks or a date bar, I try to make it in the same hour as the above average option, like the ½ snickers. 220+80 is 300 calories, which means it’s above goal for the hour.

And that works well for me. Sometimes I do have hours where I am slightly below goal – say 240 calories. That’s fine! It’s not precise. But 250 calories per hour as a goal seems to work well as a general baseline, and I know that if I have several hours of at or greater than 250 calories, one smaller hour (200-250) is not a big deal. But this tracking and reviewing my data during the run via my tracking spreadsheet helps make sure I don’t get on a slippery slope to not consuming enough fuel to match the demands I’m putting on my body.

And the same goes for sodium. I have read a lot of literature on sodium consumption and/or supplementation in ultrarunning. Most of the science suggests it may not matter in terms of sodium concentration in the blood and/or muscle cramps, which is why a lot of people choose sodium supplementation. But for me, I have a very clear, distinct feeling when I get not enough sodium. It is almost like a chemical feeling in my chest, and is a cousin (but distinct) feeling to feeling ketones. I’ve had it happen before on long hikes where I drank tons to stay hydrated and kept my glucose levels in range but didn’t eat snacks with sodium nor supplement my water. I’ve also had it happen on runs. So for me, I do typically need sodium supplementation because that chemical-like feeling builds up and starts to make me feel like I’m wheezing in my chest (although my lungs are fine and have no issues during this). And what I found works for me is targeting around 500mg/hour of sodium consumption, through a combination of electrolyte pills and food.

(Side note, most ultrarunning blogs I’ve read suggest you’ll be just fine based on food you graze at the aid station. Well, I do most of my ultras as solo endeavors – no grazing, everything is pre-planned – and even if I did do an organized race, because of celiac I can’t eat 95% of the food (due to ingredients, lack of labeling, and/or cross contamination)…so that just doesn’t work for me to rely on aid station food to supplement me sodium-wise. But maybe it would work for other people, it just doesn’t for me given the celiac situation.)

I used to just target 500mg/hour of sodium through electrolyte pills. However, as I switched to actually fueling my runs and tracking carbs, fat, protein, and calories (as described above), I realized it’d be just as easy to track sodium intake in the food, and maybe that would enable me to have a different strategy on electrolyte pill consumption – and it did!

I went back to my spreadsheet and re-added information for sodium to all of my food items in my fuel library, and added it to the template that I duplicate for every run. Some of my food items, just like they can be outliers on calories or protein or fat or carbs, are also outliers on sodium. Biggest example? My beef stick, the protein outlier, is also a sodium outlier: 370mg of sodium! Yay! Same for my chili cheese Fritos – 210mg of sodium – which is actually the same amount of sodium that’s in the type of electrolyte pills I’m currently using.

I originally had a timer set and every 45 minutes, I’d take an electrolyte pill. However, in the last year I gradually realized that sometimes that made me over by quite a bit on certain hours and in some cases, I ended up WAY under my 500mg sodium goal. I actually noticed this in the latter portion of my 82 mile run – I started to feel the low-sodium chest feeling that I get, glanced at my sheet (that I hadn’t been paying close attention to because of So. Much. Rain) and realized – oops – that I had an hour of 323mg of sodium followed by a 495mg hour. I took another electrolyte pill to catch up and chose some higher sodium snacks for my next few fuels. There were a couple hours earlier in the run (hours 4 and 7) where I had happened to – based on some of my fresh fuel options like mashed potatoes – to end up with over 1000mg of sodium. I probably didn’t need that much, and so in subsequent hours I learned I could skip the electrolyte pill when I had had mashed potatoes in the last hour. Eventually, after my 82-mile run when I started training long runs again, I realized that keeping an eye on my rolling sodium tallies and tracking it like I tracked calories, taking an electrolyte pill when my hourly average dropped <500mg and not based on a pre-set time when it was >500mg, began to work well for me.

And that’s what I’ve been experimenting with for my last half dozen runs, which has worked – all of those runs have ended up with a total average slightly above 500mg of sodium and slightly above 250 calories for all hours of the run!

An example chart that automatically updates (as a pivot table) summarizing each hour's intake of sodium and calories during a run. At the bottom, an average is calculated, showing this 6 hour run example achieved 569 mg/hr of sodium and 262 calories per hour, reaching both goals.

Now, you may be wondering – she tracks calories and sodium, what about fat and protein and carbs?

I don’t actually care about or use these in real-time for an hourly average; I use these solely as real-time decision in points as 1) for carbs, to know how much insulin I might need dependent on my glucose levels at the time (because I have Type 1 diabetes); and 2) the fat and protein is to make sure I take the right amount of enzymes so I can actually digest the fuel (because I have exocrine pancreatic insufficiency and can’t digest fuel without enzyme pills). I do occasionally look back at these numbers cumulatively, but for the most part, they’re solely there for real-time decision making at the moment I decide what to eat. Which is 95% of the time based on my taste buds after I’ve decided whether I need to factor in a higher calorie or sodium option!

For me, my higher sodium options are chili cheese Fritos, beef stick, yogurt covered pretzels.

For me, my higher calorie options are the ½ share size Snickers; chili cheese Fritos; Reese’s pieces; yogurt covered pretzels; GF honey stinger stroopwaffle; and 2 mini PayDay bars.

Those are all shelf-stable options that I keep in snack size baggies and ready to throw into my running vest.

Most of my ‘fresh’ food options, that I’d have my husband bring out to the ‘aid station’/turnaround point of my runs for refueling, tend to be higher calorie options. This includes ¼ of a GF PB&J sandwich (which I keep frozen so it lasts longer in my vest without getting squishy); ¼ of a GF ham and cheese quesadilla; a mashed potato cup prepared in the microwave and stuck in another baggie (a jillion, I mean, 690mg of sodium if you consume the whole thing but it’s occasionally hard to eat allll those mashed potatoes out of a baggie in one go when you’re not actually very hungry); sweet potato tots; etc.

So again, my recommendation is to find foods you like in general and then figure out your guiding principles. For example:

  • Do you have any dietary restrictions, food allergies or intolerances, or have already learned foods that your body Does Not Like while running?
  • Are you aiming to do carbs/hr, calories/hr, or something else? What amounts are those?
  • Do you need to track your fuel consumption to help you figure out how you’re not hitting your fuel goals? If so, how? Is it by wrappers? Do you want to start with a list of fuel and cross it off or tear it off as you go? Or like me, use a note on your phone or a drop down list in your spreadsheet to log it (my blog post here has a template if you’d like to use it)?

My guiding principles are:

  • Gluten free with no cross contamination risk (because celiac)
  • ~250 calories per hour, eating twice per hour to achieve this
  • Each fuel (every 30 min) should be less than ~20g of carb, ~10g of fat, and ~5-10g of protein
  • I also want ~500mg of sodium each hour through the 2x fuel and when needed, electrolyte pills that have 210mg of sodium each
  • Dry food is harder to swallow; mouthfeel (ability to chew and swallow it) is something to factor in.
  • I prefer to eat my food on the go while I’m run/walking, so it should be all foods that can go in a snack or sandwich size baggie in my vest. Other options (like chicken broth, soup, and messy food items) can be on my backup list to be consumed at the aid station but unless I have a craving for them, they are secondary options.
  • Not a hassle to make/prepare/measure out into individual serving sizes.

Find foods that you like, figure out your guiding principles, and keep revising your list as you find what options work well for you in different situations and based on your running needs!

Food (fuel) for ultramarathon running by Dana Lewis at DIYPS.org