Try, Try Again with AI

If you’ve scoffed at, dismissed, or tried using AI and felt disappointed in the past, you’re not alone. Maybe the result wasn’t quite right, or it missed the mark entirely. It’s easy to walk away thinking, “AI just doesn’t work.” But like learning any new tool, getting good results from AI takes a little persistence, a bit of creativity, and the willingness to try again. Plus an understanding that “AI” is not a single thing.

AI is not magic or a mind reader. AI is a tool. A powerful one, but it depends entirely on how you use it. I find it helpful to think of it as a coworker or intern that’s new to your field. It’s generally smart and able to do some things, but it needs clear requests and directions on what to do. When it misses the mark, it needs feedback, or for you to circle around and try again with fresh instructions.

If your first attempt doesn’t go perfectly, it doesn’t mean the technology is useless, just like your brand new coworker isn’t completely useless.

Imperfect Doesn’t Mean Impossible

One way to think of AI is that it is a new kitchen gadget. Imagine that you get a new mini blender or food processor. You’ve never made a smoothie before, but you want to. You toss in a bunch of ingredients and out comes…yuck.

Are you going to immediately throw away the blender? Probably not. You’re likely to try again, with some tweaks. You’ll try different ingredients, more or less liquid, and modify and try again.

I had that experience when I broke my ankle and needed to incorporate more protein in my diet. I got a protein powder and tried stirring it into chocolate milk. Gross. I figured out that putting it in a tupperware container and shaking it thoroughly, then leaving it overnight, turned out ok. Eventually when I got a blender, I found it did even better. But the perfect recipe for me ended up being chocolate milk, protein powder, and frozen bananas. Yum, it made it like a chocolate milkshake texture and I couldn’t tell there was powder in it. But I still had to tweak things: shoving in large pieces of frozen bananas didn’t work well with my mini blender. I figured out slices worked ok, and eventually Scott and I zeroed in that it was most efficient to slice the banana and put it into the freezer, that way I had ready-to-go frozen right-sized banana chunks to mix in.

I had some other flops, too. I had found a few other recipes I liked to do without protein powder. Frozen raspberry or frozen pineapple + a crystal light lemonade packet + water are two of my hot weather favorites. But one time it occurred to me to try the pineapple recipe with protein powder in it… ew. That protein powder did not go well with citrus. So I didn’t make that one again.

AI is like that blender. If the result isn’t what you wanted, you should try:

  • Rewriting your prompt. Try different words, try giving it more context (instructions).
  • Give it more detail or clearer instructions. “Make a smoothie” is a little vague; “blend chocolate milk, protein powder, and frozen banana” is a little more direction to tell it what you want.
  • Try a different tool. The models are different for LLMs, and the setup is different for every tool. How you might use ChatGPT to do something might end up being different for using Gemini or MidJourney.

Sometimes, small tweaks make a big difference.

If It Doesn’t Work Today, Try Again Tomorrow (or sometime in the future)

Some tasks are still on the edge of what AI can do in general, or a particular model at that time. That doesn’t mean they’ll always be unable to do that task. AI is improving constantly, and quickly. What didn’t work a few months ago might work today, either in the same model or a new model/tool.

A flowchart diagram titled “Try a task with AI” illustrates how to approach AI usage with persistence and iteration. At the top is a purple box labeled “Try a task with AI.” Two arrows extend downward. The left arrow leads to a peach-colored box labeled “Result is not quite right,” which then leads to another box with three bullet points: “Reword your prompt,” “Give it more instructions,” and “Try this prompt with a different model/tool.” Below that is a smaller orange box labeled “Still didn’t work?” which connects to a final box that says: “Park this project: ‘try again later’ list” and “Try a different task or project.” From this box, an arrow loops back to the top box, showing that users should try again. The right arrow from the top goes to a green box labeled “Result is pretty good,” which then leads to another green box that says “Keep going & use AI for other tasks and projects.” This green path also loops back to the top. The overall message of the diagram is that regardless of whether the result is good or not quite right, users should continue experimenting with AI and trying new tasks.I’ve started making a list of projects or tasks I want to work on where the AI isn’t quite there yet and/or I haven’t figured out a good setup, the right tool, etc. A good example of this was when I wanted to make an Android version of PERT Pilot. It took me *four tries* over the course of an entire year before I made progress to a workable prototype. Ugh. I knew it wasn’t impossible, so I kept coming back to the project periodically and starting fresh with a new chat and new instructions to try to get going. In the course of a year, the models changed several times, and the latest models were even better at coding. Plus, I was better through practice at both prompting and troubleshooting when the output of the LLM wasn’t quite what I wanted. All of that over time added up, and I finally have an Android version of PERT Pilot (and it’s out on the Play Store now, too!) to match the iOS version of PERT Pilot. (AI also helped me quickly take the AI meal estimation feature from PERT Pilot, which is an app for people with EPI, and turn it into a general purpose app for iOS called Carb Pilot. If you’re interested in getting macronutrient (fat, protein, carb, and/or calorie) counts for meals, you might be interested in Carb Pilot.)

Try different tasks and projects

You don’t have to start with complex projects. In fact, it’s better if you don’t. Start with tasks you already know how to do, but maybe want to see how the AI does. This could be summarizing text, writing or rewriting an email, changing formats of information (eg json to csv, or raw text into a table formatted so you can easily copy/paste it elsewhere).

Then branch out. Try something new you don’t know how to do, or tackle a challenge you’ve been avoiding.

There are two good categories of tasks you can try with AI:

  • Tasks you already do, but want to do more efficiently
  • Tasks you want to do, but aren’t sure how to begin

AI is a Skill, and Skills Take Practice

Using AI well is a skill. And like any skill, it improves with practice. It’s probably like managing an intern or a new coworker who’s new to your organization or field. The first time you managed someone, it probably wasn’t as good as after you had 5 years of practice managing people or helping interns get up to speed quickly. Over time, you figure out how to right-size tasks; repeat instructions or give them differently to meet people’s learning or communication styles; and circle back when needed when it’s clear your instructions may have been misunderstood or they’re heading off in a slightly unexpected direction.

Don’t let one bad experience with AI close the door. The people who are getting the most out of AI right now are the ones who keep trying. We experimented, failed, re-tried, and learned. That can be you, too.

If AI didn’t wow you the first time for the first task you tried, don’t quit. Rephrase your prompt. Try another model/tool. (Some people like ChatGPT; some people like Claude; some people like Gemini….etc.) You can also ask for help. (You can ask the LLM itself for help! Or ask a friendly human, I’m a friendly human you can try asking, for example, if you’re reading this post. DM or email me and tell me what you’re stuck on. If I can make suggestions, I will!)

Come back in a week. Try a new type of task. Try the same task again, with a fresh prompt.

But most importantly: keep trying. The more you do, the better it gets.

iOS and Android development experience for newbies

Vibe coding apps is one things, but what about deploying and distributing them? That still requires some elbow grease, and I’ve described my experiences with both Apple and Google below for my first apps in each platform.

(I’m writing this from the perspective of someone familiar with coding primarily through bash scripts, JavaScript, Python, and various other languages, but with no prior IDE or mobile app development experience when I got started, as I typically work in vim through the terminal. I was brand new to IDEs and app development for both iOS and Android when I got started. For context, I have an iOS personal device.)

Being new to iOS app development

First, some notes on iOS development. If you only want to test your app on your own phone, it’s free. You can build the app in XCode and with a cord, deploy it directly on your phone. However, if you wish to distribute apps via TestFlight (digitally) to yourself or other users, Apple requires a paid developer account at $99 per year. (This cost can be annoying for people working on free apps who are doing this as not-a-business). Initially, figuring out the process to move an app from Xcode to TestFlight or the App Store is somewhat challenging. However, once you understand that archiving the app opens a popup to distribute it, the process becomes seamless. Sometimes there are errors if Apple has new development agreements for you to sign in the web interface, but the errors from the process just say your account is wrong. (So check the developer page in your account for things to sign, then go try again once you’ve done that.) TestFlight itself is intuitive even for newcomers, whether that is yourself or a friend or colleague you ask to test your app.

Submitting an app to the App Store through the web interface is relatively straightforward. Once you’ve got your app into TestFlight, you can go to app distribution, and create a version and listing for your app and add the build you put into TestFlight. Note that Apple is particular about promotional app screenshots and requires specific image sizes. Although there are free web-based tools to generate these images from your screenshots, if you use a tool without an account login, it becomes difficult to replicate the exact layout later. To simplify updates, I eventually switched to creating visuals manually using PowerPoint. This method made updating images easier when I had design changes to showcase, making me more likely to keep visuals current. Remember, you must generate screenshots for both iPhone and iPad, so don’t neglect testing your app on iPad, even if usage might seem minimal.

When submitting an app for the first time, the review process can take several days before beginning. My initial submission encountered bugs discovered by the reviewer and was rejected. After fixing the issues and resubmitting, the process was straightforward and quicker than expected. Subsequent submissions for new versions have been faster than the very first review (usually 1-3 days max, sometimes same-day), and evaluation by App Store reviewers seems more minimal for revisions versus new apps.

The main challenge I have faced with App Store reviews involved my second app, Carb Pilot. I had integrated an AI meal estimation feature into PERT Pilot and created Carb Pilot specifically for AI-based meal estimation and custom macronutrient tracking. Same feature, but plucked out to its own app. While this feature was approved swiftly in PERT Pilot as an app revision, Carb Pilot repeatedly faced rejections due to the reviewer testing it with non-food items. Same code as PERT Pilot, but obviously a different reviewer and this was the first version submitted. Eventually, I implemented enough additional error handling to ensure the user (or reviewer, in this case) entered valid meal information, including a meal name and a relevant description. If incorrect data was entered (identified by the API returning zero macronutrient counts), the app would alert users. After addressing these edge cases through several rounds of revisions, the app was finally approved. It might have been faster with a different reviewer, but it did ultimately make the app more resilient to unintended or unexpected user inputs.

Other than this instance, submitting to the App Store was straightforward, and it was always clear at what stage the process was, and the reviewer feedback was reasonable.

(Note that some features like HealthKit or audio have to be tested on physical devices, because these features aren’t available in the simulator, so depending on your app functionality, you’ll want to test both with the simulator and with physical iOS devices to test those. Otherwise, you don’t have to have access to test on a physical device.)

Being new to Android app development

In contrast, developing for Android was more challenging. I decided to create an Android version of PERT Pilot after receiving several requests. However, this effort took nearly two years and four separate attempts to even get a test version built. I flopped at the same stage three times in a row, even with LLM (AI) assistance in trying to debug the problem.

Despite assistance from language models (LLMs), I initially struggled to create a functional Android app from scratch. Android Studio uses multiple nested folder structures with Kotlin (.kt) files and separate XML files. The XML files handle layout design, while Kotlin files manage functionality and logic, unlike iOS development, which primarily consolidates both into fewer files or at least consistently uses a single language. Determining when and where to code specific features was confusing. (This is probably easier in 2025 with the advent of agent and IDE-integrated LLM tools! My attempts were with chat-based LLMs that could not access my code directly or see my IDE, circa 2023 and 2024.)

Additionally, Android development involves a project-wide “gradle” file that handles various settings. Changes made to this file require manually triggering a synchronization process. Experienced Android developers might find this trivial, but it is unintuitive for newcomers to locate both the synchronization warnings and the sync button. If synchronization isn’t performed, changes cannot be tested, causing blocks in development.

Dependency management also posed difficulties, and that plus the gradle confusion is what caused my issues on three different attempts. Initially, dependencies provided by the LLM were formatted incorrectly, breaking the build. Eventually (fourth time was the charm!), I discovered there are two separate gradle files, and pasting dependencies correctly and synchronizing appropriately resolved these issues. While partly user error (I kept thrashing around with the LLM trying to solve the dependency formatting, and finally on the fourth attempt realized it was giving me a language/formatting approach that was a different language than the default Android Studio gradle file, even though I had set up Android Studio’s project to match the LLM approach. It was like giving Android Studio Chinese characters to work with when it was expecting French), this issue significantly impacted my development experience, and it was not intuitive to resolve within Android Studio even with LLM help. But I finally got past that to a basic working prototype that could build in the simulator!

I know Android has different features than iOS, so I then had to do some research to figure out what gestures were different (since I’m not an Android user), as well as user research. We switched from swiping to long pressing on things to show menu options for repeat/edit/deleting meals, etc. That was pretty easy to swap out, as were most of the other cosmetic aspects of building PERT Pilot for Android.

Most of the heartache came down to the setup of the project and then the exporting and deploying to get it to the Play Store for testing and distribution.

Setting up a Google Play developer account was quick and straightforward, despite needing to upload identification documents for approval, which took a day to get verified. There’s a one-time cost ($25) for creating the development account, that’s a lot cheaper than the yearly fee for Apple ($99/year). But remember, above and below, that you’re paying with your time as opposed to money, in terms of a less intuitive IDE and web interface for moving forward with testing and deploying to production.

Also, you have to have hands-on access to a physical Android device. I have an old phone that I was able to use for this purpose. You only have to do this once during the account creation/approval process, so you may be able to use a friend’s device (involves scanning QR code and being logged in), but this is a little bit of a pain if you don’t have a modern physical Android device.

I found navigating the Play Store developer console more complicated than Apple’s, specifically when determining the correct processes for uploading test versions and managing testers. Google requires at least 12 users over a two-week testing period before allowing production access. Interestingly, it’s apparently pretty common to get denied production access even after you have 12 users, the minimum stated. It’s probably some secret requirement about app use frequency, although they didn’t say that. The reason for rejection was uninformative. Once denied, you then have a mandatory 14 day wait period before you can apply again. I did some research and found that it’s probably because they want a lot of active use in that time frame. Instead of chasing other testers (people who would test for the sake of testing but not be people with EPI), I waited the 14 days and applied again and made it clear that people wouldn’t be using the app every day, and otherwise left my answers the same…and this time lucked into approval. This meant I was allowed to submit for review for production access to the Play Store. I submitted….and was rejected, because there are rules that medical and medical education apps can only be distributed by developers tied to organizations that have a business number and have been approved. What?!

Apparently Google has a policy that medical “education” apps must be distributed by organizations with approved business credentials. The screenshots sent back to me seem to be flagging on the button I had on the home screen that described PERT and dosing PERT and information about the app. I am an individual (not an organization or a nonprofit or a company) and I’m making this app available for free to help people, so I didn’t want to have to go chase a nonprofit who might have android developer credentials to tie my app to.

What I tried next was removing the button with the ‘education’ info, changing the tags on my app to fall under health & fitness rather than ‘medical’, and resubmitting. No other changes.

This time…it was accepted!

Phew.

iOS or Android: which was easier? A newbie's perspective on iOS and Android development and app deployment, a blog by Dana M. Lewis from DIYPS.orgTL;DR: as more and more people are going to vibe code their way to having Android and/or iOS apps, it’s very feasible for people with less experience to do both and to distribute apps on both platforms (iOS App Store and Google Play Store for Android). However, there’s an up front higher cost to iOS ($99/year) but a slightly easier, more intuitive experience for deploying your apps and getting them reviewed and approved. Conversely, Android development, despite its lower entry cost ($25 once), involves navigating a more complicated development environment, less intuitive deployment processes, and opaque requirements for app approval. You pay with your time, but if you plan to eventually build multiple apps, once you figure it out you can repeat the process more easily. Both are viable paths for app distribution if you’re building iOS and Android apps in the LLM-era of assisted coding, but don’t be surprised if you hit bumps in the road for deploying for testing or production.

Which should you choose for your first app, iOS or Android? It depends on if you have a fondness for either iOS or Android ecosystem; if one is closer to development languages you already know; or if one is easier to integrate/work with your LLM of choice. (I now have both working with Cursor and both also can be pulled into the ChatGPT app). Cost may be an issue, if $99/year is out of reach as a recurring cost, but keep in mind you’ll pay with your time for Android development even though it’s a $25 single time user account setup fee for developers. You also may want to think about whether your first app is a one-off or if you think you might do more apps in the future, which may change the context for paying the Apple developer fee yearly. Given the requirements to test with a certain number of users for Play Store access, it’s easier to go from testing to production/store publication on Apple than it is for Google, which might factor into subsequent app and platform decisions, too.

iOS Android
Creating a developer account better (takes more time, ID verification), one time $25 fee, requires physical device access
Fees/costs $99/year Better: one time $25 fee for account creation
IDE better (more challenging with different languages/files and requires gradle syncing)
Physical device access required No (unless you need to test integrations like HealthKit or audio input or exporting files or sending emails) Yes, as part of the account setup but you could borrow someone’s phone to accomplish this
Getting your app to the web for testing Pretty clear once you realize you have to “archive” your app from XCode, pops up a window that then guides you through sending to TestFlight. (Whether or not you actually test in TestFlight, you can then add to submit for review).

Hiccups occasionally if Apple requires you to sign new agreements in the web interface (watch for email notifications and if you get errors about your account not being correct, if you haven’t changed which account you are logged into with XCode, check the Apple developer account page on the web. Accept agreements, try again to archive in XCode, and it should clear that error and proceed.

A little more complicated with generating signed bundles, finding where that file was saved on your computer, then dragging and dropping or attaching it and submitting for testing.

Also more challenging to manage adding testers and facilitate access to test.

Submitting for approval/production access Better, easy to see what stage of review your app is in. Challenging to navigate where/how to do this in web interface the first time, and Google has obtuse, unstated requirements about app usage during testing.
Expect to be rejected the first time (or more) and have to wait 14 days to resubmit.
Distribution once live on the store Same Same

 

Piecing together your priorities when your pieces keep changing

When dealing with chronic illnesses, it sometimes feels like you have less energy or time in the day to work with than someone without chronic diseases. The “spoon theory” is a helpful analogy to illustrate this. In spoon theory, each person has a certain number of “spoons” representing their daily energy available for tasks including activities of daily living, activity or recreation activity, work, etc. For example, an average person might have 10 spoons per day, using just one spoon for daily tasks. However, someone with chronic illness may start with only 8 spoons and require 2-3 spoons for the same daily tasks, leaving them with fewer spoons for other activities.

I’ve been thinking about this differently lately. My priorities on a daily basis are mixed between activities of daily living (which includes things like eating, managing diabetes stuff like changing pump site or CGM, etc); exercise or physical activity like walking or cross-country skiing (in winter) or hiking (at other times of the year); and “work”. (“Work” for me is a mix of funded projects and my ongoing history of unfunded projects of things that move the needle, such as developing the world’s first app for exocrine pancreatic insufficiency or developing a symptom score and validating it through research or OpenAPS, to name a few.)

A raccooon juggles three spoonsAs things change in my body (I have several autoimmune diseases and have gained more over the years), my ‘budget’ on any given day has changed, and so have my priorities. During times when I feel like I’m struggling to get everything done that I want to prioritize, it sometimes feels like I don’t have enough energy to do it all, compared to other times when I’ve had sufficient energy to do the same amount of daily activities, and with extra energy left over. (Sometimes I feel like a raccoon juggling three spoons of different weights.)

In my head, I can think about how the relative amount of energy or time (these are not always identical variables) are shaped differently or take up different amounts of space in a given day, which only has 24 hours. It’s a fixed budget.

I visualize activities of daily living as the smallest amount of time, but it’s not insignificant. It’s less than the amount of time I want to spend on work/projects, and my physical activity/recreation also takes up quite a bit of space. (Note: this isn’t going to be true for everyone, but remember for me I like ultrarunning for context!)

ADLs are green, work/projects are purple, and physical activity is blue:

Example of two blocks stacked on each other (green), four blocks in an l shape (purple), three blocks in a corner shape (blue)

They almost look like Tetris pieces, don’t they? Imagine all the ways they can fit together. But we have a fixed budget, remember – only 24 hours in the day – so to me they become Tangram puzzle pieces and it’s a question every day of how I’m going to construct my day to fit everything in as best as possible.

Preferably, I want to fit EVERYTHING in. I want to use up all available time and perfectly match my energy to it. Luckily, there are a number of ways these pieces fit together. For example, check out these different variations:

8 squares with different color combinations with a double block, an l shaped block, and a corner (three pieces) block. All squares are completely full, but in different combinations/layouts of the blocks

But sometimes even this feels impossible, and I’m left feeling like I can’t quite perfectly line everything up and things are getting dropped.

Example of a square where the blocks don't all fit inside the squareIt’s important to remember that even if the total amount of time is “a lot”, it doesn’t have to be done all at once. Historically, a lot of us might work 8 hour days (or longer days). For those of us with desk jobs, we sometimes have options to split this up. For example, working a few hours and then taking a lunch break, or going for a walk / hitting the gym, then returning to work. Instead of a static 9-5, it may look like 8-11:30, 1:30-4:30, 8-9:30.

The same is true for other blocks of time, too, such as activities of daily living: they’re usually not all in one block of time, but often at least two (waking up and going to bed) plus sprinkled throughout the day.

In other words, it’s helpful to recognize that these big “blocks” can be broken down into smaller subunits:

Tangram-puzzle-pieces-different-shapes-closeup-DanaMLewis

And from there… we have a lot more possibilities for how we might fit “everything” (or our biggest priorities) into a day:

Showing full blocks filled with individual blocks, sometimes linked but in different shapes than the L and corner shapes from before.

For me, these new blocks are more common. Sometimes I have my most typical day with a solid block of exercise and work just how I’d prefer them (top left). Other times, I have less exercise and several work blocks in a day (top right). Other days, I don’t have energy for physical activity, activities of daily living take more energy or I have more tasks to do and I also don’t have quite as much time for longer work sections (bottom left). There’s also non-work days too where I prioritize getting as much activity as I can in a day (bottom right!). But in general, the point of this is that instead of thinking about the way we USED to do things or thinking we SHOULD do things a certain way, we should think about what needs to be done; the minimum of how it needs to be done; and think creatively about how we CAN accomplish these tasks, goals, and priorities.

A useful trigger phrase to check is if you find yourself saying “I should ______”. Stop and ask yourself: should, according to what/who? Is it actually a requirement? Is the requirement about exactly how you do it, or is it about the end state?

“I should work 8 hours a day” doesn’t mean (in all cases) that you have to do it 8 straight hours in a row, other than a lunch break.

If you find yourself should-ing, try changing the wording of your sentence, from “I should do X” to “I want to do X because Y”. It helps you figure out what you’re trying to do and why (Y), which may help you realize that there are more ways (X or Z or A) to achieve it, so “X” isn’t the requirement you thought it was.

If you find yourself overwhelmed because it feels like you have a big block task that you need to do, this is also helpful then to break it down into steps. Start small, as small as opening a document and writing what you need to do.

My recent favorite trick that is working well for me is putting the item of “start writing prompt for (project X)” on my to-do list. I don’t have to run the prompt; I don’t have to read the output then; I don’t have to do the next steps after that…but only start writing the prompt. It turns out that writing the prompt for an LLM helps me organize my thoughts in a way that it then makes the subsequent next steps easier and clearer, and I often then bridge into completing several of those follow up tasks! (More tips about starting that one small step here.)

The TL;DR: perhaps is that while we might yearn to fit everything in perfectly and optimize it all, it’s not going to always turn out like that. Our priorities change, our energy availability changes (due to health or kids’ schedules or other life priorities), and if we strive to be more flexible we will find more options to try to fit it all in.

Sometimes we can’t, but sometimes breaking things down can help us get closer.

Showing how the blocks on the left have fixed shapes and have certain combinations, then an arrow to the right with example blocks using the individual unit blocks rather than the fixed shapes, so the blocks look very different but are all filled, also.

Scale yourself

One of the things I wish people would consider more often when thinking about AI is how they can use it to scale themselves. What are some time-consuming things that they currently have to do themselves that AI could do for them to streamline their output and increase their productivity? Productivity for giving them more time to do the things only they can do, the things they want to do, or the things they love to do. (And to help stop procrastinating on things they have to do.)

I have a habit of trying to scale myself. These days, it’s often related to EPI (exocrine pancreatic insufficiency, which some areas of the world know by the acronym PEI). I developed a strong knowledge base first from personal experience, then by doing research – including a systematic review where I read hundreds, plural, of research papers on key topics related to design protocols and guidelines. As a result of both personal and research experience, I have a lot of knowledge. It gets tapped almost daily in the EPI support groups that I’m a part of.

Whenever I notice myself answering the same question repeatedly, I make a mental note of it. Eventually, if a topic comes up often enough, I turn my response into a blog post. This way, I can provide a well-structured, comprehensive answer with more time and context than a quick comment on social media allows – and with the ability to give the same, high quality answer to multiple people (and in some cases, hundreds or thousands of people rather than the few who might see the comment buried in a response thread).

A few examples of this include:

One of my favorite things with this approach is then seeing other people begin to share the links to my longer-form content to help answer common questions. By writing things down in a shareable way, it also enables and supports other people to scale your work by sharing it easily. This has started to happen more and more with the elastase blog post, in part because there are so few resources that cover this information all in one place.

For me, I lean toward writing, but for other people that could be videos, podcast/audio recording, or other formats that can capture things you know and make them shareable, thus scaling yourself.

For me, this approach of “scaling myself” and thinking about longer form content to post online instead of re-typing similar answers over and over again isn’t unique to EPI.

I have been doing this for over a decade. I developed this pattern early after we developed and shared OpenAPS (the first open source automated insulin delivery algorithm) with the world. Early on, I found myself answering the same technical questions repeatedly in online discussions with the same answers. Typing out explanations on my phone was inefficient, and if one person had a question, others likely had the same one. Instead of repeating myself, I took the time to document answers. I would often pause, write up the information in the documentation, and share that instead. This made it easier and quicker to go find and share a link instead of retyping responses, and it also took less time, so I was willing to do it more quickly than if I had to delay what I was doing in real life in order to type out a long yet already-answered question. Over time, I had to do less one-off typing on my phone (and could save that time and energy for true, one-off unique questions) and could share links with a lot more information more easily.

How do I use AI to scale this type of work?

A lot of the above tasks are related to writing. There are different ways you can use AI for writing, without having it write something completely. You can give it notes – whether you type or voice dictate them – and have it clean up your notes, so you can focus on thinking and not about typing or fixing typos that break your flow. You can have it convert the notes into full sentences. You can ask it to write a paragraph or an article based on the notes. You can ask it to suggest wording for a particular sentence you want to clarify for your audience.

If you think about the AI as an intern and/or a partner/collaborator who you would ask to review or edit for you, you’ll likely find even more ways to integrate AI into different parts of your writing process, even if it’s not doing the full writing for you.

I have also tried to task the AI with writing for me, with mixed results. This doesn’t mean I don’t use it, but I’ve been practicing and learning where it generates usable content and where it doesn’t.

A lot of it depends on the prompt and the topic (as much as it does the output in terms of style, length, intended audience etc).

If it’s a topic that’s “known”, it can write more content that I can take and edit and transform, as opposed to when I am trying to write about a concept that is far from the current knowledge base. (I mean far for both humans and of AI – a lot of my work is bleeding edge, pushing fields towards new developments and leading humans there.) Sometimes I ask it to write something and end up using none of the content, but by saying “ugh no” my brain has jumped to saying to myself “it should really say…” and I am able to more quickly springboard into manually writing the content I was previously slow on. In other words, it can be a brainstorming tool in the opposite sense, showing me what I do not want to say on a topic! And on some of my frontier/bleeding edge topics, it reflects what is commonly ‘known’ and when what is known is now wrong (example, as always, of how it’s commonly incorrectly reported that chronic pancreatitis is the most common cause of EPI), it helps me more clearly distinguish the new content from the old, wrong, or misinformed.

(Also, it’s worth reminding you what I have to remind myself, that AI is changing constantly and new tools override what is known about what tasks do and don’t do well! For example, in between writing this and posting it, OpenAI released GPT4.5, which is reportedly better at writing-related tasks than GPT-4o and other older models. I’ll have to test it and see if that’s true and for what kinds of writing tasks!)

This isn’t the only way you can scale yourself with AI, though. Scaling yourself doesn’t have to be limited to writing and documentation style tasks. AI and other tools can help with many tasks (more examples here and here), such as:

  • Cleaning and transforming data into different formats
  • Converting a CSV file into a more readable table
  • Writing code to automate tedious data processing
  • Drafting plain-language instructions for engineers or programmers
  • Checking whether instructions or explanations are clear and understandable, and identifying any gaps in logic that you missed on your first pass

By leveraging AI and other automation tools, you can free up time and energy for higher-value work: the things you are uniquely suited to do in the world, and the things that you want or love to do. And do them more easily!

Pro tip: if you find yourself procrastinating a task, this may be a good sign that you could use AI for some of it. 

I’m trying to use noticing procrastination as a trigger for considering AI for a task.

An example of this is an upcoming post with a bunch of math and meaty cost analysis that I originally did by hand. I needed (wanted) to re-do these estimates with different numbers, but procrastinated a bit because having to carefully re-do all the estimates and replace them throughout the blog post seemed tedious, so my brain wanted to procrastinate. So, I took the blog post and dumped it in with a prompt asking it to write Jupyter Notebook code to replicate the analyses explained via the plain language post, with the ability to adjust all input variables and see the results in a table so I could compare the original and updated numbers. It took less than 1 minute to generate this code and about 5 minutes for me to copy/paste, update the numbers, run it, and evaluate the output and decide what to update in the post. Manually, this would’ve taken 30-60 minutes due to needing to check my work manually and trace it throughout the post. Instead, this automated the tedious bit and will result in this new post coming out next week rather than weeks from now (read about it here – it’s an analysis on how cost-effect Life for a Child is, a charity supporting people living with diabetes in low- and middle-income countries that can use your help to save lives.)

Scale yourself: automate more, so you can handle what matters, a blog by Dana M. Lewis from DIYPS.orgI encourage you to think about scaling yourself and identifying a task or series of tasks where you can get in the habit of leveraging these tools to do so. Like most things, the first time or two might take a little more time. But once you figure out what tasks or projects are suited for this, the time savings escalate. Just like learning how to use any new software, tool, or approach. A little bit of invested time up front will likely save you a lot of time in the future.

You Can Create Your Own Icons (and animated gifs)

Over the years, I’ve experimented with different tools for making visuals. Some of them are just images but in the last several years I’ve made more animations, too.

But not with any fancy design program or purpose built tool. Instead, I use PowerPoint.

Making animated gifs

I first started using PowerPoint to create gifs around 2018 or 2019. At the time, PowerPoint didn’t have a built-in option to export directly to GIF format, so I had to export animations as a movie file first and then use an online converter to turn them into a GIF. Fortunately, in recent years, PowerPoint has added a direct “Export as GIF” feature.

The process of making an animated GIF in PowerPoint is similar to adding animations or transitions in a slide deck for a presentation. I’ve used this for various projects, including:

Am I especially trained? No. Do I feel like I have design skills? No.

Elbow grease and determination to try is what I have, with the goal of trying to use visuals to convey information as a summary or to illustrate a key point to accompany written text. (I also have a tendency to want to be a perfectionist, and I have to consciously let that go and let “anything is better than nothing” guide my attempts.)

Making icons is possible, too

Beyond animations, I’ve also used PowerPoint to create icons and simple logo designs.

I ended up making the logos for Carb Pilot (a free iOS app that enables you to track the macronutrients of your choice) and PERT Pilot (a free iOS app that enables people with exocrine pancreatic insufficiency, known as EPI or PEI, to track their enzyme intake) using PowerPoint.

This, and ongoing use of LLMs to help me with coding projects like these apps, is what led me to the realization that I can now make icons, too.

I was working to add a widget to Carb Pilot, so that users can have a widget on the home screen to more quickly enter meals without having to open the app and then tap; this saves a click every time. I went from having it be a single button to having 4 buttons to simulate the Carb Pilot home screen. For the “saved meals” button, I wanted a list icon, to indicate the list of previous meals. I went to SF Symbols, Apple’s icon library, and picked out the list icon I wanted to use, and referenced it in XCode. It worked, but it lacked something.

A light purple iOS widget with four buttons - top left is blue and says AI: top right is purple with a white microphone icon; bottom left is periwinkle blue with a white plus sign icon; bottom right is bright green with a custom list icon, where instead of bullets the three items are an apple, cupcake, and banana mini-icons. It occurred to me that maybe I could tweak it somehow and make the bullets of the list represent food items. I wasn’t sure how, so I asked the LLM if it was possible. Because I’ve done my other ‘design’ work in PowerPoint, I went there and quickly dropped some shapes and lines to simulate the icon, then tested exporting – yes, you can export as SVG! I spent a few more minutes tweaking versions of it and exporting it. It turns out, yes, you can export as SVG, but then the way I designed it wasn’t really suited for SVG use. When I had dropped the SVG into XCode, it didn’t show up. I asked the LLM again and it suggested trying PNG format. I exported the icon from powerpoint as PNG, dropped it into XCode, and it worked!

(That was a good reminder that even when you use the “right” format, you may need to experiment to see what actually works in practice with whatever tools you’re using, and not let the first failure be a sign that it can’t work.)

Use What Works

There’s a theme you’ll be hearing from me: try and see what works. Just try. You don’t know if you don’t try. With LLMs and other types of AI, we have more opportunities to try new and different things that we may not have known how to do before. From coding your own apps to doing data science to designing custom icons, these are all things I didn’t know how to do before but now I can. A good approach is to experiment, try different things (and different prompts), and not be afraid to use “nontraditional” tools for projects, creative or otherwise. If it works, it works!

The prompt matters when using Large Language Models (LLMs) and AI in healthcare

I see more and more research papers coming out these days about different uses of large language models (LLMs, a type of AI) in healthcare. There are papers evaluating it for supporting clinicians in decision-making, aiding in note-taking and improving clinical documentation, and enhancing patient education. But I see a wide-sweeping trend in the titles and conclusions of these papers, exacerbated by media headlines, making sweeping claims about the performance of one model versus another. I challenge everyone to pause and consider a critical fact that is less obvious: the prompt matters just as much as the model.

As an example of this, I will link to a recent pre-print of a research article I worked on with Liz Salmi (published article here pre-print here).

Liz nerd-sniped me about an idea of a study to have a patient and a neuro-oncologist evaluate LLM responses related to patient-generated queries about a chart note (or visit note or open note or clinical note, however you want to call it). I say nerd-sniped because I got very interested in designing the methods of the study, including making sure we used the APIs to model these ‘chat’ sessions so that the prompts were not influenced by custom instructions, ‘memory’ features within the account or chat sessions, etc. I also wanted to test something I’ve observed anecdotally from personal LLM use across other topics, which is that with 2024-era models the prompt matters a lot for what type of output you get. So that’s the study we designed, and wrote with Jennifer Clarke, Zhiyong Dong, Rudy Fischmann, Emily McIntosh, Chethan Sarabu, and Catherine (Cait) DesRoches, and I encourage you to check out the article here pre-print and enjoy the methods section, which is critical for understanding the point I’m trying to make here. 

In this study, the data showed that when LLM outputs were evaluated for a healthcare task, the results varied significantly depending not just on the model but also on how the task was presented (the prompt). Specifically, persona-based prompts—designed to reflect the perspectives of different end users like clinicians and patients—yielded better results, as independently graded by both an oncologist and a patient.

The Myth of the “Best Model for the Job”

Many research papers conclude with simplified takeaways: Model A is better than Model B for healthcare tasks. While performance benchmarking is important, this approach often oversimplifies reality. Healthcare tasks are rarely monolithic. There’s a difference between summarizing patient education materials, drafting clinical notes, or assisting with complex differential diagnosis tasks.

But even within a single task, the way you frame the prompt makes a profound difference.

Consider these three prompts for the same task:

  • “Explain the treatment options for early-stage breast cancer.”
  • “You’re an oncologist. Explain the treatment options for early-stage breast cancer.”
  • “You’re an oncologist. Explain the treatment options for early-stage breast cancer as you would to a newly diagnosed patient with no medical background.”

The second and third prompt likely result in a more accessible and tailored response. If a study only tests general prompts (e.g. prompt one), it may fail to capture how much more effective an LLM can be with task-specific guidance.

Why Prompting Matters in Healthcare Tasks

Prompting shapes how the model interprets the task and generates its output. Here’s why it matters:

  • Precision and Clarity: A vague prompt may yield vague results. A precise prompt clarifies the goal and the speaker (e.g. in prompt 2), and also often the audience (e.g. in prompt 3).
  • Task Alignment: Complex medical topics often require different approaches depending on the user—whether it’s a clinician, a patient, or a researcher.
  • Bias and Quality Control: Poorly constructed prompts can inadvertently introduce biases

Selecting a Model for a Task? Test Multiple Prompts

When evaluating LLMs for healthcare tasks—or applying insights from a research paper—consider these principles:

  1. Prompt Variation Matters: If an LLM fails on a task, it may not be the model’s fault. Try adjusting your prompts before concluding the model is ineffective, and avoid broad sweeping claims about a field or topic that aren’t supported by the test you are running.
  2. Multiple Dimensions of Performance: Look beyond binary “good” vs. “bad” evaluations. Consider dimensions like readability, clinical accuracy, and alignment with user needs, as an example when thinking about performance in healthcare. In our paper, we saw some cases where a patient and provider overlapped in ratings, and other places where the ratings were different.
  3. Reproducibility and Transparency: If a study doesn’t disclose how prompts were designed or varied, its conclusions may lack context. Reproducibility in AI studies depends not just on the model, but on the interaction between the task, model, and prompt design. You should be looking for these kinds of details when reading or peer reviewing papers. Take results and conclusions with a grain of salt if these methods are not detailed in the paper.
  4. Involve Stakeholders in Evaluation: As shown in the preprint mentioned earlier, involving both clinical experts and patients in evaluating LLM outputs adds critical perspectives often missing in standard evaluations, especially as we evolve to focus research on supporting patient needs and not simply focusing on clinician and healthcare system usage of AI.

What This Means for Healthcare Providers, Researchers, and Patients

  • For healthcare providers, understand that the way you frame a question can improve the usefulness of AI tools in practice. A carefully constructed prompt, adding a persona or requesting information for a specific audience, can change the output.
  • For researchers, especially those developing or evaluating AI models, it’s essential to test prompts across different task types and end-user needs. Transparent reporting on prompt strategies strengthens the reliability of your findings.
  • For patients, recognizing that AI-generated health information is shaped by both the model and the prompt. This can support critical thinking when interpreting AI-driven health advice. Remember that LLMs can be biased, but so too can be humans in healthcare. The same approach for assessing bias and evaluating experiences in healthcare should be used for LLM output as well as human output. Everyone (humans) and everything (LLMs) are capable of bias or errors in healthcare.

Prompts matter, so consider model type as well as the prompt as a factor in assessing LLMs in healthcare. Blog by Dana M. LewisTLDR: Instead of asking “Which model is best?”, a better question might be:

“How do we design and evaluate prompts that lead to the most reliable, useful results for this specific task and audience?”

I’ve observed, and this study adds evidence, that prompt interaction with the model matters.

A Slackbot for using Slack to access and use a chat-based LLM in public

I’ve been thinking a lot about how to help my family, friends, and colleagues use LLMs to power their work. (As I’ve written about here, and more recently here with lots of tips on prompting and effectively using LLMs for different kinds of projects). 

Scott has been on the same page, especially thinking about how to help colleagues use LLMs effectively, but taking a slightly different approach: he built a Slackbot (a bot for Slack) which uses GPT-3.5 and GPT-4 to answer questions. This uses the API of GPT but presents it to the user in Slack instead of having to use ChatGPT as the chat interface. So, it’s a LLM chatbot, different than ChatGPT (because it’s a different chat interface), but uses the same AI (GPT-3.5 and GPT-4 from OpenAI). You could implement the same idea (a chat-based bot in Slack) using different AIs/LLMs, of course.

Using a slack-based bot for an LLM achieves a couple of things:

  1. More people can try GPT-4 and compare it to GPT-3.5 to get a taste for prompting and responses, without having to pay $20/month for a ChatGPT Pro account to get access to GPT-4.
  2. If you spend a lot of time in Slack for work, you don’t have to switch windows to access GPT.
  3. If your employer doesn’t want you to use the ChatGPT consumer product for work due to security concerns, but is more comfortable with OpenAI’s confidentiality guarantees for their API, a Slack bot interface provides a more secure alternative.
  4. You can see “in public” how other people in your Slack workspace are using it, and learn from their prompts and interactions new tricks and ideas (in addition to reading my long post here with lots of tips and suggestions).
  5. You can privately (in DM by yourself) talk to the bot and ask it your “dumb” questions (which aren’t dumb), and you can reason through things at your own speed or learn how to do things that you haven’t done before, and do so in private just like if you had your own ChatGPT account. It doesn’t burn social capital or take other people’s time, which you can save for more nuanced questions.

Scott has implemented it for work, but also came up with another fun use for it: we created a family-focused Slack channel for kids in our lives (we have ten (!) nieces and nephews – you can see how I used GPT to help with special custom books for some of them recently) so their parents and they can access GPT in a safe environment, too. ChatGPT (and OpenAI’s GPT API) have some reasonable safeguards in it already to protect kids, but more usefully we want to be able to provide custom instructions to keep it always at a kid-focused level. Plus this way we can share the experience with them, too.

I asked the bot to summarize itself for a lay audience not familiar with the project, and this was the output:

### Summary of the SlackAskBot

The SlackAskBot is a smart assistant designed to operate within Slack, a popular communication platform used by teams for messaging and collaboration. Imagine having a helpful robot in your group chats or direct messages that can understand your questions and provide informative responses. That’s what the SlackAskBot does. It’s like adding a knowledgeable team member to your Slack workspace who is always ready to assist.

#### How It Works:

  1. **Integration with Slack:**

– The bot is integrated into Slack, meaning it can read messages in channels it’s added to or when directly messaged. It’s like inviting a friend into your group chat who can listen and respond to conversations.

  1. **Understanding Messages:**

– When you type a message that includes the bot (either by mentioning it with an @ symbol in a channel or by sending it a direct message), the bot pays attention. It’s designed to understand that you’re talking to it and that you’re expecting some form of response.

  1. **Processing Your Requests:**

– Once the bot receives a message, it doesn’t just look at the latest message you’ve sent. It considers the context by looking at the conversation history. This way, it can provide responses that make sense within the ongoing discussion. It’s akin to a human joining a conversation, where understanding the previous discussion points is crucial for making relevant comments.

  1. **Generating Responses:**

– The bot uses a powerful language model (think of it as a very advanced AI that understands and generates human-like text) to come up with responses to your messages. It’s as if the bot thinks about what you’ve asked and then crafts a reply that aims to be helpful and informative.

  1. **Interacting Back:**

– After thinking through your message and generating a response, the bot then sends this response back into the Slack channel or direct message. You’ll see its reply just like you would from any other team member, providing the information or assistance you requested.

  1. **Special Features:**

– The bot is designed with specific settings for different channels or direct messages. This means it can tailor its responses based on where it’s talking. For example, it might respond differently in a casual chat compared to a more formal project discussion channel.

  1. **Respecting Privacy and Permissions:**

– The bot is programmed to respect Slack’s rules and the privacy of your conversations. It only listens and responds to messages where it’s explicitly mentioned or directly messaged, ensuring that it doesn’t intrude on private discussions.

#### In Summary:

The SlackAskBot acts as a virtual assistant within your Slack workspace, ready to provide information and answers to your questions. It’s designed to understand the context of conversations, generate helpful responses, and interact with users in a way that enhances team communication and productivity. Whether you’re asking for quick facts, seeking advice on a project, or just interacting for fun, the SlackAskBot is there to make your Slack experience more informative and engaging.


The SlackAskBot is open source – you can check out the code and instructions for how to implement it in your own Slack environment in the repository here.

We are using Slack as a chat interface for LLMs like GPT, making the user experience similar to ChatGPT

Effective Pair Programming and Coding and Prompt Engineering and Writing with LLMs like ChatGPT and other AI tools

I’ve been puzzled when I see people online say that LLM’s “don’t write good code”. In my experience, they do. But given that most of these LLMs are used in chatbot mode – meaning you chat and give it instructions to generate the code – that might be where the disconnect lies. To get good code, you need effective prompting and to do so, you need clear thinking and ideas on what you are trying to achieve and how.

My recipe and understanding is:

Clear thinking + clear communication of ideas/request = effective prompting => effective code and other outputs

It also involves understanding what these systems can and can’t do. For example, as I’ve written about before, they can’t “know” things (although they can increasingly look things up) and they can’t do “mental” math. But, they can generally repeat patterns of words to help you see what is known about a topic and they can write code that you can execute (or it can execute, depending on settings) to solve a math problem.

What the system does well is help code small chunks, walk you through processes to link these sections of code up, and help you implement them (if you ask for it). The smaller the task (ask), the more effective it is. Or also – the easier it is for you to see when it completes the task and when it hasn’t been able to finish due to limitations like response length limits, information falling out of the context window (what it knows that you’ve told it); unclear prompting; and/or because you’re asking it to do things for which it doesn’t have expertise. Some of the last part – lack of expertise – can be improved with specific prompting techniques –  and that’s also true for right-sizing the task it’s focusing on.

Right-size the task by giving a clear ask

If I were to ask an LLM to write me code for an iOS app to do XYZ, it could write me some code, but it certainly wouldn’t (at this point in history, written in February 2024), write all code and give me a downloadable file that includes it all and the ability to simply run it. What it can do is start writing chunks and snippets of code for bits and pieces of files that I can take and place and build upon.

How do I know this? Because I made that mistake when trying to build my first iOS apps in April and May 2023 (last year). It can’t do that (and still can’t today; I repeated the experiment). I had zero ideas how to build an iOS app; I had a sense that it involved XCode and pushing to the Apple iOS App Store, and that I needed “Swift” as the programming language. Luckily, though, I had a much stronger sense of how I wanted to structure the app user experience and what the app needed to do.

I followed the following steps:

  1. First, I initiated chat as a complete novice app builder. I told it I was new to building iOS apps and wanted to use XCode. I had XCode downloaded, but that was it. I told it to give me step by step instructions for opening XCode and setting up a project. Success! That was effective.
  2. I opened a different chat window after that, to start a new chat. I told it that it was an expert in iOS programming using Swift and XCode. Then I described the app that I wanted to build, said where I was in the process (e.g. had opened and started a project in XCode but had no code yet), and asked it for code to put on the home screen so I could build and open the app and it would have content on the home screen. Success!
  3. From there, I was able to stay in the same chat window and ask it for pieces at a time. I wanted to have a new user complete an onboarding flow the very first time they opened the app. I explained the number of screens and content I wanted on those screens; the chat was able to generate code, tell me how to create that in a file, and how to write code that would trigger this only for new users. Success!
  4. I was able to then add buttons to the home screen; have those buttons open new screens of the app; add navigation back to the home; etc. Success!
  5. (Rinse and repeat, continuing until all of the functionality was built out a step at a time).

To someone with familiarity building and programming things, this probably follows a logical process of how you might build apps. If you’ve built iOS apps before and are an expert in Swift programming, you’re either not reading this blog post or are thinking I (the human) am dumb and inexperienced.

Inexperienced, yes, I was (in April 2023). But what I am trying to show here is for someone new to a process and language, this is how we need to break down steps and work with LLMs to give it small tasks to help us understand and implement the code it produces before moving forward with a new task (ask). It takes these small building block tasks in order to build up to a complete app with all the functionality that we want. Nowadays, even though I can now whip up a prototype project and iOS app and deploy it to my phone within an hour (by working with an LLM as described above, but skipping some of the introductory set-up steps now that I have experience in those), I still follow the same general process to give the LLM the big picture and efficiently ask it to code pieces of the puzzle I want to create.

As the human, you need to be able to keep the big picture – full app purpose and functionality – in mind while subcontracting with the LLM to generate code for specific chunks of code to help achieve new functionality in our project.

In my experience, this is very much like pair programming with a human. In fact, this is exactly what we did when we built DIYPS over ten years ago (wow) and then OpenAPS within the following year. I’ve talked endlessly about how Scott and I would discuss an idea and agree on the big picture task; then I would direct sub-tasks and asks that he, then also Ben and others would be coding on (at first, because I didn’t have as much experience coding and this was 10 years ago without LLMs; I gradually took on more of those coding steps and roles as well). I was in charge of the big picture project and process and end goal; it didn’t matter who wrote which code or how; we worked together to achieve the intended end result. (And it worked amazingly well; here I am 10 years later still using DIYPS and OpenAPS; and tens of thousands of people globally are all using open source AID systems spun off of the algorithm we built through this process!)

Two purple boxes. The one on the left says "big picture project idea" and has a bunch of smaller size boxes within labeled LLM, attempting to show how an LLM can do small-size tasks within the scope of a bigger project that you direct it to do. On the right, the box simply says "finished project". Today, I would say the same is true. It doesn’t matter – for my types of projects – if a human or an LLM “wrote” the code. What matters is: does it work as intended? Does it achieve the goal? Does it contribute to the goal of the project?

Coding can be done – often by anyone (human with relevant coding expertise) or anything (LLM with effective prompting) – for any purpose. The critical key is knowing what the purpose is of the project and keeping the coding heading in the direction of serving that purpose.

Tips for right-sizing the ask

  1. Consider using different chat windows for different purposes, rather than trying to do it all in one. Yes, context windows are getting bigger, but you’ll still likely benefit from giving different prompts in different windows (more on effective prompting below).Start with one window for getting started with setting up a project (e.g. how to get XCode on a Mac and start a project; what file structure to use for an app/project that will do XYZ; how to start a Jupyter notebook for doing data science with python; etc); brainstorming ideas to scope your project; then separately for starting a series of coding sub-tasks (e.g. write code for the home page screen for your app; add a button that allows voice entry functionality; add in HealthKit permission functionality; etc.) that serves the big picture goal.
  2. Make a list for yourself of the steps needed to build a new piece of functionality for your project. If you know what the steps are, you can specifically ask the LLM for that.Again, use a separate window if you need to. For example, if you want to add in the ability to save data to HealthKit from your app, you may start a new chat window that asks the LLM generally how does one add HealthKit functionality for an app? It’ll describe the process of certain settings that need to be done in XCode for the project; adding code that prompts the user with correct permissions; and then code that actually does the saving/revising to HealthKit.

    Make your list (by yourself or with help), then you can go ask the LLM to do those things in your coding/task window for your specific project. You can go set the settings in XCode yourself, and skip to asking it for the task you need it to do, e.g. “write code to prompt the user with HealthKit permissions when button X is clicked”.

    (Sure, you can do the ask for help in outlining steps in the same window that you’ve been prompting for coding sub-tasks, just be aware that the more you do this, the more quickly you’ll burn through your context window. Sometimes that’s ok, and you’ll get a feel for when to do a separate window with the more experience you get.)

  • Pay attention as you go and see how much code it can generate and when it falls short of an ask. This will help you improve the rate at which you successfully ask and it fully completes a task for future asks. I observe that when I don’t know – due to my lack of expertise – the right size of a task, it’s more prone to give me ½-⅔ of the code and solution but need additional prompting after that. Sometimes I ask it to continue where it cut off; other times I start implementing/working with the bits of code (the first ⅔) it gave me, and have a mental or written note that this did not completely generate all steps/code for the functionality and to come back.Part of why sometimes it is effective to get started with ⅔ of the code is because you’ll likely need to debug/test the first bit of code, anyway. Sometimes when you paste in code it’s using methods that don’t match the version you’re targeting (e.g. functionality that is outdated as of iOS 15, for example, when you’re targeting iOS 17 and newer) and it’ll flag a warning or block it from working until you fix it.

    Once you’ve debugged/tested as much as you can of the original ⅔ of code it gave you, you can prompt it to say “Ok, I’ve done X and Y. We were trying to (repeat initial instructions/prompt) – what are the remaining next steps? Please code that.” to go back and finish the remaining pieces of that functionality.

    (Note that saying “please code that” isn’t necessarily good prompt technique, see below).

    Again, much of this is paying attention to how the sub-task is getting done in service of the overall big picture goal of your project; or the chunk that you’ve been working on if you’re building new functionality. Keeping track with whatever method you prefer – in your head, a physical written list, a checklist digitally, or notes showing what you’ve done/not done – is helpful.

Most of the above I used for coding examples, but I follow the same general process when writing research papers, blog posts, research protocols, etc. My point is that this works for all types of projects that you’d work on with an LLM, whether the output generation intended is code or human-focused language that you’d write or speak.

But, coding or writing language, the other thing that makes a difference in addition to right-sizing the task is effective prompting. I’ve intuitively noticed that has made the biggest difference in my projects for getting the output matching my expertise. Conversely, I have actually peer reviewed papers for medical journals that do a horrifying job with prompting. You’ll hear people talk about “prompt engineering” and this is what it is referring to: how do you engineer (write) a prompt to get the ideal response from the LLM?

Tips for effective prompting with an LLM

    1. Personas and roles can make a difference, both for you and for the LLM. What do I mean by this? Start your prompt by telling the LLM what perspective you want it to take. Without it, you’re going to make it guess what information and style of response you’re looking for. Here’s an example: if you asked it what caused cancer, it’s going to default to safety and give you a general public answer about causes of cancer in very plain, lay language. Which may be fine. But if you’re looking to generate a better understanding of the causal mechanism of cancer; what is known; and what is not known, you will get better results if you prompt it with “You are an experienced medical oncologist” so it speaks from the generated perspective of that role. Similarly, you can tell it your role. Follow it with “Please describe the causal mechanisms of cancer and what is known and not known” and/or “I am also an experienced medical researcher, although not an oncologist” to help contextualize that you want a deeper, technical approach to the answer and not high level plain language in the response.

      Compare and contrast when you prompt the following:

      A. “What causes cancer?”

      B. “You are an experienced medical oncologist. What causes cancer? How would you explain this differently in lay language to a patient, and how would you explain this to another doctor who is not an oncologist?”

      C. “You are an experienced medical oncologist. Please describe the causal mechanisms of cancer and what is known and not known. I am also an experienced medical researcher, although not an oncologist.”

      You’ll likely get different types of answers, with some overlap between A and the first part of answer B. Ditto for a tiny bit of overlap between the latter half of answer B and for C.

      I do the same kind of prompting with technical projects where I want code. Often, I will say “You are an expert data scientist with experience writing code in Python for a Jupyter Notebook” or “You are an AI programming assistant with expertise in building iOS apps using XCode and SwiftUI”. Those will then be followed with a brief description of my project (more on why this is brief below) and the first task I’m giving it.

      The same also goes for writing-related tasks; the persona I give it and/or the role I reference for myself makes a sizable difference in getting the quality of the output to match the style and quality I was seeking in a response.

  • Be specific. Saying “please code that” or “please write that” might work, sometimes, but more often or not will get a less effective output than if you provide a more specific prompt.I am a literal person, so this is something I think about a lot because I’m always parsing and mentally reviewing what people say to me because my instinct is to take their words literally and I have to think through the likelihood that those words were intended literally or if there is context that should be used to filter those words to be less literal. Sometimes, you’ll be thinking about something and start talking to someone about something, and they have no idea what on earth you’re talking about because the last part of your out-loud conversation with them was about a completely different topic!

    LLMs are the same as the confused conversational partner who doesn’t know what you’re thinking about. LLMs only know what you’ve last/recently told it (and more quickly than humans will ‘forget’ what you told it about a project). Remember the above tips about brainstorming and making a list of tasks for a project? Providing a description of the task along with the ask (e.g. we are doing X related to the purpose of achieving Y, please code X) will get you better output more closely matching what you wanted than saying “please code that” where the LLM might code something else to achieve Y if you didn’t tell it you wanted to focus on X.

    I find this even more necessary with writing related projects. I often find I need to give it the persona “You are an expert medical researcher”, the project “we are writing a research paper for a medical journal”, the task “we need to write the methods section of the paper”, and a clear ask “please review the code and analyses and make an outline of the steps that we have completed in this process, with sufficient detail that we could later write a methods section of a research paper”. A follow up ask is then “please take this list and draft it into the methods section”. That process with all of that specific context gives better results than “write a methods section” or “write the methods” etc.

  • Be willing to start over with a new window/chat. Sometimes the LLM can get itself lost in solving a sub-task and lose sight (via lost context window) of the big picture of a project, and you’ll find yourself having to repeat over and over again what you’re asking it to do. Don’t be afraid to cut your losses and start a new chat for a sub-task that you’ve been stuck on. You may be able to eventually come back to the same window as before, or the new window might become your new ‘home’ for the project…or sometimes a third, fourth, or fifth window will.
  • Try, try again.
    I may hold the record for the longest running bug that I (and the LLM) could. Not. solve. This was so, so annoying. No users apparently noticed it but I knew about it and it bugged me for months and months. Every few weeks I would go to an old window and also start a new window, describe the problem, paste the code in, and ask for help to solve it. I asked it to identify problems with the code; I asked it to explain the code and unexpected/unintended functionality from it; I asked it what types of general things would be likely to cause that type of bug. It couldn’t find the problem. I couldn’t find the problem. Finally, one day, I did all of the above, but then also started pasting every single file from my project and asking if it was likely to include code that could be related to the problem. By forcing myself to review all my code files with this problem in mind, even though the files weren’t related at all to the file/bug….I finally spotted the problem myself. I pasted the code in, asked if it was a possibility that it was related to the problem, the LLM said yes, I tried a change and…voila! Bug solved on January 16 after plaguing me since November 8. (And probably existed before then but I didn’t have functionality built until November 8 where I realized it was a problem). I was beating myself up about it and posted to Twitter about finally solving the bug (but very much with the mindset of feeling very stupid about it). Someone replied and said “congrats! sounds like it was a tough one!”. Which I realized was a very kind framing and one that I liked, because it was a tough one; and also I am doing a tough thing that no one else is doing and I would not have been willing to try to do without an LLM to support.

    Similarly, just this last week on Tuesday I spent about 3 hours working on a sub-task for a new project. It took 3 hours to do something that on a previous project took me about 40 minutes, so I was hyper aware of the time mismatch and perceiving that 3 hours was a long time to spend on the task. I vented to Scott quite a bit on Tuesday night, and he reminded me that sure it took “3 hours” but I did something in 3 hours that would take 3 years otherwise because no one else would do (or is doing) the project that I’m working on. Then on Wednesday, I spent an hour doing another part of the project and Thursday whipped through another hour and a half of doing huge chunks of work that ended up being highly efficient and much faster than they would have been, in part because the “three hours” it took on Tuesday wasn’t just about the code but about organizing my thinking, scoping the project and research protocol, etc. and doing a huge portion of other work to organize my thinking to be able to effectively prompt the LLM to do the sub-task (that probably did actually take closer to the ~40 minutes, similar to the prior project).

    All this to say: LLMs have become pair programmers and collaborators and writers that are helping me achieve tasks and projects that no one else in the world is working on yet. (It reminds me very much of my early work with DIYPS and OpenAPS where we did the work, quietly, and people eventually took notice and paid attention, albeit slower than we wished but years faster than had we not done that work. I’m doing the same thing in a new field/project space now.) Sometimes, the first attempt to delegate a sub-task doesn’t work. It may be because I haven’t organized my thinking enough, and the lack of ideal output shows that I have not prompted effectively yet. Sometimes I can quickly fix the prompt to be effective; but sometimes it highlights that my thinking is not yet clear; my ability to communicate the project/task/big picture is not yet sufficient; and the process of achieving the clarity of thinking and translating to the LLM takes time (e.g. “that took 3 hours when it should have taken 40 minutes”) but ultimately still moves me forward to solving the problem or achieving the tasks and sub-tasks that I wanted to do. Remember what I said at the beginning:

    Clear thinking + clear communication of ideas/request = effective prompting => effective code and other outputs

 

  • Try it anyway.
    I am trying to get out of the habit of saying “I can’t do X”, like “I can’t code/program an iOS app”…because now I can. I’ve in fact built and shipped/launched/made available multiple iOS apps (check out Carb Pilot if you’re interested in macronutrient estimates for any reason; you can customize so you only see the one(s) you care about; or if you have EPI, check out PERT Pilot, which is the world’s first and only app for tracking pancreatic enzyme replacement therapy and has the same AI feature for generating macronutrient estimates to aid in adjusting enzyme dosing for EPI.) I’ve also made really cool, 100% custom-to-me niche apps to serve a personal purpose that save me tons of time and energy. I can do those things, because I tried. I flopped a bunch along the way – it took me several hours to solve a simple iOS programming error related to home screen navigation in my first few apps – but in the process I learned how to do those things and now I can build apps. I’ve coded and developed for OpenAPS and other open source projects, including a tool for data conversion that no one else in the world had built. Yet, my brain still tries to tell me I can’t code/program/etc (and to be fair, humans try to tell me that sometimes, too).

    I bring that up to contextualize that I’m working on – and I wish others would work on to – trying to address the reflexive thoughts of what we can and can’t do, based on prior knowledge. The world is different now and tools like LLMs make it possible to learn new things and build new projects that maybe we didn’t have time/energy to do before (not that we couldn’t). The bar to entry and the bar to starting and trying is so much lower than it was even a year ago. It really comes down to willingness to try and see, which I recognize is hard: I have those thought patterns too of “I can’t do X”, but I’m trying to notice when I have those patterns; shift my thinking to “I used to not be able to do X; I wonder if it is possible to work with an LLM to do part of X or learn how to do Y so that I could try to do X”.

    A recent real example for me is power calculations and sample size estimates for future clinical trials. That’s something I can’t do; it requires a statistician and specialized software and expertise.

    Or…does it?

    I asked my LLM how power calculations are done. It explained. I asked if it was possible to do it using Python code in a Jupyter notebook. I asked what information would be needed to do so. It walked me through the decisions I needed to make about power and significance, and highlighted variables I needed to define/collect to put into the calculation. I had generated the data from a previous study so I had all the pieces (variables) I needed. I asked it to write code for me to run in a Jupyter notebook, and it did. I tweaked the code, input my variables, ran it..and got the result. I had run a power calculation! (Shocked face here). But then I got imposter syndrome again, reached out to a statistician who I had previously worked with on a research project. I shared my code and asked if that was the correct or an acceptable approach and if I was interpreting it correctly. His response? It was correct, and “I couldn’t have done it better myself”.

    (I’m still shocked about this).

    He also kindly took my variables and put it in the specialized software he uses and confirmed that the results output matched what my code did, then pointed out something that taught me something for future projects that might be different (where the data is/isn’t normally distributed) although it didn’t influence the output of my calculation for this project.

    What I learned from this was a) this statistician is amazing (which I already knew from working with him in the past) and kind to support my learning like this; b) I can do pieces of projects that I previously thought were far beyond my expertise; c) the blocker is truly in my head, and the more we break out of or identify the patterns stopping us from trying, the farther we will get.

    “Try it anyway” also refers to trying things over time. The LLMs are improving every few months and often have new capabilities that didn’t before. Much of my work is done with GPT-4 and the more nuanced, advanced technical tasks are way more efficient than when using GPT-3.5. That being said, some tasks can absolutely be done with GPT-3.5-level AI. Doing something now and not quite figuring it out could be something that you sort out in a few weeks/months (see above about my 3 month bug); it could be something that is easier to do once you advance your thinking ; or it could be more efficiently done with the next model of the LLM you’re working with.

  • Test whether custom instructions help. Be aware though that sometimes too many instructions can conflict and also take up some of your context window. Plus if you forget what instructions you gave it, you might get seemingly unexpected responses in future chats. (You can always change the custom instructions and/or turn it on and off.)

I’m hoping this helps give people confidence or context to try things with LLMs that they were not willing to try before; or to help get in the habit of remembering to try things with LLMs; and to get the best possible output for the project that they’re working on.

Remember:

  • Right-size the task by making a clear ask.
  • You can use different chat windows for different levels of the same project.
  • Use a list to help you, the human, keep track of all the pieces that contribute to the bigger picture of the project.
  • Try giving the LLM a persona for an ask; and test whether you also need to assign yourself a persona or not for a particular type of request.
  • Be specific, think of the LLM as a conversational partner that can’t read your mind.
  • Don’t be afraid to start over with a new context window/chat.
  • Things that were hard a year ago might be easier with an LLM; you should try again.
  • You can do more, partnering with an LLM, than you can on your own, and likely can do things you didn’t realize were possible for you to do!

Clear thinking + clear communication of ideas/request = effective prompting => effective code and other outputs

Have any tips to help others get more effective output from LLMs? I’d love to hear them, please comment below and share your tips as well!

Tips for prompting LLMs like ChatGPT, written by Dana M. Lewis and available from DIYPS.org

How I Use LLMs like ChatGPT And Tips For Getting Started

You’ve probably heard about new AI (artificial intelligence) tools like ChatGPT, Bard, Midjourney, DALL-E and others. But, what are they good for?

Last fall I started experimenting with them. I looked at AI art tools and found them to be challenging, at the time, for one of my purposes, which was creating characters and illustrating a storyline with consistent characters for some of my children’s books. I also tested GPT-3 (meaning version 3.0 of GPT). It wasn’t that great, to be honest. But later, GPT-3.5 was released, along with the ChatGPT chat interface to it, which WAS a big improvement for a lot of my use cases. (And now, GPT-4 is out and is an even bigger improvement, although it costs more to use. More on the cost differences below)

So what am I using these AI tools for? And how might YOU use some of these AI tools? And what are the limitations? This is what I’ve learned:

  1. The most frequent way I use these AI tools is for getting started on a project, especially those related to writing.

You know the feeling of staring at a blank page and not knowing where to start? Maybe it’s the blank page of a cold email; the blank page of an essay or paper you need to write; the blank page of the outline for a presentation. Starting is hard!

Even for this blog post, I had a list of bulleted notes of things I wanted to remember to include. But I wasn’t sure how I wanted to start the blog post or incorporate them. I stuck the notes in ChatGPT and asked it to expand the notes.

What did it do? It wrote a few paragraph summary. Which isn’t what I wanted, so I asked it again to use the notes and this time “expand each bullet into a few sentences, rather than summarizing”. With these clear directions, it did, and I was able to look at this content and decide what I wanted to edit, include, or remove.

Sometimes I’m stuck on a particular writing task, and I use ChatGPT to break it down. In addition to kick-starting any type of writing overall, I’ve asked it to:

  • Take an outline of notes and summarize them into an introduction; limitations section; discussion section; conclusion; one paragraph summary; etc.
  • Take a bullet point list of notes and write full, complete sentences.
  • Take a long list of notes I’ve written about data I’ve extracted from a systematic review I was working on, and ask it about recurring themes or outlier concepts. Especially when I had 20 pages (!) of hand-written notes in bullets with some loose organization by section, I could feed in chunks of content and get help getting the big picture from that 20 pages of content I had created. It can highlight themes in the data based on the written narratives around the data.

A lot of times, the best thing it does is it prompts my brain to say “that’s not correct! It should be talking about…” and I’m able to more easily write the content that was in the back of my brain all along. I probably use 5% of what it’s written, and more frequently use it as a springboard for my writing. That might be unique to how I’m using it, though, and other simple use cases such as writing an email to someone or other simplistic content tasks may mean you can keep 90% or more of the content to use.

2. It can also help analyze data (caution alert!) if you understand how the tools work.

Huge learning moment here: these tools are called LLMs (large language models). They are trained on large amounts of language. They’re essentially designed so that, based on all of those words (language) it’s taken in previously, to predict content that “sounds” like what would come after a given prompt. So if you ask it to write a song or a haiku, it “knows” what a song or a haiku “looks” like, and can generate words to match those patterns.

It’s essentially a PATTERN MATCHER on WORDS. Yeah, I’m yelling in all caps here because this is the biggest confusion I see. ChatGPT or most of these LLMs don’t have access to the internet; they’re not looking up in a search engine for an answer. If you ask it a question about a person, it’s going to give you an answer (because it knows what this type of answer “sounds” like), but depending on the amount of information it “remembers”, some may be accurate and some may be 100% made up.

Why am I explaining this? Remember the above section where I highlighted how it can start to sense themes in the data? It’s not answering solely based on the raw data; it’s not doing analysis of the data, but mostly of the words surrounding the data. For example, you can paste in data (from a spreadsheet) and ask it questions. I did that once, pasting in some data from a pivot table and asking it the same question I had asked myself in analyzing the data. It gave me the same sense of the data that I had based on my own analysis, then pointed out it was only qualitative analysis and that I should also do quantitative statistical analysis. So I asked it if it could do quantitative statistical analysis. It said yes, it could, and spit out some numbers and described the methods of quantitative statistical analysis.

But here’s the thing: those numbers were completely made up!

It can’t actually use (in its current design) the methods it was describing verbally, and instead made up numbers that ‘sounded’ right.

So I asked it to describe how to do that statistical method in Google Sheets. It provided the formula and instructions; I did that analysis myself; and confirmed that the numbers it had given me were 100% made up.

The takeaway here is: it outright said it could do a thing (quantitative statistical analysis) that it can’t do. It’s like a human in some regards: some humans will lie or fudge and make stuff up when you talk to them. It’s helpful to be aware and query whether someone has relevant expertise, what their motivations are, etc. in determining whether or not to use their advice/input on something. The same should go for these AI tools! Knowing this is an LLM and it’s going to pattern match on language helps you pinpoint when it’s going to be prone to making stuff up. Humans are especially likely to make something up that sounds plausible in situations where they’re “expected” to know the answer. LLMs are in that situation all the time: sometimes they actually do know an answer, sometimes they have a good guess, and sometimes they’re just pattern matching and coming up with something that sounds plausible.

In short:

  • LLM’s can expand general concepts and write language about what is generally well known based on its training data.
  • Try to ask it a particular fact, though, and it’s probably going to make stuff up, whether that’s about a person or a concept – you need to fact check it elsewhere.
  • It can’t do math!

But what it can do is teach you or show you how to do the math, the coding, or whatever thing you wish it would do for you. And this gets into one of my favorite use cases for it.

3. You can get an LLM to teach you how to use new tools, solve problems, and lower the barrier to entry (and friction) on using new tools, languages, and software.

One of the first things I did was ask ChatGPT to help me write a script. In fact, that’s what I did to expedite the process of finding tweets where I had used an image in order to get a screenshot to embed on my blog, rather than embedding the tweet.

It’s now so easy to generate code for scripts, regardless of which language you have previous experience with. I used to write all of my code as bash scripts, because that’s the format I was most familiar with. But ChatGPT likes to do things as Python scripts, so I asked it simple questions like “how do I call a python script from the command line” after I asked it to write a script and it generated a python script. Sure, you could search in a search engine or Stack Overflow for similar questions and get the same information. But one nice thing is that if you have it generate a script and then ask it step by step how to run a script, it gives you step by step instructions in context of what you were doing. So instead of saying “to run a script, type `python script.py’”, using placeholder names, it’ll say “to run the script, use ‘python actual-name-of-the-script-it-built-you.py’ “ and you can click the button to copy that, paste it in, and hit enter. It saves a lot of time for figuring out how to take placeholder information (which you would get from a traditional search engine result or Stack Overflow, where people are fond of things like saying FOOBAR and you have no idea if that means something or is meant to be a placeholder). Careful observers will notice that the latest scripts I’ve added to my Open Humans Data Tools repository (which is packed with a bunch of scripts to help work with big datasets!) are now in Python rather than bash; such as when I was adding new scripts for fellow researchers looking to check for updates in big datasets (such as the OpenAPS Data Commons). This is because I used GPT to help with those scripts!

It’s really easy now to go from an idea to a script. If you’re able to describe it logically, you can ask it to write a script, tell you how to run it, and help you debug it. Sometimes you can start by asking it a question, such as “Is it possible to do Y?” and it describes a method. You need to test the method or check for it elsewhere, but things like uploading a list of DOIs to Mendeley to save me hundreds of clicks? I didn’t realize Mendeley had an API or that I could write a script that would do that! ChatGPT helped me write the script, figure out how to create a developer account and app access information for Mendeley, and debug along the way so I ended up within an hour and a half of having a tool that easily saved me 3 hours on the very first project that I used it with.

I’m gushing about this because there’s probably a lot of ideas you have that you immediately throw out as being too hard, or you don’t know how to do it. It takes time, but I’m learning to remember to think “I should ask the LLM this” and ask it questions such as:

  • Is it possible to do X?
  • Write a script to do X.
  • I have X data. Pretend I am someone who doesn’t know how to use Y software and explain how I should do Z.

Another thing I’ve done frequently is ask it to help me quickly write a complex formula to use in a spreadsheet. Such as “write a formula that can be used in Google Sheets to take an average of the values in M3:M84 if they are greater than zero”.

It gives me the formula, and also describes it, and in some cases, gives alternative options.

Other things I’ve done with spreadsheets include:

  • Ask it to write a conditional formatting custom formula, then give me instructions for expanding the conditional formatting to apply to a certain cell range.
  • Asking it to check if a cell is filled with a particular value and then repeating the value in the new cell, in order to create new data series to use in particular charts and graphs I wanted to create from my data.
  • Help me transform my data so I could generate a box and whisker plot.
  • Ask it for other visuals that might be effective ways to illustrate and visualize the same dataset.
  • Explain the difference between two similar formulas (e.g. COUNT and COUNTA or when to use IF and IFS).

This has been incredibly helpful especially with some of my self-tracked datasets (particularly around thyroid-related symptom data) where I’m still trying to figure out the relationship between thyroid levels, thyroid antibody levels, and symptom data (and things like menstrual cycle timing). I’ve used it for creating the formulas and solutions I’ve talked about in projects such as the one where I created a “today” line that dynamically updates in a chart.

It’s also helped me get past the friction of setting up new tools. Case in point, Jupyter notebooks. I’ve used them in the web browser version before, but often had issues running the notebooks people gave me. I debugged and did all kinds of troubleshooting, but have not for years been able to get it successfully installed locally on (multiple of) my computers. I had finally given up on effectively using notebooks and definitely given up on running it locally on my machine.

However, I decided to see if I could get ChatGPT to coax me through the install process.

I told it:

“I have this table with data. Pretend I am someone who has never used R before. Tell me, step by step, how to use a Jupyter notebook to generate a box and whisker plot using this data”

(and I pasted my data that I had copied from a spreadsheet, then hit enter).

It outlined exactly what I needed to do, saying to install Jupyter Notebook locally if I hadn’t, gave me code to do that, installing the R kernel, told me how to do that, then how to start a notebook all the way down to what code to put in the notebook, the data transformed that I could copy/paste, and all the code that generated the plot.

However, remember I have never been able to successfully get Jupyter Notebooks running! For years! I was stuck on step 2, installing R. I said:

“Step 2, explain to me how I enter those commands in R? Do I do this in Terminal?”

It said “Oh apologies, no, you run those commands elsewhere, preferably in Rstudio. Here is how to download RStudio and run the commands”.

So, like humans often do, it glossed over a crucial step. But it went back and explained it to me and kept giving more detailed instructions and helping me debug various errors. After 5-6 more troubleshooting steps, it worked! And I was able to open Jupyter Notebooks locally and get it working!

All along, most of the tutorials I had been reading had skipped or glossed over that I needed to do something with R, and where that was. Probably because most people writing the tutorials are already data scientists who have worked with R and RStudio etc, so they didn’t know those dependencies were baked in! Using ChatGPT helped me be able to put in every error message or every place I got stuck, and it coached me through each spot (with no judgment or impatience). It was great!

I was then able to continue with the other steps of getting my data transformed, into the notebook, running the code, and generating my first ever box and whisker plot with R!

A box and whisker plot, illustrated simply to show that I used R and Jupyter finally successfully!

This is where I really saw the power of these tools, reducing the friction of trying something new (a tool, a piece of software, a new method, a new language, etc.) and helping you troubleshoot patiently step by step.

Does it sometimes skip steps or give you solutions that don’t work? Yes. But it’s still a LOT faster than manually debugging, trying to find someone to help, or spending hours in a search engine or Stack Overflow trying to translate generic code/advice/solutions into something that works on your setup. The beauty of these tools is you can simply paste in the error message and it goes “oh, sorry, try this to solve that error”.

Because the barrier to entry is so low (compared to before), I’ve also asked it to help me with other project ideas where I previously didn’t want to spend the time needed to learn new software and languages and all the nuances of getting from start to end of a project.

Such as, building an iOS app by myself.

I have a ton of projects where I want to temporarily track certain types of data for a short period of time. My fall back is usually a spreadsheet on my phone, but it’s not always easy to quickly enter data on a spreadsheet on your phone, even if you set up a template with a drop down menu like I’ve done in the past (for my DIY macronutrient tool, for example). For example, I want to see if there’s a correlation in my blood pressure at different times and patterns of inflammation in my eyelid and heart rate symptoms (which are symptoms, for me, of thyroid antibodies being out of range, due to Graves’ disease). That means I need to track my symptom data, but also now some blood pressure data. I want to be able to put these datasets together easily, which I can, but the hardest part (so to speak) is finding a way that I am willing to record my blood pressure data. I don’t want to use an existing BP tracking app, and I don’t want a connected BP monitor, and I don’t want to use Apple Health. (Yes, I’m picky!)

I decided to ask ChatGPT to help me accomplish this. I told it:

“You’re an AI programming assistant. Help me write a basic iOS app using Swift UI. The goal is a simple blood pressure tracking app. I want the user interface to default to the data entry screen where there should be three boxes to take the systolic, diastolic blood pressure numbers and also the pulse. There should also be selection boxes to indicate whether the BP was taken sitting up or laying down. Also, enable the selection of a section of symptom check boxes that include “HR feeling” and “Eyes”. Once entered on this screen, the data should save to a google spreadsheet.” 

This is a completely custom, DIY, n of 1 app. I don’t care about it working for anyone else, I simply want to be able to enter my blood pressure, pulse, whether I’m sitting or laying down, and the two specific, unique to me symptoms I’m trying to analyze alongside the BP data.

And it helped me build this! It taught me how to set up a new SwiftUI project in XCode, gave me code for the user interface, how to set up an API with Google Sheets, write code to save the data to Sheets, and get the app to run.

(I am still debugging the connection to Google Sheets, so in the interim I changed my mind and had it create another screen to display the stored data then enable it to email me a CSV file, because it’s so easy to write scripts or formulas to take data from two sources and append it together!)

Is it fancy? No. Am I going to try to distribute it? No. It’s meeting a custom need to enable me to collect specific data super easily over a short period of time in a way that my previous tools did not enable.

Here’s a preview of my custom app running in a simulator phone:

Simulator iphone with a basic iOS app that intakes BP, pulse, buttons for indicating whether BP was taken sitting or laying down; and toggles for key symptoms (in my case HR feeling or eyes), and a purple save button.

I did this in a few hours, rather than taking days or weeks. And now, the barrier to entry to creating more custom iOS is reduced, because now I’m more comfortable working with XCode and the file structures and what it takes to build and deploy an app! Sure, again, I could have learned to do this in other ways, but the learning curve is drastically shortened and it takes away most of the ‘getting started’ friction.

That’s the theme across all of these projects:

  • Barriers to entry are lower and it’s easier to get started
  • It’s easier to try things, even if they flop
  • There’s a quicker learning curve on new tools, technologies and languages
  • You get customized support and troubleshooting without having to translate through as many generic placeholders

PS – speaking of iOS apps, based on building this one simple app I had the confidence to try building a really complex, novel app that has never existed in the world before! It’s for people with exocrine pancreatic insufficiency like me who want to log pancreatic enzyme replacement therapy (PERT) dosing and improve their outcomes – check out PERT Pilot and how I built it here.

4. Notes about what these tools cost

I found ChatGPT useful for writing projects in terms of getting started, even though the content wasn’t that great (on GPT-3.5, too). Then they came out with GPT-4 and made a ChatGPT Pro option for $20/month. I didn’t think it was worth it and resisted it. Then I finally decided to try it, because some of the more sophisticated use cases I wanted to use it for required a longer context window, and in addition to a better model it also gave you a longer context window. I paid the first $20 assuming I’d want to cancel it by the end of the month.

Nope.

The $20 has been worth it on every single project that I’ve used it for. I’ve easily saved 5x that on most projects in terms of reducing the energy needed to start a project, whether it was writing or developing code. It has saved 10x that in time cost recouped from debugging new code and tools.

GPT-4 does have caps, though, so even with the $20/month, you can only do 25 messages every 3 hours. I try to be cognizant of which projects I default to using GPT-3.5 on (unlimited) versus saving the more sophisticated projects for my GPT-4 quota.

For example, I saw a new tool someone had built called “AutoResearcher”, downloaded it, and tried to use it. I ran into a bug and pasted the error into GPT-3.5 and got help figuring out where the problem was. Then I decided I wanted to add a feature to output to a text file, and it helped me quickly edit the code to do that, and I PR’ed it back in and it was accepted (woohoo) and now everyone using that tool can use that feature. That was pretty simple and I was able to use GPT-3.5 for that. But sometimes, when I need a larger context window for a more sophisticated or content-heavy project, I start with GPT-4. When I run into the cap, it tells me when my next window opens up (3 hours after I started using it), and I usually have an hour or two until then. I can open a new chat on GPT-3.5 (without the same context) and try to do things there; switch to another project; or come back at the time it says to continue using GPT-4 on that context/setup.

Why the limit? Because it’s a more expensive model. So you have a tradeoff between paying more and having a limit on how much you can use it, because of the cost to the company.

—–

TLDR:

Most important note: LLMs don’t “think” or “know” things the way humans do. They output language they predict you want to see, based on its training and the inputs you give it. It’s like the autocomplete of a sentence in your email, but more words on a wider range of topics!

Also, the LLM can’t do math. But they can write code. Including code to do math.

(Some, but not all, LLMs have access to the internet to look up or incorporate facts; make sure you know which LLM you are using and whether it has this feature or not.)

Ways to get started:

    1. The most frequent way I use these AI tools is for getting started on a project, especially those related to writing.
      • Ask it to help you expand on notes; write summaries of existing content; or write sections of content based on instructions you give it
    2.  It can also help analyze data (caution alert!) if you understand the limitations of the LLM.
      • The most effective way to work with data is to have it tell you how to run things in analytical software, whether that’s how to use R or a spreadsheet or other software for data analysis. Remember the LLM can’t do math, but it can write code so you can then do the math!
    3.  You can get an LLM to teach you how to use new tools, solve problems, and lower the barrier to entry (and friction) on using new tools, languages, and software.
      • Build a new habit of asking it “Can I do X” or “Is it possible to do Y” and when it says it’s possible, give it a try! Tell it to give you step-by-step instructions. Tell it where you get stuck. Give it your error messages or where you get lost and have it coach you through the process. 

What’s been your favorite way to use an LLM? I’d love to know other ways I should be using them, so please drop a comment with your favorite projects/ways of using them!

Personally, the latest project that I built with an LLM has been PERT Pilot!

How I use LLMs (like ChatGPT) and tips for getting started