Opening pathways for discovery, research, and innovation in health and healthcare

How can we get more patients and other communities to leverage the benefits of the #WeAreNotWaiting mindset for research, development, and innovation in health (and healthcare)?

That’s a question I’ve been asking myself for two years, after seeing the diverse efforts and valuable outpourings from the DIY diabetes community (ranging from amazing remote monitoring solutions for CGM to algorithms, hardware, and other software for automated insulin delivery systems).

But, how to scale? In diabetes, we’re perhaps uniquely positioned given our data-driven disease. However, I believe that the data and innovation approach we’ve taken in diabetes can help many other types of patient communities as well. I just didn’t know how to help scale it… until recently.

Last year when a group of us from the OpenAPS community participated in the Quantified Self Public Health Symposium in 2016, it prompted some follow up conversations with various academic researchers, including Eric Hekler from Arizona State University (ASU).

Eric started a conversation, and kept asking me: What could you do if you partnered with academic researchers? How can traditional researchers help the DIY community, OpenAPS or otherwise?

That also sparked a conversation with Paul Tarini, a senior program officer at the Robert Wood Johnson Foundation (RWJF), about potential funding for a project.

(Important to state here: OpenAPS itself is not a funded project. It has not been, and will not be. It is 100% DIY, non-commercial, and it has been built by a community of volunteers.)

What I wanted to talk to RWJF about was funding a collaboration with academic researchers for studying data and innovation coming out of the community; and to ultimately identify needs and build resources to help scale this type of community effort and empower other patient communities as well.

It took over a year, but we were able to work through initial project proposals and were then invited to submit a full proposal. And on Wednesday (September 6, 2017), I found out that we have been awarded the grant, and this project work will be funded by the Robert Wood Johnson Foundation. The project officially begins on September 15 and will run for 18 months.

So what exactly is this project?

Our project is titled “Learning to not wait: Opening pathways for discovery, research, and innovation in health and healthcare.”

It entails a number of things.

    1. We are creating an on-call data science team to support research in the DIY community. More details will be forthcoming, but essentially this team is there to help do research on the myriad of questions bubbling out of the community. For example – how does sensitivity change during growth spurts, during periods of inactivity, or when changing insulin types? What are some of the most successful mealtime insulin dosing strategies? Etc. People will be able to submit ideas, and get help formulating the idea into a researchable question, and get the research done.
    2. Studying the process of research when done by patients, and the barriers they/their research run into when spreading this scientific knowledge. I personally know there are a lot of barriers, but we need to document them and find solutions. (There are a lot of prejudice and perceived stigmas toward patient researchers doing this type of scientific work, around things like quality of research, methods of distributing knowledge, etc.)
    3. Convening a meeting with patients, traditional researchers, legal experts, and others in this innovative research space to discuss and address some of the known and being-found barriers for this type of research. I envision a white paper type publication to come out of this meeting to document the lay of the land as it is.
    4. Creating toolkit-type resources based on what we’ve learned and are learning in this project for helping patients new to DIY and this type of research take on various levels of research or innovation activity. Part of our project’s scope of work, in #WeAreNotWaiting spirit, includes beta testing with 2-3 other patient communities, so we can get feedback and iterate and roll these out as quickly as possible.

Our project has a couple of principles that I feel strongly about, and am also very proud of in approaching this body of work.

  • I am the scientific Principal Investigator of this project. This is unique in the world of grant-funded research, where a patient is driving the scientific discovery process. (I’m proud and very appreciative to have two amazing co-PI’s who are helping with some of the administrative work since the grant is being administered through Arizona State University Foundation, who is being an awesome partner given the uniqueness of this situation*.) My co-PI’s are Eric Hekler and Erik Johnston. The other members of the team include John Harlow, who’s a MacArthur Foundation Postdoctoral Fellow; Sayali Phatak, a PhD student at ASU; and Keren Hirsch from the ASU Decision Theater.
  • #WeAreNotWaiting is the mantra for this project and our entire team. We plan to be as efficient as possible in doing the project work, which includes being as timely as possible with sharing findings back with the community as soon as they’re ready (a given; there’s no reason to wait) as well as finding ways to publish that are faster than the very traditional academic publishing process, and being thoughtful about the right audiences outside the patient community for communicating about this project’s work.
  • Always asking why. As a brand new PI, I have a lot to learn. But as a non-traditional PI, I also am running into a lot of things that are done the way they’d be done if I was traditionally inside an organization. I plan to explore and challenge as many of these, and try to document the decisions I make in this project as I come to those forks in the road. In some cases, I choose the easier paths because for my project/work/focus, it does not matter. In other cases, based on principle, I choose the harder path-blazing approach.

* About the uniqueness of this project and the administrative details

Since I’m an individual patient researcher, not affiliated with the organization, we decided we would make the official grantee financial organization Arizona State University Foundation, since that’s where my co-PI’s were. But true to the nature of this project, I want to document the challenges and opportunities that come with that, so more to come about all the interesting lessons learned about the process of putting together the proposal and the grant approval process once we heard the grant would be awarded. That way, future patient researchers have a leg up on what is coming when taking on this type of project and are aware of what this approach entailed. The short version is I am a subcontractor to ASU for purpose of the grant; but am not employed or otherwise affiliated with ASU. Props to the many people at ASU who learned about me and this project in the approval process and rolled with it / helped make it happen.

So, what’s next? When do you start? What are you waiting on?!

Coming super soon – a project website with more details about this project.

For my fellow PWDs:

  • Stay tuned for the project website going live, which will also include more details about how individuals in the diabetes community can pitch ideas/get started working with the on-call data science team.

For patients reading this who are members of other patient disease communities:

  • Ping me if you’re SUPER excited and can’t wait to tell me :), or stay tuned for more info about the process for proposing that your patient community be one of the communities with whom we beta test some of the tools/resources developed toward the latter phases of this project.

If you’re someone else who’s interested in this work (such as a legal expert, other researcher, etc.):

  • Also ping me if you’re interested in hearing more about the meeting we plan to convene with a small multidisciplinary group to discuss and address barriers of patient-driven research. Even if we can’t get everyone interested to attend the in-person meeting, I would still love your input and collaboration for the white paper and/or other publications and intersections with this project.

For everyone else:

  • Please do let me know if there’s a particular aspect of this project that you’re curious to learn more about – whether it’s some of what I’m facing and documenting as a patient PI researcher, or otherwise. That’ll help me prioritize some of the blog posts and articles I’m writing about this process!

Thanks to everyone who managed to read this ginormous blog post.

I am incredibly excited about the project, and having resources to focus on how patients and non-traditional actors in healthcare can drive research, development, innovation, and knowledge sharing in non-traditional methods and from the ground up, plus prioritize and change the healthcare research agenda. Like my work in OpenAPS that stands on the shoulders of so many, I’m hoping this project is the first of many and gets to a place for others to leverage this work and take it beyond the scope of what we’ve all imagined is currently possible.

A huge thanks to the team partnering with me on this work; to ASU for being a great partner as an organization; to the Robert Wood Johnson Foundation for supporting this project (and in particular to our program manager, Paul Tarini, for his ongoing support throughout this entire process); and many extra thanks to Scott and all my family and friends for supporting me throughout the proposal process and being the recipients of some VERY excited and !!! filled texts when I found out we had officially been awarded the grant for this project.

Making it possible for researchers to work with #OpenAPS or general Nightscout data – and creating a complex json to csv command line tool that works with unknown schema

This is less of an OpenAPS/DIYPS/diabetes-related post, although that is normally what I blog about. However, since we created the #OpenAPS Data Commons on Open Humans, to allow those of us who desire to donate our diabetes data to research, I have been spending a lot of time figuring out the process from uploading your data to how data is managed and shared securely with researchers. The hardest part is helping researchers figure out how to handle the data – because we PWDs produce a lot of data :) . So this post explains some of the challenges of the data management to get it to a researcher-friendly format. I have been greatly helped over the years by general purpose open-source work from other people, and one of the things that helps ME the most as a non-traditional programmer is plain language posts explaining the thought process by behind the tools and the attempted solution paths. Especially because sometimes the web pages and blog posts pop higher in search than nitty gritty tool documentation without context. (Plus, I’ve been taking my own advice about not letting myself hold me back from trying, even when I don’t know how to do things yet.) So that’s what this post is!

Background/inspiration for the project and the tools I had to build:

We’re using Nightscout, which is a remote data-viewing platform for diabetes data, made with love and open source and freely available for anyone with diabetes to use. It’s one of the best ways to display not only continuous glucose monitor (CGM) data, but also data from our DIY closed loop artificial pancreases (#OpenAPS). It can store data from a number of different kinds and brands of diabetes devices (pumps, CGMs, manual data entries, etc.), which means it’s a rich source of data. As the number of DIY OpenAPS users are growing, we estimate that our real-world use is overtaking the amount of total hours of data from clinical trials of closed loop artificial pancreas systems.  In the #WeAreNotWaiting spirit of moving quickly (rather than waiting years for research teams to collect and analyze their own data) we want to see what we can learn from OpenAPS usage, not only by donating data to help traditional researchers speed up their work, but also by co-designing research studies of the things of most value to the diabetes community.

Step 1: Data from users to Open Humans

I thought Step 1 would be the hardest. However, thanks to Madeleine Ball, John Costik, and others in the Nightscout community, a simple Nightscout Data Transfer App was created that enables people with Nightscout data to pop it into their Open Humans accounts. It’s then very easy to join different projects (like the OpenAPS Data Commons) and share your data with those projects. And as the volunteer administrator of the OpenAPS Data Commons, it’s also easy for me to provide data to researchers.

The biggest challenge at this stage was figuring out how much data to pull from the API. I have almost 3 years worth of DIY diabetes data, and I have numerous devices over time uploading all at once…which makes for large chunks of data. Not everyone has this much data (or 6-7 rigs uploading constantly ;)). Props to Madeleine for the patience in working with me to make sure the super users with large data sets will be able to use all of these tools!

Step 2: Sharing the data with researchers

This was easy. Yay for data-sharing tools like Dropbox.

Step 3: Researchers being able to use the data

Here’s where thing started to get interesting. We have large data files that come in json format from Nightscout. I know some researchers we will be working with are probably very comfortable working with tools that can take large, complex json files. However…not all will be, especially because we also want to encourage independent researchers to engage with the data for projects. So I had the belated realization that we need to do something other than hand over json files. We need to convert, at the least, to csv so it can be easily viewed in Excel.

Sounds easy, right?

According to basic searches, there’s roughly a gazillion ways to convert json to csv. There’s even websites that will do it for you, without making you run it on the command line. However, most of them require you to know the types of data and the number of types, in order to therefore construct headers in the csv file to make it readable and useful to a human.

This is where the DIY and infinite possibility nature of all the kinds of diabetes tools anyone could be using with Nightscout, plus the infinite ways they can self-describe profiles and alarms and methods of entering data, makes it tricky. Just based on an eyeball search between two individuals, I was unable to find and count the hundred+ types of data entry possibilities. This is definitely a job for the computer, but I had to figure out how to train the computer to deal with this.

Again, json to csv tools are so common I figured there HAD to be someone who had done this. Finally, after a dozen varying searches and trying a variety of command line tools, I finally found one web-based tool that would take json, create the schema without knowing the data types in advance, and convert it to csv. It was (is) super slick. I got very excited when I saw it linked to a Github repository, because that meant it was probably open source and I can use it. I didn’t see any instructions for how to use it on the command line, though, so I message the author on Twitter and found out that it didn’t yet exist and was a not-yet-done TODO for him.

Sigh. Given this whole #WeAreNotWaiting thing (and given I’ve promised to help some of the researchers in figuring this out so we can initiate some of the research projects), I needed to figure out how to convert this tool into a command line version.

So, I did.

  • I taught myself how to unzip json files (ended up picking `gzip -cd`, because it works on both Mac and Linux)
  • I planned to then convert the web tool to be able to work on the command line, and use it to translate the json files to csv.

But..remember the big file issue? It struck again. So I first had to figure out the best way to estimate the size and splice or split the json into a series of files, without splitting it in a weird place and messing up the data. That became jsonsplit.sh, a tool to split a json file based on the size you give it (and if you don’t specify, it defaults to something like 100000 records).

FWIW: 100,000 records was too much for the more complex schema of the data I was working with, so I often did it in smaller chunks, but you can set it to whatever size you prefer.

So now “all” I had to do was:

  • Unzip the json
  • Break it down if it was too large, using jsonsplit.sh
  • Convert each of these files from json to csv

Phew. Each of these looks really simple now, but took a good chunk of time to figure out. Luckily, the author of the web tool had done much of the hard json-to-csv work, and Scott helped me figure out how to take the html-based version of the conversion and make it useable in the command line using javascript. That became complex-json2csv.js.

Because I knew how hard this all was, and wanted other people to be able to easily use this tool if they had large, complex json with unknown schema to deal with, I created a package.json so I could publish it to npm so you can download and run it anywhere.

I also had to create a script that would pass it all of the Open Humans data; unzip the file; run jsonsplit.sh, run complex-json2csv.js, and organize the data in a useful way, given the existing file structure of the data. Therefore I also created an “OpenHumansDataTools” repository on Github, so that other researchers who will be using Nightscout-based Open Humans data can use this if they want to work with the data. (And, there may be something useful to others using Open Humans even if they’re not using Nightscout data as their data source – again, see “large, complex, challenging json since you don’t know the data type and count of data types” issue. So this repo can link them to complex-json2csv.js and jsonsplit.sh for discovery purposes, as they’re general purpose tools.) That script is here.

My next TODO will be to write a script to take only slices of data based on information shared as part of the surveys that go with the Nightscout data; i.e. if you started your DIY closed loop on X data, take data from 2 weeks prior and 6 weeks after, etc.

I also created a pull request (PR) back to the original tool that inspired my work, in case he wants to add it to his repository for others who also want to run his great stuff from the command line. I know my stuff isn’t perfect, but it works :) and I’m proud of being able to contribute to general-purpose open source in addition to diabetes-specific open source work. (Big thanks as always to everyone who devotes their work to open source for others to use!)

So now, I can pass researchers json or csv files for use in their research. We have a number of studies who are planning to request access to the OpenAPS Data Commons, and I’m excited about how work like this to make diabetes data more broadly available for research will help improve our lives in the short and long term!