Science AMA Series: We are Drs. Eric Stern and Mark Michalski radiologists and data scientists. Ask us about our support of lung cancer machine learning algorithms with the National Cancer Institute (NCI) via the Data Science Bowl with Dr. Anna Fernandez and Booz Allen Hamilton. AMA!



Pretty simple one, what are the machine learning algorithms/techniques you use in your kind of work?


Stefano from the CCDS team here! Medical images reflect the complexity of the human body. The efforts in applying machine learning to extract information from medical imaging data requires the full spectrum of machine learning and computer vision techniques. The field is extremely active; the techniques have a lot in common with other fields of computer vision, such as autonomous driving. Acquiring information from the physicians requires an additional set of computational tools, including natural language processing and soon conversational user interfaces.

Could you talk a bit about the scope of machine learning in cancer diagnosis? What can it currently do, what are some research avenues in expanding its capabilities and what are its fundamental limitations?


Hey, this is Brendan from the CCDS. The scope of machine learning in cancer diagnosis is quite large. In general, we think about a few main applications, including (1) cancer screening, (2) computer-aided diagnosis [CAD, of which screening is a type of], and (3) personalized medicine. In the context of screening, ML is being used to be able to process high volume studies and triage the most relevant ones to clinicians. In CAD, machine learning algorithms can also be used more generally to help automate laborious tasks in general, such as the quantification and segmentation of tumors. Going a step beyond, machine learning can also be used to make better predictions about the type of cancer and response to therapy based on these quantitative features, which may come from multiple imaging modalities. // Certain applications of cancer identification in screening have already shown very promising results, and the challenge will be to translate those results into actual clinical use. As for personalized medicine, there are many avenues of research. One avenue that we are particularly excited about is the ability to use quantitative tumor measures in combination with genetic information to better subtype and treat cancer.

Hi Eric and Mark!

Thanks for doing this AMA!

I'm sure you're tired of answering this question by now, but I'll ask the obvious one:

When do you predict that AI and machine learning will replace diagnostic radiologists? It seems it will only be a matter of time. Your thoughts?


For some insights into how and what radiologists do and think, and the role radiologists play in the care for patients, here is a very timely and eloquent article that I feel is worth a read:

Hi Eric and Mark!

Thanks for doing this AMA!

I'm sure you're tired of answering this question by now, but I'll ask the obvious one:

When do you predict that AI and machine learning will replace diagnostic radiologists? It seems it will only be a matter of time. Your thoughts?


This is Eric. I don't see radiologists as being replaced. I see AI as augmenting what radiologists do to improve the care we provide for our patients. It is a fallacy to consider that all radiologists do is diagnosis.

Hi Eric and Mark!

Thanks for doing this AMA!

I'm sure you're tired of answering this question by now, but I'll ask the obvious one:

When do you predict that AI and machine learning will replace diagnostic radiologists? It seems it will only be a matter of time. Your thoughts?


This is Mark from the CCDS. This is a big hairy question but briefly...radiology is a field that was built on technology and, as such, as technology evolves so will the field. There are parts of radiology that will change - as they have with the adoption of technologies previously, like PACS and digital imaging. My own thinking for what happens next in the diagnostic specialties is we bring the insights of data science to the patient and the population. It's an evolution for the field, but a very exciting one!

Hey Eric, it's your (don't know the best way to describe this) old nephew (?) Dave! Awesome to see you on reddit, and didn't know you were on the AI path. We'll have to talk more on that some time, I'm involved in another project that may cross some paths on a higher level.

My question. What are your thoughts on crowdsourced teaching mechanisms vs. straight deep learning? Things like MTurk seem to give "a lot of eyes on a single thing" a tough time, but seeing the results from reCAPTCHA and the recent news on a grade high schooler catching a NASA error that had gone undetected for ages gets you thinking on what new eyes of any type can bring to a challenging problem.


Hi Dave! Crowd sourcing and deep learning would seem to be two very different things with very different roles and utilities.

I'm a big fan of computer-aided diagnosis; I don't think computer algorithms will replace doctors any time soon, but they can still be enormously helpful. My concern, though, is about usability. After the questionable success electronic healthcare records have had, I worry about the implementation of any other computational solutions. Obviously you guys aren't really focused on user experience, you're still working on the back end stuff. But I was wondering if you had any thoughts about how to make complicated computational tools like machine learning systems usable and accessible for doctors and techs. Do you envision the birth of a new profession of "algorithm technologists," kind of like radiology techs? Or can these systems be made accessible enough that docs can use them unassisted?


Hey, it's Sean and Brendan! A very good question. There are a few different approaches here and we at the center have thought a lot about user interface. For example - one approach is to display a picture showing the subset of the image used to make an inference (see figure 4 in Ribeiro's paper "Why Should I Trust You?" How this data is presented to aid clinicians is difficult. If the data used for an inference is from a combination of images or images and textual data - it's a case by case thing at the moment. Another issue is when to show it to a physician and in what context. Radiologists often use multiple monitors but you would want to show the results in a way that does not interfere with other clinical tasks or induce too many mouse clicks from the users. Tools that augment what the physicians are already doing by focusing attention on specific features have to be cleanly designed. The hope is that proper choice of clinical scenario, clean UI design, a model which fits the data, and a process that filters out inappropriate inputs and outputs will enable doctors to use these tools with assistance. As for the profession of "algorithm technologists," you may want to take a look at this recent editorial proposing a new field of clinical information specialists ( It's an interesting proposition!

With Simulation modeling on the rise with predicting health outcomes, how can we apply this to predict lung cancer outcomes due to exposure, treatment, etc.?


This is Eric. Your question is spot on. This is exactly the way I see AI/deep learning as being most applicable and useful in medicine. Doing things humans cannot do...sort through aggregated population data/electronic health records, and finding the signal in the noise. One can apply this model to almost any medical problem, including your specific question.

Hi Dr. Stern and Dr. Michalski, thanks for your time! What do you think will be the largest shift in radiology in terms of machine learning? What do you think machines will be doing specifically that radiologists do now?


This is Eric. I sure hope that ML will give us more precision and decrease variation in care. What I don't see ML doing is knowing what questions to ask or understanding the why behind the way events occur.

Australia has recently seen a resurgence in the diagnosis of 'black lung' disease, which was previously claimed to have been eliminated. It transpires that although surveillance CXRs were being taken, these were either being misread or allegedly not examined at all. Are the algorithms you are collectively developing likely to be able to tackle screening of large groups of individuals in remote areas, with images often taken by non-professional staff? How robust are the algorithms when faced with sub-standard imagery? If the image files are compressed to be sent electronically does this compromise integrity?


Brendan from CCDS here. We indeed feel that screening of large groups of individuals is one of the most promising applications of deep learning in radiology. In locations where specialists are limited, computer algorithms that can triage the most suspect cases to a radiologist could have an impact on conditions such as black lung in Australia, and also in tuberculosis or lung cancer screening. Some of the excellent results of the Kaggle Data Science Bowl show the promise of deep learning in screening different medical conditions, especially given that the world is more and more connected by wireless IT infrastructure.

The issue of image quality is something we deal with on a regular basis. The short answer is, as long as we incorporate as much of that "lower quality" imaging data into our algorithm during training, our algorithms should also be able to robustly identify findings when these types of images appear in actual use. The longer answer is, we take great pains to make sure our algorithms are as generalizable as possible. There are a few types of degradation that are known to cause artificial neural networks to fail. The classic example is adding pixel noise to an image of a panda causes it to be misclassified as a gibbon: Additionally, decreases in SNR, contrast, and compression artifacts can diminish neural network performance In the medical imaging world, there are also other causes of poor image quality, including poor x-ray exposure, patient positioning, and poor digital conversion from film. There are multiple strategies to address the robustness issue, including data augmentation, regularization, the use of adversarial learning, and the incorporation of poor quality images into the original training data sets. These are all areas of active investigation, and though there is certainly work to be done, we predict that we will overcome these limitations.

Hi Eric and Mark, thanks for doing this AMA! I am an MD/JD student who is interested in the regulatory law aspects of machine learning algorithms in medicine. I have two questions: 1) Are you concerned at some point the patient record trawling will become not HIPAA compliant, or that these algorithms will become so advanced/integrated into diagnosing or recommending treatments that they would warrant FDA approval as a medical device? With regard to the latter, what would you envision the clinical trial process looking like? 2) Do you envision machine learning algorithms as the Black Box Medicine described in this article? If not- how practicable do you think black box medicine is?

Thanks again for doing this AMA!


Mark from the CCDS. It's an important point that you bring up. With regards to HIPAA specifically, HIPAA makes provisions for the use of such data but requires that it is handled properly. I think the bigger question that you're asking is can de-identified data become identifiable when combined with other data. The answer here is it can... so the law itself may evolve in a similar way that new regulations have arisen over data in non-health care spaces. (see, for example While regulatory clearance here is probably beyond the scope here, the FDA is an important part of this discussion!

Last year, I got diagnosed with planocellular carcinoma in the praeputium with no invasion to the glans but with non-metastatic spread to 1 sentinel lymph node (and have since gone through surgery to remove both tumor and lymph node, as well as getting radiation and concumetant cisplatin chemotherapy - and will be receiving regular CT scans for at least the next four years), but as I've understood it, this kind of cell is the same as a type that can be found in the lungs, so here's my question:

Could there be a way to detect this otherwise very rare type of cancer (about 1/100k in Denmark), or is the risk of exposing people to CT scans for early detection too great, or would low field-intensity MR scannings be diagnostically relevant and humane enough that it could be used, since it doesn't involve ionizing radiation?

Also, as a follow-up, when machine learning detection becomes good enough to match or exceed the detection ability of mulch-disiplinary teams consisting of radiologists and oncologists, do you still think doctors will be involved in diagnosing?


Mark from the CCDS - so sorry to hear about your illness. Fighting this kind of stuff is what gets us up in the morning.

While we can't speak to your specific cancer, we hope that ML will build our capacity to read screening studies. At some point the instrumentation may become sufficiently low cost (and low dose) that broader screening is possible - two sides to the same coin. I think for the foreseeable future multi-diciplinary teams will be the gold standard, but they'll use ML increasingly to inform their decisions.

What type of imaging are you using? CT, XR? Also what type of XR works best for your algorithms and why? (scanned film, CR or DR)


Hey, it's Sean at the CCDS. At the moment: CT, MR, CR, DX, MG, some pathology images as well. But the general answer is: everything. No scanned film. I don’t know that we can answer the ‘best’ or ‘why’ yet.

Good evening and thanks for the AMA. Given our current technological situation, do you think it would be useful for doctors to learn how to code? Having teams of physicians writing algorithms... would it give a new and different insight?


Hey, this is Bernardo from the CCDS. The clinical expertise of the physicians is essential for the algorithms to be useful in the daily practice since they are in the front-line of the healthcare and aware of the challenges faced. If they could also write the algorithms it would definitely make a difference. Being aware that coding algorithms requires an expertise that is really complex and takes a considerable amount of time to learn – and to complete a MD and residency program requires 10+ years of study – we can understand why it is not common to see doctors that are able to code machine learning algorithms and when they do why it is such a differential. But seems that this fact could change in the near future since the new generations are learning deeply the new technologies while still early in youth and this kind of coding knowledge could become almost natural. Maybe one day it could even become part of the medical school curriculum.

Do most models trained to classify lung cancer nodules take other patient information into account (family history, lifestyle, genes, etc) or is it purely based on images? Wouldn't incorporating other types of data features such as the ones listed above potentially increase detection rate?

If you only give a machine an image of a medical scan, it could potentially detect a nodule but be quite uncertain about it based on the image alone. If you incorporate other patient information and risk factors into the module, this could potentially give a more accurate diagnosis that a specific nodule is infact cancerous.

Is this something that is already done in existing systems? If not, why? Is this unnecessary, or does it negatively affect results somehow?

Thanks for doing the AMA.


This is Anna Fernandez: Great questions. Yes, there have been many research groups investigating different methods for lung cancer nodule detection/prediction of cancer that may take into account additional patient information. For this year’s Data Science Bowl this year, the competition comes with the radiology images and list of whether the patient/subject was diagnosed with Cancer (1 or 0) – more information on A desired outcome would be that there are additional features not just those associated with the nodule, that would be good predictors of cancer so we could more accurately know what could happen with the patient. In the future, obtaining more and large number of data sets with additional phenotypic information as you suggest (patient demographics including family history, genetic variations, etc.) will be necessary to develop even more robust algorithms. One could see that taking some of the initial machine learning approaches defined this year could then be applied and enhanced with future comprehensive data sets in this field of lung cancer, but also could apply to other diseases that use medical imaging. "Is this done today in systems?" I am not personally aware of any today in use in hospitals but there could be some that are in prototype or in one-off settings that are incorporating some phenotypic elements with them - usually these will need to be powered by a large amount of examples to become robust.

Do most models trained to classify lung cancer nodules take other patient information into account (family history, lifestyle, genes, etc) or is it purely based on images? Wouldn't incorporating other types of data features such as the ones listed above potentially increase detection rate?

If you only give a machine an image of a medical scan, it could potentially detect a nodule but be quite uncertain about it based on the image alone. If you incorporate other patient information and risk factors into the module, this could potentially give a more accurate diagnosis that a specific nodule is infact cancerous.

Is this something that is already done in existing systems? If not, why? Is this unnecessary, or does it negatively affect results somehow?

Thanks for doing the AMA.


This is Brendan over at the CCDS. Currently, most models trained to identify lung cancer nodules rely purely on imaging information. They are typically generated using large annotated data sets that have information about the presence or absence of a nodule, and sometimes about the histology of the nodule. These algorithms can already do quite well based on the imaging information alone. We agree, though, that there are rich sources of information that can help to improve these models, including clinical information, family history, and genetic information. The incorporation of this type of information is what clinicians typically do when they review scans. There is a potential downside. By incorporating demographic information, for example, we may bias the algorithms towards detecting nodules preferentially in high-risk populations and missing them in lower-risk populations. It's the same cognitive bias that sometimes leads clinicians to miss diagnoses in patients who do not fit the typical profile of a person with a disease.

Hi guys, here's my question:

Because of the complexity of some of the CNN models (particularly Inception V2,V3,V4 Inception-ResNet V2+) and the realities of training the model, transfer learning from ImageNet is typically used to save time/frustration. However, the Imagenet source images (Dogs, Planes, Trucks) are very different than medical imaging (CT/X-ray,Retinal fundoscopy or Pathology slides).

Are you using similar transfer learning techniques or are you going forward & 'rolling your own' using the lung CA database and training de-novo? Thoughts on the suitability of transfer learning on medical/radiology imaging from the standpoint of accuracy - a concern or not one at this point?


This is the CCDS folks - great question. We agree with your observation that neural networks trained on ImageNet data would not necessarily be expected to pull out features that are the most relevant towards medical imaging. That having been said, there have been some notable successes in transfer learning including the recent Google Brain effort to diagnose diabetic retinopathy ( Surprisingly, even though they had 120,000 images, they still chose to use transfer learning and seem to have gotten great results. As such, we believe there is still value to transfer learning even in different modalities.

Thanks for doing this AMA! I am currently a student working towards a BS in statistics and I am wondering how much you guys use statistical analysis in your work? As somebody who wants to go into applied work (via medicine or maybe even public policy) what are possible graduate programs to pursue which would be useful in projects such as the one you are working on?


Mark at the CCDS. The field of machine learning in medicine is a highly interdisciplinary field. While we would advise that you speak to a career counselor for yourself specifically, we can talk about our own team. For our data scientists, we prefer that they have a generally quantitative skill set (coming from STEM fields) with experience in scientific computing (such as Python), Bayesian statistics, machine and deep learning, data processing, image processing, and some software development. (You can see our website for more specifics Facebook also recently came up with their own advice for aspiring data scientists. ( That having been said about data scientists, the adaptation of ML in healthcare in general will require the efforts of many different people, physicians and public policy advocates alike. The more that you can understand about the core technology underlying the algorithms, the better.

What are some of the biggest obstacles to using radiomics applications in clinical practice?


Hey, Bernardo and Stefano from CCDS here. There are many challenges. Ethical and legislative, privacy and confidentiality, the design of the interfaces between the patients, care takers, the radiomic system and other machine learning systems. One current technical challenge is acquiring large amounts of data. The quality of a radiomic classifier model is limited by the size of the data sets used to create them. The continuous improvements in medical image acquisition are reflected in technical differences that must be incorporated in the mining of quantitative image data and therefore increase the amount of data required for model building. A possible solution could be in large scale data sharing (see

Hello Doctors, Are the Algorithms capable of reading images from all the modalities?


Hey, this is Brendan! Algorithms today are commonly trained on a data set of a single imaging modality. There is a great deal of emerging literature, however, on how to use algorithms that can take inputs from different types of modalities, for example an ultrasound image or a CT scan slice. We believe that one powerful application will be the combination of data from multiple modalities. That is, the algorithms will merge data from different sources in an intelligent way, similar to what humans do now.

Hi Eric & Mark, thanks for this AMA!

  1. What do you consider to be the greatest challenges (technical and otherwise) facing the field of ML-enhanced radiology?

  2. What do you consider to be the greatest opportunities in the field?


Bernardo here - Among the greatest technical challenges faced in the machine learning enhanced radiology we can highlight the dependence of large amount of labeled data which is one of the current bottlenecks. To generate these datasets is a time consuming task that requires a radiologist to perform manual labeling of the images identifying lesions so the algorithms can learn from them. Besides that, there is a huge amount of different imaging technical parameters that we face in one same modality such as MRI that can be acquired in several different scanner types with a large variety of imaging parameters that could make images quite different from each other. It is a great challenge to create machine learning algorithms that could be somehow “universal” in such a heterogeneous field.

Dr. Stern, Dr. Michalski and Dr. Fernandez, thank you for doing this AMA.

I believe increase in data collection methods (wearables, cheaper and more frequent tests, shared data banks) will help in future breakthroughs in wellness and treatment and I'm happy that ML is being put to such a noble use.

What has been your experience with data collection and sharing across institutions, providers and researchers? That seems to be one of the major obstacles today.

Thank you.


Hi – Anna Fernandez here: Thanks for the question – I agree that as more data types come in more opportunities to use the data to improve clinical care! Even as we prepared for the Data Science Bowl this year, it took time and effort to combine data from government-funded, international data sets, and other data sets (see Data Support Providers listed at Challenges include – will the institutes share the data? Will the sponsor (government, industry) allow users to share the data? And one of the biggest – how do you sufficiently deidentify/anonymize the data to be useful – data science/machine learning algorithms need some key information for training sets, etc. and we need to balance that with protecting health information. It’s a balance that needs to be considered as people share the data –for what purpose? What data elements are needed to describe the data?

Hey, thank you for this opportunity to ask questions. What interests me is how imaging and machine learning will expand in the future and most importantly which particular algorithms and methods are you using?


This is Anna Fernandez – thanks for the question – For possible algorithms and machine learning methods investigated by the data science community for these radiology lung CTs, I would look at the tutorials and forum discussions at Several people are also sharing their kernels at Also – check out where the community shares some of their experiences as well.

Hello Dr. Eric and Dr. Mark hope you're both doing well. I was wondering how you machine can help with fighting and even curing interstitial lung disease? Most if not all forms are fatal and the disease does not get the sort of exposure cancer does. I lost my mother to it recently so I've witnessed how horrible it is.


This is Mark and the CCDS folks. Sentat, we're sorry to hear about your mother. Helping improve care in these types diseases is what drives us. There definitely are early efforts to use machine learning to better detect and classify interstitial lung disease (ILD) on CT scans (see for example Whether that will translate into helping cure ILD is a larger question that I don't think I am qualified to respond to--but I hope so.

What new sources of data scans are coming on line to improve Machine Learning results? ML needs so much data right now in order to approach/surpass human abilities, (10x LIDC!)


Great question. Who owns patient data? Who profits from patient data? Unanswered questions. Search IBM/Merge.

Can you recommend a review paper about this area from the past few years?




Stefano at CCDS says - "Assassin's Khreed"

Additional Assets


This article and its reviews are distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and redistribution in any medium, provided that the original author and source are credited.