American Chemical Society AMA: We are Keira Havens & Rafa Gomez Bombarelli here to talk about the Launch Smarter Chemistry Challenge. Ask us anything about building an ecosystem for better innovation in chemistry.

Abstract

Hi Reddit!

I am Keira Havens - you’ve seen me here on Reddit before when I shared my color changing flower project a few years ago. I’m a molecular biologist by training and focused on synthetic biology while in academia. I went on to start a company around the color changing flower concept and learned a lot about the way a new application makes it into the marketplace - or doesn’t. That experience got me thinking closely about the systems we use to identify beneficial technologies and eventually brought me to LAUNCH, to build networks that connect technology more closely with society.

And I am Rafael “Rafa” Gomez Bombarelli: Postdoctoral Fellow, Harvard University. I currently work at the Aspuru-Guzik group in the computer driven design of molecular materials. I combine machine learning and first principles simulation to rapidly discover practical materials: organic light emitting diodes for displays, electrolytes for flow batteries, and organic photovoltaics for solar cells. I have a Ph.D. in Physical Chemistry from the Universidad de Salamanca in Spain.

We’re here to answer your questions. In particular, we’re excited to talk about the LAUNCH Smarter Chemistry Challenge, developed in partnership with the ACS Green Chemistry Institute, and other organizations. The challenge is a global call for innovators and entrepreneurs, companies, and organizations, to enable predictive chemical design through innovative applications of data. Why data? Predictive design can’t exist without good information. This requires the right data to exist, that the data is publicly accessible, and that the data is in a consistent format that can be easily used by scientists, companies and institutions. By any of these measures, chemistry faces enormous challenges. Check out the challenge here, and ask us anything about the challenge, data in chemistry, computer driven design, and the process of technological innovation, from discovery to adoption!

Back to answer a few more questions!

So I am a little confused by what you are trying to do here. I am worried that I may be missing the point here as I feel like you might be using a few too many buzzwords (likely the case as this is the first time I have come across your concept).

to enable predictive chemical design through innovative applications of data

What data do you mean exactly? Do you mean data that is already out there in the literature? Do you mean future data that will be collected? What do you even mean by data? do you mean raw spectra and results? or do you mean higher level data like structure-activity relationships?

Because if you are talking about changing the way raw data is collected to make it consistent then I am all for this movement (would make my life MUCH easier). But if you are talking about higher level data that makes much more sense. What I imagine is pooling data and looking for long range trends that are not obvious to the lone chemist as they sit there interpreting 50 different NMR spectra.

But the next obvious question is how do you deal with data integrity? Every chemist has been lazy and not dried their sample and got a solvent peak or forgetting to take the baseline their IR and ended up with a giant OH peak from air moisture. These mistakes are fine if interpreted and accounted for but I could see these muddying up computations.

I guess what my main question is how does your project fit into my research as a chemist? At the moment my research goes:

  • 1. Observation
  • 2. Hypothesis
  • 3. Experiment to test hypothesis
  • 4. Back to 1.

Where would the "better innovation ecosystem" fit in? Would it be used to extract interesting initial observations or would it fit in more in the experimental area?

TL;DR What does LAUNCH Smarter Chemistry Challenge do for me as a chemist?

samyall

Thanks for this question - there’s a lot to cover here.

Let’s start with the data based specifics: What data do you mean exactly? Do you mean data that is already out there in the literature? Do you mean future data that will be collected? What do you even mean by data? do you mean raw spectra and results? or do you mean higher level data like structure-activity relationships?

Usually, when we talk about data, we’re referring to big data, data that is continuously generated, in a standard format, traceable, accessible, and able to be sliced and diced in a number of different ways by anyone with the right question. Chemistry doesn’t have that. The big data repositories are often under lock and key due to intellectual property concerns, and these silos exist not only between industries (a solvent used in paints may also be used in plastics but it’s unlikely those two groups share hazard information or processes because they have no reason to), but even within industries or within companies.

This challenge is the first step in building that data infrastructure to allow the analytic tools we’ve developed in other areas to be brought to bear on problems in chemistry. There is a lot of groundwork to be laid before we can get to the type of data-driven chemistry we’re envisioning.

What does LAUNCH Smarter Chemistry Challenge do for me as a chemist?

We’re bringing concepts that substantively improve your ability to do chemistry to the companies and investment communities that can make them a reality.

In particular, for this challenge, we’re addressing all four concepts you mention - any company, organization, or inventor with a way to improve data useage in chemistry is welcome to enter the challenge.

You mentioned generating data in a universal format would be really helpful. What does that look like in reality though? Is there a non-profit group out there working with the big instrument manufacturers with a value proposition around universal formats? Is there a small start-up that’s made something like smallpdf.com for chemistry that allows you to interconvert all data formats with one another? There are a number of ways to solve this problem, and we want to give the people working on it the best opportunity to succeed. We do that by asking what we hope is the right question and then we connect the people struggling to answer that question with the right partners.


So what did ever happen with your color changing flowers? Did they hit the marketplace? Can I buy them?

DoShitGardener

Hey, thanks for asking. So yes, the color changing flowers were made by a lab in the Netherlands. You can't buy them because the project has been shelved for now.

Like I alluded to in my post, that project taught me a lot about the way products get to market. There were a couple of flaws our approach. One was technical: the color change happened over five days and only in new flowers. So, you could have a plant that bloomed pink and then white, depending on whether you fed it beer or not, but the existing flowers wouldn't change color. This wasn't the impact we were going for, and rather than put out a substandard product that didn't meet expectations, we decided to shelve it until we could do it right. This was a personal decision - it would have been possible to sell the flower as it was with the right marketing, but I wasn't interested in selling the flower as much as I was interested in making a statement about the potential of the technology.

Another part of the problem was the path to market. The floriculture market is tightly knit and a little incestuous - everyone knows everyone else, and they've been working in this space for 100 years. There were two options in developing this flower, one was a consumer based approach, and one was a floriculture based approach. We chose the consumer based approach, but a smarter path forward would have been one like LAUNCH applies in its challenges where we ask existing businesses what they need and then seek to provide it. This would have suited the type of business we wanted to run much, much better than a consumer facing platform (although I did end up learning a lot about marketing and perception from that experiment).

If you are curious about genetically engineered flowers, the Moonlite carnations from Florigene are your best bet. These are pansy genes inserted into carnations to make really beautiful purple flowers from dark eggplant to really pale lavender. I've seen them in person, they're great.


"a global call for innovators and entrepreneurs, companies, and organizations, to enable predictive chemical design through innovative applications of data"

What does this mean, exactly? What kinds of chemical designs are predictable from data- and what kind of data to these approaches use?

Also what is an innovation ecosystem? Is this just a space where no idea is too crazy? How does this challenge promote high risk high reward science?

p1percub

Another great question! The short answer is that it is can be very difficult to design a molecule with the appropriate properties from first principles. This is due in part to the vast number of possible reactions and outcomes available throughout chemistry - there's no one-size-fits-all sort of rule for developing a new molecule, and when those molecules become combined into mixtures or materials there is still a lot of trial and error involved. We're looking for ways to make this design process more effective in chemistry and we think it starts with better information.

What I mean by innovation ecosystem is the support structure that surrounds a new concept or business. And I'm just going to say, we're going to find a better way to describe that because hoo boy, this phrase is not doing the job!

So, a lot of new companies die right off the bat. They have a promising idea, get some traction with some risk takers and then when they try to bring that idea to the broader market (known as scaling) they fall into the valley of death. Sometimes they die from lack of funding, sometimes they die from ineffective management, sometimes they die from lack of market penetration. LAUNCH is trying to shift the process of innovation from one person coming up with a cool idea and forcing it on everyone else, to a collaborative process that defines the problem well and looks for solutions that meet the need. - Keira

This encourages high risk high reward science because it allows you to focus on solving the right problem. If we've built an infrastructure of companies and organizations that say "We need this", it's a lot easier to tackle big systemic challenges in an effective way.


Can you discuss other examples of how computational approaches to chemistry or biology has resulted in interesting outcomes, and what sort of problems this methodology is currently most excitingly being applied to?

Izawwlgood

Hi there, this is Rafa.

Astronomy and particle physics are two areas where scientists make discoveries sifting through huge amounts of data.

In biology, all the omics are great examples of using large datasets to answer basic scientific questions, and also to design new solutions such as drugs or diagnostics.

In chemistry, we are catching up quickly. Many experimental and theoretical research groups now use machine learning to figure out hidden patterns in actual or simulated experiments.

A beautiful paper by Joshua Schrier a few months ago, for instance, was able to learn from failed experiments that had been forgotten in old lab notebooks.


Most chemists are not entrepreneurs and vice versa. And while "generating data" is necessary for chemists, it might be very difficult for an entrepreneur to look at the data and find something meaningful. How do you plan on bridging the gap between chemists and entrepreneurs?

rseasmith

We bridge this gap in large part by working with chemists like Rafa to identify what the underlying problem is, and highlighting it in such a way that entrepreneurs can see an application for their work.

I don't mean to say that this is easy - as a scientist who became an entrepreneur I understand how different the modes of thinking are and I do my best to translate technical discussions into effective questions and action. And when we talk about something like data, that has very specific meanings for a scientist that vary depending on field, and meanings for entrepreneurs, there's a lot of miscommunication that can occur.

That being said, interdisciplinary group do a great job of breaking down these walls. In the network we have environmental groups talking with process chemists and folks working on analytics and business people. We're able to see an understanding of the scope of the problem emerge from these conversations and an appreciation for the way different disciplines can inform each other.


Hi Keira, Hi Rafa. Thanks for answering our questions today!

Rafa, my question is slightly off-topic, so I apologise in advance.

I'm interested in the detection of toxic chemicals in the ocean, and I came across some work that researchers at the Universitat Politècnica de València developed. It's a a sensor that detects small amounts of pollutants in the sea. However, I couldn't find anymore information as the emphasis was mostly on the detection of oil spills.

Do you think that there is a way to use machine learning (recognition of molecular patterns, though I have no idea) to accurately detect toxic waste that has been dumped in the ocean?

Sir_Boldrat

Hi, Rafa here.

I am not familiar with that work, but it sounds really useful. As for the broader question, machine learning and fingerprinting techniques are excellent ways to identify chemicals and mixtures of chemicals. Analytical chemists often use them [to discover] the particular origin of a sample or to figure out the components of complex mixtures.


Hi and thanks for doing this AMA. For those of us who aren't chemists (like myself!) can you explain what the current barriers are to innovation in chemistry? What is the standard and why is your solution needed? Thanks!

firedrops

Hi - this is Rafa. Very roughly, innovation in many areas of chemistry is similar to the way it was before we had computers: clever people read scientific papers, follow their intuitions and learn from the results of their experiments.

Very much trial-and-error. This way it takes many years to develop new useful materials into the market. We are hoping to borrow from other areas of science where big data has had a big impact and accelerate discovery of new, better chemicals.


ENABLING SMART CHEMISTRY: How can advances in data generation, access, integration, analysis and application accelerate a shift towards more sustainable molecules, mixtures, and materials?

You lay out the first half of your challenge statement nicely on the LAUNCH website, but I'm not sure what is meant when you say "accelerate a shift towards sustainable molecules, mixtures, and materials." Could you please elaborate?

Dizzy_Science

The dream is to be able to design a molecule that has all of the properties we want. For a long time, the list of desired properties centered around performance and effectiveness - How well does it repel water? What temperature does it melt at?

Today, we have a longer list of properties to consider, especially when we talk about sustainability: What is the environmental impact, both in the development process and in its eventual use? How does it affect human health? How does it interact in combination with other molecules? How long does it persist in the environment?

These properties are much harder to design for, and we don't have a good way to predict them. The more information we have, and the more accessible it is, the easier it will be to design molecules and materials which are both effective and sustainable. - Keira


I believe there is a trend in the comments. This project is full of flowery concepts and no meat and potatoes. I went to the ACS website and had to go four click bates into the descriptions to see anything less flowery and get some details. They, the ACS writers even had to emphasize that this is not a political stunt or ploy. REALLY, you had to state that on the ACS page? I am sorry. This makes me sad, frustrated and then a little angry to read all this fluff. All the goals are great. I would love to see everything go green for us from now on.

What is it I and so many redditers are missing from this project?

Do you really think we can make anything we want synthetically and do it with little to know toxicity to our environment? I sure hope so...but I will not hold my breath.

I must admit I did not spend more than 15 minutes clicking click bate on the ACS website. Yet, i don't think it should take over 20 minutes of reading to explain "Green chemistry". Ugh...not happy.

Althekemist

I think part of the confusion is that LAUNCH isn't actually building a solution to this problem themselves, we're asking the question. We think chemistry would benefit from focusing on improving the information available and the ability to use that information effectively.

We're not prescribing a specific technical approach because there are many ways to solve this problem and we don't want to limit the creativity of the entrepreneurs that are building the solutions. We're here to support them, by connecting them to the resources they need to bring their concept to the a wide audience.

We're not holding our breath either, we're pushing people to solve the problem :) -Keira


What's a typical problem in chemical data? Standardization of properties? Translating from one measurement paradigm to another?

helm

Hi, Rafa here.

You are hitting the nail on the head. There is a bunch of challenges dealing with data in chemistry:

  • Openness. Proprietary data can be very valuable, and owners are often very protective. We need to figure out ways to leverage each other's data without necessarily losing ownership.

  • Standardization. In my field (simulation), this is not a huge issue because we can share code and input parameters, but experimental data can be very diverse (preparation of materials, particular experimental setup that was used, ...). We need standard formats to encode and exchange all that metadata.

  • Accuracy. How confident can we be that the data we are learning from is accurate? Combining datasets can help up address this. By doing machine learning over larger sets we can see which inaccurate experiments stand out from the others.

[Edited to break out the bullet points]


Hi Keira, I always found your work on the color changing flowers fascinating. One part about flowers changing color at the detection of the surrounding chemicals was interesting. Although I am assuming it's hard to find a practical application or a market without involving a florist when it comes to flowers. Anyways, wishing you all the best in your future endeavors and happy belated birthday. Hi Rafa, so I am really interested in your work in electrolytes used in flow batteries and organic photovoltaic solar cells. I am assuming you are in research at postdoctoral level. I have only touched this subject briefly in undergrad and read about them on my own. What are the most recently researched electrolytes for energy storage? I have read about Quinones. I am assuming this depends on the size, the condition, output voltage and duration, etc.. But all things considered (cost and efficiency most importantly), which electrolyte seems most promising? And lastly, I ve heard that organic photovoltaic solar cells are not as efficient or stable as synthetic ones. What are the advantages in the organic ones? I am neither chemist or physicist, ELI chemical engineer. Thank you

jinwlee

Hey Jin! Thanks for the birthday note :) I talk about the floral market up above, but yes, it's difficult to break into and that's part of the reason these flowers aren't commercially available. I'll get Rafa to answer your question in a bit, thanks for asking!


Hi Keira, I always found your work on the color changing flowers fascinating. One part about flowers changing color at the detection of the surrounding chemicals was interesting. Although I am assuming it's hard to find a practical application or a market without involving a florist when it comes to flowers. Anyways, wishing you all the best in your future endeavors and happy belated birthday. Hi Rafa, so I am really interested in your work in electrolytes used in flow batteries and organic photovoltaic solar cells. I am assuming you are in research at postdoctoral level. I have only touched this subject briefly in undergrad and read about them on my own. What are the most recently researched electrolytes for energy storage? I have read about Quinones. I am assuming this depends on the size, the condition, output voltage and duration, etc.. But all things considered (cost and efficiency most importantly), which electrolyte seems most promising? And lastly, I ve heard that organic photovoltaic solar cells are not as efficient or stable as synthetic ones. What are the advantages in the organic ones? I am neither chemist or physicist, ELI chemical engineer. Thank you

jinwlee

Hi Jin, Rafa here

Sorry for taking so long - your comment went under my radar in the first pass and I couldn't pick it up until now. I am a postdoc, yup.

Regarding electrolytes for energy storage, it is very much an open question. Organic electrolytes for lithium batteries continue to be an active area of research after many years, in addition with up and coming solutions like flow batteries, where one can engineer power and energy independently and get long discharge times at rated power.

Most of my work for energy storage has been in using simulation to discover organic electrolytes for redox flow batteries. RnD in flow batteries is moving quickly, and there are a few contenders, both for the solvent and the electrolyte

  • Aqueous solvent is cheaper and safer, since it cannot catch fire, but solubility of the elecotrolytes is not great (so energy density is small). In addition, the voltage of the cell has to be under 1.5V; with higher voltages water itself decomposes into hydrogen and oxygen. (this also means less energy stored, since energy is the product of charge i.e. concentration of electrolyte and voltage.

  • Non aqueous increase the price, are often flammable and are less efficient at conducting charge.

Nowadays vanadium flow batteries are the closest to a successful commercial product (here is the president talking about them) but there isn't enough vanadium in the world to suit our storage needs. There are some development efforts trying to find cheaper metals that are also efficient electrolytes.

My work in flow batteries is essentially finding organic electrolytes that can beat vanadium cost- and performance-wise, so we need to predict voltages, solubilities, makeability... and point our experimental collaborators towards the best candidate molecules. One of the biggest issues we have to deal with is making sure that the molecules are stable for the many years the batteries need to last.

As you mention, OPVs also have issues with stability (and low efficiency, although perovskites are showing good promise later). Figuring out what molecules are durable before we make them is one of the open challenges in molecular design. Hopefully somebody will jump on it for the Challenge.


So I know there's a lot of computational techniques out there to characterize a composition and structure, find the electron energy levels and pretty much everything follows from there. It can be computationally intensive, but reasonably feasible at least for things smaller than big ol' proteins.

So I can see an idea of desiring certain properties looking at a database, finding molecules in the neighborhood, conjecturing structures, modeling the properties, iterating and producing some candidates.

That might not actually be the right idea, but that's the impression I get. I'm wondering about the thing I don't have any real knowledge in. Synthesis methods. Is developing novel synthesis methods a factor? Is there any space in chemistry for things like reinventing the wheel with more environmentally friendly precursors?

I would guess, from my admittedly naive point of view, that predicting the time evolution at finite temperatures of ensemble of randomly oriented and molecules to predict the reaction pathways available and the equilibria would be computationally inaccessible. Does the sort of real ability to look at novelty in designing synthesis, passing through intermediate structures, and generally planning a path through a complex phase space exist robustly already in ways that I just don't know much about? Or are synthesis methods largely confined to branching out from a finite set well worn empirical paths? Is any of this a factor in the concept of smarter chemistry?

crossedstaves

Hi,

That is one of the direction computational chemistry is going. Can we build algorithms that understand and predict how to make molecules?

You are right to point that trying to discover all the chemical reaction by first principles is daunting, but we can use machine learning on two centuries of chemical experiments. It's a bit like DeepMind with AlphaGo. In that project, a computer beat the human champion at a game much more difficult than chess. The software used machine learning to intuitively follow only the promising positions.

In the same way, machine learning can direct us to the areas that are implicitly interesting so we don't need to explore all the possibilities in the tree of possible chemistries. Various research groups are making good progress in this area.

At the same time, experimentalists are also automating synthesis.

My hope: we'll have self-driving synthetic robots right after self-driving cars.


So I know there's a lot of computational techniques out there to characterize a composition and structure, find the electron energy levels and pretty much everything follows from there. It can be computationally intensive, but reasonably feasible at least for things smaller than big ol' proteins.

So I can see an idea of desiring certain properties looking at a database, finding molecules in the neighborhood, conjecturing structures, modeling the properties, iterating and producing some candidates.

That might not actually be the right idea, but that's the impression I get. I'm wondering about the thing I don't have any real knowledge in. Synthesis methods. Is developing novel synthesis methods a factor? Is there any space in chemistry for things like reinventing the wheel with more environmentally friendly precursors?

I would guess, from my admittedly naive point of view, that predicting the time evolution at finite temperatures of ensemble of randomly oriented and molecules to predict the reaction pathways available and the equilibria would be computationally inaccessible. Does the sort of real ability to look at novelty in designing synthesis, passing through intermediate structures, and generally planning a path through a complex phase space exist robustly already in ways that I just don't know much about? Or are synthesis methods largely confined to branching out from a finite set well worn empirical paths? Is any of this a factor in the concept of smarter chemistry?

crossedstaves

This is Keira, Rafa may jump in here as well. Coming to this field from biology, I found this Scientific American article on the outstanding philosophical questions of chemistry useful

It seems that synthesis isn't the biggest problem at hand anymore although I know that it's a bit of a chicken and the egg problem. If you develop a new tool for synthesis, suddenly it will become indispensible, even if no one realized they needed it beforehand.


I came here just to say that I thought this was a band doing an AMA. That said, are you often or at all confused for a band such as My Chemical Romance?

wite_rabit

This is maybe my favorite question.
I will say, no. That does not happen often.


Was a research chemist for over ten years. I am very excited and enthusiastic still over promoting innovation and outreach. With the price of tuition and university overhead it became cheaper to hire a post doc than a grad student. Eventually I had to spend more time writing proposals to cover my own salary than doing research, much less being able to support additional personnel, as federal funding becomes more competitive with each government shutdown and anti-science policies that ebb and want with each election cycle. The push to encourage innovation in chemistry especially via NSF outlets, to me is both a very smart thing to do economically, but also comes at the expense of respect for basic research that is not application based. My question is, to what extent do you feel that the push for "innovation" in the lab is jeopardizing fundamental research for the sake of applied, marketable, monetizable research? Is it in some respects "privatizing" research by letting big corporations donate to otherwise publicly funded research, or on the flip side, pressuring researchers to patent rather than discover so universities can profit from it beyond what they already took from overhead?

I don't mean to sound bitter. I did receive a substantial part raise by leaving academia. I do still believe the goals, of fiddling the American economy through innovation, and teaching researchers to be open minded about application, to be important. But at some level it seems that there may be a fundamental conflict that's being buried here that benefits the universities over the researchers.

ggrieves

Keira here - I feel your pain. The field of biology is suffering the same fate, where if there's not a direct application it is desperately difficult to get funding. Personally, I think basic and exploratory research is critical to furthering any efforts in innovation. If you haven't read Science in the age of Selfies it highlights the link between the two nicely.

I think there is a devaluing of basic research across the board, and it's paired with putting the concept of innovation on a pedestal. I think to bring these concepts back into balance we first have to realize that 'innovation' is not the same as 'solution', and if we're honest, solutions are what we're truly after in the field of applied science.

LAUNCH is working on identifying solutions, which I value as a scientist. The process starts by asking a good question, and then seeking the best, most effective answers - discovering companies rather than natural phenomena. And when I say best, I mean efforts that take into account social and environmental impact from the start, in addition to the more traditional aspects of innovation. I hope this mindset will go a long way towards resetting our obsession with innovation.


  • What are the different dimensions where one could provide a "Ecosystem for better innovation in Chemistry"?

  • What are the specific dimensions are you focusing on?

  • What are the key challenges across each of these dimensions?

  • How are you addressing these challenges?

denzil_correa

Hi denzil, thanks for the question. Its' a long answer but I hope it's helpful.

Chemistry, like every field, has pressures that shape it. A healthy ecosystem allows for the development of new processes and methods that displace the old. Markets in general, are not healthy ecosystems. Companies resist change unless there is an indisputable benefit, and often, that benefit must be financial.

LAUNCH takes on this resistance to change by involving the market from the very beginning. We go beyond asking how to make a better widget, we're asking how we improve making as a whole. And those answers are the same no matter who you ask - environmentally sound practices, doing more with fewer materials, and reducing waste while making better, more effective products for consumers.

In chemistry, we're limited in our ability to do these things by our lack of information. We need more information, better curated and more accessible, to be able to build this future of making.

LAUNCH prepares the market by involving them from the beginning to identify the problem and form the question (this is the challenge). We then connect the entrepreneurs that answer with a market that has specifically asked for what they do.


Additional Assets

License

This article and its reviews are distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and redistribution in any medium, provided that the original author and source are credited.