Science AMA Series: We’re Sarah, Hansi and Aurora, postdocs at National Labs in the US. We each study a different flavor of computational science: computational genomics (Sarah), climate science (Hansi) and mathematical chemistry (Aurora). AUA!

Abstract

Hi reddit!

I’m Sarah Richardson. I specialize in the design of genomes and the creation of all the technological tools necessary to be able to write this sentence with a straight face. I work on massive scale synthetic biology projects (Sc2.0), the construction of genetic toolkits for non-model organisms (CRISPR for GMOs), and the reconciliation of computational genomics with experimental genomics (bioinformatics is not IT). All of which is to say, I am a germ wrangler who uses DNA to train microbes to do tricks.

I’m Hansi Singh. I will soon be joining the Atmospheric Sciences and Global Change Division at Pacific Northwest National Lab (PNNL) as a Linus Pauling Distinguished Postdoctoral Fellow. I research climate variability and change, with a focus on the polar regions. One intriguing open question is why Arctic and Antarctic climates are responding so differently to anthropogenic forcing by greenhouse gases. The tools I use run the gamut from global climate models run on supercomputers, to small heuristic models that can be analyzed with pencil and paper. I am also interested in developing novel mathematical analysis methods for improved understanding of coupling in the climate system and global teleconnections. For more information on my research interests and publications, please visit www.atmos.washington.edu/~hansi

I’m Aurora Pribram-Jones, and I tinker with electronic structure theory. I build mathematical tools to investigate how well we describe electrons in metals and molecules. My interests lie in analyzing and developing density functional theory (DFT), one of the most popular computational methods in the world, and how it’s used for thermal ensembles. Day to day, this means I get to interact with shock physicists, planetary modelers, and fusion scientists while imagining pseudo-molecules and drawing pictures. My newest projects look at applications of DFT methods in other complicated systems: materials for hydrogen storage, high entropy alloys, and materials responding to lasers.

We're signing off now, but will continue to answer questions where we can. Keep an eye out here and at /u/TheGermWrangler. Thanks for having us!

Aurora, Hansi, and Sarah

Hi Dr. Richardson, Dr. Singh and Dr. Pribram-Jones. Thanks for taking the time to do this AMA.

Dr. Richardson: What is a genetic toolkit and how do you construct them? What do they allow you to do?

Dr. Singh: I was surprised when you mentioned that you work with models that are simple enough to be analyzed by hand. Do you struggle to balance robustness, complexity and accuracy? Could you give an example of one of your simplest and one of your most complex models? Do you have a favorite?

Dr. Pribram-Jones: My quick googling reveals that DFT is a quantum mechanical model. Are there special considerations you have to make in order to model quantum phenomenon using traditional (binary) computers? Also, what is a pseudo-molecule?

PapaNachos

Hansi Singh here: I once analyzed a 5-equation model of atmosphere-ice-ocean interactions for understanding the Dansgaard-Oeschger oscillations in the climate system. I love tiny models! Then, I also work a lot with global climate models, which involve thousands of equation and millions of lines of code. Both have their place. If you can show that something is true with a hierarchy of models, I think you've said something about the robustness of your hypothesis, and you get a better understanding of the real climate system.


Hi Dr. Richardson, Dr. Singh and Dr. Pribram-Jones. Thanks for taking the time to do this AMA.

Dr. Richardson: What is a genetic toolkit and how do you construct them? What do they allow you to do?

Dr. Singh: I was surprised when you mentioned that you work with models that are simple enough to be analyzed by hand. Do you struggle to balance robustness, complexity and accuracy? Could you give an example of one of your simplest and one of your most complex models? Do you have a favorite?

Dr. Pribram-Jones: My quick googling reveals that DFT is a quantum mechanical model. Are there special considerations you have to make in order to model quantum phenomenon using traditional (binary) computers? Also, what is a pseudo-molecule?

PapaNachos

Richardson here:

Broadly, a genetic toolkit is a set of DNA that allows researchers to change the behavior of a cell. When we say this about bacteria, we usually mean "vectors" -- circular DNA that we can edit and which can then enter the cell and be maintained by the cell. For the most common example, E. coli, there are three types of vectors, and together they make up a basic genetic toolkit that allows us to manipulate that bacteria.

We can also include protocols in that genetic toolkit - how do you get a vector into a cell? How do you kill cells that don't take up the vector?

I construct toolkits by making vectors and protocols for bacterial species where they have never been established before.


Hi Dr. Richardson, Dr. Singh and Dr. Pribram-Jones. Thanks for taking the time to do this AMA.

Dr. Richardson: What is a genetic toolkit and how do you construct them? What do they allow you to do?

Dr. Singh: I was surprised when you mentioned that you work with models that are simple enough to be analyzed by hand. Do you struggle to balance robustness, complexity and accuracy? Could you give an example of one of your simplest and one of your most complex models? Do you have a favorite?

Dr. Pribram-Jones: My quick googling reveals that DFT is a quantum mechanical model. Are there special considerations you have to make in order to model quantum phenomenon using traditional (binary) computers? Also, what is a pseudo-molecule?

PapaNachos

Pribram-Jones here. Thanks for the question! I'll start with answering your question about my weird word, pseudo-molecule. I used that word to describe a particular weirdness in DFT. DFT is used to figure out what electrons are doing in molecules and metals, for instance. Answering that question for real-life, messy electrons is tough, one of the hardest questions out there. DFT uses a trick where we pretend a bunch of aloof, non-interacting electrons are hanging out in a slightly different molecule that by design gives the same likelihood of finding electrons floating around as you'd find in the real-life molecule. This non-interacting problem is way easier to solve than the messiness of real life. We know from the fancy foundational theorems about DFT that all we need is this probability of finding electrons (called the density) in order to get a bunch of important information about the molecule, like the energy of the lowest state, for instance. So, as long as we get the same probabilities for the aloof calculation and the interacting calculation, we can avoid the hard problem and still get the answer we care about. Now, in practice, we have to use some approximations to actually carry this out. But we can think about what would happen in the exact case and figure out some constraining properties that must be true in the exact mapping.

I sit around thinking about this exact mapping a lot, so I'm usually talking about these pseudo-molecules, the non-interacting aloof ones, and how they relate to the exact messy one.

ETA: Oh, I got so excited that I forgot about your first question. Modeling quantum behavior is hard for lots of reasons. But when people talk about how exciting quantum computers will be, it's usually more about their computational prowess over a wide range of topics. One thing that's very hard about doing quantum problems is that all of these particles are correlated, they care about each other's behavior. This makes splitting up the problem into parallelizable pieces way more difficult. Density functional theory, what I work on, lets us skirt this tricky problem via that tricky mapping to a non-interacting system. And that trick works no matter what kind of computer we use because it's more about rewriting the problem than hitting it with a different, bigger, better computer.


Dr. Pribram-Jones: It sounds like you work with people who work at a wide range of scales (from particles to planets). What's your favorite thing about dealing with that kind of variety in your work?

myersjustinc

I love this question! Being focused on one scale can give you horrendous tunnel vision. Working with people who actually use DFT for a wide range of applications keeps me from spiraling on my own scientific neuroses. For instance, something that might matter at the atomic scale might be totally washed out by the time someone is modeling a planetary core. This happens with temperature and time scales too, so working with so many different people keeps me on my toes.


How do you describe what you do to the general public? What makes your work worth funding? I applaud you all for engaging us with this AMA but I'm curious how much "advocacy" do scientists do, do you have time for it and how do we help convince the general population something so granular matters a lot in terms of our energy/environment/health future.

mia7812

Richardson here:

I am a black female scientist. I advocate (sometimes by my very presence) for three things: science, female participation, and under represented minority representation.

I go to the public schools around Oakland, Hayward, and Richmond at least twice a month for what are essentially hour long AMAs with science students from grades 2-12. First question: you don't look like a scientist -- are you really a scientist? My lesson from these interactions is that a lot of people don't have a stake in science. They've never met a scientist, even. The least I can do is give them a stake in it. Science is for the people.

More generally I explain to everyone that they should fund science so they can make the scientists come back to them and explain themselves. Science is not just about reaching a frontier, it's about paving the path you forged so that it isn't a frontier for anyone else. If I carve a path to a new discovery how terrible am I to force those that follow me to struggle as much as I did? In that sense a scientist is obligated by their profession to be a communicator. That is not to say that I am often rewarded for taking time to work with students, or members of the public.


How do you describe what you do to the general public? What makes your work worth funding? I applaud you all for engaging us with this AMA but I'm curious how much "advocacy" do scientists do, do you have time for it and how do we help convince the general population something so granular matters a lot in terms of our energy/environment/health future.

mia7812

Singh here: no advocacy, just education. I try to show people how interesting the science of the Earth system is.


How do you describe what you do to the general public? What makes your work worth funding? I applaud you all for engaging us with this AMA but I'm curious how much "advocacy" do scientists do, do you have time for it and how do we help convince the general population something so granular matters a lot in terms of our energy/environment/health future.

mia7812

Singh again: Totally agree with Sarah. As a woman and person of color who's also a scientist, outreach is a part of my advocacy. We need to expand the purview of science, math, and geekdom beyond the white male population.


How do you describe what you do to the general public? What makes your work worth funding? I applaud you all for engaging us with this AMA but I'm curious how much "advocacy" do scientists do, do you have time for it and how do we help convince the general population something so granular matters a lot in terms of our energy/environment/health future.

mia7812

Pribram-Jones here: How I describe my work to the general public often depends on who the people are (kindergarteners vs. college students) and what they care about (science fiction vs. oceanography). There are organizations that advocate for certain funding decisions, but I present facts as best I can to as many people as I can. A lot of my "advocacy" comes in day-to-day interactions, so people see who I am, how I am, where I come from, and what I care about. This goes for my science as well as my identity. I pass as a white cisfemale, though I have mixed heritage and am female nonbinary in gender, so I am often advocating by my presence and my participation in conversation and actions.

In case you really wanted to see how I might describe my work, here's one explanation. What I do builds tools we use for calculating how electrons behave in molecules, metals, and materials. Electrons are important because they are the movers and shakers in chemical bonds, whose forming and breaking determine whether your food gives you energy, your car bottom rusts out, and whether your computer battery explodes. They are tricky, and we have to use computers to figure out some of their important behavior. Making good tools means we can make better stuff and keep our stuff (tools, environments, bodies, the whole gamut) longer.


A few question for Hansi Singh:

  1. Which came first, your interest in math and computer science or climatology?
  2. What progress were you able to make in understanding Dansgaard-Oeschger events during your PhD?
  3. What is your take on the differing reponses of the Arctic and Antarctic to anthropogenic GhG forcing? Do you see ozone loss over Antarctica as playing an important role?

Thanks!

IceBean

Singh here: (1) Definitely math and physics first! I was in high school in the 90s, so there were no computer classes in my high school. I only started coding in college, and pretty intermittently. When I returned to grad school in 2009, that was when I began coding in earnest. (2) That you might not require any large changes in the ocean's overturning circulation for large shifts in north Atlantic climatology to occur. Most previous studies have assumed that unexplained changes in the overturning circulation instigated changes in north Atlantic climate over the last glacial period. I think that particular hypothesis is a bit unsound, in that you might not need such ad hoc external factors. I think our work showed that interactions involving the local ocean hydrography, sea ice, and atmosphere could give you millennial time scale oscillations of the current magnitude. This is still an underdog hypothesis, by the way. But maybe, as more proxy evidence accumulates, it and other existing hypotheses can be further vetted. (3) Yes, that ozone loss! I think it definitely has something to do with it, particularly because ozone loss induces wind changes that can increase poleward energy transport by Southern ocean eddies. The increase in sea ice around Antarctica is definitely perplexing, considering the strong sea ice decline in the Arctic. The other hypotheses are the ocean circulation (the ocean meridional overturning cell structure around Antarctica is very different than that in the Arctic) and circulation changes due to changes in buoyancy (perhaps driven by ice shelf melt or changes in precipitation, which stabilize the water column), the effect of Antarctic orography (3000+ m in some spots, which affects poleward energy transport by the atmosphere) compared to the Arctic, and land/ocean arrangements (ocean surrounded by land as in the Arctic, vs land surrounded by ocean in the Antarctic).


A question for all 3 of the guests:

How much of your work is done in the cloud? How do you expect this to change? Are there prohibitive factors?

Thanks so much!

discofreak

Singh here: I'd love more of our meetings to happen over the cloud. I hate jet lag. Most of my computing, however, does happen remotely. I don't have my own supercomputer (though sometimes I wish I did; I hate having my jobs waiting in the queue!).


A question for all 3 of the guests:

How much of your work is done in the cloud? How do you expect this to change? Are there prohibitive factors?

Thanks so much!

discofreak

Richardson here: If you mean how many jobs can I send off to our robotic BLAST overlords and all the other servers that specialize in doing my computational work -- NONE. None of my work is modular or common or established enough that I can just spawn off jobs.

I use Dropbox to keep my research library and laboratory notebook, and I am sometimes disappointed in my bacteria. That is as cloudy as I get.


A question for all 3 of the guests:

How much of your work is done in the cloud? How do you expect this to change? Are there prohibitive factors?

Thanks so much!

discofreak

Yep, Dropbox and github here, but mostly for sharing and storinig documents. I probably do that everyday. I have, on occasion, worked on projects that had real, honest-to-goodness computation going on. There, I queue my jobs up for supercomputing like Hansi.

Truth be told, my version of "computational science" is mostly under the hood of under the hood, so I'm not even at the level of designing mindblowing algorithms. I am looking at and creating the equations that we need to implement in computational schemes to solve these electronic structure problems.


I have a question for Sarah. How has Deep Learning impacted research in Genomics? Are there any success stories?

denzil_correa

Sarah here with an IMHO disclaimer:

Functional genomics needs to come before deep learning. I don't think we have enough data about genomes to apply Deep Learning, Machine Learning, Data Science, or >Buzzword< yet. That is obviously an over broad statement, and I welcome rebuttal because who doesn't love to argue, but -- it is so much easier to sequence a genome than it is to see a genome in action, or to perturb it. That means genome sequence stacks up and genome analytics lags by decades. That means genome sequence gets misinterpreted as genome data. It's not enough information. By functional genomics I mean more data -- properly formatted for the deep learning etc -- that links sequence to biochemical and regulatory function and timing. In bacterial genome annotations, gene function is guessed at by sequence similarity - and function is never, ever proven. That is a huge hurdle to jump and that's just bacteria.

Sorry to be a buzzkill.

ETA: Also if you want to know the human genetic markers for longevity I think you're going to need long term study - lifespans of sequencing and waiting to see who dies of what. How can you know which alleles are associated with longevity in an 18 year old? Not to sound morbid but lets see what takes that kid out. This necessarily means the people paying for such research now will not benefit. So... root for chemical longevity, i guess?


Thanks for doing this AMA! I have a couple of questions about modeling.

How does one ensure rigor in computational and mathematical models? Are there general standards for ensuring rigor that work across different disciplines, or does it have to be subject-specific?

More generally, could you give a layperson-friendly description of how one goes about developing a novel computational or mathematical model? Where do you start? Is a computational model primarily a tool to transition into a mathematical one, or vice-versa, or do they each have their own uses?

rslake

Pribram-Jones here: Rigor is a slippery term for scientific and social reasons. Some spaces use it as way to judge others for not being "hardcore" enough, which often gets skewed by people in a dominant group as weapon of exclusion. But I am also into rigor and predicability of errors in their friendly, science-y, inclusive form so I love this question.

I approach this question by thinking about how nearly everything we do in science is us dealing in and interpreting models. So, you can be a real nerd buckling down on super accurate quantum mechanical calculations, which can make DFT or classical forcefield models look "less rigorous." But most folks leave out other things beyond "everyday" quantum mechanics even when working at that level. Why? Because it doesn't matter to their problem! So the rigor of that method doesn't matter there either because they think it's the best method for their question of interest. Working across topics means that I am confronted with my own assumptions quite a bit, whether in how I think about problems or how whatever version of DFT makes assumptions about what does or doesn't matter. For instance, the vast majority of DFT calculations couldn't care less about temperature dependence because room temperature is effectively zero temperature to an electron (for them to go "up" one step in energy, they'd need to be really hot). But for people working here at LLNL, temperature often is that high, so zero-temperature, ground-state DFT is a model that doesn't always work for us.

Some things you can do to consider robustness across fields include checking how sensitive your answers are to different assumptions. Does my answer change as I assume opposite things? If it changes a lot, the model probably needs to assume that matters in some way. We often get used to thinking that the piece of the physical world we care about matters out to some sort of boundary, as that's where we live mentally and it's where we're comfortable, where we think the action is. It's why we care about it! But without stepping outside of that boundary periodically, it's easy to lose sight of when that boundary makes sense and when it doesn't. For instance, DFT is a very powerful tool, but its different flavors just don't do certain things that are necessary for some kinds of systems. The rigor or lack thereof of DFT isn't really the issue there, as the question you are asking matters way more than the intrinsic qualities of the theories and models.

Another issue is implementation, particularly in moving from theory (writing down the equations and ideas) to computation (actually calculating things based on a particularly theory). To implement a theory, you usually need to solve a bunch of mathematical equations, and you need to be certain that the way that you are solving those equations on a computer gives you a consistent answer. This is a whole other question of rigor and careful choices.

Developing models contains a ton of choices, many of them very personal. In my field, there are different philosophies about how to construct approximations, which is similar and gets at some types of those questions you have to ask about models in general. Which you adhere to is sometimes ascribed to your identity as a chemist or a physicist, but I think it's more about what your objective is. Some of us really dig using mathematical and physical constraints on the exact formalism of DFT as our favorite tools to help build approximations. Often, this produces approximations that are more systematic in their errors, but that give larger errors. So, for instance, maybe they are less accurate, but you always know the value you get is too small for a calculated property. On the other hand, you can take very good answers from another source (experiment or extremely accurate calculations), and use those to help you fit a functional form. This often gives you higher accuracy, especially for systems that "look like" your fitting set. But the downside is that you don't always know where systems start looking "too different." So, you're faced with a decision: do you need to be right or be predictable? (This is a huge oversimplification. For more about this in a lighthearted narrative, you can look at this review my co-authors and I wrote - see Cultural Wars and the included fable).

Edited to fix confusing typos.


@Hansi and Aurora: I'm an electrical engineer interested in optimizing computer architectures for various applications. What tools do you use for the computations you perform? NVIDI's CUDA seems like a good start, but I have a feeling that we could eventually develop even more application-specific computer architectures to allow for even faster execution of certain algorithms.

Have you studied how your algorithms are implemented in hardware? Are your problems structured in a data-parallel way that makes it easy to break them up for multiple processors? Have you found that there are bottlenecks to the computation you're currently using? Are there any features you'd like to see in future computing architectures (or languages / compilers)?

jesterbuzzo

Sorry for the delayed response to this one. A lot of these questions are beyond the scope of my research, since I'm not doing much of the real work that goes into implementing DFT. Actually, most of what I know about these questions in the context of DFT and quantum chemistry comes from talking with other CSGF fellows (the graduate fellowship from the DOE that three of us had during our PhDs). For a perspective on parallelization within a quantum chemical context, you might check out work by Edgar Solomonik and Devin Matthews. For instance, their work on Cyclops Tensor Framework addresses load imbalance and communication concerns in massively parallel coupled cluster calculations. Another alum of the program (and another national lab postdoc), Jarrod McClean, has done some work combining quantum and traditional computers, which might be a way to reduce coherency period requirements.

The computational bottleneck for the way most DFT calculations are performed is solving the Kohn-Sham equations, a set of eigenvalue problems that gives us the energies and eigenstates of electrons in our systems. This is the big drawback to mapping the complicated interacting system to the non-interacting system, which is easier to solve, but still computationally expensive. This becomes a real problem in my world, where temperatures get very high. This means a huge number of states are accessible to the electronic systems of interest, so these systems of equations get massive and even more expensive.

One way around this problem is orbital-free methods, which means you avoid solving this eigenvalue problem, relying instead on using only the electronic probability density (i.e., what's the chance of finding an electron at this position). Density functional theory tells us that we can get exactly the right energies out of a calculation from only this density, but we don't know the formula that gives us this exact answer. This means a lot of orbital-free theories are computationally inexpensive, but less accurate than Kohn-Sham DFT. There's a variety of people working on how to write down better and better formulas for the energies, in order to take advantage of this computational efficiency, both at zero temperature and in temperature-dependent situations. If you'd like to see the version that I developed with my collaborator, called finite-temperature potential functional theory (FT PFT), this paper gives the formalism and a numerical demonstration of how it works. This is just in its early stages, so is only demonstrated in an extremely simple system, but we are working with other collaborators on developing three-dimensional approximations that would let us use it in real systems.


Hansi, PNNL is known for innovation in high performance computing, I worked with them in the past and heard about their recent designs, reducing the impact of supercomputers on the environment. How environmentally sustainable are supercomputing and cloud computing?

iamisg

Singh here: I think the biggest issue is energy use. Unfortunately, our processors are very power hungry, especially the fast ones. Luckily, there are computer engineers out there working on this problem. Hopefully, the next generation of supercomputers will be more responsible power consumers. It's certainly ironic that the supercomputing that I (and many other climate scientists) do is, undoubtedly, contributing to greater carbon loading of the atmosphere.


Dear Dr. Pribram-Jones: I am a grad student in computational chemistry. I am doing my second attempt for a doctorate. In my previous school I tried to develop a hybrid QM-MM method, but got negative results, which annoyed my previous advisor. How often do you see situations like this happen?

Now I am doing more practical and applied research (still in computational chemistry) in Chicago. Now it seems that I can get a reasonable amount of publications. I just started in December. What are job prospects for PhD's in computational chemistry?

Finally, I have a disability. How do people with disabilities succeed in science? What are their employment prospects (i.e. would employers be interested in hiring people with disabilities)?

vanrossum1

Richardson here: People with physical disabilities face an uphill battle to participate in experimental science. I have been in experimental laboratories for seventeen years and I have never worked with someone who needed permanent assistance to walk. I am not going to say no one is out there, just that it is incredibly hard and that they may be encouraged to pursue other paths. Labs are often old places that are not set up for Universal access. This is atrocious but true. We should all work to change it.

People with mental disabilities face all of the same hurdles in science that they do in other areas, potentially worse because of Science's obsession with "brainpower".

What we hope is that employers hire based on record and potential. What we know is that they don't. I don't have a good answer but I wish you luck.


Dear Dr. Pribram-Jones: I am a grad student in computational chemistry. I am doing my second attempt for a doctorate. In my previous school I tried to develop a hybrid QM-MM method, but got negative results, which annoyed my previous advisor. How often do you see situations like this happen?

Now I am doing more practical and applied research (still in computational chemistry) in Chicago. Now it seems that I can get a reasonable amount of publications. I just started in December. What are job prospects for PhD's in computational chemistry?

Finally, I have a disability. How do people with disabilities succeed in science? What are their employment prospects (i.e. would employers be interested in hiring people with disabilities)?

vanrossum1

Singh here: Well, employers better be interested in hiring people with disabilities. Both because it's the law that an employer can't discriminate, and because it increases the diversity of the workplace. The latter is a good thing for everyone involved, including the employer. This should definitely not be the thing that stops you from pursuing your career in computational chemistry. I wish you the best.


Dear Dr. Pribram-Jones: I am a grad student in computational chemistry. I am doing my second attempt for a doctorate. In my previous school I tried to develop a hybrid QM-MM method, but got negative results, which annoyed my previous advisor. How often do you see situations like this happen?

Now I am doing more practical and applied research (still in computational chemistry) in Chicago. Now it seems that I can get a reasonable amount of publications. I just started in December. What are job prospects for PhD's in computational chemistry?

Finally, I have a disability. How do people with disabilities succeed in science? What are their employment prospects (i.e. would employers be interested in hiring people with disabilities)?

vanrossum1

Pribram-Jones here: There are so many good questions in here. S and H hit good points about disability and science. I had a pretty severe limitation for about a year of labwork before grad school. It was brutal. I'd be surprised if things were not especially bad in grad school as far as access because grad school is set up to be exclusionary. There are lots of us working to make changes in the way people perceive inclusion, but frankly, it's an uphill battle with the sciences. IMHO, the myth of scientific objectivity and ignorance about implicit bias means that scientists can be especially resistant to inclusion measures and hesitant to acknowledge privilege because of how it devalues the struggle we've been taught to value historically. But there are advocates out there, and hopefully Hansi's point means that more people beyond grad school will get it together about access and valuing the better teams it will provide employers and groups.

As far as your question about negative results, those happen all the time! All of us hit snags and failures, but it's key to use that to refocus in some other way. Method development is challenging in many ways, particularly if you're trying to combine quantum mechanics and classical mechanics. With QM-MM, it's my understanding that you have to decide how to tie those two approaches together, which invites a whole new set of questions beyond developing just a quantum mechanical method and a molecular mechanics method.

Some advice I've been given by mentors is to have a range of projects, some that are long-term, reach goals, and some that are shorter-term goals. If you're trying to develop a whole new method, that's a long-term goal with lots of potential for trouble, no matter how promising it seems. So, if you're working on something like that, it's a good idea to have something complementary that helps you in some way with that but doesn't rely on the full success of your new method. For instance, if in that situation again, you might see if you can use an established method on new types of systems, or if you can improve some small piece of the overall method. That way, even if you don't have a fully functional, novel method at the end, you have something tangible to show for all your hard work. I feel like this sort of strategizing is important for all of us because it's so hard to predict success, let alone when it will happen, especially as a grad student just getting their chops in the research game.


Dear Dr. Pribram-Jones: I am a grad student in computational chemistry. I am doing my second attempt for a doctorate. In my previous school I tried to develop a hybrid QM-MM method, but got negative results, which annoyed my previous advisor. How often do you see situations like this happen?

Now I am doing more practical and applied research (still in computational chemistry) in Chicago. Now it seems that I can get a reasonable amount of publications. I just started in December. What are job prospects for PhD's in computational chemistry?

Finally, I have a disability. How do people with disabilities succeed in science? What are their employment prospects (i.e. would employers be interested in hiring people with disabilities)?

vanrossum1

APJ again: Sorry, I didn't see your whole question on my browser before. I'm glad things are going better for you now. Grad school is a long haul. As far as job prospects, it probably depends a bit on what skills within computational chemistry you develop over the course of your graduate work. I work in a group at LLNL called the Quantum Simulations Group, where there are lots of computational chemists and physicists working on problems from across the lab's experimental areas, so becoming a postdoc or staff scientist at a national lab is one place outside of academia to look for work. Other places that use computational chemistry might be in pharmaceuticals or materials engineering, but I would bet most of those places would want a PhD with some specific experience in that area.

In my experience as someone who had a few different careers before returning to college, no matter what you're interested in, having more skills gives you more options. For instance, running codes will not serve you as well as learning some coding yourself, as that's a skill almost everyone needs at some point in and out of the sciences these days. This is particularly true if you're hoping to branch out from computational chemistry into other technical areas. Another way to do this within chemistry is to keep up some of your experimental skills related to your area of computational expertise.

By far the most important skills to develop for future job prospects are communication skills, being able to write and speak well about technical work across communities. A complaint I hear from friends about working with PhDs is that we get far, far too focused on one narrow area for too long, so we can easily forget how to talk to anyone outside of our field (or even within it sometimes). This doesn't fly outside of academia and shows up almost immediately in interviews, if not before. So, get out of the lab, do outreach, write things other than papers, and make connections with non-scientists and non-chemists where you can. Good luck!


Sarah, what do you think about synthetic microbiomes and synthetic phages for modulating microbiota affecting human health?

iamisg

Sarah here: We don't know enough about non-perturbed systems to start perturbing them right now. I vote no no no no no. Not now.

A timeline argument: right now it's hard for those who study microbiomes to know what's going on in there. We only have the coarsest snapshots. To charge in early is to miss a trick.

A phage argument: the scope of phage is really really hard to appreciate. Some virologists will tell you that most viruses and phage are not pathogenic, they seem to be more like global scale genetic reservoirs. That makes me careful about contributing to their scope.

A philosophical argument: To propose something synthetic (or human-designed) to replace something so poorly understood, so subtle, intricate, and time tested intricate is inherently arrogant. I am pessimistic about the forecast for success.

A capitulation: I am not that hot on a synthetic microbiome, but to I like the idea of altering the microbiome in place. This could mean adding or removing genetic material, but not adding or removing species. Bonus -- we're good at that it's called pollution.

Further capitulation: I would totally build you a synthetic micro ecosystems in a glass and you can call it a synthetic microbiome if you want.

(you pushed one of my buttons)


For all: What was your experience like in the DOE CGSF program and now as postdocs in national labs? Any advice to those considering such a similar path?

obriennolan

Richardson here: CSGF was AWESOME. Any student should leap at the chance to be in any established fellowship. To have ready access and support from a program but also from a cohort of students with similar interests and similar problems is a serious asset -- its about way more than the money or the prestige. Graduate school is a terrible ordeal. It is a confidence sucking time hole that no one survives unscathed. CSGF made it slightly less awful, which is actually quite a feat! I have several other fellowships now and while I enjoy the company of all of my fellows CSGF still stands out (DON'T TELL MY OTHER FELLOWS).


Thanks, everyone, for the AMA!

Dr. Richardson: You mentioned "the reconciliation of computational genomics with experimental genomics". I don't know much about the field, so I'm curious what kinds of things need to be reconciled. Are you refining models to align more closely with experimental observations, or are there some fundamental discrepancies that need to be resolved?

myersjustinc

Richardson here: I am straight up talking about the fact that computational biologists and experimental biologists live on different planets. They have non overlapping training and skills and they frequently have no real understanding of (or respect for) the requirements of life on the other planet. They are prone to skip collaborating, to collaborate only shallowly, and worse, to force students to pick sides. And yeah, I'm about to trash talk both kinds so please take the next paragraph as IMHO:

An experimental biologist often will not be able to explain to you the difference between computational biology, bioinformatics, software development, or IT -- any Ph.D. who can use "sudo" is a computational biologist as far as they are concerned. A computational biologist often has never grown a bacteria and does not understand why it takes so long to get a result, or why an experimentalist is only going to do three repetitions of an experiment -- they frequently call the bench "icky".

As someone who can do both, I am frequently asked to do only one because it's easier for each side to typecast me than to find someone from the other side who is willing to learn both languages. So the experimentalists just want me to compute and the computationalists wish i would just experiment. Being a natural contrarian, I am bent on teaching experimentalists to compute and teaching computationalists to experiment - not to convert, but to start building the bonds of respect and understanding on both sides so collaboration gets better.


Hello all,

As someone who has a basic grasp of how to to code, how would you suggest someone who is interested in applying/learning more about computational science go about learning more? I'm familiar with the methods to solve "simple" PDEs, but it always seems like most computational scientists ended up doing a Ph.D. that area leaving the rest of us in the dark about how to model systems. Is there any way you'd suggest to "ease" my way into the world of computational science beyond pursing another Ph.D.?

rseasmith

Singh here: I see that you are an environmental engineer. If one is modeling the climate, what you get, under the hood, are the Navier-Stokes equations, with some assumptions made to remove sound waves, add rotation, deal with compressibility, account for shallowness, etc. These are then linked to equations about thermodynamics and radiative transfer. So, modeling systems is often, at its crux, about using mathematical physics (often just Newtonian). If you're writing a low-dimensional mathematical model, on the other hand, you have to have some physical intuition about how these very unwieldy equations can be simplified; in other words, breaking things down to fundamentals for a particular limited problem. The issue with these giant models, of course, is that they're expected to be applicable to any problem. When you have a smaller problem of interest, paring down the problem to a simpler model is easier.


I hope these questions aren't too off-topic. How did you get started in working in this field, what program would you recommend to get started in programming and which do you use?

Are there some specific models who are benefitiary in your work (some kind of multiple time series regressions)?

alessandrux

Singh here: If you are interested in coding, I definitely recommend Python (after you've learned the basics on something like code.org). I got started in this field because I love math, and a lot of complex math can't be solved by hand. That's where computers come in. For me, I was also very interested in using the tools of math and computers to understand the natural world, so I veered off into the physics of the climate system.


I hope these questions aren't too off-topic. How did you get started in working in this field, what program would you recommend to get started in programming and which do you use?

Are there some specific models who are benefitiary in your work (some kind of multiple time series regressions)?

alessandrux

Richardson here: Old school biologists use Perl. I insist my students learn Python.

If you're still in high school to undergrad my advice is to take computer science classes in college. At least data structures and algorithms. You don't have to major in CS. But the easiest time to learn to program is when you still have dedicated class time.

My program for teaching the people who have no class time, or the recalcitrant scientist to program is the first round of CodeAcademy Python for syntax, and then either Project Euler or Rosalind depending on what their domain science is. The idea is to learn programming FOR their domain, so they feel confident about something while taking a risk on something else. That means that I ask a biologist to learn to code by starting to count DNA bases. They don't have to be told what DNA is, so they feel like they know something, and that what they're learning has an application.

Unless you're the kind of person that it works for, don't just buy a book and follow along. Pick some problem in your domain (or in your life) and vow to use programming and only programming to solve it. It could be deleting duplicate photos from your laptop; it could be parsing excel data. Just don't cheat and do it brute force. Give yourself permission to GOOGLE EVERY ERROR. That's what the professionals do.


I hope these questions aren't too off-topic. How did you get started in working in this field, what program would you recommend to get started in programming and which do you use?

Are there some specific models who are benefitiary in your work (some kind of multiple time series regressions)?

alessandrux

Richardson here: I got started in biology because I wanted to meet aliens. I got started in computational biology because my boss was doing DNA editing in Microsoft word and I COULD NOT LET THAT STAND. That was when I was an undergrad; I wrote a perl program that did it better and cleaner and it's been downhill ever since.


Sarah: What kind of tricks is it useful to teach microbes to do?

Hansi: Can you tell us more about how the Arctic and Antarctic are responding differently? What do we know about what is happening there and why?

Aurora: Please tell us more about DFT- as one of the most popular computational models in the world, what are some things it's used for that we might not realize?

p1percub

Aurora here: DFT gets used in around 30,000 scientific papers a year, so there are tons of different ways to use it. The majority of those use DFT to calculate properties of metals and molecules, which means there are lots of neat ways to find DFT answering scientific questions. It's been used to design new catalysts and battery materials.

A surprising example from my world is that DFT, even without temperature-dependence that we know it's missing, and even with a bunch of serious assumptions about energy transfer and how electrons and ions are tied together, even with all of that, DFT still does a really good job at calculating properties of certain very hot and very dense systems. These hot and dense systems are forms of warm dense matter, the kind of stuff you find in the center of Jupiter or on the way to ignition of fusion capsules. Is it perfect? No. But that it does so well with all those caveats is very striking and an example of what a good tool DFT can be for systems it was certainly not designed to tackle.


Sarah: What kind of tricks is it useful to teach microbes to do?

Hansi: Can you tell us more about how the Arctic and Antarctic are responding differently? What do we know about what is happening there and why?

Aurora: Please tell us more about DFT- as one of the most popular computational models in the world, what are some things it's used for that we might not realize?

p1percub

Sarah here:

The most important trick we can teach a microbe is how to live with us without killing us. Most bacteria on the planet have got that down pat -- the ones that don't need some training (Staphylococcus aureus I am looking at you). Most of our history with microbiology has been concerned with those bacteria that hurt humans, livestock, and edible plants (pathogenic bacteria).

Aside from that sort of biomedical stuff, there is domestication of bacteria. Most bacteria already have a pretty neat skill, like making a chemical that is super useful to humans (antibiotics are a big one), but they only do it for themselves. The trick I would teach them is to do it in the lab, for food (they don't take money I tried that first).

Then there's the bacteria that are potentially capable of making new chemicals but have never in their natural history had to. Biofuel falls into this category. That would be a great trick - bacteria to grow on discarded plant mass like compost, to make drop in fuel for your vehicle. Get your modern biofuel instead of all the ancient biofuel stuff they call fossil fuel.


Sarah: What kind of tricks is it useful to teach microbes to do?

Hansi: Can you tell us more about how the Arctic and Antarctic are responding differently? What do we know about what is happening there and why?

Aurora: Please tell us more about DFT- as one of the most popular computational models in the world, what are some things it's used for that we might not realize?

p1percub

Hansi here: That's the million dollar question, isn't it. Unfortunately, I don't think that the answer is a simple, elegant one. There are a lot of reasons why the Arctic and Antarctic may be responding differently to anthropogenic climate forcing: (1) ozone loss over the Antarctic (which has resulted in the poleward contraction and acceleration of the westerly winds over the Southern ocean, thereby spinning up the ocean circulation in the region); (2) the presence of the Antarctic ice sheet, which is high and cold (I wrote a paper on the climatic impact of this ice sheet: Singh, Bitz, & Frierson 2016; you can find this on my website); (3) the different land and ocean configurations of each pole (ocean surrounded by land, or land surrounded by ocean); (4) very different ocean circulations and sensitivities (the Southern Ocean, for one, is the most important site of ocean heat uptake in the world); (5) differences in ice melt and the hydrologic cycle (more on this below); or (6) some complex combination of the above factors.

The other thing that I haven't said anything about is natural variability. The sea ice in the Arctic is retreating quite drastically, particularly in summer. Over the Antarctic, it's every-so-slightly advancing in extent; this is despite the fact that ocean waters surrounding Antarctica are warming (see, for example, Purkey & Johnson 2014). Given the small size of this advance in Antarctic sea ice, it's possible that what we are seeing is, simply, a small magnitude natural oscillation due to stochastic factors. Others have noted, however, that sea ice will advance if you freshen the surface waters, which you are doing by increasing precipitation in the region and melting the ice sheet (surface freshening increases the stability of the water column, which prevents deep warm water from getting to the surface).

So, there's a lot of hypotheses on why the Arctic and Antarctic are responding differently. It's an exciting question that I think we are getting closer and closer to answering.


Dr. Richardson, Dr. Singh, and Dr. Pribram-Jones - thank you for your time today. How do you hope that your success in your respective fields will inspire the next generation of women and minorities to achieve in the sciences?

Chinatown_Kid

Richardson here:

Not gonna lie, the current generation of women and minorities needs some help! All I want is for the next one to need less help and for everyone to see themselves represented in the hallowed annals.

We have always been here! We have always been succeeding! Look for the stories of all the ones who fought for ME to even be here.

My job is to be visible, so that the next little girl with unruly hair has heard of at least one woman scientist. My job is to be loud and honest so that the next black kid who is the only one in the computer science class knows of at least one computational scientist. My job is to give them an option that is not otherwise explicitly offered to them. Represent.


Dr. Richardson, Dr. Singh, and Dr. Pribram-Jones - thank you for your time today. How do you hope that your success in your respective fields will inspire the next generation of women and minorities to achieve in the sciences?

Chinatown_Kid

Singh here: My goal, following my time at the national labs, is to become a faculty member at an R1 university. One reason for this is that I want a more direct role in expanding access to careers in science, mathematics, and computing for those of us who are under-represented in the field. I think that being a visible and outspoken advocate is one way to bring about this better future. I also want to act as a mentor (i.e. advisor) who is committed to helping those under-represented in the field succeed, in academia or elsewhere.

Finally, I'm a big fan of outreach. More of us scientists should be doing it. It's complicated, because many of us that are underrepresented feel overwhelmed by the extent to which we feel responsible for bringing about a more equitable world by participating in as much outreach as we can. This means less time for doing research, writing our papers, etc. In the future, I would like to work to make diversity-promoting activities like this a mandatory part of a science education.


Dr. Richardson, Dr. Singh, and Dr. Pribram-Jones - thank you for your time today. How do you hope that your success in your respective fields will inspire the next generation of women and minorities to achieve in the sciences?

Chinatown_Kid

APJ here: I'm going to echo a lot of what my colleagues have said here.

Representation really does matter, and it's important to me to get out and about, stay loud and authentic to my self, and do good work in research and teaching. Turning the volume up about identity in science is crucial because people really don't think we have always existed in these worlds, despite evidence to the contrary. And yes, feeling the responsibility to do it, to not give up, to give extra, really is crushing sometimes, as Sarah put it. It's enough of a problem that you get warned about being oversubscribed in service areas as part of career mentoring.

I'd say that also, I'd love to see the focus move toward the idea of inclusion instead of diversity, as I worry that "diversity" still centers the status quo as normal, with "others" added for spice or whatever. That's a long way off, I realize, but I think it's important to think about how our problem solving and performance as scientific communities will be strengthened by inclusive excellence, and not just representation, which has unfortunately led to tokenism in some of my experiences. Some of the work I've done with narrative and writing for science students has shown me that our exclusion of identity as a living, breathing part of scientific creativity and the scientific process has left people with the idea that identities only exist for those of us existing in the margins. Your identity, no matter who you are or what your history is, is crucial to your experience of science. It is this observation across many different mentoring experiences, that who we are individually and together directly affects the science that we do, that has me focused so intently on inclusion across the sciences. ETA: Identity and understanding how it influences science is not only the responsibility of protected groups.


Hansi, word on the street is that you're into knitting and that you consider it coding in yarn. Does this craft/hobby/art impact your computational work, and vice versa?

obriennolan

Ha! I think of knitting as mathematics in fiber. There are lots of mathematicians that are into knitting and crocheting (for example, Diana Taimina, who crochets hyperbolic surfaces). Computing, for me, is what I do to solve tricky math problems. In order to do this, of course, you have to go from the continuous world to the discrete world, in which cases, you get back to 'stitches'. So yes, I think they're all connected in some odd way, at least in my brain.

These days, I mostly knit to relax. I've been finding that it's hard to subdivide my creativity into too many little sub-fields, so I've been sticking to the science for my creative outlet these days.

Which is an important point!!!! Science is creative!


Aurora, what are the current limitations of TD-DFT other than system size. Thanks for doing this AMA!

RieszRepresent

Thanks for having us!

TDDFT's limitations come from a few different places. Some of the most serious ones come from how we approximate the exchange-correlation energy (XC). XC is a tiny piece of the energy that is very important, so important that it's sometimes referred to as "nature's glue" for its role in forming chemical bonds. We know that it's important, but unfortunately, we don't know exactly what it looks like mathematically, i.e., how its formula depends on the electronic probability density. This is true for all flavors of Kohn-Sham DFT, the kind where we map the real electronic system to a non-interacting system that's easier to solve. It doesn't matter if it DFT or TDDFT in that respect. Because of this mysterious form, we are forced to approximate XC when we actually use DFT. When using TDDFT, there is almost always a further approximation, in that we assume that we can just use ground-state (time-independent) approximations for XC in TDDFT calculations. This means our XC in TDDFT doesn't have any frequency dependence. We know this is wrong, but making well-behaved frequency-dependent XC approximations has turned out to be really hard. The result of this assumption, called the adiabatic approximation, is that TDDFT can't reproduce certain kinds of excitations in electronic systems. For instance, we know that double excitations are not captured by TDDFT with a frequency-independent XC approximation.

Another limitation in the way most TDDFT calculations are done is that they generally assume that the Born-Oppenheimer approximation holds. One way to think about this assumption is that we can get away with assuming that the ionic centers in our system are so heavy and slow that they look stationary from an electron's perspective. This is true in some situations, but not others, where it prevents TDDFT from capturing nonlinear effects and prevents effective energy transfer between the electrons and the ions they are interacting with. There are methods for doing TDDFT beyond Born-Oppenheimer, such as Ehrenfest dynamics and real-time TDDFT, but use of these methods is relatively rare compared to the larger number of TDDFT calculations done assuming the atomic centers are pinned down with the electrons whizzing around them.

A limitation of TDDFT that I am particularly interested in is shared by DFT: lack of temperature-dependent XC approximations. This isn't a big deal for all systems, but systems at higher temperatures and densities or systems with low-lying excited states should be simulated with finite-temperature XC. It is still an unanswered question where this is a big problem and where it is less important because some of the experiments needed to evaluate that are hugely expensive and/or difficult.

My favorite reference on TDDFT is Carsten Ullrich's book, if you'd like more information.


Additional Assets

License

This article and its reviews are distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and redistribution in any medium, provided that the original author and source are credited.