Science AMA Series: We're the OpenAQ Team, building the world's first open data, open-source real-time and historical air pollution platform. We are building it because open data helps people fight air inequality and no one else was building it. Ask Us Anything!

Abstract

Hi Reddit!

Air inequality - the unequal access to clean air to breathe - is responsible for one out of every eight deaths in the world (WHO, 2014). According to the World Bank, this equates to a loss of an estimated 5 trillion USD to the global economy each year. The impact of air pollution on human health and the economy is a massive injustice to our civilization. Meanwhile, we’ve seen from Bangkok to Los Angeles how meaningful access to air quality data can effectively arm communities to combat poor air quality. Yet, the injustice of air pollution is often compounded by the fact that such access to basic air quality data can be most difficult in the most polluted places. At the same time, many governments around the world, including in severely polluted places, publicly share air quality data - to the tune of 5-8 million data points per day - but in disparate and sometimes temporary forms.

The OpenAQ community (openaq.org) noticed this a little over a year ago, and we decided to capture these data before they disappear and put them in a universal format for anyone to access in a highly available manner. We developed an open-source project (github.com/openaq), and so far have aggregated more than 30 million air quality data points from 42 countries. To date, journalists, public health researchers, policy analysts, low-cost sensor developers, users of satellite data, students, teachers, and others from 1461 cities in 119 countries have accessed our platform, and we receive roughly 500,000 requests each month to our API (docs.openaq.org). An open-source community has developed around the dataset, which has allowed the creation of apps, data-driven media articles, research, and open-source packages in R and python.

We are always seeking software developers, scientists, journalists and lovers of open data to jump in and join us in opening up the world's air quality data for everyone.

Sites:

You can vote for us and other awesome open science projects in Phase II of the Open Science Prize Competition (Vote ends Friday!): http://event.capconcorp.com/wp/osp/vote-now/

About us: Christa Hasenkopf, CEO/Co-Founder of OpenAQ: I'm a PhD atmospheric scientist who got distracted by the worlds of science policy & international development for a few years at USAID and the US Department of State. Before that and along with Mongolian colleagues and an American software developer (who is also my husband, Joe Flasher :)), I launched the first air quality instrument to automatically share data via social media in Mongolia. This is where I first realized the power that even a little open air quality data can have in fighting air inequality.

Joe Flasher, Co-Founder of OpenAQ I’m Joe Flasher, co-founder and lead architect of the OpenAQ platform. I was trained as an astrophysicist but have been working in software development and open data in some capacity for around a decade. I have also shaken hands with someone who shook Carl Sagan’s hand.

Olaf Veerman, Development Seed I’m the project lead of the OpenAQ project for the Phase I of the Open Science Prize at Development Seed. Besides doing open data work, I’ve lived throughout Latin America and worked with civil society organizations to create social impact through the use of technology.

EDIT 1: Thanks to everyone for joining us and for the thoughtful questions and conversation! ALSO: a BIG thanks to the moderators for their awesome work. A few quick notes:

EDIT 2: Even after this closes, we'd love to hear what other air quality related AMA's you'd be possibly interested in having in the future (e.g. low cost sensors, personal monitors, global public health impact of pollution). We'd love to help convene other experts to help answer your questions.

EDIT 3: Just linked a few more references above.

EDIT 4: Here is our wrap up post on this AMA. Thanks again!

Is the correlation between class and access to clean air fairly straightforward? Or are there wealthy people also living in poor air zones (downtown major cities, for instance)?

sutree1

It really depends where you’re talking about in the world and at a what scale. The motivation of our platform was to address a data gap in highly polluted places, so that tends to be where our focus is. In places like Beijing, Delhi, Ulaanbaatar, or Dhaka, for instance, pollution is so severe and widespread that the rich and poor are both severely affected. To get a sense of that, check out this comparison of recent PM2.5 levels among Beijing, Ulaanbaatar, and Washington DC (where we are):

https://openaq.org/#/compare/Beijing%20US%20Embassy/MNB/FRANCONIA?parameter=pm25&_k=2vfkyy

Given that the dose-response curve of PM2.5 and mortality is expected to flatten out at higher concentrations, once you’re at ‘bad’ levels or ‘really bad’ levels, it’s all just ‘bad’ for lack of a better word, in terms of health impacts. (See figure 2 for that dose-response curve I’m talking about here: http://www.cleanairjournal.org.za/download/caj_vol26_no2_2016_p08.pdf)

For places where the air quality is less severe (e.g. the US) and you’re on the steeper part of the dose-response, the health impacts will depend much more strongly on your location and subsequent exposure. A good person to chat with about this on twitter is: @MarshallJulian


How can general public help in this mission? What kind of data can we contribute and how is it quality controlled?

heywhatsthatcalled

Thanks for this question! There are so many ways to contribute - data can be one of them, but frankly, it’s often the case that existing data are under-utilized for their maximum impact. For instance, we estimate there are 5-8 million air quality data points measured each day and publicly shared by government sources, but these data aren’t always easily available to the public.

Our premise has been that if these data are aggregated and made available in more friendly ways (See #14 in our FAQ for data access mechanisms: https://github.com/openaq/openaq-info/blob/master/FAQ.md), they will write data-driven articles for the general media about them, use them in the classroom, make open-source packages for different languages to access the data, etc - and these are all examples we’ve seen so far.

If you’re interested, here’s a community wish list that people around the world are hoping to see others build on top of the dataset: https://medium.com/@openaq/whats-on-the-openaq-community-wish-list-846ef2a78dc0#.j5k4xup0u

One of our biggest road blocks at the moment is getting adapters written for countries to get their data aggregated into our system and finding volunteers to work on that. (examples of data waiting to be ingested: https://github.com/openaq/openaq-fetch/issues?q=is%3Aissue+is%3Aopen+label%3A%22new+data%22 )

We do, though, want to eventually add in low-cost sensor data from the public, though this is tricky for a host of reasons (see:http://www.nature.com/news/validate-personal-air-pollution-sensors-1.20195 ). That’s one of our goals if we win the Open Science Prize.

To your question about quality control: The OpenAQ platform aggregates data from sources and does a technical validation only (e.g. make sure that the data ingested is in the proper form, e.g. a string is a string, a number a number, etc and not going to break anyone's code built on top of the platform). However, important to note, it does not do any cleaning or QA/QC'ing of data. It mirrors data as-is from a variety of disparate sources. The reason is three-fold: (1) Adjusting these values, even for some form of QA/QC, makes a judgement on a given source's (e.g. government's) data, which we do not want do. We are neutral. (2) There are many ways one can do QA/QC'ing of the data, depending on one's need and field, and (3) there are uses for the 'uncleaned' data.

When we get into aggregating low-cost sensor data, we’re going to have to think carefully how we approach that, when it comes to the QA/QC’ing issue, as they’re kind of a different beast than the government-level or research-grade data.


Do you think these data could be used to enhance other EHR (electronic health records) for epidemiologic research or even better, more personalized healthcare approaches? If so, how do you see these data being leveraged in Obama's Precision Medicine Initiative?

p1percub

Absolutely. Air quality information available at higher and higher spatial and temporal resolutions can help shape decisions people make throughout the course of their data. To be honest, the data we aggregate primarily right now - government-level sources across the world - are not at a very high spatial or temporal resolution to cause this sort of ‘personal’ level revolution. That’s more in the domain of personal sensors that are starting to emerge. But for the dataset we aggregate, there’s potential for impactful large-cohort epidemiological (or exposure assessment) work in places where such studies have not been conducted as much yet are highly polluted. In fact, surprisingly, we have less of an understanding about, say, the relationship between PM2.5 and mortality in more polluted places than less polluted ones. See figure 2 in this commentary our community wrote in a journal from December:

http://www.cleanairjournal.org.za/download/caj_vol26_no2_2016_p08.pdf

The majority of large-scale epidemiological air pollution studies have taken place in the US and EU, and many of these studies relied on open data from government sources, themselves. Here’s a post we’ve done on that: https://medium.com/@openaq/open-air-quality-data-packs-a-powerful-punch-38ca9462910b#.b5tbmb75z

As a sidenote, we’re looking to add in low-cost sensor sources to our platform in the future (which will increase the spatial and temporal resolution of the dataset). That’s one of the goals we’re hoping to achieve if we win the Open Science Prize.


Air pollution has been a political issue for a while now with China being the obvious example. But there is also the Torry lobby against EU air pollution restriction laws. And in the US there was the "Smoggy Skies Act" (H.R.4775) that aims to reduce the efficacy of the Clean Air Act, which passed the House and is currently in the Senate.

How can you (or we) encourage this data to not just be collected but also analyzed, interpreted, and published in a way that the public can be empowered with knowledge about their communities? And that, in turn, encouraging people to mobilize so they can effectively call upon representatives to address it?

firedrops

This is spot on about not just collecting data but also doing something with the data. One of the ways we do that is through workshops in communities facing what we term air inequality. We convene scientists, developers, policy folks, journalists, students, etc, and brainstorm ways to use existing open air quality data to further the fight to improve local air quality. We also find that a lot of times, it’s not just about the data, but also about getting different sectors a space to be talking with each other in the first place. We then take general lessons learned back to our global community (and get everyone connected, at least virtually through our slack channel).

Here’s some stuff on our recent workshop in Delhi: https://medium.com/@openaq/delhi-openaq-workshop-info-materials-and-results-2bd74b88bee6#.yv3ywgd9a

And about our next one in Sarajevo: https://medium.com/@openaq/the-next-openaq-workshop-is-coming-to-sarajevo-bosnia-in-february-apply-to-come-4b4e8e4b9265#.oe5eank25


Hi everyone, thanks for the AMA!

I wonder if you can elaborate on why data for air quality is difficult to get access to? What factors limit sharing of this data through traditional channels?

superhelical

There are a couple of things. One major thing is that every government that measures and publicly shares air quality data has their own method of doing it. Even sub-nationally, in some cases, data are collected in different ways. So there’s just no natural or easy way at the moment for governments or some other body to have an international database in a universal format (as far as government-collected data go).

When data are shared at national or sub-national level, it’s often the case it’s a few clicks in on a country’s government agency website, and often not available programmatically or for historical download. This seems to be true especially in places experiencing higher pollution levels. I think, historically in many places, the original intent of this data was to give the public a snapshot of current air quality. The interest in and ability of the public to use data over time has changed, and governments’ systems have yet to fully catch up to that interest and demand. This is probably true in many fields besides air quality.


This AMA is being permanently archived by The Winnower, a publishing platform that offers traditional scholarly publishing tools to traditional and non-traditional scholarly outputs—because scholarly communication doesn’t just happen in journals.

To cite this AMA please use: https://doi.org/10.15200/winn.148362.20666

You can learn more and start contributing at authorea.com

redditWinnower

Thank you so much to the team at The Winnower for all the hard work on opening up scholarly outputs!


How are your data sources chosen? I'm very interested in the other side of the OpenAQ platform. Specifically, what do you think about massively-scaleable AQ data collection sourced by the community itself?

I'm thinking about cheap, wifi-connected home devices. One time setup air scanners. Do you think it's feasible?

Keep up the good work!

DasKaz

We're currently aggregating data from publicly available government sources and research groups. These sources are reported by people in the community and tracked in one of our Github repos: https://github.com/openaq/openaq-fetch/issues. If we don't know a source exists, there is no way it'll get added to the platform, so having people help us identify sources is a huge help.

Low cost sensors have great potential to solve the current data gap and provide insight into the air quality of some of the highest polluted places on Earth. We have thought about a number of ways to expand the existing platform to be able to include data from these sources. An open question will be how to present that data back out? Do we show all data points from an individual sensor that a person might be wearing, or do we only show aggregated, gridded results or similar (do we even want to store all those measurements?)? Fun things to think about!

This is work that we've proposed to do as part of the Open Science Prize, so if you think it's valuable, please head over and vote!

And thanks for the link issue, fixed above now.


Technical question here rather than focused on the content: what does the underlying architecture and design for the platform look like? How is data ingested into the platform?

I'm guessing, based on the number of samples (~34m), that you're probably using using a relational database for storage and query, and not something like HBase, which would be overkill? Something else? Is data streamed into the platform or uploaded manually?

kstrike155

Latest platform architecture can be seen here. Two main components are openaq-fetch and openaq-api which meet at the database, which is PostgreSQL. We call out to all the sources (websites, FTP servers, APIs, etc.) every 10 minutes and save all new data. We don't currently support streaming of the data as no one has offered to stream to us yet, but there is always hope. We do also a manual import mechanism, but that'd be more for one-off research data use cases.

Went with PostgreSQL due to some familiarity, but the dataset is getting large (by our standards) and there is something like 5 million potential measurements a day floating around out there (we're currently storing ~130,000 per day). We're definitely looking for help optimizing (or altering) the storage mechanisms, so if anyone is interested in helping there, let us know!


What are some of the most interesting (and perhaps even unexpected) uses of the OpenAQ datasets have you seen?

edwinksl

Here are a few that come to mind:


To do the same thing with water is decidedly more difficult but also on the horizon. What do you know about projects similar to yours (like water)?

speisenkarte

Water quality has been one of the most raised topics whenever we talk about OpenAQ, so there is definitely a lot of interest in it and it's obviously a very important issue.

We don't have any specific projects to point you towards, but if you email us at info@openaq.org, we can try and connect you with some of the other individuals who have reached out to us.


Where are the areas most affected by air pollution and lack of access to data? Are there any particular locations that are a particular focus or concern?

OnlyQueries

The entire continent of Africa has a dearth of ground monitoring data available, but we know from both existing short-term measurements of both outdoor and indoor levels, as well as from long-term satellite measurements, that it is an issue that goes under the radar.

There's also the issue that even if data are available at some level, that they aren't being fully utilized for research or policy purposes. A graph that illustrates this point (see figure 1, article is open access): http://www.sciencedirect.com/science/article/pii/S1352231016303843


As a future CS (starting next year). What can I contribute to this?

StealthDrone

That'd be great and thanks for being interested and offering!

To be honest, we are a really tiny entity, so we haven't had the bandwidth to make a lot of great on-boarding resources to the community...yet.

That said, check out these things:

And most of all, I'd suggest chatting with us on Slack. What we lack in size, we make up with friendliness! :)

Thanks again for being interested!


Thank you for this important work. As an environmental engineer living in a non-attainment area, I never cease to be amazed by how edit unaware edit the average person is of what's going on in their own back yard. Hopefully this will educate people and motivate them to hold their industrial neighbors accountable. Questions:

  • what criteria pollutants are you archiving, or is it a complete data dump of whatever is provided by the primary source?
  • are third party independent sources used for countries with no air quality standards, or are these areas excluded from your archive?
  • follow-up to Q2: if third party sources are used, is there some method of verifying the reliability of their data?
fyukhyu

We currently aggregate PM2.5, PM10, CO, SO2, NO2, O3, and Black Carbon. We made a decision at the outset to only save data that is most commonly found in a majority of sources. The most common thing we're asked for alongside these is meteorological data.

The majority of our data currently is from official government sources (or parties acting on behalf of the government). We also have research-grade data and this is denoted as such in the metadata. We're not specifically excluding any third-party independent sources, they just haven't come up yet.

As for reliability, we don't make any guarantees for data quality; that is not the goal of this platform. The goal of this platform is to create a base layer aggregating existing reported government-level, research-grade and other air quality data sources in the same universal format.

We do have a Community Wishlist with ideas of what people would like to see built around the platform and a QA/QC'd layer is on that. If you're interested in seeing this exist, we'd love to help you build it!


Data + machine learning = predictions

Have you considered aggregating additional data, e.g. Wind speeds, weather etc and implementing forecasts on air quality?

In Beijing the goverment seem to be fairly accurate to predict (with the limited alerts they issue) , and we all know when the wind starts to blow the air quality will improve. But the alerts are limited and a few days heads up would allow better planning for people.

What are your thoughts on machine learning for air quality in general and forecasts in particular?

ScandInBei

I've been waiting for someone to ask this! I'm biased, but I think this dataset would be an awesome target for ML. There are hourly, daily, seasonal patterns layered with natural and man-made causes; that sounds so cool.

But it's not something we're looking into as part of the core platform. Our goal is to collect the data, make it easily available and help others understand how to use it. We want to provide the base layer of data for others to do forecasting or other ML projects on top off, but not something we'd likely try and fold into the core mission.


I may be wrong about this, but from what I have previously read most of pollution comes from 2 sources, Transportation of goods such as cargo ships and Semi-trucks and The meat industry specifically raising animals for Steaks and Milk. My question would be what route would you guys believe would be best to take to fight air pollution, do you believe a solution would come from reduction and innovation of transportation of goods and lab grown meats or do you guys see the better solution would be in having huge oceanic plankton farms or giant city air filters or an "Amazon sized forest" in Africa that would clean up more pollution than is being produced. My reason for this question is I really do not know the numbers but I have read that we can stop 100% of transportation pollution but it will still be too little to late to combat the already huge amounts of pollution in our atmosphere.

pryzless1

Honestly, air pollution can come from a variety of sources - depending on what pollution specifically and where you are. In most places, it's a mix of different sources, like industrial, agricultural, home heating/cooking, transportation, etc, no one single issue. While air pollution is a global issue, solving the sources of pollution from one location to the next is often pretty localized.


Have you seen any interesting patterns or correlations in the data that you did not expect?

At_least_im_Bacon

Have you seen any interesting patterns or correlations in the data that you did not expect?

Here's some interesting ones either we've seen or picked out by community members:

https://medium.com/@openaq/personalizing-the-data-points-following-the-open-data-trail-to-coyhaique-40604278bf71#.bctpiryrg


Have we gotten better in terms of air pollution in any place in the world due to environmental restrictions or anything else?

tiffas1121

Have we gotten better in terms of air pollution in any place in the world due to environmental restrictions or anything else?

Yes, for sure. I'm going to give a few US-centric examples, but examples do abound.

Here's an example of how PM2.5 has decreased over the past 15 years in the US: https://www.epa.gov/air-trends/particulate-matter-pm25-trends

And you can see how other ambient pollutants have changed over time: https://www.epa.gov/air-trends

I personally have a difficult time believing that these systematic changes would have happened on their own, with no enforced regulations.

(May come back and shoot some articles your way, depending on time remaining, but feel free to email us at info@openaq.org, and can follow up there, as well).


Are you hoping that this database will start questions at the policy level, or help guide policymaking? Also, are you seeking volunteer money or work contributions to this effort?

mothslice

Yes, definitely! We think that OpenAQ can be a powerful tool to inform policy making and inform civil society. A lot of the places that we get our data from provide snapshots of the current air quality but don't provide access to historic measurements.

There are a lot of different ways you can help us achieve this. From suggesting new data sources, building applications on top of OpenAQ, or using the data in analysis and articles. Some more pointers on how you can contribute are up on our website To know more about the funding, see this comment


What would someone need to do to try add his country to your platform?

Petaye

The first step is to let the community know it exists. You can make a GitHub issue here, submit something via this Google Form, or email us at info@open.org. The data format and requirements are listed here.

Then we'd need to create an adapter to turn the data from the source into something storable in our system. That code is written in JavaScript and if you're able to help there, that'd be awesome! But if not, no worries, just helping us to know it exists is extremely helpful.


What are the most effective masks and other protective methods for those who live in polluted areas? Are scarves effective?

KillerButterfly

If you're trying to remove PM2.5 (e.g. smoke and fine dust pollution), scarves will not work at all, unfortunately.

For the rest of my answer, I do feel the need to preface all this with we are not mask experts. But it depends what pollutant you’re trying to filter out, but if you’re talking PM2.5 (eg. smoke and dust), you’ll want a mask that can filter out those particles. In the US, NIOSH has a classification for masks that are able to do this to varying degrees (e.g. N95 masks)

Link: https://www.cdc.gov/niosh/npptl/topics/respirators/disp_part/

One difficulty in many highly polluted places is that it is hard to obtain such masks or there are lots of masks on the market that have not been tested to meet a given country’s standards (if that country has them for masks). An added difficulty is that there, to our knowledge, there is no certification for the efficacy of child-sized PM masks by a given government (and definitely not the US). For the US, this is because these masks were designed to meet conditions found in the work place. It can also be tricky getting a proper fit on a child due to their face shape etc.


I live in Beijing, one of the heavily air-polluted cities in the world. In your opinion, how bad is this whole smog situation? Also, do you work with any governments to fight air inequality issue?

Tsi_Ruin

We think air pollution is one the most pressing issues of our time - it’s responsible for an estimated 1 out 8 deaths each year. Here’s a really great graphic from the Global Burden of Disease: http://www.healthdata.org/infographic/global-burden-air-pollution

And the thing about it is that it is solvable; it’s not rocket science!

In terms of government engagement, we’ve had great conversations with a few governments to help give the data they are already collecting a larger voice (which is our entirely our goal). We’ve had a few folks at various agencies across the world help us out in very concrete ways (e.g. they notice an incorrect geo coordinate or ask that their data actually be added - or they adjust their API when we notice issues). We also invite governments’ agencies’ staff to our workshops that convene different sectors locally to figure out how to use open data to advance the collective fight against air inequality.


For those that are interested in building their own air quality measurement devices plotly has a great tutorial using an arduino.

At_least_im_Bacon

Love Arduino and one of our community members was working on using Plotly to take data directly from the API and plot it. His results are here


How are you guys funded?

ancapnerd

You can see our current list of partners and sponsors here (bottom of the page). We generally rely on a combination of government, private-sector, and non-profit/foundation grants in the open data, air quality and general environmental and public health spaces. At this time, we do not accept personal donations of funds.

We are always on the lookout for partnerships to help us sustain and expand what our community is doing.


Do I have a significant impact on the quality of air in my own home? If I use perfume diffusers, candles, air fresheners, chemical cleaning sprays (eg. Lysol), are these significantly influence the quality/toxicity of the air I breathe? Do plants, UV Light, etc. actually keep the air clean or is that marketing hype?

8solutions

Those items you mention (perfume diffusers, candles, etc) can affect your indoor air quality, but it's a little outside of our expertise. An excellent indoor air quality expert is Brandon Boor at Purdue. He's on twitter: @BrandonBoor (and here's his website).

(edited for grammar)


Is there a place we can download the data in a single csv file? I don't want to specify anything, I'd just like to download the air quality data in a single file.

MDA1912

Right now, we have these options:

To get the entire dataset in one go, we should prob talk more to figure out what will work - shoot us an email at info@openaq.org.


Is there any way I can help out with the coding part, besides contributing to the Github repo?

chochomp

Is there any way I can help out with the coding part, besides contributing to the Github repo?

Do you mean what might you build off of the system? If so, here's a Community Wish List, and most of these items are ones that would be built on top of the system, not something directly within the purview of our org, if that makes sense.

And thanks for being interested in contributing in some way!


Hoping to see more data from China. I've just started with data science, these data can be so helpful to understand the problems.

3xlax

Yeah, we hear you. We'd love more data from China. There's a ton of awesome data measuring AQ across China, but we haven't been able to add it into our system yet.


1/8th of deaths on this planet are from polluted air? So in the average of 55 million dead each year, 6.88 Million died from polluted air.... I just want to be sure these are figures were standing by. And killed how? Respiratory failure? Asphyxiation? Disease?
And 5 trillion annually, with this death toll, factors to around $700 per dead body. Im just having a difficult time with the numbers

TheAlmightyGawd

No problem, and thanks for asking!

  • The 1 out of 8 number comes from the WHO, though the number estimated typically varies from 5.5 million to 7 million or so deaths, depending on the source. These are outdoor and indoor air pollution deaths combined. A really nice graphic on this comes from the Global Burden of Disease.

  • Surprisingly to a lot of folks, most deaths are actually due to subsequent cardiovascular issues, but also a significant portion of respiratory issues (e.g. lung cancer). This info you'll find at the WHO link above.

  • The 5 trillion USD estimate is from the World Bank. I know the headline says 225 billion USD, but that's only lost labor costs.

"When looking at fatalities across all age groups through the lens of “welfare losses”, an approach commonly used to evaluate the costs and benefits of environmental regulations in a given country context, the aggregate cost of premature deaths was more than US$5 trillion worldwide in 2013."

EDIT: And going to link these to our intro above. Thanks again for asking! (also edited for grammar)


Are the Chinese authority cooperating with you in any way? I was wondering that since we've been hearing a lot lately about the constant smog in Beijing and the Northern Provinces.

Doumtabarnack

Currently, all data we aggregate from China is from the US embassy and consulates that measure PM2.5 and share the data on stateair.net and airnow.gov. We'd greatly welcome collaborating with anyone in accessing additional public data from China.

(edited for clarification)


How does one use this data to determine which cities are asthma friendly? Which measurements are best to minimize?

ocawa

(Just a little disclaimer that we are not medical professionals)

Two pollutants that are associated with exacerbating asthma are particulate pollution and ozone. But of course, for a given person, it can vary what triggers issues and at what levels they feel affected. The US EPA's Air Quality Index System is a way of categorizing air quality by a simple color-coded system by its healthiness for the general public, as well as sensitive groups. Many countries (and actually some independent air quality apps and orgs) have similar systems in place that transform physical data to an AQI.

To define if a city were 'asthma-friendly', one would have to construct a framework for how to define that (e.g. define a time span to look at, observe how often levels exceed x-level or y-time span). To my knowledge, there isn't a generally accepted framework where this has been done.


In an age when the future president of the United States stated he will defund NASA weather satellites due to the global warming they are reporting how do you foresee dealing with unfavorable data at levels where real decisions can be made?

ruat_caelum

I wish I had the perfect answer to this, but I do think data are most at risk of disappearing when the public isn't aware they are there in the first place or how they already positively impact their everyday lives. The more any of us can do to either highlight just how integral such data are to making our lives better or to find a way to give the data a 'louder voice' through embedding them in various uses, the better.

What I don't know is how to make sure those data aren't then ignored at the level where real decisions are made. Public pressure and highly-competent, expert staff around decision-makers seem to be key.


Do the positive effects of plant fertilisers out way the negative effects in pollution ?

bhaanginkush

Great, q, but honestly, I don't have the knowledge base in that area to say.


What software do you guys use?

pandamaster2

Thanks for the, Q! The front end is written in CSS, HTML and JavaScript using React. The API and updater are written in JavaScript using Node.js. There are a number of other small pieces (CSV exporter, health checker), mostly written in JS/Node.js. If you're interested in system architecture, everything runs on AWS.


I'm originally from Phoenix, AZ, and I have asthma. It's a commonly held belief that the dryer climates in much of the Southwestern United States is better for people with breathing problems.

Is there really any scientific basis to this claim, or is it a belief that simply began when people moved to an area with less pollution and notice breathing was easier?

ElliotFriend

I'm originally from Phoenix, AZ, and I have asthma. It's a commonly held belief that the dryer climates in much of the Southwestern United States is better for people with breathing problems. Is there really any scientific basis to this claim, or is it a belief that simply began when people moved to an area with less pollution and notice breathing was easier?

Not sure on how the combination of a dryer climate and lower air pollution plays out in regards to asthma. Asthma seems idiosyncratic in what triggers it for one person versus another. e.g. Someone who moves to Beijing with asthma may not notice a marked increase in attacks, while another person does. Please do keep in mind we are not medical professionals, however.

Also, because you are in Phoenix, had to share this with you: https://medium.com/@openaq/the-phoenix-haboob-a-picture-is-worth-a-1000-words-or-1000ug-m%C2%B3-of-pm10-1ba0cce8050c#.yxys1jpjn


Hi OpenAQ team collection of Data is first step.I am sure your team must have come across multiple DIY projects with crowdsourced data model from across the world can you list some projects which are noteworthy here ?

satyaakam

Thanks for the question! But, we've focused so far much more on gov and research-grade sources, and we don't have a great or definitive list of crowdsourced, DIY projects. We're finding out more as we go through this AMA too!


Do you have works published and reviewed in scientific journals?

AnSeTe

We haven't published an OpenAQ-specific paper yet (though our system is always up for informal but completely public and large-scale review on github).

We've recently published a commentary from the OpenAQ Community (12 scientists, 10 countries) here: http://www.cleanairjournal.org.za/download/caj_vol26_no2_2016_p08.pdf

There are a few orgs/individuals that are working on research utilizing OpenAQ-aggregated data, but to our knowledge they haven't published yet.


What kind of concrete conclusions from your data do you hope to publicize the most, in order to spark more immediate action to improve environmental policy? Is it educating the public on the direct link between certain illnesses and certain thresholds of smog? Is it showing governments the earnings that a nation will lose every day due to air pollution-related morbidity?

On a more speculative note, what kind of economic overhaul or restrictions on sources of heavy emissions do you predict would be necessary for these policies to best succeed?

cloudmallo

What kind of concrete conclusions from your data do you hope to publicize the most, in order to spark more immediate action to improve environmental policy? Is it educating the public on the direct link between certain illnesses and certain thresholds of smog? Is it showing governments the earnings that a nation will lose every day due to air pollution-related morbidity? On a more speculative note, what kind of economic overhaul or restrictions on sources of heavy emissions do you predict would be necessary for these policies to best succeed?

So which concrete conclusions do we hope for the most? All of them! :) The precise mix of policy versus science versus activism that will truly improve air inequality in a given community will vary from one place to another. While the answer to air pollution is not rocket science, what moves the needle in one place does seem to be highly specific to local circumstances.

We see our role as fostering an ecosystem of uses rather than focusing on any one in particular, and making sure that we connect solutions that worked in one place with our broader community, so no one has to reinvent the wheel.


Are you aware of this Project? http://luftdaten.info It started as a Citizen science Project to Map air quality Stuttgart, but expanded since then. I just ordered all the parts for another Sensor for Hamburg. Maybe you can get together and figure something out...

Cyp12die4

That sounds awesome! Thanks for sharing, and we'll check it out. If you'd like to talk offline about it at all, shoot us an email at: info@openaq.org


I see that there is no data available for Switzerland. However it seems that there are a few measurement stations: * http://www.ostluft.ch/ * http://www.bafu.admin.ch/luft/luftbelastung/aktuell/tabelle/index.html?lang=de

What would be required to incorporate these data in OpenAQ? Are there any resources available, like howtos, guide regarding legal issues etc.

brumgabrasch

This is great, thanks for pointing us to those sources! We are generally made aware of sources by people from the community. Each source is tracked in this Github repo: openaq-fetch

After doing some quick checks of the source (for example if it fits the OpenAQ data format), somebody from the community will write an adapter to pull in the data. All that work is tracked in the openaq-fetch repo.


It's great that you are doing this! Where do you get your funds from?

kerloom

You can see our current list of partners and sponsors here (bottom of the page). We generally rely on a combination of government, private-sector, and non-profit/foundation grants in the open data, air quality and general environmental and public health spaces. At this time, we do not accept personal donations of funds.

We are always on the lookout for partnerships to help us sustain and expand what our community is doing.


Thank you for doing this AMA! I hope your mission receives heightened awareness and the information you gather is used to tremendously benefit our planet!

Opinion based question: Based on the information in this article, what do you feel the long term, positive/negative impacts of converting CO2 to ethanol in this manner could be?

http://www.sciencealert.com/scientists-just-accidentally-discovered-a-process-that-turns-co2-directly-into-ethanol

Personal question, sort of: Based on the information you've gathered, what are some small things that we can do as individuals to improve air quality indoors/outdoors?

Thanks again for doing what you're doing. It's pretty amazing.

P3rs3v3ranc3

Honestly, I don't have enough knowledge to give a thoughtful response to your CO2 to ethanol question.

To your personal question, one of the best resources (in my opinion) for figuring out what you can personally do during a high air pollution (specifically PM2.5) episode comes from the EPA:

https://www.airnow.gov/index.cfm?action=aqibasics.pmhilevels


Is there a way for me to look up the air quality in my local area?

T4blespoon

Yes! If you go to openaq.org and allow the browser to know your location, the homepage should show you the 3 locations nearest to you. If for some reason that doesn't work, you can have a look at the map to see if there are measurements for your local area.


There is another important effort being made right now to acquire air quality information in real time that involves purpose-built sensor packages deployed world-wide.

Go to uRadMonitor network

The designer/founder/deep thinker behind it is currently in a competition with four other projects to win significant funding for the project. To see the video and cast a vote for the project (or for a different one as you please) visit here:

uRadMonitor - Chivas the Venture funding competition

uRadMonitor has been actively gathering some data for several years using an array of detectors deployed world-wide by those who purchased either during a Kickstarter effort or after commercialization of the product.

A recent article about their efforts -

Clean-Tech Article

Anyway, just thought I'd mention another effort to accomplish a similar goal. Good luck to you.

doodlebugger

Thanks for sharing their work! There could definitely be a few potential lines of collaboration!


Hey! Thanks for taking the time to do an AMA. I have wondered, do you factor in Benzine as an air pollutant? And if so, how do you measure it? Thanks again!

MrLewayne

Our pleasure! We currently don't aggregate benzene. The main reason we aggregate the pollutants that we do is that they are the most commonly measured across the globe by government bodies. And just to clarify - we actually don't measure anything ourselves; we aggregate public data from other entities, put it in a universal format, sourcing back to the originating source, and then serve it out.


I was thinking the other day whilst going through a tunnel how many concentrated pollutants those tunnels must see every day. Is there any useful data you could divine from sensors in tunnels across the world?

tehrob

You know, we don't have any sensors from tunnels in our system. Data are aggregated from primarily gov't sensors measuring ambiently. But interesting question!


Air pollution is one of those thing when you see it it is already too late. I live in South Africa and we use too much coal how can we use this data to make people aware that their air is not clean just because they can see the sky

Ansteph09

You hit the nail on the head - how can you use data to make the invisible visible? One way is by trying to understand and then communicate the health, economic or other social impacts of air quality at the current levels. That can help make air quality as an issue more concrete.

For the OpenAQ platform, we don't aggregate any real-time air quality data from South Africa, but we know there are a few sources (and there's a GH issue on them).


Did you guys make up the term "air inequality"? Do some people walk around with gas masks and fresh air looking down on plebs breathing normal air?

Hubris_by_Nature

Yep, as far as we know, we made up the term 'air inequality'. We needed a term to describe the distinct and disparate levels of air quality across nations. See these two graphs: https://medium.com/@openaq/global-air-inequality-summed-up-in-2-graphs-ad3d5a845033#.1i6qtpfpa


Would an Android app help you in your cause? If yes, PM me. I would be glad to chip in.

_init1

Thanks! Connecting with you on Slack (and thanks for joining us on there!)


Can someone explain how providing data about air quality actually helps improve the situation? What would regular people even do with such data? We already know the places where air pollution is a very big issue.

rasen58

This is a great question. We're big believers that if you want to implement some means of reducing pollution (or really applying any policy to a given problem), you're going to need data to help assess your progress towards that goal. For instance, in a highly polluted place that experiences swings of PM2.5 in the milligram per cubic meter range, a policy that produces a 25% reduction is truly significant, but may be hard to see. By not capturing that 'win', the public could lose faith in an effort that is actually working. Conversely, not having data can also convince people through their anecdotal experiences that something is working when it truly isn't. We also have noticed that when highly polluted places do have more open data accessible to the public, that tends to spur a lot more interest and awareness on the issue that pushes for change.

So we think data alone are not sufficient for change. But we do firmly believe that they are a necessary condition for change to happen.

The more open data are to the public, the more eyes you have evaluating policies, creating ways to share the data that are friendly to the general public (e.g. apps, visualizations, journal articles), and the more air pollution is on the general consciousness of the public.


Just want to say this is a very riteous cause! I lived in an area that was near an airport, garbage incinerator facility, and a major port. I knew the air quality was off some days but had no way to prove it. Glad you are working on this.

Cantholditdown

Thanks a bunch for your support. :)


What is the base for judging how dirty the air is? Where is the cleanest air?

flunderbuster

What is the base for judging how dirty the air is? Where is the cleanest air?

It really depends how define the question - By pollutant? Or impact of a pollutant on mortality? And over what time period? And by whose standard? They vary significantly across the world. The other thing to note is that there is no known level of PM2.5 pollution, for instance, that doesn't have some impact on health. So the 'safe' limit is tricky to define.

But, not to completely dodge your question, here are guidelines (not standards) set by the WHO: http://www.who.int/mediacentre/factsheets/fs313/en/

Also, here’s a graphic from the Global Burden of Disease on the global burden of air pollution on mortality across the world:

http://www.healthdata.org/sites/default/files/files/infographics/Infographic_AAAS_Air-pollution_2016.pdf


Thank you, I need this. However how do I feed my local air quality data? The state/federal air monitor is not in the neighborhoods being pumped with wood smoke, so their data is completely crap.

breathemore

I think you're asking how do you input your own sensor data into the system? For non-research-grade low-cost sensors, we currently do not have a method to insert them, nor visualize them responsibly. We are proceeding carefully and slowly with adding in low-cost sensor data. It is tricky business.

This is work that we've proposed to do as part of the Open Science Prize, so if it's of interest to anyone, please go ahead and vote!


The title of this AMA answered my only question :(

On a serious note, this is really awesome and very motivating as an aspiring Data Science student.

Infamous_Noone

Glad to hear that, and thanks. :)


What about pollution caused by dihydrogen monoxide? It's really hard to breath when levels are high... How can we eliminate this pollutant as well?

CanaryInTheMine

<sigh> Perhaps only when we remedy the scourge of dioxygen gas ravaging the atmosphere.


Additional Assets

License

This article and its reviews are distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and redistribution in any medium, provided that the original author and source are credited.