SDS011 datasets with- & without dryer

I am a new user, so first of all I want to say hi to everyone!

A few weeks ago I launched two smog sensors: one with a dryer for high humidity conditions, the second one without a dryer. I want to collect measurements to try to replace the heater with the software solution (mainly I want to test machine learning).

I am looking for datasets with the same solution (two sensors, with- and without dryer) to compare results and extend my own dataset, which won’t have enough data until next year. Maybe someone has such a set and can share it or knows where to find them? I have seen a few publications on Researchgate, but the data are nowhere to be found and contact with the authors is difficult.

Thanks for your help! :slightly_smiling_face:

Hi Sitar,
We have just set up a heated smog sensor and have two of the Nettigo unheated kits that are produced specifically for the Sensor Community. All three are co-located. We are noticing a variation between all three SDS011s. I have yet to analyse the data fully to determine if it is more than the 5% that is expected. Are you running the unheated smog sensor with the Nettigo board? Our data from the SDS011 on the Nettigo heated system reads lower than anticipated even at low humidity. The data is interesting and I shall post more in this forum later. There are quite a few scientific papers out there I particularly like Laquai et al
They discuss the need for a calibration with a reference sensor and to check the humidity of the particle chamber. I don’t know if this fits with your machine learning but it seems that heating isn’t the full solution to the inaccuracies at high humidity, but it could be a start :slightly_smiling_face:.
I hope this is helpful.

Thanks for your reply. My smog sensors are Nettigo Air Monitor with slight changes and own software. SDS011 sensors reading are indeed different even at low humidity but the difference is probably small enough.

I know the publication by Mr Laquai and Ms Kroseberg, also I was looking for a contact and dataset of twin-SDS011 box, but I have no information on this :confused:

From the paper; I am not sure if the twin system was run for a very long time. You could contact the lead for the group in Stuttgart (
I recently posted here our preliminary results from our three sensors and, unfortunately for us, the variation between the sensors is quite large (looks like about 40%) this should be sorted out with a calibration factor. However, we do not have a reference system so I am assuming an average of the three (below 70% humidity) gives the most accurate value of actual PM2.5.
Clearly, like you we are just starting out with the heated system but we would be happy to share our data with you if it would be helpful. Perhaps check out my last post first as our information might not be suitable for your purposes.

Link to the post Preliminary comparison of heated and unheated systems.

Thank you for the tip to write to Dr.-Ing. Vogt. He gave me the contact to the author of the publication, but to this day I have not received an answer. So I will wait until next year and just use my own dataset.

I also noticed differences in humidity measurement between both SHT35 sensors. Over 80% relative humidity the difference is even about 5 percent points. I am considering whether to add another module with the HYT221 and SHT35-F.

I’m really interested in your results. Comparing my unheated SDS011 & SPS30 vs the heated Nettigo SDS011 and Tera NPM (internal heating) show no significant improvement of the heated devices at high humidity.

It is a shame you have not heard back yet from the author of the paper. We have been running our sensors for 2 months now. Of course it has been very dry in the U.K. but there seems to be a difference. I will post more about this in a few days.

I am not sure of your location but in the U.K. there seems to be a difference…even in the recent dry conditions.

Is the heater always on in your experiment?

No the heater only comes on when the RH% is around 70% it then cycles on and off to maintain the RH at 70%.

It is a bit complicated to explain but we have 3 sensors 73072, 53261 and 53245. 73072 is heated as I described above (it is the Nettigo heated system). They are all set up in the same location. The reading that they all give differs from each other but the heated sensor, for some reason, has a much lower value. I have taken the average of the readings from the last 2 months that were at or below 70% RH, at all of the %RH and above 70% RH. I have then calibrated the output so that all of the readings at or below 70% RH are the same. I then applied the relevant calculated calibration factor to all of the other average values for each of the sensors including the standard error. This I have plotted on the bar chart. This might not be a good method but from this chart it appears that only the data from the heated system (73072) correlate for all of the 3 %RH average values. I therefore conclude that the difference above 70%RH is significant for unheated systems.
I have the charts for the uncalibrated data if you are interested. I do intend to write these results up more fully but I hope this information helps with the conversation.

1 Like

Yes, it helps a lot, thank you. It would be interesting to see what happens when the pollution goes much higher than this.
I’ll be looking forward for your write up. I also intend at some point, when I have more time, to present a case of why citizen science is important, and why the more official initiatives cannot be blindly trusted.

1 Like

Check out this article, it deals with sensor corrections at high relative humidity:

Note that the SDS011 does not perform very well (high signal noise, low accuracy at low and high humidity), the SDS30 performs much better, especially for PM2.5. I think a correction for very high relative humidity values becomes nearly impossible. At the Dutch national PM monitoring network the air samples are always heated to 70°C to avoid moisture issues.
You would need to construct (3d print) a sensor housing for the SPS30 to separate sample in/out ports in case you want to add a heater.

1 Like

Hi everyone, it will soon be a year since I started collecting measurements, so I wanted to refresh the topic and share my thoughts.

First of all, if I were doing the station one more time, I would change the sensors to SPS30. SDS011 pose minor problems, and I also had to replace one of them with a new one. Other than that, the station operates in the same configuration as at the beginning. Due to a SDS011 failure, it is likely that the November - January data is to be discarded.

From my measurements, any difference in the reading appears between 60-70% relative humidity. I extracted data for the period April - November 2022 and checked several solutions: linear and binomial regression, normal equations, decision tree, vector machines and regression implemented in Keras. The best result was, of course, Keras, a simple model achieved an R2 metric of about 91,5%. Right behind him, 90% has the decision tree.

I only use measurements where the humidity was at least 60%, which is about 40000 records. Perhaps better results can be obtained by increasing the number of records, but the large noise of the SDS011 measurement may make this a bit difficult. @sensorsalnorth12 can you share your data as well? I’m curious how they will affect the results when combined with mine.

I’m also wondering how to improve the data in post-processing. SDS011 sometimes has value spikes, Windy weather in general is problematic, a Stevenson screen would be useful. I am currently averaging several measurements and removing some measurements based on the standard deviation of difference between adjacent records, but if anyone has experience with improving the quality of such data, I’d love to read.

Hi Sitar,
Sorry for the delay in responding. We can certainly look into sending some data to you or at least send you. Unfortunately, as with you, we have had issues with the SDS011s and our heated sensor has been offline for a little while now because it flooded when we had heavy rain. So, I don’t know how useful you will find it.
I also have found that any calibration is unreliable and have also posted about the difference in readings across a number of different platforms including so called reference systems, so currently I am not certain what counts as a valid reading.