Upgrade Notice

The CPAN Testers Blog site has been upgraded since you last accessed the site. Please press the F5 key or CTRL-R to refresh your browser cache to use the latest javascript and CSS files.

News & Views

Happy New Year to all. Last year had its ups and downs, so here's hoping that 2012 sees a much more settled year :)

2011 ended on a rather sour note, with our previous hosting company, Hetzner, causing us no small amount of stress and infuriation last year. I could recount all the problems, but suffice to say I would never recommend them to anyone ever again.

Thankfully 2012 has started on a much brighter note, in part thanks to our new hosting company, Bytemark, who have been absolute stars with their fantastic support and just generally being absolutely cool guys. They also gave a very healthy discount for our new server too, which was very much appreciated. They are our first corporate sponsor, and the CPAN Testers site will be updated soon to acknowledge our sponsors more officially. We have had other sponsors approach us, as well as individuals, who wish to donate to CPAN Testers. As a consequence a new fund is being set up so everyone can contribute to help CPAN Testers cover costs. We'll have more details for you as soon as we've finalised the details.

The current server has recently suffered a slight problem due to the backups we are doing. We no longer backup the complete database, but the sheer volume of data we do backup, has increased so much that we've had to re-adjust the way we backup the data. As a consequence of this as well as the problems we've had with the SQLite database, I'm going to look at a better way of backing up and distributing the data. This will hopefully form part of the work I intend to cover during the 2012 QA Hackathon in Paris.

We've passed the 18 million reports mark, and we are well on our way to 19 million. For the past two months we have been submitting nearly 1 million reports each month, and I fully expect us to see this increasing through the year. We're currently testing across 14 different OSes (95 platforms) with 38 different varieties of Perl and with around 110 testers submitting reports every month. All of these statistics are slightly down from their peaks, but undaunted many of the regular and high profile testers have all managed to increase their testing coverage. If we do encourage some new testers this year, I expect we'll be seeing over 1 million reports submitted on a regular monthly basis.

Over the last few months I've noticed a few questions popping up on the mailing list, or being asked of me personally, and I suspect other testers have occasionally got the same. Some of the questions are already covered on the wiki, particularly on the CPAN Authors FAQ, but not all, at least not currently. As such, I thought it might be useful to have an occasional feature to help authors and testers point to a more elaborate answer to their questions. If you have any questions, or better yet any complete posts you'd like to contribute, please let me know.

With some already planning their conference season now, if you are planning to promote CPAN Testers, or testing general, at a forthcoming conference, technical event or even a local user group, please let us know and we'll be only to happy to help promote your presentation and the event.

More news is waiting in the wings for next month, so stay tuned :)

PS: Thanks to Joel Berger for his I love CPAN Testers! post.

November turned into a rather more traumatic month than I would have liked. It didn't start well with the old hosting company issuing us a invoice for 1 year's hosting, which we had previously cancelled due to their appalling service regards our loss of data. They now claim we still owe them a month, regardless of the month outage we suffered due to their incompetence and failure to replace our mirrored HDD before the second one failed. I'm now seeking compensation, but I'm not hopefully it'll get very far.

The SQLite issues are still surfacing. Following a complete rebuild of the SQLite database, Andreas is still seeing errors for some searches. Though not all, so those of you who use the SQLite download, may see intermittent faults. I have no idea why this happens and if there are any SQLite experts out there, feel free to get in touch.

However, these issues highlight a need to move away from SQLite as our means to port the data around the net. It compresses to just over 1Gig, but the bzip and gzip tasks, as well as the requests via Apache, are all taking their hit on the server I/O. A more managable solution is needed. As a result I hope to have more time to investigate this next year, and will possibly be concentrating on it for the 2012 QA Hackathon. If anyone has some good suggestions for replicating nearly 20 million records, I'd be interested to hear from you.

The I/O issues are still causing us a bit of a headache, as the although the CPU processing is fast, the disk is still taking quite a hit. As a consequence the report feed, parser and builder are all vying with each to keep on top of things. We're roughly 36 hours at most behind, although I've been trying manually ease the pressure occasionally to try and get best usage out of the disk. Again I shall be looking closer at this in the new year.

I've been approached by a few people with regards to sponsorship. At the moment we're not in a position to take monetary donations, but all being well in the new year a special fund will be available to anyone wishing to contribute. We've also had offers of server space, mirrors and backups, which I hope to take advantage of too. If you're emailed regarding anything like this in the last few months, if I haven't got back to you yet, please be assured I will do :)

We're back to nearly 1 million reports a month again, and this is one of the reasons why we're a little behind after the server rebuild. We are catching up, but slower than I would have liked. However, it has to be said that I'd rather have too many reports to process than not enough ;)

Posted by Barbie on 11th November 2011

Recently Andreas alerted me to a problem with the SQLite database used to store the basic metadata for the CPAN Testers statistical database, aka cpanstats. On reading the database, for some queries an error message is now being returned; "database disk image is malformed". It's unclear where the error has occurred, but it seems to have been something that has only surfaced recently.

As a consequence I am now rebuilding the complete SQLite database. This means that the downloads available from the Development website will remain static until this is complete. Once complete and all is fine, then the backup mechanisms will be re-enabled.

However, there is the possibility that the database has grown so large now (with over 17 million records), that the data storage, and particularly the indexing, is not being written to disk correctly. With the database currently being around 5GB uncompressed, and just under 1GB compressed, it would be beneficial to reduce its size for efficiency, disk IO and bandwidth. So I have started to think of the alternative options.

Firstly the easiest option would be to create a database that only includes reports for distribution releases on CPAN. This would reduce the size slightly, but as we are now submitting reports at a far greater rate than for those releases that reside only on BACKPAN, the short-term gain would be quickly lost within a month or two.

Secondly we could only store data for a set period, such as per year. This would allow those that only need updates to just retrieve the latest instance (current + previous year), while those that need to start from scratch would need to download all the annual snapshots first. This would mean that all the data is easily accessible, but it does require anyone who wishes to maintain their own database to have a mechanism to update their local copy, rather than just download and have the database instantly ready.

A third option is to provide an API that can provide a list of the updates in CSV format, with the ability to specify a from date and to date, with the full SQLite database being created on a monthly basis. This would reduce disk IO and bandwidth, but would still require a local copy to be maintained with reqular updates.

A fourth option would be to switch to a MongoDB (or similar) database and enable replication. This would reduce bandwidth usage and reduce disk IO, however it would mean that anyone wanting to maintain a local copy would have to put in a lot more work to set-up and configure their local copy, and would require them to change any tools they currently have to use the new style of database storage. In the longer term, this is possibly an idea for a Google Summer of Code project.

Of all of these the journalling style of the third option would probably be the most practical in the short- to medium-term, while I think moving to a database that supports efficient replication would be better long-term. All would require work and/or tools for anyone maintaining their own copy, but if we can prepare all the necessary code before switching over, the change shouldn't be too painful.

These are all only ideas at the moment, and no plans have been made to change anything. If you have other suggestions that might be viable, I'd be please to hear from you. I'd also like to know if you currently download the SQLite DB, whether any of these ideas would work for you.

As mentioned, no changes are planned and beyond the current rebuild, the current DB is not going away any time soon. However, it does make sense for us to look at more efficient ways of exposing the data. Scaling for the future should not really be looking to continually copy the same 5-10GB of data daily around our eco-system.

October was a rather quiet month publicly, but behind the scenes there has been a number of fixes, improvements and discussions.

The most notable fix was getting the summary emails running again. It is still not perfect just yet, but I'm hoping that the ongoing fine-tuning will reduce the bouce-backs and faults that I'm currently seeing. The configurations are all from what was in the database at the end of August, so apologies if you've wanted to change these. The configurations are normally exposed via the Preferences website, but as the SSL certificate for the new server hadn't been approved it has a little longer than anticipated to get it up and running again. GoDaddy have now approved this and it's all ready to go, so I'm hoping this can all be sorted in the next few days.

There are still some problems building the web pages, with some reports gettting missed. However, I now have tools in place that double-check what is being processed, and re-inject any reports to the build queue that may have got missed. The fact that the builder and the feed parser are now capable of processing so much so quickly does now mean that to a large degree proessing is now near real-time. Typically the majority of reports are now available on the Reports website within a few hours of being submitted.

Some of the discussions behind the scenes have been to discuss sponsorship. We currently have two large international corporations who are interested in supporting CPAN Testers. Hopefully we can progress this over the next few months and early next year we can give you more details. The discussion of sponsorship have also led to discussions of potentially setting up a donation fund for CPAN Testers. Over the years we have had several people approach us asking if they can contribute to CPAN Testers. As we aren't a legal entity, it hasn't been something we have actively persued previously. However, we are now in talks to set something up for the future, and again hopefully in the new year we'll be able to tell you more details.

I would like to thank everyone who has been very supportive of CPAN Testers over the last few months. It has been a frustrating few months, but so many have contacted me personally or posted on lists and IRC to add their thanks and support for what we do. It really is very much appreciated.

To finish with, I have a few stats for you. As mentioned previously in August we managed to submit over 1 million reports in a single month. In the two months following we haven't quite matched that submission rate, but it's not been too far behind. So much so that we have now nearly 17 million reports in the database. We currently have reports covering 27 different Operating Systems, although around 18 of them are being tested on a regular basis. The page builder is now capable of processing several hundred thousand reports a day, and so can easily cope with the growing number of testers and smokers we'd like to see contributing to CPAN Testers.

There is still a lot happening behind the scenes, but looking forward to 2012 we are in a much better position to grow CPAN Testers for the long term future. We have an enviable infrustruture within the programming language and QA worlds, and we like to keep it that way :)

September has turned out to have been a very difficult and traumatic month for CPAN Testers. Our server problems have now been resolved, but it has meant more time than was intended has been devoted to rebuilding the website eco-system. It's still not complete, but we are getting there.

The first sites to be reinstated were the CPAN and BACKPAN mirrors. Soon after the Statistics site and Development sites were back online. The database rebuilds took considerably longer than expected, partly because the archive of the original NNTP reports had become corrupted at some point. The checks to ensure all interconnected parts of the databases were correctly referenced also took time. A full backup was then taken so we could start from a known point once all the moving parts were restarted. Just in case anything went wrong again!

The uploads and metabase feed parsers were turned back on, and although both were not the same code that was running on the old server when it died, they both highlighted problems with the server. At first I thought one of the disk was not performing as well as it should have done, but on investigation from the guys at Bytemark, we discovered the kernel that had been installed came with an IO blocking fault. Once upgraded we started to see a much better response time.

Following the reintroduction of the performance tuning code, the metabase feed parser is now processing 4 days worth of reports in a single day. We're currently less than 15 days behind, so hopefully by next week we should be almost realtime.

The builder for the Reports site was switched on to get through the several million requests we had stacked up from the rebuild processes, but with only just over 30,000 unique requests to process it only took a short while to catch up. As of right now I'm turning the builder on only for short periods, to give the metabase feed parser as much processing resources and disk access as possible. There are currently less than 5,000 unique requests for the builder to process, which once we're fully operational, the builder should be able to process in less than a day.

As we are getting much closer to being back to full speed, you might have noticed that the Reports site is back online as of this morning. Both the dynamic and static sites are now available, however the dynamic site no longer accepts requests from robots and spiders. As some don't follow the robots.txt file, this is being done within the Apache config files. For the time being the static site will accept requests from bots, but if there is any abuse, they will also be denied access.

Once upon a time I had hoped the requests from bots would help to build the site, but they have ended up being more of a hindrance than a help. Also the idea that the search engines would help the ranking for Perl within some dubious lists, hasn't really materialised. As such the loss of bots is likely to have more of a benefit than keeping them on side. Interestingly, I have noted that Microsoft have stayed true to their word and removed us from the bot lists. It was just sad that it took such high profile posts to get them to take stock.

The reports emails and the release database are still to do, as are some other small corners of the eco-system. I hope to get to all of these as soon as possible over the next month, and will keep you posted as these come back online.

Many thanks to the support and understanding of everyone. It's been a traumatic exercise, but the encourangement and help we have received has been very much appreciated. If you spot any problems in the next month, it is probably something we are already aware of. If it isn't resolved by the end of the month, please let me know or post to the CPAN Testers Discussion list.

A final bit of news that has vindicated my prophecy during YAPC::Europe, is that we did indeed break the 1 million reports barrier, for reports posted in a single month. Looking at the figures so far, I'm not expecting this to be repeated for September, as I suspect some have not been so active due to the downtime for CPAN Testers last month. However, it now highlights that we can indeed cope with such a high volume of reports. It will be interesting to see whether October picks up the pace again :)