Kernl: Important BitBucket Changes

It came to my attention that the way BitBucket handles deployment keys has changed. Until recently the same deployment key could be shared across multiple repositories. That rule has been changed and now each repository requires a unique deployment key. So what does this mean for you? You’ll need to take a few steps to make sure that your “push to build” functionality continues to work as you expect it to.

  1. I’ve deployed changes that allow you to add unique deployment keys to all of your repositories. For those of you with a lot of repositories this is going to be pretty tedious, but in the end it will give you greater access control to your repositories. Documentation for adding deployment keys can be found at https://kernl.us/documentation#deploy-key , but you likely won’t need it. Just go to “Continuous Deployment” and then click “Manage Deployment Keys” (if you don’t see that button, hard refresh).
  2. Starting tomorrow (February 21, 2017) at 7pm EST, access with the old Kernl deployment key will be cut off. From this point forward only the new deployment keys will be able to access your repository.
  3. After February 21, 2017 @ 7pm EST you can delete the old Kernl deployment key from your repositories. If you do it before then your builds will fail.

Sorry for the short notice and inconvienience of this change, but it’s necessary to make sure that all customers are able to deploy continuously with Kernl. If you have any questions or concerns about this change, please reach out. And once again, sorry for this inconvience!

What’s New With Kernl – February 2017

It’s been a little while since my update, so let’s dive right in to what’s new.

  • Kernl now has an enhanced billing area. The mechanism for paying an expired invoice was pretty confusing and it wasn’t possible to see your past invoice amounts. With these changes you can now see the last 10 invoices from Kernl and if you ever need to pay an out-of-date invoice that process is much simpler.
  • Purchase codes can now be limited to a domain. This is handy if you don’t want your customers buying a single license and using it on many sites.
  • Work has started on feature flags! Feature flags are a software development best practice of gating functionality. Functionality can be deployed “off”, then turned on via the feature flag, separate from deployment. With feature flags, you can manage the entire lifecycle of a feature. This is _super_ useful for the WordPress community because it allows you to turn functionality on/off without creating and deploying a new version. You can roll out flags based on a boolean on/off, percentage of users, or just to specific users. If you’d like to be part of the beta, let us know.
  • Kernl’s 3 Node.js app servers were upgraded. They now have 1GB of RAM per server instead of 512MB.

If you have any questions, thoughts, or concerns, feel free to reach out on Twitter or comment here.  Cheers!

0 to 1 Million: Scaling my side project to 1 million requests a day

In the Beginning

In late 2014 I decided that I needed a side project.  There were some technologies that I wanted to learn, and in my experience building an actual project was the best way to do that.  As I sat on my couch trying to figure out what to build, I remembered an idea I had back when I was still a junior dev doing WordPress development.  The idea was that people building commercial plugins and themes should be able to use the automated update system that WordPress provides.  There were a few self-managed solutions out there for this, but I thought building a SaaS product would be a good way to learn some new tech.

Getting Started

My programming history in 2014 looked something like: LAMP (PHP, MySQL, Apache) -> Ruby on Rails -> Django.  In 2014 Node.js was becoming extremely popular and MongoDB had started to become mature.  Both of these technologies interested me, so I decided to use them on this new project.  As to not get too overwhelmed with learning things, I decided to use Angular for fronted since I was already familiar with it.

A few months after getting started, I finally deployed https://kernl.us for the world to see.  To give you an idea of the expectations I had for this project, I deployed it to a $5/month Digital Ocean droplet.  That means everything (Mongo, Nginx, Node) was on a single $5 machine.  For the next month or two, this sufficed since my traffic was very low.

The First Wave

In December of 2014 things started to get interesting with Kernl.  I had moved Kernl out of a closed alpha and into beta, which led to a rise in sign ups.  Traffic steadily started to climb, but not so high that it couldn’t be handled by a single $5 droplet.

Around December 5th I had a customer with a large install base start to use Kernl.  As you can see the graph scale completely changes.  Kernl went from ~2500 requests per day, to over 2000 requests per hour.  That seems like a lot (or it did at the time), but it was still well within what a single $5 droplet could handle.  After all, thats less that 1 request per second.

Scaling Up

Through the first 3 months of 2015 Kernl experienced steady growth.  I started charging for it in February, which helped fuel further growth as it made customers feel more comfortable trusting it with something as important as updates.  Starting in March, I noticed that resource consumption on my $5 droplet was getting a bit out of hand.  Wanting to keep costs low (both in my development time and actual money) I opted to scale Kernl vertically to a $20 per month droplet.  It had 2GB of RAM and 2 cores, which seemed like plenty.  I knew that this wasn’t a permanent solution, but it was the lowest friction one at the time.

During the ‘Scaling Up’ period that Kernl went through, I also ran into issues with Apache.  I started out by using Apache as a reverse proxy because I was familiar with it, but it started to fall over on me when I would occasionally receive requests rates of about 20/s.  Instead of tweaking Apache, I switched to using Nginx and have yet to run in to any issues with it.  I’m sure Apache can handle far more that 20 requests/s, but I simply don’t know enough about tweaking it’s settings to make that happen.

SCaling Out & Increasing Availability

For the rest of 2015 Kernl saw continued steady growth.  As Kernl grew and customers started to rely on it for more than just updates (Bitbucket / Github push-to-build), I knew that it was time to make things far more reliable and resilient than they currently were.  Over the course of 6 months, I made the following changes:

  • Moved file storage to AWS S3 – One thing that occasionally brought Kernl down or resulted in dropped connections was when a large customer would push an update out.  Lots of connections would stay open while the files were being download, which made it hard for other requests to get through without timing out.  Moving uploaded files to S3 was a no-brainer, as it makes scaling file downloads stupid-simple.
  • Moved Mongo to Compose.io – One thing I learned about Mongo was that managing a cluster is a huge pain in the ass.  I tried to run my own Mongo cluster for a month, but it was just too much work to do correctly.  In the end, paying Compose.io $18/month was the best choice.  They’re also awesome at what they do and I highly recommend them.
  • Moved Nginx to it’s own server – In the very beginning, Nginx lived on the same box as the Node application.  For better scaling (and separation of concerns) I moved Nginx to it’s own $5 droplet.  Eventually I would end up with 2 Nginx servers when I implemented a floating ip address.
  • Added more Node servers – With Nginx living on it’s own server, Mongo living on Compose.io, and files being served off of S3, I was able to finally scale out the Node side of things.  Kernl currently has 3 Node app servers, which handle requests rates of up to 170/second.

Final Thoughts

Over the past year I’ve wondered if taking the time to build things right the first time through would have been worth it.  I’ve come to the conclusion that optimizing for simplicity is probably what kept me interested in Kernl long enough to make it profitable.  I deal with enough complication in my day job, so having to deal with it in a “fun” side project feels like a great way to kill passion.

What’s New With Kernl – November 2016

It’s been a long time since the last Kernl update blog, so lets get right into it.

Big Features

  • GitLab CI Support – You can now build your plugins and themes automatically on Kernl using GitLab.com!  We’ve had support for GitHub and BitBucket for a long time, and finally figured out a good way to make things work for GitLab.  See the documentation on how to get started.
  • Slack Build Integration – If you are a slack user, you can now tell Kernl where to publish build status messages.
  • Replay Last Webhook – Sometimes when you’re running a CI service with Kernl it would be useful to re-try that last push that Kernl received.  You can now do that on the “Continuous Integration” page.

Minor Features

  • Repository Caching – We now do some minor caching of your git repositories on the Kernl front end.  The first load will still reach out to the different git providers, but subsequent loads during your sessions will read an in-memory cache instead.
  • Better Webhook Log Links – Instead of displaying a UUID, the webhook build log now displays the name of the plugin or theme.

Other

  • Miscellaneous Upgrades – Underlying OS packages and Node.js packages were upgraded.
  • Payment Bug Fixes – There were a few minor bugs that kept showing up if someone’s credit card expired.  This fix hopefully allows for a more self-service approach.
  • Minor copy changes – A few changes were made to the wording on the Kernl landing page.

What’s next?

  • It’s been a few months since Ubuntu 16.04 LTS came out, so I’ll be spending significant amounts of time upgrading our infrastructure to the latest LTS version.
  • If our load balancer goes down right now, everything goes under.  A floating IP address between two load balancers will solve that issue and provide high(er) availability.
  • Better insights into purchase code usage and activity.

What’s New With Kernl – July 2016

With summer in full-swing here in the United States, development on Kernl has been slowing down to accommodate much busier schedules than during the rest of the year.  This doesn’t mean we haven’t been busy though.

Features

Infrastructure, Bugs, and Miscellaneous

  • When the server throws a 500 error, it renders the correct template.  Prior to this fix Kernl would render a 404 page, which made it very hard to tell when you encountered an actual problem.
  • We now have a robots.txt file!
  • Kernl’s Mongo infrastructure has been moved to Compose.io.  Having a professional DBA manage Kernl’s database helps me sleep easier at night and provides customers with a more performant and stable backend.
  • The landing page for Kernl was taking over 1 second to load for many people.  Caching was added, and we now have the number down to under 100ms on average.

What’s next?

July is a busy month outside of Kernl, so I don’t expect much to get done.  The current plan is to take it easy in July and then come back with renewed vigor in August.

What’s New With Kernl – June 2016

The past month of work on Kernl has seen a lot of great infrastructure improvements as well as a few customer facing features that I’m pretty excited about.

Customer Facing Features

  • Direct Uploads to AWS S3 – When Kernl was originally created all file uploads were stored directly on Kernl’s servers.  As we grew, this became an unsustainable solution, so the process changed to just use Kernl’s servers as temporary holding space before putting the file on S3.  This month we made this process even better by having files upload directly to S3. For you, this means faster uploads and less time waiting to get updates out to your customers.
  • Expiring Purchase Codes – You can now create purchase codes that expire on a specific date.  This allows you to sell your updates over time, instead of having to give them away for free for the life of the plugin or theme.
  • Max Download Purchase Code Flag – You can configure a purchase code to only allow a certain number of update downloads.  This will help resolve any issues with customers sharing purchase codes amongst themselves or across multiple installations.
  • JS Cache Busting – As customer facing features get rolled out Kernl automatically busts the client-side javascript cache for https://kernl.us.  This should help prevent confusion and remove the need for any sort of “hard refresh” when new features are released.
  • plugin_update_check.php Bug Fixes – There was an edge-case bug where some code in this file would collide with an old version of WP-Updates plugin update check file.  This happens when a customer has your plugin and also has a really old version of somebody else’s plugin installed.  This update takes care of that collision permanently.
  • Client-side JS Errors – A few minor miscellaneous bug fixes were performed on the front-end of Kernl.

Infrastructure

  • MongoDB – The month started off with Kernl’s database moving to it’s own server.  This was a temporary step that aimed to make the move to a highly available setup easier.
  • Mongo Replica Sets – After the first MongoDB move, the next step was to make the setup highly available.  Kernl now has 3 Mongo databases (1 master + 2 replicas).  In the event that the master database goes down, Kernl automatically fails over to one of the replicas with no downtime.
  • Memcache – Memcache was moved to it’s own server to make it easier to increase the number of items that Kernl caches over time.  This piece of the setup doesn’t need to be highly available.  If for some reason it goes down, Kernl will continue to operate fine.
  • Nginx – Nginx is used by Kernl both as a front-door to the application as well as load balancer between the app servers.  This was moved to it’s own server which allows it scale up when we need additional capacity.  In the future (hopefully soon), we’ll use a floating IP address to give this portion of the infrastructure the ability to fail over to a backup Nginx server.
  • Multiple App Servers – Kernl’s app servers can now scale horizontally.  We’re currently running 3 app servers which Nginx load balances traffic to.  This setup allows us to add app servers easily as our traffic grows.
  • Automated Deployment – Kernl can now be deployed with a single command.
A rough drawing of how Kernl is architected now.
A rough drawing of how Kernl is architected now.

What’s Next?

  • Caching the repository list that you see when you set up CI builds.
  • Get a rich text editor set up on the installation and description fields.
  • Theme change logs.
  • Wrap up infrastructure work.
  • Sign in / Sign up with BitBucket & GitHub.
  • Slack Integration.
  • HipChat Integration.

What’s New With Kernl – May 2016

Since last month I’ve been working hard on getting a few features out the door.  They are:

  • All new files are now hosted on S3 – Part of the work to make Kernl highly available is to get files hosted elsewhere.  When you upload a new version to Kernl, or push a change via webhooks, the deliverable now lives on S3.  Existing versions were a bit complicated, so thats going to be a task for May.
  • SSL and domain renewed – The SSL cert and domain for Kernl were renewed this month.  This should have been a completely transparent change.
  • Editable version fields – For plugins, you can now edit a few fields on a version once it has been created.  This was a pre-requisite for getting changelogs implemented nicely.
  • Plugin changelog API – You can now programmatically add, get, and remove changelog entries from you plugins.   Documentation on this feature is available at https://kernl.us/documentation/api#changelog-api and full examples are available at https://github.com/vital101/Kernl-API-Examples
  • Plugin changelog tab – The changelog tab in the plugin detail update window is now populated automatically and looks like the wordpress.org version.

So whats on backlog for May?

  • Moving the legacy version files to S3.
  • Moving the database to its own server + adding a replica.
  • Moving Memcache to its own server.
  • Analytics

What’s New With Kernl – April 2016

Over the past 4 months we’ve been making a lot of progress on many different fronts with Kernl.   After 4 new features, 5 feature updates, 3 infrastructure changes, and numerous bug fixes, Kernl is better than ever.  Check out the detailed info below, and leave a comment or reach out if you have questions.

New Features
  • Purchase Code API – A long requested feature has been the ability to add and remove purchase codes from Kernl via an API.  This has always been supported, but there wasn’t any documentation or examples of how to do it.  We now have detailed documentation for the Purchase Code API available at https://kernl.us/documentation/api.
  • WebHook Build Log – For customers using BitBucket and GitHub integration, it could be frustrating to figure why your build failed.  To help with that, we added a WebHook Build Log on the continuous integration page.  It can be found at https://kernl.us/app/#/dashboard/continuous-integration  Webhook build log
  • Envato Purchase Code Validation – Another often requested feature was the ability to validate against Envato purchase codes.  You can read about how to use and enable this functionality at https://kernl.us/documentation#envato-purchase-code-validation.
  • Caching – Since the beginning of the year, Kernl’s traffic has more than doubled and isn’t showing any signs of slowing down.  Kernl Traffic  To keep response times and server load down, update check results are now cached for 10 seconds.  What this means for you is that after you upload a new version or make any changes in the ‘edit’ modal, Kernl will take a maximum of 10 seconds to reflect those changes on the update check API endpoints.
Feature Updates
  • PHP Update Check File Timeout – In the plugin_update_check.php and theme_update_check.php files that you include in your plugin and themes, the timeout value for fetching data from Kernl is set really high by default (10 seconds).  If you want the update to fail fast in the event that Kernl is down, you can now configure this value using the remoteGetTimeout property.  Depending on how close your client’s server are to Kernl and how fast Kernl responds, you could likely lower this value significantly.  You should exercise caution using this though.  The documentation has been updated here and here to reflect the change.  You will also need to publish a new build with the updated PHP files.
  • Email Notification Settings – You can now enable/disable email notifications from Kernl.  There are two types: General and Build.  General email notifications are all emails Kernl sends to you that aren’t build emails.  Build notifications are the emails you receive when a webhook event from BitBucket or GitHub triggers a build.  You can modify these settings in your profile.
  • Failed Build Email Notifications – You will now receive an email notification when your BitBucket/GitHub webhook push fails to build in an unexpected way.  For instance if the version number in your kernl.version file doesn’t follow semantic versioning, the build would fail and send you an email notification.
  • Indeterminate Spinner for Version Uploads – Depending on the size of your deliverable and the speed of your connection, the Kernl interface didn’t give a lot of great feedback when you were uploading a file.  An indeterminate spinner now shows while your file is being uploaded.  Copy has also been updated to reflect that this action can take a little while.
  • Filterable Select on Repository Select Drop Downs – When trying to select a repository for continuous integration, it could be a real pain if you had lots of repositories.  A filterable select field is now in place that allows you to search large lists easily.
Infrastructure Changes
  • Capacity Increases – In mid March we had about 4 minutes of downtime in the wee hours of the morning while we upgraded our server capacity.  Current capacity should hold until we double or triple our traffic levels.
  • Mandrill to SendGrid Migration – Since the beginning of Kernl we used Mandrill as our transactional email provider.  As I’m sure some of you know, Mandrill sort of screwed it’s customers by making their free low-volume plan cost $30 per month.  Since this isn’t really something we wanted (or needed) to pay for, we migrated to SendGrid.
  • Apache to Nginx Migration – As our traffic numbers started to rise, Apache started to fall over on us.  A migration to Nginx as our reverse-proxy was high in the backlog, so instead of tweaking Apache we just did a quick migration to Nginx.  With the default configuration, load levels dropped from 1 – 1.5 to 0.3 – 07 with no configuration tweaking.  *high five nginx*
Whats Next?
  • Multi-tier Server Architecture – Kernl started out as a fun side project.  As a side project, keeping things simple as long as possible is almost always the right choice.  Now that Kernl has a growing number of paying customers, and those customers have lots of paying customers, it’s time for Kernl’s server architecture to grow as well.  Over the next month or two, we’ll be teasing apart Kernl’s current infrastructure to support horizontal scaling and automatic failover in case any node in the stack goes down.
  • Better Change Log Support – The current change log support on Kernl is… meh, at best.  A big goal for the next month or two is try and get better change log support out the door.
  • Analytics – Having some insights into your clients has always been a goal of Kernl.  Doing this efficiently and cost effectively is tough, but we’re 60% there already.  Infrastructure work has a higher priority than this right now, but getting this out the door in the next few months is a priority.
  • Bug Fixes – As always, bug fixes come first.
Other News

When you log in to Kernl, near the top you see a few boxes with general stats in them.  The ‘update pings’ stat is going to be off for awhile until the new analytics work is complete.  This is due to the naive way that we currently calculate update pings not being compatible with how we cache.  The ‘update downloads’ stat is still accurate since we do not cache the download endpoints.