Using Google App Engine to host your Sphinx Documentation

I was working on a way to host my Sphinx generated documentation, and realized that Google App Engine was perfect for it! Not only is the hosting Free, but it has a great automatic deployment tool that will upload all the documentation. So, the first Demisauce Documentation is now online. You can check out the Code necessary to support this. It is just a few lines of app.yaml configuration and a very simple redirect.

tags: gae, sphinx

Blog updates for App Engine, Caching data and non-normalized database work.

I made a number of improvements to this blog engine over the last couple of weeks. Most note-able was performance improvements to pre-render and cache HTML. There has been a lot of discussion and movement in the web architecture community towards not-traditional Database designs, not normalizing data, and storing caclulated info, and even moving to non-relational database designs. All because Internet scale data-bases built on transactional systems just seem fraught with problems and forces you to build around the db instead of around the needs of the web apps. The Google App Engine article on High Scalability gives a lot of great insight into App Engine which i had been discovering separately.

In order to incorporate a lot of the improvements possible out of Google App Engine Datastore, you have to start to do things different, here are what i did to help build a better google app engine based app:

  • OnChange Handler: Since we are often pre-rendering, and caching html, we need to get hooked into the events which signify when we need to make a change. Digging into the Google Datastore API, i did some reverse engineering to try to figure out how I can figure out when something is changing. What i came up with is to provide a Base model which i inherit from which allows me to write OnChange methods per attribute.
    from google.appengine.ext.db import Model as DBModel
    
    class BaseModel(db.Model):
        def __init__(self, parent=None, key_name=None, _app=None, **kwds):
            self.__isdirty = False
            DBModel.__init__(self, parent=None, key_name=None, _app=None, **kwds)
        
        def __setattr__(self,attrname,value):
            """
            DataStore api stores all prop values say "email" is stored in "_email" so
            we intercept the set attribute, see if it has changed, then check for an
            onchanged method for that property to call
            """
            if (attrname.find('_') != 0):
                if hasattr(self,'_' + attrname):
                    curval = getattr(self,'_' + attrname)
                    if curval != value:
                        self.__isdirty = True
                        if hasattr(self,attrname + '_onchange'):
                            getattr(self,attrname + '_onchange')(curval,value)
            
            DBModel.__setattr__(self,attrname,value)
      
    
    Then, on my Entry class (blog entry), i implemented a method whenever an entry changes from published to draft and back, and updated the archive.
    class Entry(BaseModel):
        author = db.UserProperty()
        blog = db.ReferenceProperty(Blog)
        published = db.BooleanProperty(default=False)
        content = db.TextProperty(default='')
        # more cut out
        
        def published_onchange(self,curval,newval):
            """
            Gets called every time published status changes
            """
            if self.entrytype == 'post':
                my = self.date.strftime('%b-%Y') # May-2008
                archive = Archive.all().filter('monthyear',my).fetch(10)
                if curval == False and newval == True:
                    # add to archive
                    if archive == []: # new month
                        archive = Archive(blog=self.blog,monthyear=my)
                    else: 
                        archive = archive[0]
                    archive.entrycount += 1
                    archive.put()
                    self.blog.entrycount += 1
                else:
                    # remove from archive
                    if archive and archive[0]:
                        archive = archive[0]
                        archive.entrycount -= 1
                        if archive.entrycount == 0:
                            archive.delete()
                        else:
                            archive.put()
                    self.blog.entrycount -= 1
                
                self.blog.save()
      
    
  • Pre-Render HTML: On the previous entry, we are storing some tables (Archive) which are purely used for convenience. They are never called at runtime, and in fact the Archive table is then rendered into html and stored again. Here is an admin entry "POST" update that then calls to update the cache. The Cache is actually a Blog entity which is used on every single request. It is half way between a Cache, configuration and context in normal web apps.
    class AdminEntry(BaseController):
        def post(self,entrytype='post',key=None):
            #update entry from form post
            entry.save()
            rebuild_cache(self.blog)
    
    def rebuild_cache(blog):
        """
        Pre-Render's and cache's html in blog object.  Everything
        in here doesn't change very often, so we can update it at point of change
        """
        pages = Entry.all().filter("entrytype =", "page").filter("published =", True).fetch(20)
        archives = Archive.all().order('-date').fetch(10)
        recententries = Entry.all().filter('entrytype =','post').filter("published", True).order('-date').fetch(10)
        links = Link.all().filter('linktype =','blogroll')
        template_vals = {'recententries':recententries,'pages':pages,
                'links':links,'archives':archives}
    
        path = os.path.join(os.path.dirname(__file__), 'views/sidebar.html')
        blog.sidebar = template.render(path, template_vals)
        path = os.path.join(os.path.dirname(__file__), 'views/topmenu.html')
        blog.topmenu = template.render(path, template_vals) 
        blog.save()
        
    

These tricks helped make writing App engine even easier, moving more to an event driven model that updates the cache. You can find all the code from this app at the Github code repository.

tags: gae, appengine, potlatchblog

A few more updates to Blog!

A few more updates today:
  • Added archive capability, traditional blog archive by month
  • Added support for "migrations" in the blog. see it here. Not very scalable going forward (can time out). But, seems to address the need of changes on app engine for now.
  • Cleaned up slugs a bit
  • Added Draft mode/Publish capability

And started working on the Demisauce comment system, hopefully this week...

tags: updates, gae, potlatchblog

Enjoyable Deployments! Google App Engine's best feature

Wow, i have deployed several versions of this blog now to update a few things, and am completely blown away at how enjoyable and easy doing deployments with App Engine is.

Here is the command to upload a new version, and the output at the command. Note how it recognizes and only uploads changed files which is nice.

$ appcfg.py update potlatchblog/
Scanning files on local disk.
Initiating update.
Cloning 23 static files.
Cloning 31 application files.
Uploading 2 files.
Closing update.
Uploading index definitions.

I am immediately jealous and wish deployments were this easy elsewhere.
Items I am NOT having to do in order to deploy:

  • Logging into remote server via ftp, and copying entire folders/files up.
  • Creating change script's for DB, and uploading and running on DB.
  • Figuring out which files are new and need to be deployed.
  • Restarting web servers.
  • Configuring and setting up Database.
  • Configuring and setting up load balancing, web servers, different routes for static files.

That is a lot of work that i am NOT doing, wow. It will be interesting to see if the rest of the industry follows suit. Chris Anderson created "AppDrop" an amazon EC2 image that will host the apps developed on the SDK. Rightscale.com has easy deployments for Amazon EC2 images, including a specialized one for Ruby on Rails. Certainly Capistrano for Ruby (can also be used for non ruby apps) has raised the bar on helping with deployments, but certainly any applications that depend on traditional databases, load balancing, and web servers still require a lot more configuration than Google App Engine right now. Jumpbox.com released a virtual machine with the Google App Engine SDK installed for development purposes as well, but would not seem suitable for production to host that container somewhere else as the SDK uses flat files without indexing.

But, the service that seemed closed to me is Mosso, the new Rackspace "in the clould" service. It seems to provide the auto-distribution to many machines type concept although it doesn't appear that their DB's do the same.

This I think will continue as a trend, the ability to remove all of the setup/configuration/deployment/scalability from application development to be hidden behind a service! Yes!! Keep it coming and keep the competition going!

tags: gae, appengine

Introducing Potlatch Blog: A google app engine personal blog

I was looking for a project to try out on Google App Engine, and also wanted to have something to showcase the comment system in Demisauce that i was working on. So, here is another Personal Blog Engine. You can get the code at my Github repository.

Still working on adding in the Demisauce comment system, so youll have to comment on my APotlatch on Wordpress.com blog or on the github code.

tags: python, gae