Aggregated blog posts about Django, updated every hour

May 22, 2013

THANK YOU!

From DjangoCon Europe on May 22, 2013 12:56 PM

image

Wow, what an incredible week. We still can’t believe we actually pulled it off.

We spent the last year planning, organizing and working to deliver you the best possible experience. More importantly, we were having a lot of fun and we took a lot of risks that paid off. It’s been an amazing year and we will never forget this last week with you.

We owe a huge THANK YOU to everyone who helped us along the way and words can’t express how grateful we are. We couldn’t make it without you. Thank you for trusting our ideas and backing us so generously from the very beginning. We were just crazy kids without any background who wanted to do fun things for Django community and your trust gave us power to do the impossible.

You are DjangoCon. You made the last week unforgettable for us and everyone else. The conference would not even be near this level of awesomeness if it wasn’t for you. Thanks for being so awesome. We think that since last week, the word ‘awesome’ belongs to DjangoCon forever.

If you still can’t fight your post-DjangoCon-depression, make sure to remember Django Circus Story, watch videos and see all the photos. We will keep you posted with new fun videos this week and publish all the talks in 3 weeks on our Youtube channel. If you posted a review or blogpost about DjangoCon somewhere, do let us know about that! 

Other than that, we still think we can do even better. If you want to see us raising the bar even higher, make sure to follow Makerland.

See you soon, somewhere!

image

South 0.8, Migrations and DjangoCon

From Andrew Godwin on May 22, 2013 12:16 PM

A new release of an old friend, and more news on django.db.migrations.

I've wanted to get a new release of South out for ages, so I'm delighted that I've finally done so. South 0.8 is now available on PyPI - there's not a great many new changes, the most notable (and the reason for the major version bump) being Python 3 support.

Aymeric Augustin was instrumental in getting that support implemented, so I'd like to thank him for his work on it. On a related note, support for Python 2.5 is being dropped - if you still need that, you'll need to stick with the 0.7.x series.

The other notable change is support for index_together, one of the new improvements in Django 1.5 and something that should have been released a while ago. There's still no first-party support for AUTH_USER_MODEL - it'll work fine as long as you're not distributing third-party apps with migrations. The overall solution to that is something that will have to be implemented in the rewrites that are underway.

db.migrations

Those rewrites are coming along well, however. Last week I was at DjangoCon EU, in Warsaw, Poland, and I had a fantastic time, as you can read in my blog post about it. In particular, I had some good discussions with fellow core developers and Django and South users, to clear up some more thoughts I was having.

At the sprints, I got quite a bit more code implemented for db.migrations - as always, you can see the progress on my GitHub branch.

Most progress was on the "state" code and field freezing, so I'd like to discuss that.

State

The "state" part of db.migrations is the part which is responsible for the in-memory running of migrations to build correct versions of models.

In essence, it runs each of the actions in your migrations on fake versions of models (represented by a class called ModelState) in memory, and at the end it can then render those states into full models, to use for a data migration or pass to the schema migration functions.

The basic format is reasonably simple - there's just a class that represents a model, with attributes for all the things models can have, like their options (the things you put in Meta) and their name.

Fields, however, are more tricky. The problem South has faced since its inception is how you take a set of fields and serialise them - something that has finally been fixed.

The Good, The Bad and the Source Code Parser

You see, there's no way, given an instance of a Field, to tell how you reconstruct it. Sure, you can tell what class it is, and some values are obvious (like field.max_length), but getting the value that you passed in to a ForeignKey for its relationship is trickier.

The first versions of South solved this in a very simple way - they opened up your models.py file, read the source code, and chopped out the field definition using string manipulation. Needless to say, this was very fragile, and didn't work with any kind of conditional around fields.

The next (and currently shipping) approach was to inspect the fields' attributes using something called modelinspector. This was a built-in set of rules which South has about how to work out a fields arguments just by inspecting its attributes.

While this works well for core Django fields, there's no way of knowing how third-party fields work without shipping rules for them with South (which a few apps have) or by declaring them yourself when you declare the field.

The way these custom rules were declared was difficult to understand and not immediately obvious, and so there have been a lot of complaints with the current method about custom fields and South not really playing well together.

In particular, South wouldn't just accept a custom field even if it was a simple subclass - you had to tell South it was safe to use using a list of regular expressions on field path names. While it's worked till now, it's clearly not the best solution.

Introducting deconstruct()

The new solution is now in my branch - passing this responsibility onto the fields themselves. The API a field is required to provide has grown an additional function: deconstruct().

This function takes no arguments, and returns four arguments needed to recreate the field: its attribute name (what field name it was assigned to on the model), a path to import it from, positional arguments and keyword arguments.

The base implementation of this on Field is the most complex one and handles all the default arguments. New field classes will just need a simpler override, like the one for DecimalField, which adds on the new arguments.

I'll be writing up full documentation on this into the Django docs as part of my branch, but just keep in mind that all custom fields will need to provide this method soon, or they will not be usable with migrations. I plan to submit pull requests to a decent number of third-party apps that use custom fields with this method implemented for them, to help kickstart adoption.

Back to State

This all means that the state tracking can now work - it has methods to take either a model or a whole AppCache and turn it into a ModelState or ProjectState object, which can then rebuild models or AppCaches respectively.

This is what will power the autodetection - South will render the most recent version of the models it has, and compare them to the ones you currently have in your project. If there's any material database differences - a new field, a model has gone, db_table is changed - it will generate the appropriate migration.

Some changes don't affect the database, of course. verbose_name never touches the database, and much to people's surprise, neither does default - Django implements all defaults in Python rather than in the database, as otherwise there's no way to allow arbitary callables as a default value (something which is causing some pain doing serialising, let me tell you).

Context Managers

The other change that might effect users is that I've changed SchemaEditor to be a context manager, as suggested by a few people last week. That means that you now use it like so:

with connection.schema_editor() as editor:
    editor.create_model(Foo)
    editor.delete_model(Bar)

What's next?

Now that's all in place, the work of getting migrations to load from disk, create in-memory models and then run them through the schema editor is next - essentially, bringing together the past few weeks' work into a functioning whole.

Some of that code is already in place - a disk loader already reads classes from disk, and a recorder already has code to mark migrations as applied or not - but there's some more work in deciding the user interface for migrations in terms of commands.

Should the migrate command stay? Should it all be rolled into syncdb? Should they both go in favour of a third option? Some planning is needed. Any opinions are welcome, either via email or Twitter.

May 21, 2013

Transactions for web developers - Aymeric Augustin

From Reinout van Rees on May 21, 2013 06:43 AM

Initially he didn't know a lot about transactions, so he researched them in depth. A quote by Christophe Pettus: "transaction management tools are often made to seem like a black art".

He moves from the database (postgres and sqlite) to the interface (psycopg2 and sqlite3) to the framework (django).

Database

A definition: an SQL transaction is a sequence of SQL statements that is atomic with respect to recovery. In SQL 92, a transaction begins with a transaction-initiating statement (almost everything can start a transaction) and it ends with a commit, an explicit rollback (ROLLBACK) or an implicit rollback.

SQL 1999 changed this a bit. It has savepoints. After a savepoint, you can rollback to that savepoint, to a previous savepoint or you can set a new savepoint. Oh, and there is an explicit transaction start statement (START TRANSACTION).

Key findings:

  • Statements always run in transactions.
  • Transactions are opened automatically.
  • Transactions are advanced technology.

Remember the dreaded "current transaction is aborted, commands ignored until end of transaction block" postgresql fault? What it actually means is "a previous statement failed, the application must perform a rollback". You cannot let postgres do any auto-recovery, that would break transactional integrity. It is your application that needs to do it (and it should always do it).

(I didn't hear what the actual solution is). Update: Diederik says in his comment that the solution is to just switch on autocommit for postgres in the database settings.

There's also AUTOCOMMIT. Most databases default to it. It commits every single statement automatically. Normally, you are either in auto-commit mode or inside transactions.

Interface: the python client libraries

Psycopg2 and sqlite3 are wrappers around C libraries. They use the DB API 2.0, PEP 249. It defines connections and cursors. Connections implement transactions, cursors do fetching and setting.

Note: the PEP wants the auto-commit to be off, initially!

Psycopg2 handles it by inserting a BEGIN before every statement, unless there's already a transaction in progress. Even for SELECTs.

Sqlite3 also inserts BEGIN, but not for a SELECT. All other statements get a COMMIT. Even a statement like SAVEPOINT: this is broken by design ("documentation issue").

Key findings:

  • The DB API requires the same transactional behaviour as the SQL standard.
  • Client libraries for databases that always autocommit have to emulate this behaviour.
  • But you can turn it off and use autocommit

Framework (django)

Django 1.5 and earlier runs with an open transaction. For updates/deletes/saves, django does a commit. More or less auto-commit.

There's transaction middleware. One http request = one transaction. Commit on success, roll back on exception. It only works for the default database, though. And depending on the order of your middleware, it may or may not apply.

Django provides a couple of high-level APIs. with transaction.autocommit():, with transaction.commit_on_success():, with transaction.commit_manually():. There is also a low-level API for doing stuff manually.

Key findings:

  • OK to forget it, it will change in 1.6.
  • The middleware is a reasonable idea.
  • The decorators/context managers don't work well, they often cannot be nested.

Django 1.6 uses database-level autocommit, which is what you'd normally expect. There are atomic transactions for requests: only for the view functions. Again. one transaction per http requests.

The high level API is now called atomic. Usable as a decorator and as a context manager. It can be safely nested.

Key learnings:

May 20, 2013

Why I left Heroku, and notes on my new AWS setup

From Adrian Holovaty on May 20, 2013 03:44 PM

On Friday, we migrated Soundslice from Heroku to direct use of Amazon Web Services (AWS). I'm very, very happy with this change and want to spread the word about how we did it and why you should consider it if you're in a similar position.

My Heroku experience

Soundslice had been on Heroku since the site launched in November 2012. I decided to use it for a few reasons:

  • Being a sysadmin is not my thing. I don't enjoy it, and I'm not particularly good at it.
  • Soundslice is a two-man operation (developer and designer), and my time is much better spent working on the product than doing sysadmin work.
  • Heroku had the promise of easy setup and easy scaling in cases of high traffic.

While I was getting Soundslice up and running on Heroku, I ran into problems immediately. For one, their automatic detection of Python/Django didn't work. I had to rejigger my code four or five times ("Should settings.py go in this directory? In a subdirectory? In a sub-subdirectory?") in order for it to pick up my app -- and this auto-detection stuff is the kind of thing that's very hard to debug.

Then I spent a full day and a half (!) trying to get Django error emails working. I verified that the server could send email, and all the necessary code worked from the Python shell, but errors just didn't get emailed out from the app for some reason. I never did figure out the problem -- I ended up punting and using Sentry/Raven (highly recommended).

These experiences, along with a few other oddities, made me weary of Heroku, but I kept with it.

To its credit, Heroku handled the Soundslice launch well, with no issues -- and using heroku:ps scale from the command line was super cool. In December, Soundslice made it to the Reddit homepage and 350,000 people visited the site in a period of a few hours. Heroku handled it nicely, after I scaled up the number of dynos.

But over the next few months, I got burned a few more times.

First, in January, they broke deployment. Whenever I tried to deploy, I got ugly error messages. I ended up routing around their bug by installing a different "buildpack" thanks to a tip from Jacob, but this left a sour taste in my mouth.

Then, one April evening, I deployed my app, and Heroku decided to upgrade the Python version during the deploy, from 2.7.3 to 2.7.4. (In itself, that's vaguely upsetting, as I didn't request an upgrade. But my app code worked just as well on the new version, so I was OK with it.) When the deployment was done, my site was completely down -- a HARD failure with a very ugly Heroku error message being shown to my users. I had no idea what happened. I raced through my recent commits, looking for problems. I looked at the Heroku log output, and it just said some stuff about my "soundslice" package not being found. I ran the site locally to make sure it was working. It was working fine. I had deployed successfully earlier in the day, and I had made no fundamental changes to package layout.

After several minutes of this futzing around, with the site being completely down, after I had just sent the link to some potential partners who, for all I know, were evaluating the site that very moment -- I deployed again and the site worked again. So it was nothing on my end. Clearly just something busted with the Heroku deployment process.

That's when Heroku lost my trust. From then on, whenever I deployed, I got a little nervous that something bad would happen, out of my control.

Around the same time, Soundslice began using some Python modules with compiled C extensions and other various non-Python code that was not deployable on Heroku with their standard requirements.txt process. Heroku offers a way to compile and package binaries, which I used successfully, but it was more work using that proprietary process than running a simple apt-get command on a server I had root access to.

With all of this, I decided it was time to leave Heroku. I'm still using Heroku for this blog, and I might use it in the future for small/throwaway projects, but I personally wouldn't recommend using it for anything more substantial. Especially now that I know how easy it is to get a powerful AWS stack running.

My AWS setup

I'm lucky to be friends with Scott VanDenPlas, who was director of dev ops for the Obama reelection tech team -- you know, the one that got a ton of attention for being awesome. Scott helped me set up a fantastic infrastructure for Soundslice on AWS. Despite having used Amazon S3 and EC2 a fair amount over the years, I had no idea how powerful Amazon's full suite of services really were until Scott showed me. Unsolicited advertisement: You should definitely hire Scott if you need any AWS work done. He's one of the very best.

The way we set up Soundslice is relatively simple. We made a custom AMI with our code/dependencies, then set up an Elastic Load Balancer with auto-scaling rules that instantiate app servers from that AMI based on load. I also converted the app to use MySQL. In detail:

Step 1: "Bake" an AMI. I grabbed an existing vanilla Ubuntu AMI (basically a frozen image of a Linux box) and installed the various packages Soundslice needs with apt-get and pip. I also compiled a few bits of code I needed that aren't in apt-get, and I got our app's code on there by cloning our Git repository. After that instance had all my code/dependencies on it, I created an AMI from it ("Create Image (EBS AMI)" in the EC2 dashboard).

Step 2: Set up auto-scaling rules. This is the real magic. We configured a load balancer (using Amazon ELB) to automatically spawn app servers based on load. This involves setting up things called "Launch configurations" and "scaling policies" and "metric alarms." Check out my Python code here to see the details. Basically, Amazon constantly monitors the app servers, and if any of them reaches a certain CPU usage, Amazon will automatically launch X new server(s) and associate them with the load balancer when they're up and running. Same thing applies if traffic levels go down and you need to terminate an instance or two. It's awesome.

Step 3: Change app not to use shared cache. Up until the AWS migration, Soundslice used memcache for Django session data. This introduces a few wrinkles in an auto-scaled environment, because it means each server needs access to a common memcache instance. Rather than have to deal with that, I changed the app to use cookie-based sessions, so that session data is stored in signed cookies rather than in memcache. This way, the web app servers don't need to share any state (other than the database). Plus it's faster for end users because the app doesn't have to hit memcache for session data.

Step 4: Migrate to MySQL. Eeeek, I know. I have been a die-hard PostgreSQL fan since Frank Wiles showed me the light circa 2003. But the only way to use Postgres on AWS is to do the maintenance/scaling yourself...and my distaste for doing sysadmin work is greater than my distate for MySQL. :-) Amazon offers RDS, which is basically hosted MySQL, with point-and-click replication. I fell in love with it the moment I scaled it from one to two availability zones with a couple of clicks on the AWS admin console. The simplicity is amazing.

Step 5: Add nice API with Fabric. Deployment was stupidly simple with Heroku, but it's easy to make it equally simple using a custom AWS environment -- I just had to do some upfront work by writing Fabric tasks. The key is, because you don't know how many servers you have at a given moment, or what their host names are, you query the Amazon API (using the excellent boto library) to get the hostnames dynamically. See here for the relevant parts of my fabfile.

Ongoing: Update AMI as needed. Whenever there's a new bit of code that my app needs -- say, a new apt-get package -- I make a one-off instance of the AMI, install the package, then freeze it as a new AMI. Then I associated the load balancer with the new AMI, and each new app server from then on will use the new AMI. I can force existing instances to use the new AMI by simply terminating them in the Amazon console; the load balancer will detect that they're terminated and, based on the scaling rules, will bring up a new instance with the new AMI.

Another approach would be to use Chef or Puppet to automatically install the necessary packages on each new server at instantiation time, instead of "baking" the packages into the AMI itself. We opted not to do this, because it was unnecessary complexity. The app is simple enough that the baked-AMI approach works nicely.

Put all this together, and you have a very powerful setup that I would argue is just as easy to use as Heroku (once it's set up!), with the full power of root access on your boxes, the ability to install whatever you want, set your scaling rules, etc. Try it!

A Rhapsody In Warsaw

From Andrew Godwin on May 20, 2013 02:17 PM

A field, a tent, and a large amount of Polish food - the makings of a great conference.

DjangoCon and I have a long history. The very first DjangoCon, back in 2008, was also my very first conference - and I've achieved the slightly dubious honour of having attended every single one.

They are not, of course, the only conferences that I go to; these days I try to speak at a variety of events. I've seen a lot of venues and they're all variations on a theme. That theme, of course, is large rooms full of chairs.

DjangoCon EU 2013, hosted last week in Warsaw, bucked that trend and was probably the best yet - and that's not something I say lightly. Ola Sitarska and the rest of her team went for an inspired gamble that really paid off.

The stage, and Craig Kerstiens. From flickr.com/photos/patrick91

When I first heard of the plans to host this year's DjangoCon EU in a circus tent, I was a little sceptical - after all, conference venues have evolved over many decades to serve the many needs of a large-scale event. Seating, airflow, power, networking, A/V, catering and toilets are all needs of a modern conference population.

The end result, however, was impressive. The circus tent had been outfitted with power, WiFi, lighting, a stage, projectors, audio and even chillers full of drinks, and beats many indoor venues I've spoken in. One entire side of the tent was open to the outside, providing easy airflow and access without making the inside too bright.

There were a couple of small niggles - the flight path of the nearby airport had changed the week before, meaning planes would occasionally interrupt talks, there wasn't quite enough toilet capacity at peak times, and WiFi was sluggish - which is normal for tech conferences. These were all outweighed by the positives, though - and what positives there were.

One of my tweets describes the conference as "like a music festival", which gives some idea of the wonderful attitude everyone had. Hammocks, deckchairs and bean bags were some of the seating options, there was a plentiful supply of free popcorn, and in between talks you could wander down to the fountain, relax on picnic blankets under the trees or dip into an entire freezer of ice cream.

Everyone was in a very good mood, and very relaxed. DjangoCon has always gone for a laid back approach, and here it worked incredibly well. I mostly come to DjangoCon to socialise and meet new people, rather than to learn from the talks, and in that environment it worked very well.

Danny's handstand lessons are almost a DjangoCon staple. From flickr.com/photos/patrick91

I'd like to highlight the catering in particular - it makes such a difference to have snacks and drinks available throughout the day rather than at set times. It was possible at any point in the afternoon to go and get some ice cream, fruit juice or even one of the sandwiches from breakfast.

Not only that, but the catering during the sprints wasn't the usual case of just ordering pizza or sandwiches for everyone - there were proper hot meals, with desserts. Portion size suffered a little since the sprints were so popular, but it was still very tasty, and I'm pleased to see healthier food at a sprint event.

As a speaker, the slick execution of this conference began before I even arrived. As DjangoCon is a community-run conference, staffed entirely by volunteers, speakers are generally expected to pay their own way, sometimes including some of the ticket price. This time, however, not only was admission free but the organisers picked us up from the airport and sorted out the hotel.

Of course, it's not that unusual for a conference to do this at all - I've experienced it many times before - but to do it while still keeping the prices low was impressive, and Ola and her team did it well, keeping us up to date and taking feedback into account very quickly.

One thing I wish more conferences did for their speakers is providing a local SIM card with data - this is especially useful in non-English-speaking countries, where getting one can be tricky. DjangoCon provided one right in the welcome basket, and I used it all week - a data connection is invaluable for navigating a foreign city.

Approaching the circus through the trees

I continue to be really impressed by the way the Django community evolves each year. DjangoCon EU itself gets more impressive every year - and previous years have set a high standard indeed.

Conferences are a very important part of bringing a community together and fostering the cohesion that really helps keep a project like Django going. I'm so glad that DjangoCon exists, and that each one helps push forward projects both old and new. It's somewhat unfortunate that the tech scene is mostly governed by who you know, rather than what, but events like these offer a way to improve both at once.

I'm also amazed how each year another group of volunteers tirelessly steps up to host, and this year was no different - a team of French volunteers stepped up to host it in the French Rivera next year. I can't wait - I wish them the best of luck.

I have many friends who I only ever see at conferences, and so the return of conference season each year is always a delight. It's a privilege to be able to attend and enjoy all these events each year, and to the organisers - not only of DjangoCon EU but all similar events - I'd like to say one thing: dziękuję!

DjangoCon EU 2013

From Horst Gutmann on May 20, 2013 12:00 AM

DjangoCon EU has always been something special. The very first European DjangoCon back in 2009 in Prague was a great start and over the years with Berlin, Amsterstam and Zurich in between in grew and grew and got better and better. This year, DjangoCon made a stop in Warsaw and, to be honest, it was the greatest conference I've ever been to.

Back when the first plans for this year's event started to leak I was all "what on Earth???", but the local team and all the helpers including Ola Sitarska, Ola Sendecka, Kuba Kucharski, Tomek Paczkowski and Jarek Piotrowski made it happen: A tech conference in a park, in a tent, that worked!!!

The conference

Over the years, conferences did evolved into those perfectly organized events where usually only one thing breaks: the WiFi. They usually take place in perfectly AC'd hotels or conference centers and depending on the price you pay you get served hot and cold beverages by people in black & white & bow-ties. Well, DjangoCon EU has always been different being a conference by the community for the community, but this year was like off a whole different planet with the organizers taking a huge risk and it working out perfectly.

The DjangoCon Circus in all its glory!

The tent is probably the most risky venue you can have. If the weather changes and you end up getting hit by rain from all directions, your conference venue might become a nightmare (depending naturally also to some degree on your attendees). But the exact opposite happened last week: Summer!

All week was perfect summer weather. Naturally, the tent became slighted heated up but still quite bearable and there was ice cream at the buffet whenever you wanted some :D

Then there were the talks. My main objectives when going to conferences are to meet up with people I haven't seen for a while and getting new ideas. No point in talking about the fun-part since ... heck, we were in a tent in a park in Warsaw! And right on the first day I moved PyGraz.org from being managed by supervisord to circus after seeing Tarek Ziadé's talk about it. So much for new ideas ;-)

Thematically probably the biggest theme here were databases, though, with at least 5 talks being about how to store your data (mostly in PostgreSQL) the right way:

Good stuff! :-)

So, the conference was awesome, but so was the after-show party: There was a large grill with exceptionally long queues for food and beer. Sadly the local fauna took that as an opportunity to assault us with all the mosquitos it had. Seems like for most this wasn't really a problem but Ulrich and myself were sadly hit rather hard so we left early while the other still had a blast into the morning. But at least I still saw Rob Spectre making the crowd go wild ;-)

Rob Spectre doing a live performance ;-)

The sprints

After everyone had gotten out of the bed on Saturday, the sprint was the next point on the agenda. I guess, more than 200 people attending a Django sprint will gets its own big entry in the Django history-book. Venue in the heart of Warsaw at the HardGamma offices, enough power and WiFi for everyone and despite being nearly twice as many people as anticipated there was also enough food for everyone!

Big thanks to the organizers for such an awesome week! And see you all again next year in France!

Travel appendix

  • Get a Play prepaid data SIM card. 20 PLN for 1GB is a steal!
  • For 30 PLN per day for a room for two the Hostel Słuzewiec is quite nice.
  • OMG, the burger at Meat Love!
  • ... while the burger at the Hard Rock Café is overrated ;-)
  • Besides TripAdvisor FourSquare is surprisingly useful to find good food in a city you don't know :-)

May 18, 2013

Announcement

From DjangoCon Europe on May 18, 2013 03:44 PM

It has been brought to our attention that there may have been a violation of the code of conduct at the DjangoCon speakers dinner. We have started an investigation of the incident. If our investigation reveals that a violation did occurr, we will make a further announcement regarding the action we will take.

May 17, 2013

Lightning talks day 3 - Djangocon.eu

From Reinout van Rees on May 17, 2013 03:32 PM

html5lib

Browsers are terribly forgiving. Python's parsers don't deal with everything, even valid html5 docs. html5lib was a problem. Google code and so and not python 3 compatible.

The new html5lib supports python 3. Github, readthedocs, works fine!

See https://pypi.python.org/pypi/html5lib

Real time web - Aymeric Augustin

He looked at web sockets in django. He played with tulip, Guido's library for async python. He had 1000 processes calculating a 'game of life' screen and django connected with them just fine and pushed the result to the browser.

PyWaw

PyWaw is a python community in Warsaw. They have now had 24 meetings with about 55 attendees. At the last meeting they even had 100 people attending.

They are not alone in Poland, there are other user groups.

So... go back to your cities and start user groups!

Scrapy

Screen scraping is when you need to get structured information from the web, quickly and with no hassle.

Scrapy takes the hassle out of screen scaping. It takes away the pain of parsing horrible html.

It has perfect documentation and a helpful community.

You can even scrape from amazon, even including logging in.

What can you do? Convert SVG to VML. Stock checker for a market place. Testing your own website.

Motivating users - Aaron Bassett

How to motivate kids aged 7-17 to learn online. Don't give rewards. If you give rewards, that means that the task must be really shit. Rewards don't scale either. After initial success, do you increase the reward?

Everyone is addicted to dopamine, the stuff you get in your head when you like something. Don't give a reward always, because that turns play (which kids like) into work.

They did some tests: random rewards do seem to work. So that's something you can look at.

Better model inheritance - Craig de Stigter

There are three kinds of model inheritance now in Django:

  • Proxy models.
  • Abstract models.
  • Multi-table.

None fit exactly with his usecase.

What he made was django-typed-models. A bit like proxy models, but it does store the type of the object in a type field, so you can figure out what you actually are.

They even use python magery for re-casting objects as a different type: self.__class__ = NewClass :-)

Django-fluent CMS - Diederik van der Boor

It is a CMS he build for his own CMS. Many CMSs are, in the end, monolythic.

He made a CMS that consists of separate parts. If you just want a tree of pages, use django-fluent-pages. If you just want an editable main part of a page, use another app. Etcetera.

See https://pypi.python.org/pypi/django-fluent-pages/, https://pypi.python.org/pypi/django-fluent-contents/

And you can also use django-fluent-dashboard, a more beautiful admin skin.

Update: he's got a website now: http://django-fluent.org/

Adventurer in the land of production environment - Maciej Pasternacki

Is your production enviroment up? Use a monitor like http://pingdom.com.

Django should not run as root. Run it with gunicorn and nginx, for instance.

Get immune to surprise upgrades: pip freeze.

Amulet of life saving: re-spawn when death with supervisord.

Stun immunity: a crontab with @reboot, for instance.

Acquire skill: chef, puppet.

The final battle: the slashdot effect. Gear up: autoscaling, self-healing.

New core committer

Marc Tamlyn is the new core committer!

Invisible and intentional management - Darin Swanson

Your project is not code, your project is your people. Make them happy. Make them do the best they can, no more, no less. Keep them leveling up.

Reward teamwork. It is not about the individual. Don't have individual goals, have team roles instead. Talk about "we" and "us". Lead by example. Help the team. Help everyone do better.

If you're a manager, try to be invisible. You're behind the scenes. Multiplying your impact behind the scenes. Don't take credit, the credit is for the team.

Move people to autonomy. Stay away from command and control. Set degrees of freedom and let people grow.

There's an implicit contract between you and your teammember: I'll give you freedom, you'll share status and information with me.

Discard what doesn't work, double down on what does. Especially regarding teamwork.

Do try to become better yourself, too. Find a mentor, read books, talk with others.

Relationships - Daniele Procida

There are 7E9 people in the world. Is your relationship really the best choice? Mathemetically not. Don't worry. Instead, commit to what you have already chosen and make it the best relation for you.

Same with web frameworks. There are so many... Stop worrying about making the wrong choice, stick with the one you have already chosen and make it the best for you.

Another subject: he wrote https://github.com/evildmp/django-inspector to report on all sorts of pages that his uses have added to his system. Status codes and so.

Arduino.loal

He hacked his landlord's garage door opener. They only had one and there were multiple people that needed to use it. So hack the opener, add an arduino and a webserver to control the garage door. They also added django-social-auth.

An enterprise level garage door opener!

SPDY - Emanuele Palazetti

How to deploy Django over SPDY. How to get that to work? Run django inside jython and thus inside java and SPDY push actually works.

3 simple ways to make your side load faster - Filip Wasilewski

  • Database connection pooling. Creating a connection can take quite some time. Connection pooling will come in django core 1.6.
  • Cache templates. Especially if you use something lik django-crispy-forms that uses lots of small templates. You only need to enable a template cacher in TEMPLATE_LOADERS in your settings.
  • pjax. Push state ajax. That helps a lot.

Salt stack - Chris Reeves

He used to use Puppet, but didn't like the DSL. It was quite slow and wanted something better, stronger, faster.

They came accros Salt. Written in Python. Very fast. It is explicit, you control everything from the master, the clients don't call home themselves.

In your configuration templates you can use jinja2 for loops and so.

See http://docs.saltstack.com/

His verdict: it is consise and clean.

Your webpages are too big

Why should you care?

  • Less developed countries.
  • Mobile users.
  • Overloaded wifi at django conferences.

What can you do?

  • gzip compression on the server.
  • django-htmlmin for html minification. It is still young and quite buggy at the moment.
  • css/js minification. Look at django-pipeline.
  • Do you need the full jquery? jquip has 90% of the functionality and 10% of the size. If you need the full version, use a CDN.
  • Bootstrap css: don't hit "download" go to "customize" and make yourself a smaller version.
  • https://github.com/samastur/image-diet to optimize your images. Works out of the box with easy-thumbnails.

Being a community member - Mark Steadman

He sucks at people stuff. Small groups are OK, but bigger groups are a problem. So that's hard when trying to integrate in a community, also the django community.

He works now on bambu-tools, a huge collection of small useful tools and components. But it needs work and fixes to make it useful for everyone.

Which is, see the first paragraph, hard for him. He'll be at the sprints and he'll do his best!

Django and vagrant and PyCharm

Three kinds of magic:

  • Django is beautiful. There's magic inside, but it is beautiful magic.
  • Vagrant is non-understandable magic.
  • PyCharm was already magic in 2010 and it is even better now.

He now has something even better than magic. He has a miracle. He showed vagrant workin inside PyCharm. Looks quite nice. The debugger even works when the code runs inside the virtual machine.

Jukebox - loci

What to do when different people have different music styles, for instance at a party? Time for democracy. A website running locally on your laptop allows you to log in and vote for numbers. The highest-voted songs will be played first :-)

See https://github.com/lociii/jukebox

Classy class based views

Very handy when working with Django's class based views: http://ccbv.co.uk

(Note: I already used a link here in my summary of Russell's class based views talk. See http://ccbv.co.uk/projects/Django/1.5/django.views.generic.edit/UpdateView/ )

Ideas don't solve problems - Lukasc Balcerzak

His first computer came with logo, you could move the cursor with it to draw lines. Infinite possibilities, so no goal.

There are a lot of open source projects. Does it reinvent the wheel? Does it solve a relative simple problem? Those are two ways to rate projects on.

Example one:

  • Just try reading a URL with Python. Which built-in library to use? Hard.
  • The "requests" library is a small library that solves one relatively simple problem.

Example two:

  • Django-guardian extends Django's auth and has shortcuts for basic stuff. Much simpler.
  • Django's auth itself is quite elaborate and hard.

Testing class based views - Benoît Chesneau

You can use django.test.client, but that is an integration test. All the middleware and so is used.

For unittests, you can use a request factory. You still test the system, the callable.

We can also do focused unittesting. We can mimic as_view():

view.request = request
view.args = args
views.kwargs = kwargs

With this, you can test your code much more focused. And you gain speed!

Further reading: http://tech.novapost.fr/django-unit-test-your-views-en.html

Django client certificates - Deni Bertovic

Why would you use client SSL certificates? Isn't user/passwd enough?

The advantage: nginx takes care of authentication.

See https://github.com/denibertovic/django-client-certificates

Arduino - Swift

Normally you have to code in C. But now you can also do it in Python.

See https://github.com/theycallmeswift/BreakfastSerial

He demo'ed it. Very nice! Looks useful and usable and simple. Perfect.

Next conference - Remco Wendt

This is now the fifth year. We have a tradition now! High quality conferences organized for programmers by programmers. Not for profit. Great! And now the fun factor is there, too.

The fun will stay! Next year it'll be France, on the beach, in the south! (They don't know the exact city yet).

Prehistorical Python: patterns past their prime - Lennart Regebro

From Reinout van Rees on May 17, 2013 01:35 PM

Dicts

This works now:

>>> from collections import defaultdict
>>> data = defaultdict(list)
>>> data['key'].add(42)

It was added in python 2.5. Previously you'd do a manual check whether the key exists and create it if it misses.

Sets

Sets are very useful. Sets contain unique values. Lookups are fast. Before you'd use a dictionary:

>>> d = {}
>>> for each in list_of_things:
...     d[each] = None
>>> list_of_things = d.keys()

Now you'd use:

>>> list_of_things = set(list_of_things)

Sorting

You don't need to turn a set into a list before sorting it. This works:

>>> something = set(...)
>>> nicely_sorted = sorted(something)

Previously you'd do some_list.sort() and then turn it into a set.

Sorting with cmp

This one is old::

>>> def compare(x, y):
...     return cmp(x.something, y.something)
>>> sorted(xxxx, cmp=compare)

New is to use a key. That gets you one call per item. The comparison function takes two items, so you get a whole lot of calls. Here's the new:

>>> def get_key(x):
...     return x.something
>>> sorted(xxxx, key=get_key)

Conditional expressions

This one is very common!

This old one is hard to debug if blank_choice also evaluates to None:

>>> first_choice = include_blank and blank_choice or []

There's a new syntax for conditional expressions:

>>> first_choice = blank_choice if include_blank else []

Constants and loops

Put constant calculations outside of the loop:

>>> const = 5 * a_var
>>> result = 0
>>> for each in some_iterable:
...     result += each * const

Someone suggested this as an old-dated pattern. You can put it inside the loop, python will detect that and work just as fast. He tried it out and it turns out to depend a lot on the kind of calculation, so just stick with the above example.

String concatenation

Which of these is faster:

>>> ''.join(['some', 'string'])
>>> 'some' + 'string'

It turns out that the first one, that most of us use because it is apparently faster, is actually slower! So just use +.

Where does that join come from then? Here. This is slow:

>>> result = ''
>>> for text in make_lots_of_tests():
...     result += text

And this is fast:

>>> result = ''.join(make_lots_of_tests())

The reason is that in the first example, the result text is copied in memory over and over again.

So: use .join() only for joining lists. This also means that you effectively do what looks good. Nobody will concatenate lots of separate strings over several lines in their source code. You'd just use a list there. For just a few strings, just concatenate them.

That's the nice thing of Python: if you do what looks good, you're mostly ok.

Django 1.5 Cheat Sheet

From Mercurytide: Django articles on May 17, 2013 01:00 PM

At Mercurytide, we know all too well the difficulties of memorizing shortcuts when you work in different frameworks. You think you’ve mastered it… and then a new version comes along. Mercurytide’s developers have been working with Django 1.5 since its release in February 2013. Our skilled developers have created a solid, quick-start cheat sheet with an easy to reference layout.

Dynamic models in Django - Juergen Schackmann

From Reinout van Rees on May 17, 2013 12:53 PM

The classical approach in django is:

Code development
You create models.
Deployment
Tables and columns are created with syncdb.
Runtime
Models and db tables are populated.

This means that models are pretty much static. There is no way to modify them at runtime based on user interactions. You can get something working with for instance hstore in postgresql (see the postgresql talk).

His usecase is for medical forms. The contents of those forms should be able to be defined inside the system. There are strict processes for installing medical software, so you cannot just release a new version with a new field. So you must get it to work at runtime.

The solution could be to use dynamic models, models created at runtime. Sometimes configuration by subject matter experts is better than code customization by developers. Also, dynamic models reduce the number of deployment cycles.

He has some criteria:

  • Performance.
  • Querability, which means the standard django query stuff should work.
  • Django standard tool integration (admin, cache, and so).
  • Supported DB backends. If possible, support all django DB backends.
  • Complexity/maintainability.

There are a couple of possible solutions:

Entity attribute value (EAV)
Colums are stored in separate table rows. Instead of a table having attributes, a table has an attribute table with values. There are at least two apps that provide this. The performance is a problem here.
Serialized dictionary
For instance one of the Django JsonField apps. A lot of what is normally database work is now moved to the application. You'll really have to create custom logic in your app to take care of it.
Runtime schema updates
Update models at runtime with syncdb of some South functionality or Andrew's new schema migrations for Django work. There are a couple of apps that do this. He also created his own one. The best one seems to be django-mutant.
Database-specific solutions
Hstore, django-nonrel. Drawback: it doesn't work with all database backends, of course.

In the end, the runtime schema updates approach looks like the best one.

For more reading: https://code.djangoproject.com/wiki/DynamicModels

Does your stuff scale? - Steven Holmes

From Reinout van Rees on May 17, 2013 11:54 AM

They grew from a two-person company to a 70-person one in two years. Central to that growth were Django and google app engine.

Scalability means both load scalability and functional scalability. You also have to deal with organizational scalability and geographical scalability if you want to grow your organization.

1: Running Django on app engine

It is easy to get confused. Is app engine real? Is it a joke? How to run your django stuff on it?

Their reasons to use it:

  • Auto-scaling. They build high-profile stuff and it needs to scale. They had a valentine day site that got a lot of attention on that day and it automatically scaled up without a change in the app. The day after it scaled down automatically, too.
  • Services and APIs.
  • No sysadmin needed.

Some caveats with app engine: it is a sandbox. You you cannot do "pip install". The filesystem isn't there in the traditional sense; there is a blob storage instead. And it is lock-in, mostly; portability is an issue.

They could work arounds these issues and ended up with a better application as a result.

There are three ways (that they use) of running Django on app engine:

  • Django non-rel. A ported version of Django, modified for NoSQL. Github, open source. It has a familiar API to Django, so you'll feel at home. It works in production.

    A drawback is that the familiarity can be misleading. So you might do things that won't work like M2M relations. And it can feel heavy. Because of the fork/port, it might feel hacky.

  • Djappengine. A lightweight skeleton around app engine. You don't use django's models. It aims to be the best of both worlds. It also supports NDB, which is app engine's new fast data storage layer.

    Drawback: you need to learn a new database API, so you have a higher learning curve.

  • Django appengine + cloudSQL. You get a fully supported django.

    Drawback: there's more setup and it is probably not as scalable as a datastore.

Now to scalability. App engine will already do a lot for you. Some things you yourself must do:

  • Plan.
  • Cache the hell out of it.
  • Offline tasks out of the request loop.
  • Prepare load tests and do profiling.

Functional scaling provided by app engine (apart from what django provides):

  • Memcache
  • Taskqueue
  • Mapreduce
  • Search
  • Email
  • Images

And you get up to 10 testable versions per app. http://0.yourapp, http://1.yourapp (the previous version) and so on. You can do A/B testing and traffic splitting. It blew his mind when he first discovered it.

2: Scaling an organization + culture

Part of it is organizational culture:

  • Be a minimalist.

  • Removed bottlenecks and overhead. Don't get in the way.

  • Just make good things. You can try (new) things out. You have freedom.

  • Internal apps. From a pool score app to steering deployments. They also have a big wiki with lots of info in it. It works well for them.

    They also build a small Django app to handle all the incoming emailed job applications. One small app build in an afternoon on a beach in Thailand now helps them to hire better people more quickly :-)

You can work from everywhere. Plane, pub, train, at home, in an office, at a beach, whatever. The minimalism helps in scaling.

Important question: what if google shuts it down? Answer: for them, the advantages outweigh the risks. (Note: ouch, this shows what closing Reader did for google's perceived reliability... Everyone in the room was applauding the question...)

The path to continuous deployment - Òscar Vilaplana

From Reinout van Rees on May 17, 2013 11:17 AM

If you've got continuous deployment, you've got stable servers. You make big changes in small increments.

Continuous deployment forces you to do many good things:

  • Good tests.
  • Repeatable build.
  • Well-configured identical machines.
  • Automated deployment.
  • Migrations and rollbacks
  • Etc.

Lots of good things. But let's compare it with lion taming.

Originally, lions were beaten into submission, confused and kept in line with whips. Likewise you'll be beaten if you dare to touch the production machine as it might break.

Now lions are understood better. Conditioning, behavior/signal mapping, reward and trust are the methods now. We understand that deployment is hard. We have behaviour/signal mapping with code/test/green/deploy. Etc.

  • Continuous deployment: everyone is responsible. Everyone deploys. You automatically learn. Everybody uses the same environment locally for test deployments. The same as on the server.

  • Testing is core. Slow tests are killing. Fast tests. And all types of tests: unit, functional and acceptance tests. Also automatic code checkers. The light must stay green. Quality must stay high, also test quality.

  • You need a repeatable build. And it should include not just code, but also configuration and infrastructure. And... always follow the pipeline.

    Even in emergencies, follow the pipeline. Peer review, tests, and then the deployment. Don't do manual steps.

  • Rollback. You must be able to switch back to the previous version.

    You can take a canary approach. Canary in the coalmine. Show the new version to a few users. "User testing" in a sense.

    Rollbacks in databases; keep it backwards compatible. Never drop columns, for instance. (After a long time, you can remove them safely, of course).

  • Small changes. Frequent releases mean less risk: if something breaks, you know where to search.

Some tips:

  • Split your stuff in components. A component is something that has a good API and that can be switched out for a different component. It can also be separately deployed.

    This helps with testing, too.

  • Rehearse releases. Get very good at them!

  • You need good infrastructure. You must manage it and test it good.

  • Keep all environments equal. Use vagrant.

  • Automate everything! And if it is not possible: document it. But know that that's something that's not quite correct yet!

Principle philosophy - Swift

From Reinout van Rees on May 17, 2013 09:23 AM

Principle philosophy: a way to discuss our rules and beliefs that govern our actions. He tells it from his personal experience.

His parents wanted to raise him as a good person. So they thought him good principles (like don't be a quitter, don't steal, etc). This is quite black/white though. We are all more gray/gray.

What about the question "how can I be a good programmer"? Programmers use logic, which sounds black/white again: write tests, don't repeat yourself. Sigh.

Talking about things like this is impossible without Immanuel Kant. He differentiates between reason and instinct. If "be happy" were our life goal, we'd just follow our instincts. So what is reason for, then, apart for doing good? Reason has to do with moral. There are three ways of looking at "doing good":

  • Duty. Good things can come from duty. Duty can also lead to non-good things, though. Hm, so this is not it.
  • Make a difference between the goal and the outcome. The outcome might be bad even though the goal could be worthy.
  • Universal lawfullness. Only do something if you know that everybody thinks it is a good idea.

Does this help with a question like "is testing good"?

Gandhi said that a man is the sum of his actions.

In a sense we are the sum of our experiences. So increase the amount of experience that you have. Either have the experiences yourself, or share them like on this conference. Everything looks different from the trenches: learn from eachother.

Some lessons he learned from a little baseball league experience (where he sucked) as a kid:

  • Swing for the fences. Aim for a home run. It allows you take great risks (because you have great goals). It motivates you.
  • Set reasonable goals, too. Incremental intermediate goals. Those intermediate goals help you progress.
  • You suck... and that's totally OK. You're not good at everything. It gives you a different perspective. And you can still give it your best. Also to that almost-unused old project that you get a bug report for.

Some take-aways:

  • Build a strong foundation of principles.
  • One size doesn't fit all
  • Learn from your experiences and share them.
  • Build a great network.
  • Ask all the right questions.

The advantages of diversity - Steve Holden

From Reinout van Rees on May 17, 2013 09:14 AM

Open source is great. It is absolutely amazing.

We live in a multi-dimensional world, though it is often presented otherwise.

Some present a simple line-based worldview. Bad-Good for instance. Where do you want to be on the line? Republican-democrat? Ruby-Python? Foreigner-native? Once you think along those lines (...) you tend to start thinking in opposites.

This is the basis for many invalid world views. Just draw a line, cluster according to your preference and you're ready. Linear concepts are not useful. The issue is polariation. In a one-dimensional world, there is no room for complexity.

What about a Venn-diagram based worldview? It allows for a bit more subtlety, but there's still a line on the outside...

The open source world has a lot to teach the rest of the world. It is focused, mostly, on outcomes and results. But it is not representative. It is not even representative of the tech industry generally. In tech, 20% are women, in open source it is more like 2%, for instance.

And... we need diversity! The biggest resource in open source is people. So you'd rather not exclude many people. The most common diversity areas, to give you an idea:

  • Ethnicity
  • Religion
  • Gender
  • Culture
  • Socio-economic background

Diversity is desirable because each individual is limited. We are all good at some things and bad at others. A group can solve a bigger range of problems And you don't want the group to be too homogeneous.

Typical open source projects will tend to focus on the actual programming and it'll ignore technical writers, designers, training, etc. Django stands out with its documentation. But Python's documentation isn't that good. There's no real emphasis on it in the current Python team. If we don't broaden our community with different skill sets and roles, we'll fall behind. Python is poised to be the #1 language of choice, but we need to improve some things before that can happen.

We ought to involve the community more as open source projects. We should run our projects more professionally. Be more open to involve all of the community more.

We should not accept it anymore to have to read through half-finished documentation and having to fall back to reading source code. "But that takes time to rectify". Well, yes. So involve more people. Get more people with more diverse skill sets to help. Perhaps you can then focus on what you're good at.

It is up to us all. The python world does have an awesome community. But we might just be a bit too smug about how wonderful we are. We should not get complacent and we should keep aiming at increasing our diversity.

Class based views: untangling the mess - Russell Keith-Magee

From Reinout van Rees on May 17, 2013 07:57 AM

Russell is a Django core dev.

Class based views were introduced two years ago, but they weren't greeted with universal acclaim. So he's here to clear up the mess and hopefully make it all more clear for everyone.

History

  • In the beginning of Django, there were only views. Function-based views. No generic views.

  • Next, because of DRY, don't repeat yourself, several generic views were added. Listing objects, editing an object, for instance. Editing something happens so often that a generic view inside Django seemed like a good idea.

    There are some problems here, though. The configuration you can do is limited by the arguments you can give in your URL configuration. No control over the logic view. You can't pass in an alternative view. There's no re-use between views.

    You could "fix" this by adding more and more arguments and allow passing in callables and so, but in the end you're almost building what you'd already get with object oriented class inheritance... So...

  • Next: class based views. It landed in Django 1.3 after it didn't work out to get it in 1.1 or 1.2.

What went wrong?

Then the wheels fell off. What went wrong?

  • Fundamental confusion over purpose. There were two problems being solved at the same time. The two: class based views and class based generic views.

    Class based views are only a class-based variant on function-based views that handles get/post/put/delete. Classed based views will give you a lot for free. Automatic OPTIONS requests handling. And naive HEAD handling. You wouldn't have that with a function based view. And you can modify it.

    Class based generic views use class based views as a base. They're re-writes of the existing function-based generic views. But a bit better and especially much more extensible.

  • Confusion over implementation choices. The reasons were good, but the reasons weren't clear.

    The whole discussion and the choices behind it can be found in the django wiki.

    The biggest question is about instantiation. What is being instantiated? How? When? At the start, once, or for every single request? How do you pass in configuration? What's the lifespan of an instance? Can you safely assign something onto self? What are the expectations?

    Note: the admin was already always class based. And it had state problems (assigning to self would leak state to other requests).

    In the end, all this was what resulted in the MyView.as_view(). as_view() results in a class factory. Otherwise they'd have to change the urls.py contract. A view is currently a callable. It would have to be changed to "a callable or a class". It was a value judgment in the end.

  • Ravioli code. It wasn't spagghetti code, but ravioli. A package with unknown contents.

    The generic class based views are made with a whole bunch of mixin classes. The edit view (UpdateView) consists of 9 (mixin) classes. See ccbv.co.uk.

    Why would you go through this 9-level madness? Yes, we have a complex class hierarchy. But the reason is that you can easily customize it.

    Ravioli tastes good! Maximum reuse of core logic. Extremely flexible. Easy to add your own functionality. But you need to learn it, that is the price you pay for the power you get. Learning means documentation, so...

  • Bad documentation. The initial documentation was bad. It is now better, but it needs to be made better still.

The biggest thing that needs fixing after the documentation is how to handle decorators like @login_required.

But... did we solve the right problems with the generic views? Modern websites have different problems. Multiple forms. Conditional forms. Continuous scrolling instead of pagination. AJAX support. PJAX (see yesterday's ajax+django talk). Multiple "actions" per page.

Call to action

  • In discusions, always make sure you whether you mean CBV or CBGV (class based views or class based generic views).

    Suggestion made later during the questions: call the latter just "generic views". The old function based generic views are gone, so...

  • Docs still can be improved.

  • #18830 FormCollection

  • Experiment with APIs. Django's admin is a useful case study. Why not do that with an API and make it easier to create your custom admin?

Get Django to play with old friends - Lynn Root

From Reinout van Rees on May 17, 2013 07:13 AM

She works for Red Hat on http://freeipa.org, on identity stuff for Linux.

Note: see her website for instructions and code examples.

Say that your pointy haired boss (or customer) asks you to make an internal web app with all the buzzwords.

So you can't use regular django auth, you'll need single sign on. Luckily since Django 1.5 you can have custom user models, so it'll fit with all your external requirements. One or two pieces of MIDDLEWARE_CLASSES and AUTHENTICATION_BACKENDS later and you play nice with the external single sign on. Django can be a team player.

Webserver? You'll probably have to use apache. So the environment can be kerberos+apache. Add mod_auth_kerb for kerberos support. Add a "keytab" (making sure it is chown'ed to apache).

There's a difference between authentication and authorization. Authentication is "just" logging in, authorization is what you're allowed to do. You'll have to connect to LDAP for that to ask which group(s) the user is a member of.

Setting up your own kerberos environment (for testing) is a pain. Unless you use a ready made vagrant box for it. Instructions are on her website.

Keynote - Daniel Greenfeld

From Reinout van Rees on May 17, 2013 07:07 AM

Django conferences have a tradition: there's an external luminary that gets to give a critical talk on Django. His talk won't be that. He's not external either: he wrote two scoops of Django together with Audrey (see also my review).

Being critical is sometimes easy. Just bash class based views, for instance. Bashing is easy. A rant like Zed Shaw's is fun, but he's not asked because of his rants, but because of his contributions (like books).

Similarly, Django delivers working stuff and that working stuff makes a lot of our work possible. So here are some good points about Django:

  • Django is everywhere. So many people and companies use it.

  • Django is powered by Python. Pep8, python is beautiful. And there's the import this zen of python that we use all the time to steer others and ourselves in the right direction.

  • Django's API wins. It is understandable. No weird names: templates, views, logging, sessions. Django projects also have understandable structures. If there's no views.py or models.py or templates/ directory, you know someone messed something up.

    Fat models are great. Just put your business logic all on your models. They do get big this way, however. You can make a separate module with helper functions you call in your model. Just call it with a model attribute. This way you get a reasonably small models.py and another file with very testable small functions. Win!

    The API is clear and logical. We're not fighting about architecture, we are getting things done.

  • Django views are simple callables. (Even class based views).

  • Django is awesome at deprecation. Code often just keeps working fine for multiple Django versions.

  • Django has lots of features. For instance, Django's admin is awesome. This is what you use to sell Django to others.

    Tip: don't try to use the admin with a nosql database. Just build something from scratch, that's easier than trying to get it working and especially keeping it working.

  • Django's full stack is awesome. Real projects are being done in unextended python+django. No third party packages. Not everyone goes to conferences and not everyone knows everything on pypi. You can get a lot done with just Django (though external packages help a lot!).

    Django is also python. There are over 30000 packages! Even if only 20%

  • Documentation. Django set the bar for others.

    Truth be told, some other projects have better documentation now. Django set the bar and others followed. They're playing by our rules :-)

  • Django is humble. We have a tradition of invited critical talks. They shape the community, they shape the core committers.

    Here are two critical talk summaries I have:

    Good criticism is good. They got a lot on their book. That was hard, but the book is much better for it.

  • The django community is generous. There's almost an unwritten rule: "the more you help people in the Django/Python community, the more the community helps you".

    They recieved a lot of help with their book. They also helped others with free books in case they couldn't pay for it. But that required an email to ask for it. Note that Daniel and Audrey did ask for something in return: either buy the book later, donate to charity or help someone.

    It worked! Somone gave a free guitar course to someone else. Someone bought a homeless person a dinner. People did projects for schools/churches/whatever. Contributions to open source software.

    It worked! People did good work! Lots of small local positive actions. All over the world.

Call to action: be awesome. Make the world a better place.

May 16, 2013

Djangocon lightning talks day 2

From Reinout van Rees on May 16, 2013 03:49 PM

Sorry if I mangled any of the names, that's the hardest part of blogging lightning talks. Many don't show their name long enough :-)

Single page web apps with django and extjs - Michał Karzyński

Single page apps: you're writing two apps. A front end one and a back end. The routing is done on the client side. The back end just spits out data (JSON api).

ExtJS has a store that handles communication with the backend. So that talks to your JSON API. Plan that API carefully, try to keep it nicely RESTful.

He showed a one minute demo. There is a longer one on his blog.

Don't trust, check - Marcin Mincer & Tomek Kopczuk

Check and question everything. Seek the best way. Not all good solutions are as good as they seem. They compared a standard view with a tastypie view and the regular view was much faster.

They also checked, for their example, whether using jinja2 would be faster than django templates. Yes, it is faster. Despite what the two scoops book says.

So: check everything for your usecase.

Lessons learned - Tom Christie

Tom maintains the Django rest framework project. He tells us a few lessons he learned.

Be negative
Everything someone submits to a project increases the maintenance burden for the maintainer. So suggest things that can be removed. Before submitting a bug, first fix an easy one.
It is your fault
You haven't yet stepped in and contributed what you want.
Forget about DRY
Simplicity is a design goal, DRY only follows from it
Link everything
Don't make me search, just provide a direct link
A deprecation policy makes change easier
Figuring out a formal deprecation policy actually makes making changes easier.
There's no such thing as a core dev
All of us have what we need to know to be a core developer. They only have the extra commit bit to actually commit the change, but almost everyone can do the work.

Community and learning - Karol Majta

Karol is a mechanical engineer that has some python knowledge. He's new to Django and provides some community feedback from that background.

If you're new to Django, you're new. You need experience. You need to learn. But because you don't know much, it is easy to learn more! And... the community is great at helping you learn. Positive feedback!

Two phase commit - Grzegorz Nosek

Two phase commit is a quite unknown database feature. Everyone knows database transactions.

In SQL it is somethign like PREPARE TRANSACTION 'foo' before a bigger set of changes, COMMIT PREPARED 'foo' afterwards.

This two phase commit is not only for databases, you could also use it in regular python code, for instance when creating files.

Django-downloadview - Benoît Bryon

You manage files with django, for instance for authentication, permissions. FileFields or ImageFields, but it can be also local files, remote URLs or generated files.

django-downloadview provides class based views for almost every usecase. You can also extend and modify those views for your own use cases.

Django is not efficient for streaming files, so you need x-sendfile (apache, lighthttpd) or x-accel (nginx). There's a middleware for that!

(Personal note: investigate that one; looks very useful).

PHP-like django - Markus Tomqvist

DHP is PHP in django :-) You've got a {% code %} template tag that you can write your django code in. Totally dirty and he's never going to finish the project.

It works by calling eval() on the extracted code and by copy/pasting some django wsgi stuff.

Everyone was laughing.

Django pony checkup - Erik Romijn

Last year he spoke about making secure websites.

Lots of those things are remote-checkable. So he wrote the django pony checkup website which you can pass the URL of a site.

He actually ran it on 3707 django websites. The score is not good.

  • 7% runs in debugmode.
  • 97% has no clickjacking protection.
  • 83% has no HTTPOnly session.

Run http://ponycheckup.com/ on your sites!

What is new in Django CMS - Benjamin Wohlwend

The new release isn't out yet; he already shows what's new.

The 2011 version had front-end-editing, but it has some problems which they aim to fix in the new version.

The 3.0 goals:

  • Make it beautiful.
  • Keep the end user out of the /admin.
  • We don't want to interfere with your markup.
  • Front end editing suitable for experimenting and playing with it. It should be safe.
  • It should be fun.

He showed a demo. Sure looks polished and nice.

L20n, localization 2.0

There's content localization (whicn l20n does not do), l20n does UI localization.

Gettext is the one used now. It is English-centric. It has limited plural-handling, for instance. All those countries have different rules. English just has single/plural, many languages have 1, 2 3/4, 5+, for instance.

There's a lot that needs fixing. L20N attempts to fix it.

He showed a couple of examples. Wow it sure is work for many languages. But it seemed to work quite well.

See http://l20n.org/

django-mail-factory - Rémy Hubscher

Mails with django: you need html and plain text emails, attachments, etc. Then you need to check whether the mails are coming out OK. So you mail it and someone has to look at it.

What we want? Preview html and plain text emails, possibly send one as a test. And good warnings about missing context variables.

You can register all your different emails at django-mail-factory and define the context variables that they need.

See https://pypi.python.org/pypi/django-mail-factory

Spreading Django - Markus Zapke-Gründemann

How can more people learn Django? You could give tutorials, give talks, hold workshops. For this you need someone to do it.

He prepared http://django-introduction.keimlink.de, which gives a nice introduction. This needs an English translation.

http://django-marcador.keimlink.de provides a tutorial. This one is already multilingual.

He wants to use transifex for the translations later on.

Translating models - Jef Geskens

Jef makes websites in Belgium. That is a problem. They need at least 3 versions: French, English and Dutch. So that also means translating text inside models. Some of those models are their own models, but a lot is in existing external apps and he doesn't want to modify them.

He started django-datatrans that can handle all that without changing anything to the underlying models.

Python deployer - Jonathan Slenders

They originally used Fabric, but they missed some features. So they started python-deployer.

He showed a demo.

Setting up your django project in 60 seconds

You want to get up and running as quick as possible. There are lots of things, though, that you need to do every time.

Take the time to put those initial steps in some sort of ready-made skeleton for new projects. Initial settings, especially if you split them up in separate files. Initial empty south migration. Fabfile, makefile, things like that.

The web of stuff - Zack Voase

From Reinout van Rees on May 16, 2013 03:47 PM

A plane flew over (noisily) at the start of his presentation. He put our work in perspective by saying that that was a 80 ton plane and that we're just building websites :-)

Possibilities

Computers used to take up whole rooms, now you have a smartphone. Big data is really big data now. Moore's lawworks both ways, though, so you have really small computers now. An arduino for instance.

He often makes comparison to the human body. All over our body, sensors give off signals that go into the central nervous system. The brain processes it and gives signals back to muscles if necessary. Sensing, feedback, understanding, reaction.

Stuff can talk to the cloud. Like a sensor in your body talks to your mind, stuff can treat the cloud as a brain. The cloud is what allows small tools to be smart.

Stuff does often need a human to interact with it. Like a smartphone. There's all sorts of people thinking about how to "liberate the computers from their human overlords". Why cannot computers sense and act on their own account?

So how do you bridge the gap betwen sensing and acting of stuff? How do you use Django for it? There's a lot available online about sensing and about acting, but not the communication in between.

The communication medium itself is a bit of a problem. You don't want to have a telephone data contract for every single small piece of stuff. A physical connection isn't always handy either.

His preferred communication medium is Twilio for sending SMSs. The stuff has low memory, so the message length limit is fine.

He showed a demo with a card reader that read his London transport card and sended an SMS to his Django site. The card reader was a combination between an arduino, a 'shield' sms sender and an RFID reader. The django app then submits it to foursquare. (The last part didn't work, probably due to a local foursquare problem, but the django app did have all the data he send from his card reader). Nice.

SUCCESS: after the lightning talks he did it again and now it worked!

Personal development

He had never done any hardware work until four months ago. No compiling for arduino. It sounded a bit scary to him.

It is normal, if you start as a beginner, you're slowly getting better if you keep at something. Then you automatically learn more and thus learn that there's a lot you don't know. That's the dip in the middle. Those are the people we need to keep on board so that they push through to the expert stage.

When you're in the middle, you know how bad you are (or how good you aren't yet). That's the risky phase were people quit.

Likewise documentation. Tutorials are useful for beginners. Reference material is useful for experts. There's not a lot in the middle and you're bound to be a bit frustrated in that stage.

So if you're going to start experimenting with electronics, you're bound to hit a wall, for instance when calculating complex electronic schemas. Push on anyway: the first time you make a phone call with your own device is totally worth it.

Two books he recommends to get you started:

  • Getting started with Arduino.
  • The art of electronics.

Apps for advanced plans, pricings, billings and payments - Krzysztof Dorosz

From Reinout van Rees on May 16, 2013 02:21 PM

He runs multiple sites with a common business model: accounts with plan subscriptions. So there's an obvious need for a generic account billing application.

The app should not be too specific, as that limits your business flexibility. Also it should not be too generic: you'll end up with an architecture from hell that way. And there's the billing as such: you need to pay close attention to security and so. Hard problem!

What he's making is django-plans for keeping track of the billing data, the plans, etc. And django-getpaid as payment processing app.

django-getpaid

Some challenges for the actual payment integration:

  • He wants it to be generic and lightweight. He doesn't want to pull in half of pypi for a payment processing app.
  • He wants a single API so that he can switch payment brokers if needed.
  • He wants it to be asynchronous. Synchronous processing blocks too long.
  • Multiple currency support.

None of the existing apps were good enough, so he made django-getpaid. It is stable and supports a lot of (Polish) payment systems and is pluggable if you need to add another one.

Pluggability is achieved with special backends you can enable in your Django settings. This way you can easily add more. Each backend can read its configuration from the settings, too (it looks a bit like the database settings).

Django-getpaid works through signals and listeners. You configure the listeners to accept the models that represent an order and to extract the necessary information from them. Yes, that means that is quite flexible. It are your models and you get to specify how to extract information. So getpaid doesn't make many assumptions.

There are template tags for rendering the forms that are needed. Easy to integrate. There are some assumptions django-getpaid makes of the backends. There should be a specially-named PaymentProcessor class, for instance.

django-plans

Core concept is a pricing table. Items you can buy in the rows, kinds of customers in the columns, plans in the cells. Plans can be marked as unavailable; there's a quota system; you can price them; periods; etc.

A tricky thing: switching plans! There are so many things that can happen. Does the customer switch from cheap to more expensive? Or the other way around? Is his current period expired or is it halfway? And so on. So... it should be pluggable.

What also needs to be pluggable: taxation policy. There are lots of differences per country.

See https://github.com/cypreess/django-plans

Taming ajax and django - Marc Egli and Jérémie Blaser

From Reinout van Rees on May 16, 2013 01:43 PM

Jérémie is a frontend developer and Marc does the backend.

Address/state handling and content rendering are the two main challenges.

Address and state handling

Problems:

  • Browser history. If you don't watch out, the back button won't be working.
  • Deeplinking should stay possible.
  • Crawler visibility: you want them to grab your entire site. But they don't use javascript. So you need a special URL for them

Some solutions:

  • A hash like http://yoursite.com/#/some/id. Javascript will need to handle everything behind the hash.

    Problem: without javascript it isn't visible. You're invisible to crawlers. It is easy to implement, though.

  • A hashbang like http://yoursite.com/#!/some/id. The difference? Google and others replace the URL with http://yoursite.com/?_escaped_fragment_/some/id. You'll have to configure your website to support it. Deeplinks work this way and crawlers can access the site via links in a search engine sitemap.

    It works with almost all browsers. And it covers all three mentioned problems. You have multiple URLs, however. And you'll need to maintain legacy URLs.

    In django you could implement it with some middleware that detects the _escaped_frament_ GET parameter.

  • Pushstate. The URL is a regular URL like http://yoursite.com/some/id. The best example is the github website.

    Pro: easier to implement on the backend, good URLs, everything crawlable and deeplinkable. It degrades gracefully.

    Drawbacks: no wide support. Even IE9 doesn't support it. 62% of the now-used-browser-clients support it. But... it does work, just slower, as you need to grab a whole new page. Another drawback: it is more work for the frontend developer.

Their approach

They do it with pjax: Push state ajax. A pjax link fetches the whole new page source over ajax and extracts a specified div and the title and modifies the browser history. You improve the speed this way by not needing to re-render the entire page, only one part is updated.

There are some existing implementations, like django-jax, django-easy-pjax and django-ajax-blocks, but they all had problems. So they made their own solution:

  • Django: template inheritance and filters and middleware.
  • Backbone.js

They have two base templates: one for the regular layout and one for the pjax responses. They build a template filter "pjax" that returns whether it is a pjax request or not and modifies the name of the template that's extended. That way you get a mostly empty page for pjax and the full one for regular requests.

Backbone handles the pjax handling, requesting the new page and replacing divs and so. And it keeps track of the browser history.

Some pitfalls: caching and redirection.

  • You use the same URL for your regular and pjax response. So caching can trip you up. Setting a Vary header helps, but not in all browsers. So they're now using a special URL and modify it back to the original URL in middleware.
  • Redirections happen transparently for ajax requests. You don't have a chance to intercept them. To work around it, they return json for pjax requests with the redirect info in there.

Content rendering

Client site templates can make your site faster. It would be nice to use the same template on the server and client side. They're using https://github.com/chrisdickinson/plate, which aims to be mostly compatible with Django's template language.

Growing open source seeds - Kenneth Reitz

From Reinout van Rees on May 16, 2013 12:02 PM

He shows us three kinds of (more or less) open source projects.

Type 1: public source

Once upon a time there was an "open source project" called the facebook SDK. Basically it just stopped working one day and nobody could help, despite offers for help on the issue tracker. Hacker news got wind of it and it was on the front page for a while. Facebook's reaction? Disabling the issue tracker... (Later on they fixed it).

That's not open source, that's public source. Often it is abandoned due to loack of interest, change of focus or so. The motivation for having it as open source simply is not clear.

Type 2: shared investment

A different example: gittip. They aim to be the world's first open company. There's a github issue for everything, even the company name. Major decisions are voted for on github. The code is open source, of course. All interviews with journalists are filmed and live-streamed. And all otherwise-often-backdoor-cooperation-agreements are fully open.

Projects like gittip are shared investment projects. Shared ownership, extreme transparency. There is very little questioning of motivations. The motivation is clear and public. There's a documented process for new contributers. The advantage? It is low risk. There's a high bus factor.

Type 3: dictatorship project

Kenneth is the author of requests. An open source project, very succesful. But all the decisions are made by Kenneth.

That's really more of an dictatorship project. A totalitarian BDFL that owns everything. The dictator is responsible for all decisions. Requests' values lie in its extreme opinions. If he'd involve more people, the value would be dilluted. There are drawbacks. A low bus factor. High risk of burnout: Kenneth is the single point of failure.

Lessons learned

  • Be cordial or be on your way. As a user, you need to keep all your interactions with the maintainer as respectful as possible. The maintainer put a lot of work in it and they don't owe you any of their time.

    As a maintainer, you also must be cordial. Be thankful to all contributions. Feedback is the liveblood of your project, even the negative. You'll need to ignore non-constructive comments. Be careful with the words you choose, sometimes contributors take what you say VERY personally. You might have to educate your users. And: a bit of kindness goes a long way.

  • Sustainability is almost the biggest challenge. Don't burn out. Try to get others to help.

    He quotes Wes Beary: "open source provides a unique opportunity for the trifecta of purpose, mastery and autonomy". Pay equal attention to all of these three. Learn to do less, focus more on your purpose, for instance.

  • Learn to say no. People ask for crazy features. Or they submit quite sane pull requests that, if you allow them all in, makes your project slow and unfocused. Kenneth wants as few lines of codes in his project. Negative diffs are the best diffs!

  • Open source makes the world a better place. Don't make it complicated!

Advanced Python through Django: metaclasses - Peter Inglesby

From Reinout van Rees on May 16, 2013 11:33 AM

Metaclasses are a handy feature of Python and Django makes good use of them.

When you create certain kinds of classes in Django, a metaclass will do something to the class before it is created. For forms, the various attributes of the class are converted into a base_fields dictionary on the class.

Similarly, a subclass of Model also fires up a metaclass that does some registering. A foreignkey to another model adds a relation back on that other model, for instance.

As a recap, a class is something that can be instantiated into an object. It can have an __init__() method that does something upon instantiation. type(your_instance) will return the class.

Did you know that you can create classes dynamically? See for yourself:

>>> name = 'ExampleClass'
>>> bases = (object,)
>>> attrs = {'__init__': lambda self: print('Hello from __init__')}
>>> ExampleClass = type(name, bases, attrs)
>>> example = ExampleClass()
Hello From __init__
>>> type(example)
<class '__console__.ExampleClass'>

So... we can actually control how classes are created! You could create a create_class() method that calls type but that modifies, for instance, the name. Or we could take all the attributes and add them to a base_fields dictionary on the instance. Hey, that's what we saw in the first Django form example!

Now, what is type exactly? It is a class that creates classes.

This also means we can subclass it! The most useful thing to override in our subclass is the __new__() method. The __init__() method creates instances from the class, the __new__() creates classes. So again we can modify the name and/or the attributes.

How do you use it in practice? Normally you'd set a __metaclass__ attribute on a class. This tells python to use that metaclass for creating the class. The same for subclasses. This is how our Django form classes use the metaclass specified in Django's base Form class.

Django uses metaclasses in five places: admin media, models, forms, formfields, form widgets. Grep for metaclass in your local django source code once to get a better feel for how Django uses it.

Note on python 3: it uses a slightly different syntax for specifying metaclasses. So Django 1.5 uses six to support both ways in a single codebase.

Warning: don't overuse metaclasses. They can make code difficult to debug and follow. Use Django as a good example of how to use metaclasses. Django saves you a lot of work by using metaclasses in a few locations.

See https://github.com/inglesp/Metaclasses

Nice way of giving a presentation, btw. Some sort of semi-interactive python prompt. The software is online at https://github.com/inglesp/prescons

Bleed for speed - Rob Spectre

From Reinout van Rees on May 16, 2013 11:02 AM

He started with a little history lesson. The sea battle of mobile bay. The admiral (Faragut) ordered the ships straight through the minefield (called "torpedoes" at the time). "Damn the torpedoes, full speed ahead". And it worked.

What does this to have to do with Django? Well, "damn the torpedoes, full speed ahead" feels a bit like how rapid prototyping feels afterwards. He's often involved with hackathons. Lot of quick coding in limited time with a lot of people. He learned a lot about his tools that way (and he often used Django).

There's a time to make a distinction between production and prototype. Sometimes it is better to just try something with a prototype. Throw-away code.

Aaargh! Throw-away code?!? We never throw code away. But it is something we must learn. It is good to let go once in a while. Let your code go. It isn't yourself, it is just some code.

The danger is that prototype code is put into action as production code. With some work, this danger can be prevented.

What about Django? Django is the best for prototyping. For rapid prototyping, Django is better than micro-frameworks like Flask that might seem better at first glance. Here are some reasons:

  • Django was build for rapid prototyping. It originated at a newspaper! 24 hours to build something.

  • It is flexible. It was build to bend. He can prepare something for the other people programmign with him and get them going and still keep the code in reasonable shape.

  • Us. The strength of Django is the community that supports it. Stack overflow questions and answers. The django websites. Books like two scoops of Django (see my review). That's not something you have with many other frameworks!

    Tip: read especially chapter 2 and 3 of the two scoops book.

    One thing he'd add to the book is stuff like fabfiles and makefiles. Handy for rapid prototyping.

Use stuff that's available. For instance Django's staticfiles app for grabbing together all the css/js/png. Whether it is in one directory or split out over multiple apps. It also helps with production.

Also look at brunch for setting up your javascript app's structure. It works well with Django.

Deployment: you need to show your prototype. Heroku is very quick for prototypes. (He mentioned that they have a data center in Europe now in case you need it).

Deployment: use chef. Lots of recipes. You could also use Salt if you're more into Python. Also lots of stuff available. Both take a while to learn, but it is a very good investment.

Configuration management is an extremely useful skill. Do it well.

Tastypie is the quickest way to get a REST api out of your Django. It is the best for rapid prototyping. Another good one is django-rest-framework. It will take a little bit longer to set up, but once done you're working with actual Django views. And django-rest-framework's browseable API is very helpful when you're working with a couple of others

Social auth connectors: everyone makes one and there are way too many half-working ones. He's got two that he can recommend. django-social-auth is very complete. The other is django-allauth for when speed is important for you.

If you don't want to play fair to others during a hackathon: use celery. It is very unfair to use celery, python and Django. The combination with Django is pretty OK to set up. You can do a lot what others cannot do easily. So use it for rapid prototyping. (Regarding setting it up: there are good chef recipes for it).

TEST. Yes, even during a hackathon. He doesn't advocate full test driven development. It is a balance. But the errors that kill you during a hackathon are the errors you make twice. So, for instance, test that all your views simply return a 200 Ok. This already helps prevent a lot of problems.

Look at AngularJS. Even if you don't use the framework itself. Why? It has a great javascript test runner. Good for testing while rapid prototyping.

Getting past Django ORM limitations with Postgres - Craig Kerstiens

From Reinout van Rees on May 16, 2013 08:43 AM

Tip: subscribe to the postgresql weekly newsletter that Craig makes.

Why postgres? A colleague described it as "it is the emacs of databases". There's just so much available inside postgres.

The problem is Django: it treats all databases the same. It doesn't prefer one over the other. It doesn't give special treatment. Look at all the types that Christophe mentioned yesterday: Django only supports a few of 'em. Likewise indexes.

For instance postgresql's Array type. Django doesn't support it, but it'd be perfect for for instance a list of tags on a model. For many of these types, also for the Array type, you have django apps that add support for them.

Great: hstore. NoSQL in your SQL. A key/value store in your SQL. They use it inside Heroku a lot: it scales fine and works fine. To use it in Django, use django-hstore. Add a data field as hstore to a model and suddenly you can do my_object.data = {'key': 'value', ...}!

Queuing: most people use celery. Postgres is a great queue. There's a celery backend called trunk for it.

Postgresql has great text search. You do need to do some setup in your models, but then it works fine. You'll have to read the docs, though.

Indexes. Many types. So it can be a bit of a mistery which one to use. Btree is the normal one. Generalized inverted Index (GIN) is for multiple values in 1 column. Good for array/hstore types. Generalized Search Tree (GIST) is for full text search, shapes, postgis.

Geospatial: just use geodjango. It uses postgresql/postgis's great geospatial stuff.

Tip: look at django-db-tools, for instance for its read-only-middleware that makes your site read-only (for maintenance, for instance).

Django 1.6 has persistent connections, but the current 1.5 doesn't. It can shave a whole lot of the rendering time of your pages if you have some sort of connection pooler! If you want the 1.6 functionality now, you can use for instance django-postgrespool. This really saves a lot of time.

Summary: postgresql is great, Django's ORM is pretty good. And you can extend it.

(In response to a question: never put session data in your database, it is a good way to kill the database.)

Here's the link to his presentation.

Fractal architectures - Laurens van Houtven

From Reinout van Rees on May 16, 2013 07:53 AM

He worked twisted on. twisted And people tend to talk about subjects that are almost antithetical to how Django does things. The thing that he does different from Django is that he's not using a single data source...

Once a database gets really really too big, putting multiple databaseservers next to eachother doesn't really work. You slowly start to get into expensive Oracle territory.

How he set it up now is what he calls a fractal architecture. The whole accepts requests. The parts of the whole acccept requests. The parts of the parts accept requests. That's why he calls it fractal. You could also call it sharded, but that has a bad name: it is something you do when nothing else works.

The way he looks at the architecture is SMTP. Email. Simple.

He prefers SQLite. Simple and included in the python standard library. Sure, you can use postgres but you'll need a VM to re-create the same environment locally as on your production machine. SQLite is the same everywhere.

In fact, he uses Axiom: an object store on top of SQLite. (Note: he is trying to write documentation for it at https://github.com/lvh/axiombook).

Another advantage of sqlite: it is easy to scale down. There's not much lower you can go than import sqlite3! If you want to use postgres, remember you must install it on each and every part :-)

Important: almost nothing is as fast as a local sqlite store, especially when it is reasonably small and fits mostly in RAM. Just look at the regular comparisons of access time for L1 cache, L2 cache, RAM, SSD, LAN, spinning rust, internet and so. So if you have a local database on an SSD with quite some RAM, it'll blow a network connection to some remote database out of the water.

But... some things don't fit locally. You have to search everywhere, for instance. There are three basic solutions:

Duplication
You could duplicate the data over all the parts, but that doesn't work if the data is big.
Sharding
Sharding will only work reasonable if the data itself, by nature, is sharded. Sales data per region, for instance.
Separation
Separate data for separate calculations in separate (local) stores. This is what he uses.

He mentioned paxos and raft (pdf), but I don't remember what for.

Play nice with others - Honza Král

From Reinout van Rees on May 16, 2013 07:19 AM

Many people think that reusable apps don't work: there's always something you need to change or modify. Honza is going to talk about his experience with ella, a django CMS.

He advocates using model inheritance. from ella.core.models import Publishable and then subclass your specific model (YoutubeVideo, for instance) from it. That Publishable has most of the basic CMS functionality. That way you get most of what the CMS needs for free and you still can extend it.

Showing the new model? You can use different templates easily. render_to_string() and friends accept a list of templates. So you can give it ['publishable.html', 'youtubevideo.html'] and so, using templates named somewhat after the model. This way you can re-use basic templates, but modify them if you want, just by providing a specially-named template. No code changes necessary.

They're using Redis to collect information from the Django database on publishables. This way you don't have any problem with Django's database's behaviour of focusing on a single kind of model at a time.

They also use django-appdata for storing extra data on existing Django models. From the pypi page: extandable field and related tools that enable Django apps to extend your reusable app. Through a registry you can add for instance tags to an existing model. To actually see the field in the admin, you do have to make a new ModelAdmin for that . Django-appdata is a hack, but sometimes hacks are very useful.

Through ella.core.custom_urls they even managed to add URLs (like a URL for adding "+1" functionality) to arbitrary models that support it (through django-appdata).

Warning: with great power comes great responsibility. All this is powerful, so it is easy to make a complete and utter mess out of it. Perhaps it is better to convince the customer to forgo a feature?

Warning: keep the defaults sane. Nice to use Redis to make querying quicker and simpler, but you just forced every developer to have Redis installed locally.

Warning: premature optimalization is the root of all evil. Likewise with extensibility. You normally don't need to make an app extensible. But don't close the door to extensibility. Add it when you need it.

Taming multiple databases with Django - Marek Stępniowski

From Reinout van Rees on May 16, 2013 06:51 AM

Marek works at SetJam: "We came to Django for the views, but stayed for the ORM". Django's ORM is pretty much in the sweet spot. SQLalchemy in comparison is less nice, having to learn a non-sql, non-pythonic language.

At SetJam, they have what they call a backend and frontend. The backend collects data and stores it in the database, the frontend spits it out, mostly via feeds.

They started out with one single big database, but that was hard to optimize. Many backend servers would write to the same database and the frontend server would read from it. Hard to optimize.

Next they added a database slave for reading. That was before Django's multi-db support, so they had if/elses in their settings files based on environment variables.

After Django's multi-db support, they could really support two databases and refer to them in the code with 'DEFAULT' and 'SLAVE'.

Later on they splitted up the database even more. What goes where is handled by two custom database routers: a "MasterSlaveRouter" for the master/slave distinction and an "AppRouter" for shuffling some apps' data to certain databases.

Tip: look at https://github.com/jbalogh/django-multidb-router, especially for the handy decorators (@use_master, for instance) it provides.

At a moment they had problems with Django's transaction decorators: they only work with the default database. They had to call the actual code and pass it the right database.

Similarly, South doesn't work very automatically with multiple databases. South's ticket #370 is still open after three years. He hopes he can get a fix into the new south-in-the-django-core code.

He showed a code example that looked pretty OK. Then he showed what needs fixing to get it to work reliably with multiple databases.

Multidb is awesome, but...

  • It needs more documentation.
  • Full support for multidb in schema migrations.
  • It needs better debugging tools (whiny transaction decorators).
  • Attributes like _for_write should be more clear. They're pretty important, but the underscore looks like it is unimportant. (Comment: a core dev discussed with him during the questions; he thought this wasn't necessary).

May 15, 2013

Djangocon lightning talks day 1

From Reinout van Rees on May 15, 2013 03:05 PM

Sorry if I mangled any of the names, I took a photo of the lightning talk submission form and tried to decypher them :-)

From carrots to Django - Kamila Stephiouska

She tells about the Geek Girls Carrots community. A community for women interested in new technology. 11 cities, 4 special meetings, 1 sprint, 5 kinds of workshops.

They like to promote women working in IT.

The held a "django carrot" recently: 14 hours, 10 mentors, 23 participants. They try to get special guests. Last week Daniel and Audrey came (the writers of two scoops of Django).

See http://django.carrots.pl

They chose Django because of the community.

Don't be afraid to commit - Daniele Procida

Lots of people work with Django. Lots of people program with it. There are barriers to getting them to work on Django. They might not be effective. They might be afraid. They might not communicate effectively.

You also need to manage your code and your environment. Virtualenv fixes the environment, but you need to learn that first. Version control helps with your code, but you first need to learn version control.

Similarly, you need to learn documentation and tests.

And you need to learn to have confidence when interacting with the community.

He organizes a workshop on the first day of the sprint to help people learn this. Virtualenv, pip, git/github, python tests, sphinx, readthedocs.

After the workshop you can start working on a couple of simple tickets that he reserved for workshop attendees.

See https://github.com/evildmp/afraid-to-commit

Elasticsearch - Honza Král

Elasticsearch is cool. Open source, distributed, schemaless, realtime. All the buzzwords. Originally it was for searching, which it still does.

It can also handle faceting (analytics). Aggregating data into facets.

Percolator is new. Trigger-like. A query you store in elasticstore. When you submit something to the store, you can get an alert whether it matched a query.

Stop writing settings files - Bruno

We're django devs, so we like settings files. from local_settings import *, that sort of stuff. The problem is that you can't add to existing settings, you have to overwrite it.

You can also have multiple settings files, importing base.py and production.py and so. You end up with lots and lots of settings files this way.

http://12factor.net advocates strict separation of config from code. Which Django doesn't.

So: expose your configuration as environment variables and use that to get them into your settings.

Look at daemontools' envdir. This lets you put environment variables in files in a defined directory and which sets the variables. You can use the same trick in your settings.py, it is only a few lines of code.

The files can be in version control. Your sysadmin will thank you. Easy to set up with salt/puppet/chef.

Teaching 2.0 - Krysztof Dorosz

How teaching should look like. He teaches at a university, so he nows about teaching.

You don't need to know everything better. You don't need to make one fixed PDF with fixed text and a fixed exercise.

He makes his classes in github. Everything in .rst files. Students can propose fixes and improvements. And they do!

This way you treat your students as collaborators and parners instead!

Configuring python environments with Puppet - Dmitry Trofimov

If you want to test with various python versions, you need to build them all and fit them out with their virtualenv and so. And use various django versions.

He prepared all those combinations with puppet. See https://github.com/traff/python.pp

Migrating the future - Andrew Godwin

From Reinout van Rees on May 15, 2013 02:24 PM

Andrew Godwin attempted to raise 2500 pounds for inclusion of south in Django core with kickstarter. It worked. In fact, he raised 17952 pounds!

Why does South need to be replaced by a new version inside Django itself?

  • It started 5 years ago, so there's 5 years of learning done in that period. Some things that made sense at the time aren't the best decision now.
  • There's poor support for VCS branching.
  • The migration files are huge.
  • Migration sets get too large. There are projects with 1000 steps!

The inside-django solution has two parts. The actual migration code and a separate backend. So if you want a different migration engine, you can probably reuse the backend code with its support for multiple types of databases.

The new migration format is more declarative instead of imperative like it is now. This makes them smaller. It also allows you to compute the end result in memory and apply one single migration.

Migrations will have a parent. So you won't have a problem with 0003_aaaa and 0003_bbbb migrations that halfway bite eachother. If a merge can be done automatically, fine, otherwise south/django will warn you.

Squashing will be added. You can squash a set of migrations together so that you can start from one new starting point instead of needing to go through the entire list of migrations.

One thing to watch out for: the Django field API will change a bit because the migration code needs to know how to re-create a field. Watch the django developer mailinglist if you're interested.

Read his blog at http://www.aeracode.org/ if you're interested in the details of everything he encounters.

Having your pony and committing it too - Jacob Burch

From Reinout van Rees on May 15, 2013 02:22 PM

Jacob Burch hopes you can learn from him if you're new at contributing to open source. He won't cover virtualenv, git, django's core code structure. And also not what to get involved in. What's this talk about? About you if you have something you want ("a pony") to get into Django core.

You are initially probably going to be a bit afraid. Jacob showed a couple of quotes about people that were initially not quite sure/certain when committing to Django. Then he showed the names of the people those quotes came from: they're now all core committers :-)

Two balances you have to keep in mind:

  • You should be both pro-active and patient. This is a tough balance to strike. If you manage it, it helps a lot.
  • You should be both confident and humble. Be humble, but be convinced of your idea. How to help here? The best thing is to run all the tests. It will give you confidence that your solution works (if it does). And it'll make you humble once you realize all the end cases that Django (and thus your fix) needs to support.

There are three broad categories of contributions:

Bug fixes
Start with a test condition. Something works or it doesn't. A test that demonstrates an issue is worth 20 emails.
Major contributions
Do your homework. Search trac, search the django developer mailinglist, become familiar with the code you're proposing to change. You need a go-ahead beforehand, so discuss it on the mailinglist.
Minor additions
Treat it as a major contribution. Only a beforehand go-ahead isn't needed here.

(Jacob did some live coding, trying to get a push into Django. In the meantime, he continued with the presentation by showing himself on video :-) )

Some do's/dont's when mailing about something:

  • Don't communicate entitlement. Don't focus only on your own needs.
  • Communicate patience. Accept that this is the start of a conversation.
  • State the problem clearly.
  • Show confidence: propose a clear solution. This really helps the core devs, as they have a clear proposal to work from instead of having to come up with something themselves. Creative energy is expensive energy.
  • Show your homework. Ticket numbers, list potential downsides/drawbacks.
  • Show humility. If you're unsure of an aspect, just ask.

Code is important, but most of the effort will probably be spend in discussing it. That said, here are some code related suggestions:

  • PEP8, unless it is consistently ignored on a certain point. Stay consistent locally.
  • Respect existing style.
  • Comments are your friend. Don't comment too heavily, but make sure that anything unusual is explained.
  • Get some peer review before submitting.

Repeat to yourself: you are not your code. Your ego is not on the line. Separate yourself from your code. Humility is really important. Your patch might not get accepted. You might get negative feedback. Don't take it personally. Your code is not yourself, even though it might feel like your own baby.

If it is not getting reviewed: remember that core devs are busy and might not have had time to review it. A bit of persistence is important, but don't irritate people. Tip: get to know people that can commit on conferences or at sprints. That helps.

Once you do get feedback: iterate quickly and get back quickly on the feedback, otherwise the core dev has to load everything back into their head.

Django Sprint workshop

From DjangoCon Europe on May 15, 2013 01:25 PM

We are thrilled to announce that during a DjangoCon sprint Daniele Procida will lead a workshop “Don’t be afraid to commit”. The workshop is addressed to all of you who want to contribute to open source projects but are not sure how to do it.

The workshop will take participants through the complete cycle of identifying a simple issue in a Django or Python project, writing a patch with tests and documentation, and submitting it.

The workshop will take about 3h starting at 11:00 on Saturday in the same place as Sprints. Since the number of attendes is limited (12 people), please make sure to sign up here: http://djangocon-workshops.eventbrite.com as soon as possible!

Combining Javascript and Django in a smart way - Przemek Lewandowski

From Reinout van Rees on May 15, 2013 11:59 AM

Django is a javascript-agnostic web framework. Nothing is built-in so you can be up to date all the time. Javascript development moves very quickly.

The basic approach is to include some custom inline javascript in the html pages. It quickly leads to illegible code that's hard to work on and hard to distribute.

Javascript has frameworks, too. They give your application structure and take work off your hands. This is the advanced approach. It includes several parts:

  • Communication with the server (REST api, websockets).
  • Application building: combining and minimizing files.
  • Static files management.
  • Javascript improvements: coffeescript and so.

What Przemek Lewandowski needed was a powerful javascript framework, coffeescript, testable code, js code minimization and fingerprinting for avoiding caches. And also rapid REST API development.

Javascript framework
They started with backbone, but it wasn't enough. They added marionette to backbone, but it still wasn't good enough. There's a lack of a binding mechanism; there are no reusable views; models are poor. AngularJS and Ember are better.
Coffeescript
It is controversial, but it helps to write code faster and use less code for it. It performs as well as javascript as it compiles to javascript. They used requireJS for painless coffeescript integration. Requirejs allows for modular code and gives you both a builder and an uglifier.
Building javascript apps
In the end they used django-require instead of django-compressor and django-pipeline.
REST api
Piston isn't really maintained anymore. Tastypie is reasonable, but django-rest-framework is the nicest one. It uses class based views, so it saves you a lot of work (even though still being very customizable).
Static files management.
Django's built-in static files management is good. And you can add extra "storages" to it to get django to store the static files in the cloud, for instance. django-require can be plugged in, too, to add a fingerprint to javascript files to ensure the latest version is always used.

There's more, like Bower, a javascript package manager. He didn't look at this yet. (Note by Reinout: look at http://blog.startifact.com/posts/overwhelmed-by-javascript-dependencies.html for a starting point)

Getting recommendations out of nothing - Ania Warzecha

From Reinout van Rees on May 15, 2013 11:25 AM

Ania Warzecha researched recommendation systems. Recommendations means estimating ratings or preferences for items a user hasn't seen yet. For example books or movies you might also like based on earlier purchases.

There are three kinds of recommendations.

  • Collaborative recommendations. Mostly created based on actions from other users. Which books are often bought together, for instance.

    Simple to implement, but can be slow for big datasets. And doesn't work well on new items and/or new users

  • Content-based recommendations. Looks for similar items.

    Fast and accurate, but tends towards over-specifications regarding needed data.

  • Hybrid methods. Combining them.

A case study: a Polish car parts website. You normally don't log in there, you just want a part. So older purchases aren't available. They did have a lot of parts and data, so they started with content-based recommendations.

They mixed in some basic user actions. 0=didn't buy, 1=browsed, 2=bought. Later on more elaborate, like points for items found through searching or items placed on wishlists.

They used Redis for its quick addition of user actions, simply pushing an additional score to an item which then gets added in the database.

One thing they needed to do was to merge session keys after a user logs in, merging the before-login session with the logged-in user's session. They didn't want to lose data collected till that point.

Now on to figuring out similar users. Common techniques are Euclidean distance, Pearson correlation and cosine similarity. But the problem was that it was slow. So they made an intermediary cache table in Redis.

Some conclusions:

  • Redis is good for fast storing and painless calculations.
  • Content-based recommendations are good for big datasets.
  • Keep all the data you can keep.

Advanced PostgreSQL in Django - Christophe Pettus

From Reinout van Rees on May 15, 2013 09:57 AM

(See also last year's talk)

Database agnosticism: write once, run on any database. A critical selling point for Django: it runs on many databases. But for others, it is bad. You pay a performance hit for not using database-specific features. So once you have made your choice, really use that database.

Here are some examples of good special things available in postgres.

Custom types

Custom types. If you like types, you'll love postgress. Many built-in types. And many are usable in Django by installing some small app.

  • Do you do .lower() in python code or in your SQL? For an email address for instance? Why not use citext, a case insensitive text field provided by postgres.
  • Often you want to add various key/value data to an object. Attributes. Extra table with a join? Add fields to the main table? Solution: hstore.
  • Postgres has a built-in json type! No need for mongodb :-) It is validated going in. Postgres 9.3 will make it much faster.
  • The UUID type is much more efficient than storing a long character string.
  • IPv4 and IPv6 addresses.

You can define your own! And it is easy to integrate into Python and Django:

  • You adapt it into psycopg2. This'll mean quite some regex'ing, but there are many examples.
  • You write a field class for Django.
  • You write a formfield and widget for use in forms and the admin.

Indexes

Django's models are great, but the index creation functionality is limited.

  • Very cool: partial indexes. You can create an index that only indexes a part of the table. Filter out inactive items, for instance. It might make your index much smaller and quicker.
  • Multicolumn indexes. Speeds up selection on multiple columns.
  • Expression indexes.

For these things you need to get custom SQL into the database. Using South is the only sane way.

Custom constraints

Django does foreign key constraints in the ORM, not in the database. The only other constraint is uniqueness.

Constraints should be pused into the database whenever possible. The database is much more efficient at it. And you remove one major path that could lead to data inconsistency.

Actually getting the constraints into the database means custom, hairy SQL. Sadly. He's working on something better.

You can use exclusion constraints, like not allowing a room booking if it overlaps with another.

Raw SQL

Christophe's rule: if you are joining more than three tables, use raw SQL. Below three, just use the ORM.

Django has raw query sets that even give you back actual Django model instances. See the django documentation.

Sometimes you just have to dig in and write some 40-line monster SQL to get some operation down from 30 seconds to 10 miliseconds.

Where to put the SQL? In the manager of the model, not directly in the view. You can also wrap it in SQL stored procedures. Again: use south to add stuff to the database if you need to.

Closing comments

  • Don't limit yourself because of some hypothetical need to later switch databases.
  • Postgresql has lots of advanced features: use them!

Processing payments for the paranoid - Andy McKay

From Reinout van Rees on May 15, 2013 08:50 AM

Everyone should be paranoid when processing payments. The client, the programmer, everyone.

He works on Firefox OS and more especially the marketplace ("don't call it an app store"). The marketplace is powered by Django. And of course it accepts payments. And of course it is open source (even the presentation is on github).

Btw, they have a bug bounty in place. If you find a real bug, mail them and they'll pay you a bounty!

The firefox add-on website already allows donations for firefox add-ons, handled through paypal. 500-2000 dollar per day. But the marketplace will process much larger amounts of money, so they needed to increase their paranoia level.

For online payments, you need tokens and credentials. And they need to be stored somewhere. And suddenly you're a big fat juicy target just waiting to be hacked.

  • XSS (cross site scripting) is an oft-occurring problem. Django has build-in protection for common cases. There's also content security policy that further limits it.

    They also started navigator.mozpay.

  • Phishing. In-person tricks. For instance for getting your hand on a database for test usage. You do need something for debugging, so they now create an automatic anonymized debug database.

  • SQL injection and so. They now have a REST api (solitude) for payments. This isolation helps preventing injections. Inside the database, lots is encrypted. And several items are stored outside of the databases. At the moment, the transaction data is separated from the payments data which is separated from the payment provider credentials.

    This is defence in depth: hedging against your own stupidity, basically.

    Access can happen through requests and oath1. Andy uses curling and slumber.

    There is a list of common problem points: OWASP. After reading through it they started django-paranoia which for instance provides paranoid forms: if you submit more key/values than expected, it will be logged. Also something that watches if your user agent changes during a session... IP changes are also logged, but normally they'll be valid. But if the first IP is in Poland and 5 minutes later it is in China...

About the phone: version 1 isn't done yet, but very very nearly. Which means they'll start to have scaling problems soon which need solving :-)

Circus: process and socket manager - Tarek Ziadé

From Reinout van Rees on May 15, 2013 07:39 AM

Tarek Ziadé can't believe he's giving a talk about his circus process manager in an actual circus tent :-)

A typical deployment is with nginx and gunicorn or uwsgi. But you add more and more items right next to your django process(es). Celery or haystack for instance. So you add a supervisor that starts 'em all. An often-used one is supervisord. You could use a system-level tool like upstart, but you need root access for that. You don't need it for supervisord.

Supervisord has some missing features like a powerful web console, clustering, realtime output, remote access and so. Supervisord has some of this, but not good enough. So they (mozilla) started with Circus.

They used several existing libraries, like psutil, zeroMQ, socket.io. psutil is the core of the system. Very handy for interacting with processes. It was a bit slow, but together with the psutils author they managed to make it fast.

ZeroMQ is an async library for message passing, so more or less a smart socket. They use message passing for making the various process data available to the circus tools, like 'circus-top' or 'circusd-stats'.

And because everyting is nicely decoupled, it is possible to add your own plugins for custom interaction. There are already community-provided plugins available.

He showed the web interface: looks nice. Live graph per process with memory and CPU usage. A simple "+" to add an additional process.

One last thing: there are multiple levels of supervision. Supervisord or circus must be started in some way by the system. And gunicorn, launched by supervisord or circus, itself starts up django processes. They added chaussette as a wsgi runner that can also run new processes on already-opened sockets so that they can be managed by circus, too.

2013 EU Djangocon introduction

From Reinout van Rees on May 15, 2013 06:33 AM

I'm at the 2013 European djangocon in Warsaw! Ready for three days of conferencing and, for me also, live blogging :-)

Russell Keith Magee started off the conference. He remembered Malcolm Tredinnick, mentioning his code contributions, but especially his community involvement. Lots of mailinglist messages. Lots of personal involvement, too, as he visited many people and local communities. Not only Django: also chess, for instance. And he build a community here, too: working on the Australian chess community.

He passed away unexpectedly a few months ago.

Make the most of the time you have. It can be over quickly. And especially: be part of communities. Make communities work. And especially make this Django community work. Make friends. Enjoy our friendly community!

May 14, 2013

Query a Random Row With Django

From Ed Menendez on May 14, 2013 09:38 PM

Here's a gist for a drop-in Django manager class that allows you to return a random row.

Model.objects.random()

It can be used in your models.py like this:

class QuoteManager(RandomManager):
    def random_filter(self):
        return self.filter(is_active=True)

class Quote(models.Model):
    quote = models.TextField()
    by = models.CharField(max_length=75)
    is_active = models.BooleanField(default=True)

    objects = QuoteManager()

    def __unicode__(self):
        return self.by

Advantages over using the order_by('?') is performance. Random sort at the database seems to be extremely slow on most databases even if the table only has a few thousand rows. Note that the count of records is cached for 5 minutes, so if the table changes often you may want to change that. A limitation is that it only returns one row.

Two scoops of Django book review

From Reinout van Rees on May 14, 2013 09:16 PM

I took the train from Utrecht (NL) to Warsaw today. I only had to change in Amersfoort (NL) and Berlin (DE), so it was a pretty direct connection. 12 hours of train time (which I enjoy). So that's enough time to read through two scoops of Django, the Django book by Daniel Greenfeld and Audrey Roy! Here's my review.

The summary: buy the book and learn a lot. For the longer version, I'll simply go though my notes I made for each chapter.

Coding style
The book starts off good, in my opinion, because it tells you to write good and neat code. PEP8. Good not-too-short variable names. And it got me thinking by advocating explicit relative imports ("from .models import SomeModel"). That's what's good about this book: Daniel and Audrey state preferences and tell you best practices and sometimes those best practices won't be your best practices. Or you didn't know something. Anyway, it gets you thinking; which is good and enjoyable.
Virtualenv
Hey, virtualenv in chapter two! Nice. Here, like in the rest of the book, I noticed they point a lot at existing documentation and don't provide much explanation. No virtualenv explanation here, for instance, just a pointer to the official docs. It is not necessarily bad, probably even good, but it is something to keep in mind. You'll have to do some work yourself (which will make you retain the knowledge you gain better anyway!).
Project layout
I got tickled here. Directories three levels deep? Especially having urls and settings in a subdirectory within a directory with the very same name (the name of the project)? Two chapters later, the settings seem to be moved to a settings/ directory, so mayhap I looked at an older beta version of the book :-)
Apps
The gospel that a Django application should do one thing only is repeated here, which is a very, very good thing. I completely agree. Advice like this is what makes it a good book. You get a good mindset out of the book.
Settings
The idea to have a directory settings/ with a base.py and then production.py, dev.py, reinout_dev.py and so looks OK. I use a different setup and this proposed one looks better. Half the chapter is about enviroment variables as a means of keeping things like SECRET_KEY and database passwords out of the settings. Yep, that can work. My opinion is that you can also keep it in the settings, provided you keep your code non-public. If you use environment settings, you still need to store the data somewhere. You won't type it in by hand, will you? There's no real suggestion in the chapter to solve this, though the solution of course depends on your chosen setup (and there are too many different kinds of setups to provide a single right answer). Nice touch is the clean suggested ImproperlyConfigured error when a enviroment variable is missing, this shows the care that went into the book.
Models
Hey, I learned something new! I didn't know about auto_now and auto_now_add on DateTimeFields! That's what I read books like this for: getting hints like this.
Views
Reminder for myself that I took out of these chapters: put less code in views.py. And I ought to look at django-braces for handy Class Based View mixins.
Forms
Hm. I probably ought to use forms much more. Especially for those spots in my code where I just take two or three variables directly from the GET or POST and stuff it in some query... Why not use a small form, just for the form validation? Much safer that way and I'd use more of what Django gives me! Again something in the book that educates me :-)
REST
It took me a while to spot the difference between the two views that are shown at the start. It turns out that one is a view on the collection of items and the other a view on one single item. The first has list+add, the second view+edit+delete. The names just don't make it clear. I think this chapter is a bit too short. On the other hand, perhaps one extra paragraph and two better class names would be enough.
Templates
Flat is better than nested. Solid advice not to go overboard with blocks and template inheritance. Oh, and TEMPLATE_STRING_IF_INVALID is handier than I thought for template debugging as you can add a %s to the string, which shows you the failed expression. This tip is going to help a lot.
Admin
They say that the admin should only be used for site admins, not for end users. It is just as easy and probably better to make a couple of quick edit pages or dashboards for your customers.
Third-party apps
There are lots of apps you can use. Look at http://djangopackages.com. Did you know it was written by the same people that wrote this book? Now you know why you should read the book.
Testing
Most of my favourite/essential packages are mentioned: coverage.py, factory_boy, mock. And the tip to zap the tests.py file and replace it with a tests/ subdirectory full of test files is correct.
Documentation
Documentation is mandatory. Even when installation is done with a fabfile or with chef, tell it anyway in the installation docs. And describe what the goal of the app is. Etc. Documentation is mandatory.
Performance tuning
Debug toolbar: yes. Hey, but I didn't know yet about django-cache-panel to see what happens in the cache. Sounds handy. This chapter also whacks me on the fingers a bit as I have almost done nothing with sql/db level optimization/fixing/profiling.
Security
"Always use https". And lots of other good tips. And, for me, the reminder to use forms (or rather, form validation) more for better security.
Logging
Good that the book mentions logger.exception('Something went wrong'), as it logs the message at ERROR level and automatically includes the traceback. No more weird exc_info-like stuff, just logger.exception().
Django utils
A handy list of utils that Django already provides such as sluggify, strip_tags and so.
Deployment
Gunicorn and mod_wsgi. Personally, I'm happy with gunicorn (when run behind supervisord). Nice isolation. Nice mostly-transparent restarts when things barf.
Getting help
Good thing: they tell you to do your homework before asking for help in the usual channels. Very good.

I've got one big gripe with the book. There's probably a good reason for the omission, but I'm missing setup.py. Telling to use a certain requirements.txt in some README is a poor substitute for Python's automatic dependency handling. This is not only good for apps you want to put on PYPI, but also for your own packages.

All in all: valuable book, buy it!

Daniel and Audrey are at Djangocon in Warsaw, so if you're there say them "hi" to thank them for the book.

May 12, 2013

Be Nicer at DjangoCon!

From DjangoCon Europe on May 12, 2013 07:41 PM

With a few days to DjangoCon, we thought it’d be nice to let you know how we are going to make your stay in Warsaw even NICER. 

The Nicer app is here! Available for both Android and iOS, smartphones and tablets. By downloading this app, you will always be connected to us, organizers. We will update you with hot news, unexpected agenda changes and post you little tips so you can spend awesome time in Warsaw.

Get the app of your choice here: http://getnicer.com/apps
Then simply follow DjangoCon Europe 2013. Make sure to turn on push notifications so you won’t miss an important event!

Other than that, make sure to follow us on twitter: @djangocon, tweet using the #DjangoCon hashtag, check out our pictures on instagram and videos on Vine to get a full coverage :) 

See you in Warsaw really SOON!

Enabling CORS in Angular JS

From Torsten Engelbrecht on May 12, 2013 01:09 PM

I was recently experimenting with building an API with django-tastypie and make it accessible via CORS, so it can be used from a different host from an AngularJS app.

For the Django part it was relatively straightforward. I could have either written my own Middleware, dealing with incoming CORS requests, but decided to use django-cors-headers in the end. Following the instructions in the github repo and adding my host where AngularJS is hosted to the CORS_ORIGIN_WHITELIST setting did enable the Django server to handle CORS.

With AngularJS it was a little more tricky, mainly because information is spread all over the web. Beside the fact that I was trying to implement a service using ngResource to communicate with the API, the following did enable AngularJS to send its requests with the appropriate CORS headers globally for the whole app:

var myApp = angular.module('myApp', [
    'myAppApiService']);

myApp.config(['$httpProvider', function($httpProvider) {
        $httpProvider.defaults.useXDomain = true;
        delete $httpProvider.defaults.headers.common['X-Requested-With'];
    }
]);

_

Just setting useXDomain to true is not enough. AJAX request are also send with the X-Requested-With header, which indicate them as being AJAX. Removing the header is necessary, so the server is not rejecting the incoming request.

Meet our Platinum Sponsor: Mozilla

From DjangoCon Europe on May 12, 2013 12:53 PM

Mozilla hardly requires introduction. They make Firefox. And Thunderbird. And Persona. And other web stuff. They also fight for the users, by keeping the Web open and diverse. All this while being non-profit organization. They also help others, like, you know, sponsoring conferences :-)

What are they up to now? Firefox OS seems to be the new, hot project. A mobile operating system running Linux kernel that you can write applications in technologies you already know: HTML and JavaScript. Combine it with Firefox Marketplace and you have complete mobile ecosystem. Check them out at Mozilla Developer Network.

We are super excited to have two speakers from Mozilla: Andy McKay will speak about processing payments in Marketplace and Tarek Ziadé will talk about… circus!

May 11, 2013

Django Facebook – 1.5 and custom user model support

From Thierry Schellenbach on May 11, 2013 12:14 PM

Django Facebook now officially supports Django 1.5 and custom user models! Go try it out and upgrade to pip version 5.1.1. It’s backwards compatible and you can choose if you want to keep on using profiles, or migrate to the new custom user model. Installation instructions can be found on github.

Contributing

Thanks for all the contributions! My startup (Fashiolista) depends on a reliable Facebook integration and maintaining it would not be possible without all the pull requests from the community. Contributions are strongly appreciated. Seriously, give Github a try, fork and get started :)

About Django Facebook

Django Facebook enables your users to easily register using the Facebook API. It converts the Facebook user data and creates regular User and Profile objects. This makes it easy to integrate with your existing Django application.

I’ve built it for my startup Fashiolista.com and it’s currently used in production with thousands of signups per day. For a demo of the signup flow have a look at Fashiolista’s landing page (fashiolista.com)

After registration Django Facebook gives you access to user’s graph. Allowing for applications such as:

  • Open graph/ Timeline functionality
  • Seamless personalization
  • Inviting friends
  • Finding friends
  • Posting to a users profile

Django Facebook helps you quickly develop Facebook applications using Django.
Let me know what features or issues you are encountering!

Share and Enjoy: Digg Sphinn del.icio.us Facebook Mixx Google

May 10, 2013

The Easy Form Views Pattern Controversy

From Daniel Greenfeld on May 10, 2013 06:18 PM

In the summer of 2010 Frank Wiles of Revsys exposed me to what I later called the "Easy Form Views" pattern when creating Django form function views. I used this technique in a variety of places, including Django Packages and the documentation for django-uni-form (which is rebooted as django-crispy-forms). At DjangoCon 2011 Miguel Araujo and I opened our Advanced Django Forms Usage talk at DjangoCon 2011 with this technique. It’s a pattern that reduces the complexity of using forms in Django function-based views by flattening the form handling code.

How the Easy Form Views pattern works

Normally, function-based views in Django that handle form processing look something like this:

def my_view(request, template_name="my_app/my_form.html"):

    if request.method == 'POST':
        form = MyForm(request.POST)
        if form.is_valid():
            do_x() # custom logic here
            return redirect('home')
    else:
        form = MyForm()
    return render(request, template_name, {'form': form})

In contrast, the Easy Form Views pattern works like this:

def my_view(request, template_name="my_app/my_form.html"):

    form = MyForm(request.POST or None)
    if form.is_valid():
        do_x() # custom logic here
        return redirect('home')
    return render(request, template_name, {'form': form})

The way this works is that the django.http.HttpRequest object has a POST attribute that defaults to an empty dictionary-like object, even if the request’s method is equal to "GET". Since we know that request.POST exists in every Django view, and os at least as an empty dictionary-like object, we can skip the request.method == 'POST' by doing a simple boolean check on the request.POST dictionary.

In other words:

  • If request.POST dictionary evaluates as True, then instantiate the form bound with request.POST.
  • If the request.POST dictionary evaluates as False, then instantiate an unbound form.

Great! Faster to write and shallower code! What could possibly be wrong with that?

The Controversy

Before you jump to convert all your function based forms to this pattern, consider the following argument raised against it by a good friend:

This one of those things where "empty dictionary and null both evaluate as false" can bite you.

There's a difference between "There is no POST data", and "This wasn't a POST".

—by Russell Keith-Magee (paraphrased)

The problem he is talking about is data besides multipart/form-data or application/x-www-form-urlencoded would still end up in the request.POST dictionary-like attribute.

Where is the controversy? Well, I didn't write a retraction until now. Arguably I should have done it earlier. However, since I never ran into the edge case, I didn't see the need. Yet when it comes down to it, the "Easy Forms" approach has an implicit assumption about the incoming object, which in Python terms is not a good thing.

Getting bit by the Easy Form Views method

Here's how it happens:

Before Django 1.5 HTTP methods such as DELETE or PUT would see their data placed into Django's request.POST attribute. The form would fail, but it might not be clear to the developer or user why. HTTP GET and POST methods work as expected.

For Django 1.5 (and later) if a non-POST comes in then the form fails because request.POST is empty. HTTP GET and POST methods also work as expected.

Conclusion

Going forward, I prefer to use Django's class-based views or Django Rest Framework which make the issue of this pattern moot. When I do dip into function-based views handling classic HTML forms, I'm leery of using this pattern anymore. Yes, it is an edge case, but to inaccurately paraphrase Russell, "edge cases are where you get bit".

What I'm not going to do is rush to change existing views on existing projects. That's because personally I've yet to run into an actual problem with using this pattern. As they say, "If it ain't broke, don't fix it." While I'm not saying my code isn't broken, I'm also aware that 'fixing' things that aren't reporting errors is a dangerous path to tread.

Also, next time I get called on something by a person I respect, I'll respond more quickly. Nearly two years is too long a wait.

Update: Changed some of the text to be more succinct and took out the leading sentence.

Stonewall Jackson and documentation

From Reinout van Rees on May 10, 2013 08:26 AM

Today it is 150 years ago that Stonewall Jackson died. Not everyone will recognize the name: it is a general from the American civil war. And a good one at that!

Bear with me, I'll have a programming-related comment to make on documentation :-)

If you know a bit about the second world war, you might have heard about the German general Erwin Rommel. Jackson's fame was a bit like that. If you had to fight Jackson or Rommel, it didn't really matter that you had more men and equipment: he'd beat the crap out of you anyway. Once at a time Jackson's 15000 men ran circles around 60000 opponents and repeatedly beat them. That's 1:4. And they won.

Both Jackson and Rommel seemed to have a Fingerspitzengefühl. They'd known instinctively when to do or not do something. When to lay in wait and when to strike out despite the odds.

Both also seemed to be one-of-a-kind. I mean it in the sense that they could not teach others to do the same. It was all in their own head. It was all dependent upon them. And at least Jackson didn't tell anything to his subordinates; he was secretive. When he died, there was no one to take his place and no one who could emulate his expertly handling of his army.

Here's the link with programming: document your stuff. If it isn't documented, it doesn't exist. I updated a small internal tool two days ago and had to figure out which commandline arguments to pass to it because I had not originally documented it! There was a README, but only with server installation instructions; no local test instructions. And it only described one of the two scripts. Needless to say, I've now corrected this situation.

I wrote the tool originally and I'm the only one working on it. But after I haven't touched it for a year I sure need a reasonable README to get myself back on track. So: document! Try to pass on knowledge.

Btw, Stonewall Jackson died because he was shot by friendly troops. I'm not suggesting programmers should be shot for not-documenting their stuff, but a forceful reminder here or there could be useful :-)

May 09, 2013

Party with Base!

From DjangoCon Europe on May 09, 2013 04:35 PM

You probably thought one party is good enough for a conference. We’re here to prove you wrong! Awesome guys from Base are helping us organize Base Party on Wednesday, 15th May at 9pm in Klub Balsam.

Base is the only CRM built for people. They believe that by 2020, business software will be radically different. Base is paving the way by building the next generation of CRM software. Their mission is to make you and your team 10x more productive.

Most importantly, they’re building the best tech team in Europe. With a big vision and small, highly talented team, Base is creating an amazing place to work for self-driven and dynamic people. Watch this short video to know them better:

They’re looking for great python developers to join their team. Make sure to drop by Base Party at 9pm, Wednesday in Klub Balsam

Starting Off

From Andrew Godwin on May 09, 2013 03:36 PM

Welcome to the first of my Django Diaries, where I'll be detailing the progress I'm making on my Schema Alteration project.

After a very successful Kickstarter, I had the unfortunate situation of a couple of successive trips abroad, and so initial work has been a bit more delayed than I would have liked. However, thanks to securing more time to work on the project every week, progress should be faster than planned from now on.

The plan is that these diaries will contain a rough summary of the work I've been doing; they're here both to help engage you (the slightly-too-interested public) in the work I'm doing, as well as providing some transparency.

If you want to hear more about a certain issue, feel free to get in touch with me - see the About page for my contact details. I'd love to explain as much as I can to those who are interested!

Laying the Groundwork

The first task I faced was to go back to my original Django branch and get it up-to-date with the changes in trunk. The only change that affected the schema work was Aymeric Augustin's transaction changes - he's gone in and fixed a lot of the transaction API and cross-database differences with things like autocommit.

As a result, I got to simplify my code somewhat: https://github.com/andrewgodwin/django/commit/6e21a594

After that, the next step was to go in and fix the issues other core developers had with AppCache in the previous release - in particular, the way I was abusing it to make new models at runtime. But first, let me explain a little bit about how AppCache works, for the uninitiated.

AppCache

Note

Other responses may include "templates", "the URL dispatcher" or possibly just "everything"

Ask a core developer what part of Django they dislike most, and chances are good that AppCache will appear somewhere in that list. It's a very old part of Django, and responsible for both knowing what apps are available to the project as well as which models are available.

Django depends far too heavily on it - anything app-related in Django generally touches it, even if it has nothing to do with the ORM. That's a problem being solved by the app-loading branch, which has been going for quite a while but is ever so close to landing.

However, my issues lie elsewhere. The main problem is that any schema migration design is going to have to be able to make historical versions of models - if you have a data migration to run before a schema migration, that data migration needs old model classes as the tables won't yet match the schema your project currently has.

Alas, every time you make a new models.Model subclass in Django, an entry gets placed into the AppCache for that model. This is very useful - it's how ForeignKeys know how to find the other end of their relation, for example - but it means that if we're making three or four old versions of an Author model it's going to trample all over the AppCache and mess everything up.

Resistance is... fine, actually

Note

For those completely unaware, the Borg are an alien race in Star Trek who all share a single hive mind.

Even more excitingly, the AppCache class uses what's known as the "Borg Pattern" - any instance of that class will share state. That means we can't just make a second AppCache to put temporary models in!

The work I did was in two parts: de-borgify AppCache, and allow a per-model app_cache option.

AppCache actually still uses the Borg pattern, I've just moved all the logic down into a BaseAppCache (along with a setting which means additional caches don't try and load models from every app). This means that my code can now just call:

new_app_cache = BaseAppCache()

I might tidy up the class name into something more suitable, we'll see.

The second change is an app_cache option for models:

new_app_cache = BaseAppCache()
class Author(models.Model):
    class Meta:
        app_cache = new_app_cache

This means you can now assign models to something other than the default AppCache when they're created. Obviously this isn't meant for end-users to develop against; it's so we can make models at runtime into a separate, sandboxed AppCache, with ForeignKey resolution between them still working, but no pollution of the global cache.

You can see most of the changes here: https://github.com/andrewgodwin/django/commit/104ad050 and https://github.com/andrewgodwin/django/commit/75bf394d

Graphs, Graphs Everywhere

Now the groundwork is laid and models are easily creatable at runtime, the next step is to move onto the migrator itself. This will eventually do three main jobs: parsing the available migrations into a big dependency graph, building up versioned models from those migration files, and running the migrations to change the database schema.

It's best to start at the base of all this, which is the dependency graph. This is what migration files get fed into as they're read off disk, and how we work out which migrations to apply to achieve our end goal.

Note

South just takes the filename, ASCII-sorts them, and uses that as the dependency graph for an app.

I'm making a few changes compared to South's original model of this graph; in particular, there won't be implicit dependencies between adjacent numbers (the fact that 0004 depends on 0003 will be recorded in 0004's file) and it'll be possible to "rebase" an app's migrations (throw away historical ones and start afresh).

The numbering dependency decision is so VCS merges can be handled more gracefully - rather than just trying to see a "hole" in the dependency history, it'll be possible to detect that an app has two topmost migrations and prompt the user for action (either an automated rearrange to get a linear history or a manual merge).

The "rebase" operation allows an app with a large number (say, 100) of historical migrations to get a new initial migration added at point 100 - in a way where old installations that are still below the new migration continue to run the old migrations, but any new installation just comes in straight away at migration 100 and runs the initial migration (and then perhaps continues up to 101, 102, etc.).

Note

Since publication, and some suggestions, I've settled on "squash" for the name of this command.

Confusingly, the VCS-merge-automatic-inlining mechanism I outlined above is analogous to what git rebase does, while the rebase command does nothing like it. It's probably worth thinking of a better name for "adding a new initial migration to make tests and new installs faster" - suggestions welcome to @andrewgodwin!

Work on this is going on right now - I've taken a break from it to write this diary - and so next time we'll revisit it and see how it progressed, and if any problems appeared (I'm sure some will).

Also, I'll be giving a talk at DjangoCon EU next week titled "Migrating The Future", with all this kind of detail and more - I hope to see some of you there!

May 08, 2013

Meet DjangoCon Sprints sponsor!

From DjangoCon Europe on May 08, 2013 02:56 PM

image
Who doesn’t know yet that we’re doing DjangoCon Europe in a circus tent? But circus is no place for DjangoCon Sprints! We were looking for the kind of space that will accommodate 150 people and at the same time would be located in the heart of Warsaw (it will be weekend so it’s better to be closer to the epicenter of parties, right?).

Thankfully, we found the perfect venue! It’s called GammaFactory - a startup hub/co-working center!! It is based in the.. cheese factory :) Ok, it was a cheese factory, but now it’s a place with a vibrant community around it. If you happen to be a rock climber - the biggest bouldering center in the centre of Warsaw is next door to GammaFactory! So don’t forget your climbing shoes!

HardGAMMA Ventures, which owns GammaFactory, is one of the leading early stage VC funds in Poland (

Visit their site for more info! http://www.hardgamma.com/  

They also run a startup accelerator program for technology entrepreneurs called GammaRebels. It focuses on accelerating and developing startups through mentoring, advising and sharing both business & technical knowledge.

See you soon! :)

May 07, 2013

Meet our Platinum Sponsor: New Relic!

From DjangoCon Europe on May 07, 2013 06:53 PM

Let me introduce you to New Relic, our Platinum Sponsor.

New Relic is application performance management company. They offer web and mobile application monitoring with support for many languages, including Python, of course, as well as Ruby, PHP, .NET, Java, Android and iOS.

With their great product they make life easier for more than forty thousands clients monitoring staggering number of 1.4 million of application instances.

Using New Relic you can easily dive into your application performance breakdown and see what parts are slow. It’s very easy to use and you can install it in matter of minutes, as we did for our conference website!

If that sounds interesting, be sure to check out Amjith’s talk during DjangoCon!

Be sure to check out their website at http://newrelic.com/

Making Django 1.5 compatible with django-bcrypt

From David Cramer on May 07, 2013 03:43 PM

Last night I took the opportunity to upgrade all of getsentry.com to Django 1.5. While most things were fairly trivial to sort out, we hit one less obvious (and pretty critical) bug during the migration surrounding django-bcrypt. This bug would only present itself if you’ve transitioned from …

May 06, 2013

In defense of &lt;canvas&gt;

From Adrian Holovaty on May 06, 2013 03:55 PM

My friend and fellow Chicagoan Evan Miller wrote an excellent blog post over the weekend: Why I Develop For The Mac. It's full of great reasons why his software (which is also excellent, by the way) was written for the desktop, despite the fact that he's a web developer, even the creator of an Erlang web framework.

But I'm compelled to respond to it, specifically his statements about <canvas>:

large <canvas> areas seem laggy on most browsers
So I'm left with <canvas>, and <canvas> is slow.

I have become intimately familiar with <canvas> while developing Soundslice. I'd even venture to say Soundslice is one of the most advanced uses of <canvas> on the web that's not a tech demo -- i.e., it's an application that normal people use. The site uses not one, but nine <canvas> elements stacked on top of each other to make a very rich UI, sort of like Photoshop for guitar tabs. (For a flashy demo of how those canvases interact, watch the tech talk I gave at 37signals, specifically the bit starting at 10:20.)

Here's what I've learned: <canvas> is not slow. In fact, I've been continually surprised by how fast it is -- as long as you take care to do things right. Evan's article mentions the "magical" sensation of instantaneous feedback; I invite you to play with the zoom slider on any Soundslice page (example) to experience this same magic, all drawn dynamically with <canvas>.

Of course, <canvas> is certainly not as fast as the lower-level drawing routines that you can use if you develop a desktop app. No question. But it's fast enough that, unless you're doing something relatively insane, you'll be totally fine.

On Soundslice, we're drawing guitar-chord charts completely on the fly (again, see an example), which is a relatively involved drawing routine -- and it's still near-instant performance. That's across all modern browsers (Chrome, Safari, Firefox and IE 10).

Here are some specific tips I've picked up to make <canvas> performance really shine.

Use requestAnimationFrame

Above all else, do this.

It's a JavaScript API designed to fix a very specific problem: your computer screen can only be redrawn a certain number of times per second (the "refresh rate"), so any calculations that redraw more often than your refresh rate are wasteful.

For example, say you have an event such as mousemove that results in a <canvas> redraw. A mousemove might happen hundreds of times per second, but your screen might only refresh, say, 75 times per second (75 Hz). That means, if your code is naively written, it will try to redraw several times within each actual opportunity to redraw (hundreds of times per second vs. 75 actual redraw opportunities per second).

The requestAnimationFrame API solves this by letting you say, "Execute this code the next time a redraw happens." Which saves your browser from having to do unnecessary work.

When I added this to Soundslice, the site became dramatically faster and more responsive. Here's more info about how to use the API.

Stack canvases

Above, I linked to a video of a tech talk I gave about Soundslice. In that talk, I demonstrated how Soundslice uses several different <canvas> elements, stacked on top of each other as layers, for maximum performance -- and for nice, clean code. Definitely watch the demo at around 10:20 in the video to get a sense of it.

I'm planning to write a separate blog post about this, but the Cliff's Notes version is that you can stack transparent <canvas> elements on top of each other so that you only have to redraw the ones that need to change.

For example, on Soundslice, there's a separate <canvas> for the playhead -- the vertical orange line that tracks the currently played moment of the video. That's a separate <canvas> with a z-index above the other ones, so that redrawing it doesn't require redrawing any of the other stuff. The less you have to redraw, the better.

Bunch calls to fillStyle

When you draw on <canvas>, you first have to tell it which color you're using. You can do that by setting the "fillStyle." It turns out that, each time you change the fillStyle, there's a slight performance penalty. Therefore, you can squeeze out some extra performance by bunching your calls to fillStyle -- that is, rather than drawing a gray thing, then an orange thing, then a gray thing again, you should draw all the gray things, then draw all the orange things.

For example, Soundslice, which is all about annotating YouTube videos, needs to draw dozens, sometimes hundreds, of annotations on the screen at a time. Each annotation might use several different colors -- the text color, the border color, the line color, etc.

My original implementation looped over each annotation and drew each one independently, which resulted in two to five fillStyle calls for each annotation. I changed this to bunch the fillStyles across all annotations -- so that all of the light grays were drawn at the same time, then all the dark grays, etc. -- and the drawing got a few dozen milliseconds faster.

For more background, see the "Avoid unnecessary canvas state changes" section in this great HTML5 Rocks article.

Cache text rendering

In profiling, I've found that rendering text on <canvas> is my next big rendering-related bottleneck on Soundslice. I haven't done this yet, but I'm planning to come up with a way of caching the results of fillText, possibly using this technique.

Final thoughts

A decent argument in Evan's favor is: "Well, if <canvas> is only fast if you use these various hacks, it's not really fast, then, is it?"

Two thoughts on that.

First, well, sure! I'd love it if <canvas> was super fast right out of the box, without needing to use these techniques. No doubt about it. But the reality is, it is fast enough, if you put in the work.

Second, there's the bigger question -- a defining question for the current generation of web developers -- which is: web or native app? I am squarely in the web camp, both for philosophical reasons (such as openness) and practical reasons (such as the fact that Soundslice has only one developer and one designer, and we can't justify building separate apps for separate platforms).

What I love about <canvas> is that it lets us make desktop-quality apps right in the browser, so we can get the benefits of being "of the web" along with the benefits of amazing, fast graphics. Fear not, my friends: <canvas> is great.

UPDATE, May 7, 2013: Evan has posted a thoughtful follow up, reacting to this.

Sendgird Party!

From DjangoCon Europe on May 06, 2013 12:06 PM

Guess who is throwing the best party of 2013?:) 

We’re super excited to announce that SendGrid is a part of our DjangoCircus family!! :) Moreover, they were one of the first companies to back us making all this possible! Kudos to all Sendgriders, especially to Swift :)

We’re planning BBQ with Polish KIEŁBASA ;) a lot of great beer and.. the DjangoCon FlipCup tournament! The party will take place in the circus and hopefully outside - if the weather is good. Join us on Friday night at 7pm.

SendGrid is the leader in email deliverability. SendGrid’s cloud-based platform increases email deliverability, provides actionable insight and scales to meet any volume of email, relieving businesses of the cost and complexity of maintaining custom email infrastructures. 

For more information, visit www.sendgrid.com.

See you at SendGird Party!

Einladung zur Django-UserGroup Hamburg am 08. Mai

From Arne Brodowski on May 06, 2013 06:40 AM

Das nächste Treffen der Django-UserGroup Hamburg findet am Mittwoch, den 08.05.2013 um 19:30 statt. Dieses Mal treffen wir uns wieder in den Räumen der intosite GmbH im Poßmoorweg 1 (3.OG) in 22301 Hamburg.

Die Organisation der Django-UserGroup Hamburg findet ab jetzt über Meetup statt. Um automatisch über zukünftige Treffen informiert zu werden, werdet bitte Mitglied in unserer Meetup-Gruppe: http://www.meetup.com/django-hh

Da wir in den Räumlichkeiten einen Beamer zur Verfügung haben hat jeder Teilnehmer die Möglichkeit einen kurzen Vortrag (Format: Lightning Talks oder etwas länger) zu halten. Konkrete Vorträge ergeben sich erfahrungsgemäß vor Ort.

Eingeladen ist wie immer jeder der Interesse hat sich mit anderen Djangonauten auszutauschen. Eine Anmeldung ist nicht erforderlich, hilft aber bei der Planung.

Weitere Informationen über die UserGroup gibt es auf unserer Webseite www.dughh.de.

May 05, 2013

Warsaw Survival Guide

From DjangoCon Europe on May 05, 2013 04:59 PM

DjangoCon is in a less than two weeks. Are you already excited? We definitely are!

Since a lot of you will be in Poland for a first time in their life, we prepared some tips for you. We really want to make sure you will enjoy being in Warsaw this year!

We already covered some basics here: http://djangocircus.com/getaround/ (taxis, buses, trains, sim cards, internet).

Here’s some more.

Currency

Polish currency is PLN (złoty). 1PLN is around ~ €0.24 or ~$0.32. You will find a lot of ATMs (bankomat) in Warsaw, so you don’t need to have a lot of cash with you. It is possible to pay with debit or credit cards in most of the shops (MasterCard, Visa, Visa Electron, Maestro, PolCard etc. are widely used), but there are places where only cash is accepted or it is possible to pay with card when you spend more than 10 or 20PLN.

You can also exchange some money in a bank or in a kantor (you will find a lot of kantors in the city center). If it’s possible ask for banknotes with lower denomination (10, 20, 50).

Power plugs

In Poland we use Type E power plugs. If you use different kind of power plug, remember to buy an adapter. We have also an extra tip for you: if you bring a power strip you’ll be able to charge a lot of your devices using only one adapter :).

Tap water

If you ask Poles if Polish tap water is good enough to drink, they will tell you that you shouldn’t do it. However it’s no longer true. Water in Warsaw (we cannot assure it is a case in other places in Poland) according to newest water tests is safe to drink. It is probably not the tastiest water in the world though, so drinking still water which you can buy in many shops around Warsaw is a better option in our opinion.

Food

Make sure to try a little bit of Polish cusine while being in Warsaw. Some of Polish specialities are: bigos, pierogi, gołąbki, barszcz, żurek, zupa ogórkowa and many, many other.

Language

Young people speaks English quite good in Warsaw. People 50+ may have some troubles with understaning English, but it’s not a rule - in this case try German or Russian if you know one of those languages. In most of restaurants and pubs (and in many shops) people speak English, so we are sure you’ll be fine.

If you want to know how to pronounce Polish words (for example “Służewiec”) here are some useful pages you can read: Polish alphabet, Polish phonology :).

Safety

Be sure to watch your belongings while you are in a crowded place. In many places there are pitpockets. Loosing your wallet and documents won’t be a perfect memory from Poland.

May 04, 2013

New Committers for Tastypie &amp; Haystack

From Daniel Lindsley on May 04, 2013 08:18 AM

New Committers for Tastypie & Haystack

Dynamic Fixtures

From Inka Labs on May 04, 2013 01:54 AM

The django unit test framework is great, it allows you to test everything you code in python. The "TestCase" class also supports fixtures. Most people writes fixtures as XML or JSON and loads them in every test. This can make running the test suite a very very slow process. You shuold consider that every time you run a test, the fixtures are loaded in the database, then they are deleted and loaded again by the next test. 

We had a simmilar problem some weeks ago. We had all our fixtures in json files (some of them were huge). So we decided to avoid those fixtures. That is why we decided to create our objects (per test) in pure python. Something like:

 

obj = SomeModel.objects.create(
          attr1=1,
          attr2=2,....)
related = OtherModel.objects.create(
            fk=obj,
            other_field=...)

#use related and obj

But, let's be reallistic, this is very boring and tedious. We don't want to spend soo many time creating database objects. Also consider you add a required field in a model, then you should go to all your tests using that model and add a value for that field, making your tests difficult to maintain.

Here is where django dynamic fixture comes to the rescue. This great application allows you to create python database objects very very easy.

# create a database object with random data
obj = G(SomeModel)

# Give some field values 
# and fill the remaining fields with random data
obj = G(SomeModel, attr1="field1", attr2="Field2")

# ignoring fields
obj = G(SomeModel, ignore_fields=['field1'])


This is just a very small example of all features of this great application. If you want your tests to run fast and be maintainable you shall give it a try. Check the docs here

May 03, 2013

Tools we used to write Two Scoops of Django

From Daniel Greenfeld on May 03, 2013 05:00 PM

Because of the ubiquitousness of reStructuredText in the lives of Python developers and the advocacy of it, it's not uncommon for people to assume we used it to write our book. However, that's not really the case.

The short Answer is we used:

  • reStructuredText (RST)
  • Google Documents
  • Apple Pages
  • LaTeX

The long answer is the rest of this posting. Since writing the book was broken up into three major stages 'alpha', 'beta', and 'final', so have I broken up blog article.

Alpha Days

Some of the original alpha material was written in rough draft form as RST since it was what we were used to using. Unfortunately, the PDF generation wasn't to our liking, so we immediately began looking at other options. Since she enjoyed using it at MIT and because it gave us greater individual control, Audrey wanted to switch to LaTeX. I was worried about the challenges of learning LaTeX, so we compromised and moved to Google Documents.

For the most part, Google Documents was great in the early stages. The real-time collaborative nature was handy, but the gem was the comment system. It gave us the ability to have line-by-line written dialogues with our technical reviewers. However, Google Documents makes it nigh-impossible to use WYSIWYG editor styles, add in better print fonts, forced us to cut-and-paste code examples, and finally the PDF export system was flakey on our massive document.

Our original thought was to convert the Google Document output to PDF and then modify it with Adobe InDesign. Upon trying it, we found it had a lackluster user interface that had a steep learning curve and was prohibitively expensive ($550-$700). Our friend and reviewer, Kenneth Love of Getting Started with Django fame, offered to do the conversion work, but we wanted to be able to update our work at will. Awesome as Kenneth might be, we couldn't expect him to drop what he was doing to update the final output of our work whenever we wanted.

Therefore, what we did in the week of January 10th-16th was convert the book to Apple Pages, which is the word processor in Apple iWorks. This was as painful as it sounds. We also discovered the day before launch that Apple Pages doesn't create a sidebar PDF table of contents, which a lot of people enjoy (including ourselves). Tired and exhausted from weeks of 16 hour days, we launched anyway on January 17th with the book weighing in at 5.1 MB.

Beta Experiences

People were so positive it really gave us a boost. Hundreds of people sent us feedback and we were delighted beyond words, with a significant portion sending us commentary/corrections about our writing and code. I'll admin did get tired over a certain 'moat' mistake since I got corrected on it over 50 times. However, the number of code corrections we were getting was higher than expected. It was clear we needed to be able to import the code modules from testable chunks of real code. We had so many kindle/epub requests we also needed the ability to render the text attractively across multiple formats.

After stumbling through RST, Google Documents, and Apple Pages different tools, I finally agreed with Audrey that the challenges of learning LaTeX was worth it. While we could have used RST, we would have had to use LaTeX anyway for our customizations since when RST is converted to PDF it actually uses an interim step of LaTeX!

So while I handled the corrections and feedback from thousands, Audrey built the fundamentals of the LaTeX file structure. Audrey really got her hands dirty by teaching me LaTeX, since my brain is slow and thick. Here's a sample of what I've learned how to do, taken from Chapter 6, Section 1, Subsection 5 (6.1.5):

\subsection{Model Inheritance in Practice: The TimeStampedModel}
It's very common in Django projects to include a \inlinecode{created} and \inlinecode{modified} timestamp field on all your models. We could manually add those fields to each and every model, but that's a lot of work and adds the risk of human error. A better solution is to write a \inlinecode{TimeStampedModel} \index{TimeStampedModel} to do the work for us:

\goodcodefile{chapter_06/myapp/core/timestampedmodel.py}

Take careful note of the very last two lines in the example, which turn our example into an abstract base class: \index{abstract base classes}

\goodcodefile{chapter_06/myapp/core/class_meta.py}

By defining \inlinecode{TimeStampedModel} as an abstract base class \index{abstract base classes} when we define a new class that inherits from it, Django doesn't create a \inlinecode{model\_utils.time\_stamped\_model} table when syncdb is run.

Once I got the hang of LaTeX, then began the hard work of converting the book's current content from Apple Pages That was a couple weeks of grueling effort on my part. Daily I would request a new LaTeX customizations, which Audrey would address. However, as she was working on literally rewriting the content of a dozen chapters including templates, testing, admin, and logging my interruptions became an issue. So we enlisted the help of Italian economist and LaTeX expert Laura Gelsomino. Thanks to her the desired text formatting was achieved.

During the conversion process we also rewrote every single code example, putting them into easily testable projects, and pull them into via use of custom LaTeX commands called \goodcodefile{} and \badcodefile{}.

Eventually I joined Audrey on rewriting and reviewing chapters and on February 28th, the beta was launched. LaTeX generates lean PDFs so the book came in at just 1.6 MB while adding a whopping 50 pages (25% more) of content.

Final Efforts

The final effort was focused on cleanup, new formats, presentation, and art.

For cleanup, our amazing readers gave us so much feedback we could barely keep up. We fought to keep our dialogues with them personal yet brief. With reader oversight we corrected many of the 'quirks' of my writing style (Audrey is a stickler for Strunk and White, I am not). We also made numerous corrections based on feedback and our own observations.

With the guidance of fellow Python author Matt Harrison I wrote scripts that took the archaic HTML generated by LaTeX module tex4ht and rendered it into something that Kindlegen could use to generate Kindle .mobi files. At first the results looked awesome on modern kindles and other new ebook readers, but was terrible on older devices. So I toned back the fancy stuff to what you see today. Getting technical books to look nice on all readers is really, really hard - and unfortunately some publishers take shortcuts that hurt the efforts of the authors. If you have problem with an e-book's format, please consider that before writing a negative review about the final output.

Speaking of mobile editions, we also wrote a second version of each Python example to deal with the smaller format. While libraries exist to do the work for you, since I did a lot of it from scratch (albeit coached by Matt) I had to dig into the lackluster .mobi/.epub documentation to figure out things like .ncx files.

note: If you want to be the self-published author of a technical book I strongly recommend you read Matt's Ebook Formatting: KF8, Mobi & EPUB. Also check out his rst2epub2 library for converting RST files to various formats.

While I worked on the mobile editions, Audrey focused on the print version and adding more art and tiny bit of new content. She focused on clarity and flow, and the result is that the book feels even lighter to read and yet is dense with useful information. To test how the book launched, she would order a copy from the printer and wait several days for it to arrive. Then she would inspect the cover and interior with her incredibly exacting eye. It's a slow process, but Audrey wanted to make absolutely certain our readers would enjoy and use the print edition.

On April 10th we launched the final in PDF, Kindle, and ePub form. The PDF weighs in at 2.7 MB, and the Kindle file is a bit heaver. At some point we'll do the work to reduce file size, but for now we're working on other things.

A week later we announced the launch of the print version of the book. People seem to really like the design and feel of the physical book, and we've even had requests for t-shirts.

Thoughts

Writing a technical book was really hard. Crazy hard. Also very satisfying. We could have made more money doing just client work, but this was a dream come true. Sometimes money doesn't matter.

Whither Two Scoops of Django?

Two Scoops of Django: Best Practices for Django 1.5 will still receive periodic corrections, but won't see new content unless it's security related for Django 1.5. Don't worry though, for when Django 1.6 comes nigh, we'll commence work on Two Scoops of Django: Best Practices for Django 1.6 (TSD 1.6). The plan is to update practices as needed and hopefully add more content on testing, logging, continuous integration, and more. Like it's predecessor TSD 1.6 will be written using LaTeX.

That said, if I ever fulfill my dream of writing fiction I'll just use Matt Harrison's rst2epub2 library.

Concerns about Open Sourcing

We've considered open sourcing our current book generation system, but installation is rather challenging and requires serious Audrey/Laura-level LaTeX knowledge combined with my experience with Python. Unfortunately, from our experience on managing other open source projects, dealing with requests for documentation and assistance would take up a prohibitive amount of our time. Honestly, we would rather write another book or sling code.

Book Generation as a Service?

Another option is turning our system into a service, which would convert existing RST or even MarkDown to LaTeX so it could generate books in the Two Scoops format. Doing this would require at least a month of full-time work on both of our parts, and we have no idea as to the interest level. We think it would be a low amount of interest, but then again, hasn't leanpub done pretty well using this model of business?

In any case we're working on other projects. Maybe even a new technical book...