[ RSS | ATOM 1.0 ]
Powered by PyBlosxom


Petone-Grenada Link Road

Posted in general by Olly Betts on 2014-09-18 18:29 | Permalink

Xapian GSoC 2014 Projects

Accepted GSoC students were announced on 21st April, but I was away on holiday last week, and have only just had a chance to write up a blog post about this.

We received 30 student proposals for Xapian this year, and Google allocated us six slots (the same as we had in 2012).

We had four particularly strong proposals for the "Learning to Rank" project idea, so we decided to create a second project adding more algorithms, to complement the project sketched out in our ideas list.

Congratulations to the chosen six:

Sorry to those we weren't able to select this year - we had to make some difficult decisions during the selection process, and we really appreciate the time you spent writing your proposal, working on patches, and on the rest of the application process. We'd encourage you to remain involved with Xapian, and to apply to us again next year if you're still eligible for GSoC.

If any applicants would like some more specific feedback on their applications please just come and ask us.

Posted in xapian by Olly Betts on 2014-04-30 15:10 | Permalink

Analysis of Xapian GSoC 2014 Applications

As I said in my earlier post, we received 31 proposals from students (ignoring 2 duplicates withdrawn by students). On closer inspection, we spotted another duplicate, so discounting that, here is how the remaining 30 proposals break down by project idea:

  • 10 - Clustering of search results
  • 5 - Learning to Rank
  • 5 - Weighting Schemes
  • 2 - Postlist encodings
  • 2 - Improve Java bindings (one with PHP bindings too)
  • 1 - Gmane search improvements
  • 1 - Testsuite Improvements
  • 1 - Performance/Relevance testing and optimization of DFR
  • 1 - Social Media Product Analyzer
  • 1 - Web application for fast image search
  • 1 - Improving Arabic Support + Python Binding Improvements

In the above list, italics indicate ideas or parts of ideas which were suggested by the student, rather than coming from our ideas list.

As in 2012, the most popular ideas from our suggested ideas list are those with the closest connections to Information Retrieval theory. I think the clustering idea also seems very accessible, which is why it's been so popular (it was only added to the list shortly before student applications opened, as we'd already seen signs that "Learning to Rank" and "Weighting Schemes" were likely to be very popular).

There's also a wider spread in quality for the clustering proposals (perhaps also due to the accessibility of that idea) so don't despair if you're a student who applied for a clustering.

And generally, if we have more than one great proposal based on the same project idea, we may accept more than one of them - we don't want to duplicate effort, but it's often possible to adjust the scopes to produce projects which don't overlap.

Posted in xapian by Olly Betts on 2014-03-27 16:00 | Permalink

Xapian GSoC Applications for 2014

Student applications for GSoC closed a few hours ago, and here are some initial stats on the proposals we received for Xapian (for comparison, see my blog posts for 2011 and 2012).

We received a total of 31 applications this year - here's a graph of total applications received against time:

Graph of student applications to Xapian in GSoC 2014

If you're an admin or a mentor, you can produce a similar graph for your own org(s) - just download this OpenDocument spreadsheet and follow the instructions inside.

Of the 31, 18 were submitted in the last 12 hours, with the latest submission a rather brave 99 seconds before the deadline.

The total number is lower than the 42 and 41 we received in previous years, but in a quick skim through I didn't see anything we'd immediately discount as a spam proposal and mark as invalid. So that 31 is more comparable with the numbers after removing spam from previous years (which were 33 and 30).

I suspect the improved quality and the even more marked spike as the deadline nears may be due to the new requirement that students upload proof that they are enrolled before they can submit a proposal.

Posted in xapian by Olly Betts on 2014-03-22 12:35 | Permalink


Slide 5 from my nostalgia-fest "The Art of Writing Small Programs" from just under two years ago:

The Art of Writing Small Programs - Slide 5

XKCD 1275 from last week:

xkcd 1275 - INT(PI)

(Of course that should be INT SQR EXP PI/INT PI * PI * R ** INT PI).

Posted in general by Olly Betts on 2013-10-18 18:01 | Permalink

Debian GSoC Applications for 2013

I've produced a graph of the 61 student applications which Debian received for GSoC this year:

Graph of student applications to Debian in GSoC 2013

Ana blogged a similar graph last year if you want to compare. It looks like the total is down a little (though I'm not sure if the figure of 81 from the text, or ~68 read from the graph is correct for last year) - this is likely at least partly due to the number of proposals each student can send having been reduced from 20 last year to 5 this year, which should have reduced the number of low quality proposals. The timeline this year is later, which may have also had an effect.

If you're an admin or a mentor, you can produce a similar graph for your own org(s) - just download this OpenDocument spreadsheet and follow the instructions inside.

Posted in debian by Olly Betts on 2013-05-06 13:36 | Permalink