Summer of Code Program

Student Project Ideas

[These are ideas for 2005. If you're interested in the Summer of Code, see project ideas for 2006 instead.]

Bulk Transport Projects | Rich Presence Projects

Get paid by Google to write open-source code this summer! The deadline for submitting an application was June 14, 2005; decision on applications already submitted should be reached by June 24, 2005. The following are ideas for student projects (other projects are possible, depending on individual interest) for Google's Summer of Code. Internet2 is proud to be a mentoring organization for the initiative.

To ask a question or to clarify anything, contact us. If you see nothing on the list below that is a good match for what you want to do, but if you think that Internet2 could mentor you, get in touch, too. (Contact information is below.)

To apply:

  1. Select a project;
  2. Read the FAQ;
  3. Read the Internet2 intellectual property framework, which describes acceptable licensing;
  4. Write your application, which needs to include:
  5. Submit the application directly to Google Code, selecting Internet2 as your mentoring organization.
  6. Checking this page every now and then before the deadline for any new information would probably help. Remember: you can always revise your proposal and resubmit, so keep a copy and keep improving it.
Try to be as specific in the application as you can. Describe what you want to do and why you can do it.


Q: Is project X taken?
A: This is a competitive proposal process. We don't even (yet) see the applications filed with Google. We don't know how many applications are filed for a particular project. Duplication of effort is allowed; that is, if you make a strong case in your proposal, it is likely to be funded regardless of the number of competing proposals. What matters is the quality of your proposal, not the number of other people who want to do the same thing. That said, if you design your own project (the first ``project'' on the list), you will have little competition, so there's almost no chance that your proposal would be rejected only because of the number of other similar proposals. To repeat: just write a good proposal and put it in. If two projects interest you, do so for two projects.

Q: My university is not an Internet2 member (usually because I am not in the U.S.). Can I still apply?
A: Definitely. Google sets the rules of who is eligible, and there's nothing about Internet2 membership in the rules. You won't, in any way, be at a disadvantage.

Q: Do I need to check with you before I file the application with Google?
A: Feel free to check, but it is not required. If you have no questions, there is little to discuss at this stage. Unfortunately, we can't help you with writing of the proposal---that part you'll need to do yourself. List of things you could include in the proposal is above. If you do have any questions, don't be shy. You can write a better application if you're clear about the project.

Q: I can't think of anything for item X on the list of things to include in the proposal. What do I do?
A: Think again. Still nothing? Omit it and go on. Make the proposal as detailed as you can, but don't obsess about items or checklists. We need to know what you want to do and why you think you can do it.

Q: I am a high-school student. Can I apply?
A: Sorry, but high-school students are not eligible. If everything goes well this time, it's possible that Google might repeat the exercise in the future when you're in college. Not this time, though. You can still work on open-source code this summer; this will improve your karma, programming skills, chances of getting any future open-source stipends, and chances of getting a good job. (NB: this answer should not be used as any grounds for speculation that there might be a next time. We, as a mentoring organization, have no inside information pertaining to this. Only Google would know.)

Bulk Transport Projects

Contact: Stanislav Shalunov.

Design your own project

This is not an actual project idea, but it can be used to create one.

Read the document Design Space for a Bulk Transport Tool. Consider your abilities. Is there a portion of this you'd like to do? The Internet2 Bulk Transport working group would help any student working on the problem.

Timekeeping using TSC register

Many architectures, including i386 and amd64, provide a register, called TSC on these two architectures, that increments every CPU clock cycle. This register is usually used for profiling and related tasks that only require relative measure of time. However, such a register allows very precise timekeeping without context switching. Normally, obtaining time from the kernel using, e.g., the gettimeofday() system call involves two context switches (to and from the kernel). With an appropriate library, one could have a replacement call for obtaining absolute time entirely in user space, eliminating roughly 10,000 cycles of overhead associated with obtaining a timestamp through the kernel. In addition, lack of context switch for getting the time means a decreased probability of losing the execution context, which is important for applications that, subsequently to obtaining a timestamp, send it through the network or use it to create a record on disk.

The frequency of updates of the TSC register can drift slowly; in the short term, the drift is mostly caused by temperature changes, which, in turn, depend on the pattern of CPU use, and, as such, are not easily predicted. The frequency of updates can also change abruptly when the CPU frequency itself changes (due to, e.g., a power management event); note that CPU frequency change does not necessarily result in the change of frequency of TSC updates. Both of these situations need to be considered and covered. To accommodate frequency drift is a problem, in essence, identical to the one an NTP client with a single server solves; consequently, the solution will probably be similar. One complication is that NTP stores the time adjustments in the time itself; since TSC frequency or offset cannot be easily changed by a user-space application, the adjustments would need to be stored elsewhere in a conversion table. The second hurdle, frequency steps, should be relatively easy to detect with simple sanity checks; subsequent action needs to be decided upon.

The conversion table discussed above should be the same for all processes running on the same machine; otherwise, processes with different time views can produce results difficult to predict and debug (consider, just as an example, the case of make). In addition, it is advantageous to keep the conversion table around and to continually refine the coefficients in the same mode NTP does. This necessitates a daemon and a means to distribute its conversion table, via some IPC mechanism, to all the processes that are using the library.

A very limited and naive shim implementation of this idea can be found in the source code for thrulay (below) in files tsc.h and tsc.c. That implementation, of course, lacks practically all of the features discussed above; however, it might still be interesting to look at or perhaps even start with.

The library must, at minimum, work on Linux and FreeBSD. Other desirable platforms are Windows and Solaris.

This project involves programming in C and requires the implementor to learn about time synchronization loops. The project is self-contained and involves a fair amount of independent work under guidance.

Noise calibration for bulk transport tool

Collection and analysis of data to determine timing noise of packets sent across a network. If time permits, development of filtering algorithms to eliminate the measurement noise from samples.

Background reading: Design Space for a Bulk Transport Tool.

Let us define the noise to be measured. Suppose the signal we'd like to measure is network delay. Then the measurements obtained by comparing depart and arrive timestamps, minus the signal, is the noise.

The noise, of course, would depend on the test environment. A variety of environments would work best. However, in all cases the network delay (the signal) needs to be known---otherwise, it becomes impossible to separate from noise before the characteristics of noise are known. So, a back-to-back environment could work. What's more important is to vary the operating system and the load on the machines (it's the machines themselves that are the source of the noise; the rest, by definition, is signal).

The purpose of this project is to provide input for building the Internet2 bulk transport tool. Therefore, the amounts of data that traverse the link(s) during the test must be substantial and approach the level of saturation.

C, some understanding of statistics, standard sockets API.

Enhancements for thrulay

The thrulay program is a network tester. Its unique features include the ability to test the delay along with throughput using TCP and the ability to send fine-grained Poisson test streams using UDP. Enhancements to be done include testing with multiple streams (perhaps a wrapper that starts multiple processes), porting to Solaris, improved statistics for UDP mode, IPv6 support, API for programmatic execution of tests (so that the timing can be better controlled by a wrapper application such as BWCTL), autoconf integration, and others.

Some clarification about the statistics part: Currently, in UDP mode, the only two things that are reported are loss percentage and the minimum delay. Other statistics that could be reported include reordering (perhaps using the n-reordering definition), duplication, median delay as well as other quantiles, and possibly the loss burst metrics.

Integration of the results of work done by the TSC timekeeping project (above) might help as well.

It would probably help to download the source and look at the TODO file as well as to search for the string ``FIXME'' in the source. This is all work that needs to be done.

This project involves programming in C and is a good way to take the plunge into network measurement.

Bulk Transport API over UDT

Implement the bulk transport API over UDT. UDT is a possible foundation for the Internet2 Bulk Transport tool.

C, C++.

Rich Presence Projects

Contact: Ben Teitelbaum.

SIP/SIMPLE Presence and IM Plug-in For Gaim and/or Adium

Gaim and Adium are popular open-source multi-protocol instant messaging and presence clients. Unfortunately, neither supports presence and IM using the IETF's SIP and SIMPLE standards. This project, would create a SIP/SIMPLE plug-in for either Gaim or Adium. Page-mode messaging should be supported. Support (or at least strong design consideration) should be given to the Message Session Relay Protocol (MSRP). The Internet2 PIC working group will assist any student that takes on this project by providing interoperability testing and access to a reference SIMPLE presence agent implementation.

Integration of Calendaring with SER Presence Agent

A wide variety of client-only and client/server calendaring applications exist (e.g. Outlook, Evolution, Chandler, WebDAV, CalDAV, CAP). Pick one or two and integrate with SER's presence agent (PA). This is an important step towards automated, rich presence, which could, for example, show a meeting's participants as "busy" for the duration of a meeting or a person as "in flight" while he is on a plane. (IM could also be send to participants to remind them about the meeting.) One possible implementation, would be to send PUBLISH messages to the presence agent. Any student taking on this project will be provided with access to a reference SIMPLE presence agent implementation.

Presence Agent

The Internet2 PIC working group has built upon SER to create a SIMPLE presence agent that could be the nucleus of a campus/enterprise rich presence solution. We need a second, independently-developed open-source PA implementation. Create one, perhaps starting with the SIPfoundry code base.

PlaceLab Location Presence

Extend PlaceLab to use SIMPLE and RPID to PUBLISH location presence to a presence agent (PA).