Implementing the next GNOME shell

By now, I think most people have had a chance to look at the text and mockups that Vincent posted of the new desktop shell ideas that came out of the user experience hackfest. Obviously some parts of the ideas are controversial, some parts will be improved as we get some experience with them in practice, but it is a compelling set of ideas that I’m excited about trying out. So here I’d like to put out some ideas I have about how we could go about implementation. (Much of this was covered at a session we had Sunday morning at the GNOME summit where we presented the ideas that came out of the hackfest.)

One Process. The first thing to note is that the ideas don’t naturally split into “window manager” and “panel”. The “Activities” view combines showing the windows that are currently open with launchers for existing applications. It would be possible to add complex API’s to the window manager to allow putting extra things into its scene graph. But it is going to be far easier to simply work in a single process with clean internal programming interfaces.

Javascript. This is an area that really calls for a high level language. There’s going to be lots of code that needs experimentation to get the right user behavior, but not a lot of code that is implementing some complicated algorithm over lots of data. For applets (which would run within the shell), we want a low barrier to getting involved. Javascript doesn’t pull in another complicated platform, almost everyone is familiar with it to some extent another, it offers good possibilities for sandboxing applets, it’s pretty light-weight for memory, and there is a lot of work going on to make it fast. And, especially with some of the Mozilla JS-1.7/JS-1.8 improvements, it’s not as painful to work in as you are thinking.

Clutter. Once we mix together windows, other UI bits like panels, overlay views, and so forth, we need a scene graph to manage everything and put it on the screen. The obvious candidate is there is Clutter. Clutter isn’t (yet) a very good replacement for GTK+ for a general purpose application toolkit, but UI like what is mocked up in the designs very much plays to Clutter’s strengths. And its a big plus to me that Clutter has a tested, documented API… something we wouldn’t have for a custom written scene graph.

Start from Metacity. If the shell subsumes the window manager, then what we should avoid is spending a bunch of time getting all the window manager details right once again … how do you constrain sizes and positions? how do you read ICCCM properties? etc. For this reason, starting from an existing window manager codebase is the right thing to do. I’m less convinced that we should try to have a single code base that works both ways… being able to aggressively refactor, convert thing that would be better in a high level language to Javascript seems important.

So, those are my ideas. No running code, svn module, or even project name yet. But that should change soon. (Current leader for a project name is the exciting “gnome-shell”… gnomesh to go with gnomecc.)

This entry was written by Owen and posted on October 22, 2008 at 11:11 am and filed under Coding. Bookmark the permalink. Follow any comments here with the RSS feed for this post. Both comments and trackbacks are currently closed.

29 Comments

Seif Lotfy

Posted October 22, 2008 at 11:45 am | Permalink

If you can have one sidebar what is the point of a horizontal panel!
you can view everything on the sidebar and hide it when not needed!
Gustavo Sverzut Barbieri

Posted October 22, 2008 at 12:50 pm | Permalink

Well, this is just what E17 is, so yes, it works and works great. E17 uses different names for the components you said (embryo instead of javascript, evas instead of clutter), but the concept is the same.

Maybe it’s time to bring enlightenment as gnome desktop shell again? 🙂 [ok, I don’t believe in that, but it was a catchy question]
Stu

Posted October 22, 2008 at 12:57 pm | Permalink

Not sure why javascript instead of python for instance.

Also, it’s probably possible to do a lot of this and still work with other window managers.
tf

Posted October 22, 2008 at 1:20 pm | Permalink

With the Clutter-based Metacity, aka Mutter, doing some thing like this should be quite easy; the whole shell would be just a ClutterActor implemented inside a plugin, inserted into the Mutter overlay layer. We are hoping to get Mutter ready for pushing upstream in not to distant future, for now the clutter branch at git://git.o-hand.com/metacity-clutter.git.
martin

Posted October 22, 2008 at 1:46 pm | Permalink

Putting everything in one process is a bad idea because of stability. If there is a bug in one of the components it brings down the whole thing and subtle bugs in one component can also cause other weird unexplanable behavior in the other components. Most importantly if some buffer overflow screws up the stack, only elite hackers can manually reconstruct the stack (in separate processes, it’s much easier to know which component crashed).

So, it’s much better to have a hive of processes cooperating using a clean IPC based interface.

Consider architecture of Google Chrome and other modern browsers (process separation is the future for sure!!)
oliver

Posted October 22, 2008 at 2:11 pm | Permalink

(+1 on Python instead of / alongside Javascript)

What I actually meant to ask: how does Clutter work if there is no OpenGL acceleration available (like in virtualized systems)? Does it work just as well with software rendering?
Kevin Lange

Posted October 22, 2008 at 2:14 pm | Permalink

Do something, I don’t even care what, that requires the “input redirection” framework. Maybe shrinking windows and stuff. That way, we can force the X devs to include the patches for it.

Nah, I’m just kidding (it would be great if you did, but it would be solely at the benefit of us in the Compiz world).
michael schurter

Posted October 22, 2008 at 2:20 pm | Permalink

As a python webdev I love the idea of javascript being used in Gnome! It will let us web hacks port our bling to the desktop. 😉

@Stu: I love Python too, but I think JavaScript is a better choice simply because its so widely known … and widely known by people who otherwise could never contribute to desktop development (designers, php devs, etc.).
Owen

Posted October 22, 2008 at 2:51 pm | Permalink

Python versus Javascript: I’m certainly pro-Python in general (see http://www.reinteract.org/), but it doesn’t feel like a great fit here. The main reasons are exactly the ones I mentioned: among them it drags in a big platform of it’s own and you can’t sandbox it.

OpenGL: We need to just assume it and use it. Dancing around the issue and writing everything two ways is something we don’t have the luxury to do considering finite resources. The GL situation is way better than it was and will be better yet in a year. GL for virt systems is not a big deal. (If nothing else, just use GLX…) And the old panel will still be available for a while, not to mention light-weight alternatives like XFCE.

Separate processes: If components crash, that’s a big problem for the user no matter how things are split up. (And a definite goal for the shell is that it is stateless … that it can crash at any time and restart and things come up exactly as they were.) I’d say in general, our experience with GNOME, especially with CORBAm has been that separate processes done without extreme care cause more instability than they save. That doesn’t mean that there isn’t a role in some cases for splitting things out into separate processes (perhaps to get different security contexts or even UIDs), but I don’t think that should be the default… it’s something you do for untrusted content, like an applet the user downloaded. There’s no reason to split the WM/CM/Panel for that reason.

@tf: I’ve been playing around with mutter a bit over the last few days. The code is nice, and something that it would take a clutter novice like me a lot of time to come up with. I’m not really sold on the architecture though… having the entire shell be a plugin to a compositor that is itself virtualized in metacity seems like it would make it very hard to change anything about core window management.
Juri Pakaste

Posted October 22, 2008 at 2:56 pm | Permalink

There are two problems with using Python for something like this: it isn’t really built for embedding (in Python’s view, apps are more likely to be built with Python, and extended in C as necessary) and it has the whole huge library with it, whereas Javascript is just a small language (or, if ES4 happens, not so small) that’s easy to integrate with existing libraries. Javascript and Lua are probably the most sensible choices for something like this.
Daniel Borgmann

Posted October 22, 2008 at 3:09 pm | Permalink

Good luck. I am very interested in contributing, if not too many compromises will be made to accommodate the bikeshed experts.

I don’t believe that every shell has to fit every turtleback, and we shouldn’t be afraid to try new interface ideas. Something good will come out of it.
anonymous cowherd

Posted October 22, 2008 at 3:15 pm | Permalink

How about “gnospoon”? After the flexibility and radical break with convention. Or am I a decade too late for that one?
anonim

Posted October 22, 2008 at 3:23 pm | Permalink

why not compiz?
Thomas Thurman

Posted October 22, 2008 at 7:25 pm | Permalink

@anonim: hey, what about KWin?

I’m really excited about all the new possibilities for Metacity.

[ Python? What about Scheme? 😉 ]
Conrad Steenberg

Posted October 22, 2008 at 11:53 pm | Permalink

Another -1 for Javascript…

How about doing a scriptable/JIT Genie implementation that already binds the Glib/Gtk+ platform (via Vala)?

See http://live.gnome.org/Genie maybe with tcc – http://bellard.org/tcc/

Can’t wait to see the new shell 🙂
Jorge

Posted October 23, 2008 at 1:28 am | Permalink

+1 for python and separated processes, but yeah, javascript and others languages seems to fit better for embedding
iain

Posted October 23, 2008 at 8:09 am | Permalink

I for one am also excited by these new Metacity possibilities
Andrew

Posted October 23, 2008 at 10:25 am | Permalink

Doesn’t the idea of the single process kind of undo the great process that has been made with Dbus and Dbus/GObject?

I am happy with separate processes, when I installed dropbox the other day for example, I did a “killall nautilus” to load the plugin and nautilus came back to life. If it hadn’t I would have still had access to all my apps via the panel/gnome menu.

One process is the opposite to progress.
oliver

Posted October 23, 2008 at 11:12 am | Permalink

Just a note regarding the single-process model: currently the gnome-panel pulls in policykit which then disables core dumping (IIRC for security reasons, so passwords etc. are not dumped), which means that panel crashes are not properly caught by crash reporters (like Apport). Apparently this already means that crashes of the clock applet are not caught either; if you merge even more functionality into such a privileged process, crash reporting might be even more difficult.
cdiddca

Posted October 24, 2008 at 11:36 pm | Permalink

>splitting things out into separate processes

I’m pro-JS, but if we want increase reuse and flexibility, we need “GtkApplicaionFramework” with component architecture and some kind of “GLang”, specially designed as component linking language. Statically typed for speed (current firefox + 10 extensions = unbearable on embedded device) and with component linking (like ability to link components as in same address space, as through shared memory in separate processes. implementation feature) pipe-like functionality for typed data between components (JS would create overhead for large throughput e.g. graphics) etc

So applications like Inkskape and Gimp could share common parts. And developer of new “application” could just code couple of components in any supported language (e.g. python, js, C) link them with each other, and with entire infrastructure using “Glang”. Users could easily “rewire” components for their needs (emacs like philosophy)
Mikko

Posted October 25, 2008 at 6:57 am | Permalink

+1 on everything in the original post.

Maybe some of the technical details could/should be different, but I’m no expert on those. I’m a long-time Gnome user and I’ve been trying out KDE 4 in Fedora 10 recently. KDE 4 is certainly technically impressive and has some interesting usability improvements, but overall it still does not seem to make real progress compared to the current Gnome (I never really used KDE 3.5 so I can’t say anything about that).

The ‘window list’ (in Gnome, probably ‘task bar’ in MS Windows) is likely the worst piece of UI design to be forced on as many users as it has been (and not surprisingly it came from MS). The buttons only show part of the name of the window, and they get smaller and smaller the more there are etc. Ugh. I’ve never kept it in my panel, but used the ‘window selector’ instead. I’ve always wondered why hasn’t anyone combined the alt-tab switcher, ‘window selector’ and the launch menu into one thing and made sure that it was highly usable. The thing proposed in the original blog post seems like a big step in the right direction.
ReinoutS

Posted October 27, 2008 at 11:26 am | Permalink

Does Webkit-GTK have a role in all of this?
Andy Tai

Posted October 31, 2008 at 5:34 pm | Permalink

How about Squirrel (http://squirrel-lang.org/)? A C/C++-like language with a VM modeled after Lua?
Michael DeHaan

Posted November 4, 2008 at 5:29 pm | Permalink

Javascript should be destroyed.

For lightweight, fast, and embeddable, why not Lua?

It’s made for that sort of thing.
anonim

Posted November 17, 2008 at 11:23 am | Permalink

I guess using metacity as the desktop compositor makes using the new gnome-shell with compiz imposible, right?
Ferk

Posted April 5, 2009 at 8:27 am | Permalink

+1 for Scheme!
Scheme was intended to be the embedded extension language for Gnome since the begining. It is powerful, easy to code (the lambda-ish hack for Javascript v1.8 is really confusing to read :S), it can do most things in far less lines of code, it can be sandboxed, it has already a lot of bindings (for guile scheme), supported by the FSF, many Universities teach it for AI learning, it is very simple and it wont get cluttered.

I think that Javascript is likely to end up adding a lot of new functionalities in the future resulting to be difficult to learn, and end up being as messy as Java.
ferk

Posted April 5, 2009 at 11:25 am | Permalink

But.. well.. if Scheme syntax is to be a problem ( for some people it seems it is :S ) then Lua is probably the second best option.
tang

Posted April 15, 2009 at 9:41 am | Permalink

Lua and Scheme are both reasonable languages in the abstract, but you have to ask yourself who you want to contribute to the platform. For most potential contributors, knowledge of the language (or ease of apprehension) is a primary barrier to implementation. While scheme may be widely taught in universities, it’s not widely adopted by any other population. That means that most contributors will be CS students. Not a good profile.

Lua is a great embedding language, but most people don’t learn it unless they plan on embedding something or they want to work on a platform that uses it. So again, it’s a higher barrier to startup.

Python is my personal favorite for general-purpose programming, and I wouldn’t think it a poor choice for a desktop UI. But Owen named some outstanding virtues for Javascript, and its object-templating nature makes it very powerful and very easy to build from. It’s also getting much, much faster. Most of this commentary was written months ago, but FF3.1 has been beta for a while since then, and if nobody’s noticed its SpiderMonkey 1.8.1 JIT implementation makes it blazingly faster than previous versions.

And just for choosing that language you get a planetful of web developers who already know how to develop user-experience applications, who are more concerned with usability than with whether lambdas are better than anonymous closures. It seems like a big win to me.
Tuiiner

Posted May 2, 2009 at 11:11 am | Permalink

The shell user should be able to see the same app window in two different ActivityViews, so that it’s possible to see eg. the same evince window with some doc in two unrelated activities such as redacting a text or writing a program. The Activity View should show a subset of all possible windows, no matter if any of these windows are already in any other View.