Alky Postmortem

2009-12-27

A lot of people have asked me what happened to the Alky project. The short answer is that we made a lot of bad business moves, but that answer glances over a lot of the fun little details. Having gained considerable knowledge from other stories of failed startups, I figure I'll throw one of my own into the ring.

History

The Alky project's history can be split into a few phases:

Conception

Alky began as an experiment to see how easily I could convert Windows PE files to run natively on Mac OS X (x86). If it were to work, it may make it possible for me to convert Windows games to run natively on OS X, which was my primary focus. I started by writing a converter that ripped the segments out of the original file and throw them into a Mach-O binary, then linking it against 'libalky'.

LibAlky was a reimplementation of the Win32 API. At first, I tested with files that just called a few simple things, like user32:MessageBoxA, and it worked spectacularly. It was at this point that I realized the basic concept was not only sound, it made a whole lot of sense.

Actual project creation

Once the initial prototype worked, it was time to get people interested. I went to Michael Robertson (who was my employer at the time) and gave him a rundown. He offered to buy the domain, host the project, and get some resources behind it, primarily PR-wise. Within a few days, the project started actually feeling real. We got the site up, wrote some copy to explain what we were doing, and then put it out on Slashdot.

Unsurprisingly, we received three types of responses:

This is impossible, it simply can't work from a technical point of view. (This was especially hilarious coming from a Transitive engineer, considering that what they did is considerably more complicated.)
This is possible, but it won't happen due to the immense amount of work involved. (Half right -- more on this later.)
Wine sucks, I hope you guys can do it better. (We couldn't -- more on this later.)

But more importantly than anything else, we got some developers involved. However, they ended up being driven away.

Mismanagement of the open source project

Alky was the first open source project I'd ever managed that consisted of more than myself and a few friends, and as a result it was mismanaged at every possible turn. It was unclear what people should've been working on, no effort was made to document the project, and no real effort was made to include the developers that were so interested in working on the project.

This was compounded by a rewrite and redesign, which I decided (foolishly) to take on entirely by myself. Some of the design was done as a group, but none of it ever left my hard drive, so work stalled on the existing codebase and the project started to wither.

Shortly thereafter, Falling Leaf Systems was incorporated to back the development of Alky. This further increased the rift between the open source developers and the "core" group (consisting of myself and one of the cofounders of the company). Originally, we planned to dual-license the codebase, but as we got more into discussions of the goals of the business, it became clear that closing the source was the right move. However, we couldn't have picked a poorer way to do it.

Rather than be upfront about the move to closing the source, we simply killed the IRC channel and took down the site. The open source developers were left wondering what happened, while we moved on the rewrite.

Prey and the Sapling Program

Alky was completely rewritten with the new design, and work quickly moved forward on getting the first game running. We released a converter for the demo of Prey (Mac OS X only at first), as part of our new Sapling Program. The Sapling Program was created as a way to get some upfront money for the company, so we could get needed hardware, go to the GDC (which was a horrendous waste of money, for the record), etc. We sold memberships for $50, which gained access to the Prey converter and all future converters. Of course, after we finished Prey for Linux, there were no more converters.

Loss of focus

After Prey was done, we'd planned on implementing DirectX 9 with hopes of running Oblivion, but we lost sight of this goal. Instead, we decided to go after DirectX 10. Along with this shift in focus came an even bigger one: we were no longer targeting Mac OS X and Linux primarily, but Windows XP. We saw that Vista was tanking, and DirectX 10 was only available there, so we decided that we only had a limited window to make money off of that.

Shortly after we made the change, we released a library to allow a few of the DX10 SDK demos to run on Windows XP via OpenGL, albeit with serious missing features (e.g. no shaders). It got some attention, but few people were able to actually get it working. After this was out, I started work on reverse-engineering the shader format and writing a recompiler that would allow Direct3D 10 shaders to work with OpenGL, and even more importantly, without DX10-class hardware. Geometry shaders were planned to run on the CPU if the hardware wasn't available to handle it, and work progressed quickly.

Alky for Applications

We discovered a project known as VAIO to allow Vista applications (non-DX10) to run on Windows XP, and after some talks with the developers, they (or more specifically, one of them -- we'll call him Joe) were sucked into Falling Leaf. We rebranded VAIO and it was released as Alky for Applications. After this, Joe was tasked with making Halo 2 and Shadowrun -- Vista-exclusive but non-DX10 games -- run on Windows XP. We were so confident in our ability to do this, we set up an Amazon referral system and made it such that anyone who purchased either game through us would get a copy of the converter for free.

At this point, I was working heavily on DX10 stuff, and was under tight deadlines to show things off to a company that was interested in our technology, but the clock was ticking. About a week before the planned release of the Halo 2 and Shadowrun compatibility system, Joe came to us and told us that he'd not been able to get anything working, and had very little to show for the time spent. In retrospect, it was my fault for not checking up on him (my job as his manager), but at that point it just made me realize there was no way it was going to be done in time.

I picked it up -- dropping DX10 in the process -- and attempted, feebly, to get it done. Of course, I picked the most difficult way to approach it; reverse-engineering the Windows application compatibility system. By the time I got anything remotely close to working, we'd already missed our deadlines for both the DX10 work and the Halo 2/Shadowrun converter.

The death of Falling Leaf

After all this went down, I fell into a bit of a depression. I became demoralized to the point of just not wanting to go on anymore, in the face of repeated failures, very much in public. Despite us not walking away with a dime -- we made approximately $7000 in total, none of which went to any of the founders of the company -- I felt that we'd ripped people off, despite the best of intentions. It wasn't long after this that Brian (one of my co-founders) stepped down as CEO, and I closed the doors on the company. The source to Alky and Philosopher (the compiler framework used in the shader work) were released at the same time.

Lessons Learned

If you're going to run an open source project, make it easy for people to contribute. Not only that, make it worthwhile for them to contribute and make them a part of the project, not just free workers.
If you're going to kill an open source project, be up front with the people involved. It's not only dishonest not to do so, you lose people who may well go to bat for you even if you are commercial. This is especially important for a project like Alky, where we faced nearly universal negativity.
If you're going to change focus, be clear with your users as to what's going on, and make sure you make it clear that the old stuff is dead and gone. If you don't, you come off looking terrible in the press, and it just makes you look like amateurs (which we were).
Make sure your employees are actually doing what they're supposed to be doing, especially if they're working remotely. This was really the final nail in the coffin for Falling Leaf.

Alky Reborn

Now, with all of that said, there's a light at the end of the tunnel perhaps. The source for Alky has been pulled into Github and it seems that development is picking up again. Perhaps I can shed some light into what design decisions were made, how it was implemented, and how I'd do it now if I were so inclined. I don't plan on working on Alky again (that time has passed), but I'd still love to see it succeed.

The old Alky design

Alky's original prototype had a very simple design, library-wise. There was one big LibAlky which contained all of the Win32 API implementations, each talking directly to native APIs. This design very quickly became unworkable, as we had tons of duplicated, unmaintainable code.

The new Alky design

Alky was redesigned such that we had the Nucleus and the Win32 APIs. The Nucleus was intended to abstract away the platform-specific details and expose a universal API that the Win32 APIs could sit on cleanly. While a good idea, it quickly broke down in implementation. I ended up writing code that straddled the Nucleus and the Linux/OS X APIs, rather than abstracting everything away. This led to slower development and an even more complicated code base.

Potential new design

Having done two implementations of Alky and quite a few other projects that relate to it in concept (IronBabel, Renraku (OS design plays a factor here), etc), I think I'm at a place where I can perhaps come up with a workable design.

The key point where both Alky implementations (and Wine, IMHO) failed is in maintainability. The codebase was a rats nest, much like the real Win32 APIs, and neither implementation managed to help this at all. I think this needs to be the absolute top priority, above performance, compatibility, and all else. If your code is maintainable, all the rest can happen.

With that focus in mind, here are the things I'd do with a new design.

Implement the APIs on top of Mono, taking advantage of the flexible marshalling that P/Invoke gives you. This will allow you to simplify things greatly, and will only have a marginal performance hit in most cases.
In cases where performance is critical, drop down and implement things in C, C , or assembly. If this chunk of the project is greater than 10% of the codebase, you've got bigger problems.
Abstract, abstract, abstract. Break things into the smallest chunks possible and reuse them. This is what we tried to do with the Nucleus idea, but it was easy to just ignore it for complex pieces.
Write thorough unit tests for every API that's implemented (public and internal). Regression testing would also make things really nice.
Rather than trying to get real games/applications running immediately, write your own simple applications that test specific chunks. This would've cut down considerably on the development time in the old revisions of Alky.
Write a great debugger to be able to step through applications cleanly. In the old days, I'd break out IDA Pro and step through a game on Windows, then compare the behavior to the Alkified version, but this was just downright painful.
Make it work on Windows, to allow easy side-by-side comparisons.

The suggestion that this should be built on top of Mono/.NET sounds ridiculous, I'm sure, but I do think it'd give the project a shot.

In Closing

I hope this has given you some insight into what went down with Falling Leaf, some ideas of what not to do (obviously, it's easy to completely overlook these things when you're down in the trenches, as we did), and where Alky could one day go. I wish the Alky Reborn folks the best of luck, and I hope some of my advice helps.

Happy Hacking,
- Cody Brocious (Daeken)

Designing a .NET Computer

2009-11-19

Developing Renraku has led me to some interesting thoughts, not the least of which is the feasibility of a .NET computer.

Concept

Existing computers know only a few core things:

Move to/from memory and registers
Branch
Perform arithmetic

All the other smarts are built on top of that, leaving us with layer upon layer of complexity. What if we were to add some logic to the hardware and introduce a thin software layer on top of that?

Design

CPU

The CPU will execute raw CIL blobs and exposes hardware interfaces and interrupts. The most likely route is to keep the stack in a large on-die cache, doing a small lookahead in decoding to prevent multiple stack hits in arithmetic.

Since the CPU is going to be executing CIL directly, it's going to have to deal with object initialization, virtual method lookups, array allocation, memory manager, and many other tasks that existing systems handle purely in software. This is achieved by splitting the task between the CPU and a hypervisor.

Hypervisor

The hypervisor, like everything else, will be built completely in managed code. However, it will be a raw blob of CIL, instead of the normal PE container. The hypervisor will handle most of the high-level tasks in general, but it's up to the specific hardware manufacturer to determine what goes in the CPU and what goes in the hypervisor. The hypervisor will come up when the machine is booted, much like a BIOS. After performing hardware initialization, it will look for the bootloader.

The hypervisor provides a standardized interface to talk to drivers and memory, which are verified. In this interface, it will also provide hooks for the memory manager, allowing the OS to handle it in a nice way. Interrupts will also be hooked via this interface. The hypervisor instance will be passed to the bootloader, which then passes it on to the OS.

Bootloader

The bootloader is a standard .NET binary, and its job is simply to hand control off to the OS. Its function is effectively identical to that of a bootloader on an existing system.

OS

The OS will look very similar to existing managed OSes, with a few exceptions. Obviously, there's no need to write your own .NET implementation, and no need to write a compiler.

Because it exclusively talks through the hypervisor interface, it's possible to nest OSes as deep as you want, you simply have to expose your own mock hypervisor. You can handle resource arbitration here and run them very efficiently.

Unlike most OSes, it won't get free reign over the system. To access devices, it'll have to use verified channels with the hypervisor, to ensure that consistency is upheld. This is similar to the way that Singularity functions.

Is it a good idea?

Maybe, maybe not. I don't know how efficient a CIL chip would be or how much of the .NET framework can really be pushed onto a chip, but the core idea is really an extension of Renraku. A pure-managed system would be very powerful if done properly, but it's a pretty drastic change.

A nice middle ground would be utilizing an existing processor and using the hypervisor in place of the existing BIOS/EFI. With some code caching magic, the CIL->X compilation could be done at this level and be completely transparent to the OS. This may well be a better route than a CIL CPU, but it's tough without having more information on the hardware challenges involved with such a thing.

Happy hacking,
- Cody Brocious (Daeken)

Renraku OS: FAQ

2009-11-19

This is the current Renraku OS FAQ. If you have a question that's not answered here, feel free to join us in #renraku on irc.freenode.net.

What is Renraku OS?

Renraku is a pure managed code operating system, written in Boo from the ground up. The goal is to produce a high-quality operating system without the traditional security, performance, and flexibility issues.

Is Renraku open source?

Yes, Renraku is distributed under the CDDL (Common Development and Distribution License).

Why not use X in place of .NET?

The .NET infrastructure is fairly simple, powerful, (very) easy to compile, and flexible enough for our purposes. Another tool may be better suited to the task, but we've found it to be a good match.

Why use Boo instead of C#?

Although C# is the more traditional choice for .NET work, it lacks macros and is simply much more verbose. Boo is very easy to pick up and significantly more powerful. However, our choice of Boo doesn't mean that the rest of the OS has to be written in Boo. Services, Capsules, and drivers can all be written in any .NET language.

Is it POSIX compatible?

No, by design. POSIX was great in 1989, but times have changed and we need to move forward.

What about legacy applications?

Virtualization is the only planned route for legacy code at the moment, as it's the most efficient and lowest cost. Perhaps we'll support running certain types of legacy applications in a more direct route in the future, but it's doubtful.

Won't this be slow?

For a while, yes. However, compilers are rapidly getting considerably better and compilers like Microsoft's Bartok are paving the way toward faster code. In addition, our lack of separate address spaces and privilege levels make system calls and task switches dramatically cheaper than on existing OSes.

No separate address spaces? What keeps applications apart?

Because we use pure managed code, we're guaranteed that applications cannot touch memory they aren't directly given. This means that the compiler has the last word on all memory access in the code, reducing the attack surface of the OS considerably.

Does it run X?

Unless the application you have in mind is playing with our shell, looking at our logo, or acquiring an IP address via DHCP, probably not. Renraku is still in its infancy, but we hope it'll be useful someday.

Renraku OS: Networking, Hosted Mode, Moving Forward

2009-11-19

There hasn't been any news on Renraku in a while, so I figured I'd throw up an update on what's going on.

Where we are

Not a whole lot has changed since the last post. We have a clear direction on where we're moving and we're making progress, albeit more slowly than in the first few months.

The biggest new thing is networking. We have Ethernet, IP, UDP, DHCP, and ARP, although the IP and UDP implementations still need some love. We also have a driver for the AMD PCnet card used in VMware, so testing is fairly easy.

Outside of this, most of the work has been in design, where we're solidifying the way the system works.

Where we're going

Hosted mode

Renraku is soon going to be able to run on top of any existing .NET implementation, allowing considerably simpler development and debugging. The primary goal with hosted mode is to allow developers to tackle higher-level tasks (GUI work, filesystems, etc) while our .NET implementation, kernel, and drivers are built.

When you have a hosted Renraku kernel, you'll run the kernel like any other application, then all of Renraku's services will start up just like on the bare metal. At that point, you'll be able to run other applications inside it, with the benefit of the drivers being dealt with by the underlying OS. This should significantly lower the barrier to entry to hacking on Renraku, while making our code considerably better.

To achieve this, we're splitting Renraku into several pieces:

Compiler
Kernel
.NET Base Class Library
Services
Applications

When we do this, we need to split platform-specific details away from the rest of the code. Hosted mode will just be another platform to the OS, so this also gives us much-needed platform independence. (This will make it possible to write targets for ARM, x64, etc in the near future -- more on this soon.)

In hosted mode, certain services will run as-is, but others will talk to the host OS. For networking, a driver will be implemented that uses raw sockets, utilizing our stack above it. For storage, we can create a virtual disk in a file. For video and input (and sound, eventually), we'll likely backend to SDL.

Whenever possible, we'll use more Renraku services (e.g. not just implementing the 'tcp' service as a wrapper around the .NET TCP classes), which will keep the code from fragmenting too much.

Utilizing Mono's BCL

In the early days of Renraku, I pushed hard against the idea of using Mono's BCL, preferring that we write our own. While I'd like to go with our own BCL that's tightly tied to the Renraku kernel, cracks are quickly appearing in that idea. With us launching our hosted mode initiative, it's clear that the BCL needs to be separated as much as possible. At this point, it's starting to look like going with a modified Mono BCL may be the best route; at the very least, it's worth investigating.

This is simpler than writing our own BCL of course, but it's no trivial task itself. Integrating it with the Renraku kernel is going to be tough; most likely, the right way to do it is to strip the BCL down to the core components that we use and implement now, then move up from there. The other big challenge is compiler support; we currently don't support exceptions and several other things that the BCL uses heavily.

Regardless of the difficulty, this may well prove to be the right way for us to go. Hopefully it will work out, as it'll accelerate Renraku's development significantly.

Capsules

Capsules are our answer to the application and process concept in most OSes. A Capsule holds running tasks, connected events, and the service context. When a Capsule starts, its main class is initialized, which will set up services, throw up a GUI, or whatever. Unlike a process on most OSes, a Capsule does not have to have a thread (task) running at all times, acting much like TSRs did in the old days. Once initialization is completed, the main task can die off, leaving all its delegates and services up. Because of the design of the system, most Capsules will not have a main task, leading to a more nimble system.

The strength of Capsules is most apparant with a GUI or networking app. In a GUI, the main task will put up the GUI and connect events, then die off. When an event occurs, the GUI service calls the Capsule's delegate, which spawns a task inside the Capsule to deal with it.

A network service will work similarly, dying off after it registers itself, waking up only to accept a new client, then it'll wake up when communication happens. This makes writing network applications very, very simple and efficient.

One outstanding problem is how to handle a Capsule being force killed. We need to handle this in a way that guarantees that the system remains consistent, which is tough. Anyone have a proof debunking the halting problem?

Formal design documents

Currently the Renraku documentation is, to put it nicely, sparse. Code isn't commented at all and there's no documentation on how anything works. I've started writing real design docs, which will go live on this blog over the coming weeks. Here are the primary documents that are being worked on:

Capsule architecture
Input chain of command (how input gets from the hardware to the right place)
Storage architecture
Networking
Graphics
Compiler
Security

Capsules will definitely be the first to go out, as it's the most important (it blocks the most other elements of the system); the order on the rest is up in the air.

Closing notes

The work I've done on Renraku OS so far has been some of my best and certainly some of the most rewarding. I'd like to thank everyone who's contributed or just cheered on the team. Even if this never takes off (the likelihood is quite small), Renraku will be an OS we can all be proud of, because of you guys.

Happy hacking,
- Cody Brocious (Daeken)

How To Lose A Customer In Two Lines

2009-11-13

You know, I'm not a picky customer. When I find a company that does the job well and without causing problems, I tend to stick with them for a long time. I also tend to advocate for good companies, because they're becoming less and less common. Today, a company just lost my business forever, and I doubt anyone else who sees this will ever deal with them either. I ordered some Adrafinil from Nubrain.com yesterday. The ordered was marked as completed in their site and I wasn't given a tracking number (didn't think about it at the time, but first class USPS was used, so there was no way to track it). As I was curious to see how long it'd take to arrive, I looked up where it was shipping from and found that they're located in Lawrenceville, GA, where I reside. Since the site didn't say it had shipped yet, I decided to send them an email to see if it was possible to pick up the order:

Hello, I was looking at your site to figure out where my order (redacted) was going to ship from, to get some idea of the time it'd take to ship, when I found that you're located in Lawrenceville as well. If my order hasn't shipped yet, would it be possible to pick it up instead? Thanks, - Cody Brocious

Not a terribly difficult request. If the answer was no, or it had already shipped, I would've thought nothing of it and moved on. Instead, I got this:

WE ONLY SEND BY US MAIL IF YOU WANT PICK UP CALL PAPA JOHNS PIZZA

Is it really that much of an inconvenience to send back "Sorry, but our store only delivers"? When you're in a market like shipping pharmaceuticals, a certain amount of trust is required, not that this would be appropriate in any market. If you want to keep your customers, you have to treat them well, not like you're being inconvenienced by them. If you don't want to do the job, don't do it; don't act like the customer is doing you a disservice by giving you money.

Cody Brocious

Previous | Page 7 of 9 | Next