Discussion:
[uml-devel] cygnus-win32 port of User Mode Linux
Dan Aloni
2000-12-07 15:20:13 UTC
Permalink
Yesterday was a first attempt to port UML to Micro$oft Windows.

cygnus-win32 is the UNIX compatible environment that I tried to compile
Linux on. (http://www.cygwin.com)

This is what I came across in this attempt:

* Incomplete cygwin - Cygwin is incomplete to provide a fully UNIX
compatible layer, that means the important functions, like mmap(),
fork(), clone(), signal handling functions, are only partially
supported, which makes it quite difficult to port.
* binutils - 'as' creates COFF objects files on cygwin, I don't think
creating ELF objects is possible there. Why is this a problem? Well, we
COFF doesn't have all the features like ELF does, so linking will be
more difficult.
* a buggy ld - ld turns completely insane when trying to link relocatable
object files. The solution is to link all regular object files in the
kernel in one strike to create the Microsoft-intimidating LINUX.EXE
file ;)
* The Win32 API and design - One of the most nasty APIs out there, with
a quite limited support for POSIX standards.
* mkdep.c (what makes 'make dep' possible in the kernel), needed patching
because mmap() didn't work there.
* gcc, for some reason, decided to complaine about stray '\'s in
some places where multi-lined macros where defined.


So what did we get from this? I've patched around some Makefiles, removed
some ELF dependancies from asm code, and the kernel marvelously started
compiling on my cygwin system...

All the code besides the arch specific (fs/*, net/*, mm/*, kernel/*,
etc) compiles nicely to object files. The only compiliation problem now is
the arch specific code that resides in arch/um/*.

This is very preliminary, so no public patches at the moment. If anyone
can help, jump in.
--
Dan Aloni
***@karrde.org
Lars Brinkhoff
2000-12-07 21:15:19 UTC
Permalink
Post by Dan Aloni
* Incomplete cygwin - Cygwin is incomplete to provide a fully UNIX
compatible layer, that means the important functions, like mmap(),
As far as I could tell when I looked at Cygwin, mmap() was supported
when running in Windows NT (and 2000, I guess), but not in Windows 95
(nor 98, again a guess).
Post by Dan Aloni
* gcc, for some reason, decided to complaine about stray '\'s in
some places where multi-lined macros where defined.
Sometimes I have this problem, and it's usually because the line ends
with (in C syntax) "\\\r\n" instead of just "\\\n". The pre-processor
apparently doesn't recognize the backslash-carriage_return-linefeed
sequence to be a line continuation. Solution: convert CRLF line
separators to LF.

I don't know if this is the source of your problem. Could LFs have
been converted to CRLFs when you edited files with a Windows editor?
--
http://lars.nocrew.org/
Michael Vines
2000-12-08 14:38:21 UTC
Permalink
Post by Dan Aloni
Yesterday was a first attempt to port UML to Micro$oft Windows.
.
<snip snip>
.
Post by Dan Aloni
This is very preliminary, so no public patches at the moment. If anyone
can help, jump in.
I was thinking about this last night, once the kernel is compiling with
cygwin (ie. once the bulk of the work is done) I could probably help out
with merging/rewriting the syscall trapping code from LINE
(http://neomueller.org/~isamu/line/) into UML if nobody else is doing
it. The actual relevant code is quite small.

I'm a little unsure on the details but I wonder if we could create
something like another binfmt for native x86 Linux executables on the
cygwin/NT port. Then you could do uber-cool things like have native
cygwin (COFF?) Linux applications running in the same UML as regular x86
Linux apps. But I guess that also depends on how the native cygwin apps
invoke a syscall. If they use a int 0x80 like real Linux apps then there
would be no need for the distinction (except perhaps in the loader).
Although I don't think that int 0x80 would be the best method for native
NT apps given a choice.

Just some random thoughts...

Mike
Michael Vines
2000-12-08 14:43:09 UTC
Permalink
I think sendmail ate half of my message! resending...

---------- Forwarded message ----------
Date: Fri, 8 Dec 2000 09:38:21 -0500 (EST)
From: Michael Vines <***@undergrad.math.uwaterloo.ca>
To: user-mode-linux-***@lists.sourceforge.net
Subject: RE: [uml-devel] cygnus-win32 port of User Mode Linux
Post by Dan Aloni
Yesterday was a first attempt to port UML to Micro$oft Windows.
.
*snip snip*
.
Post by Dan Aloni
This is very preliminary, so no public patches at the moment. If anyone
can help, jump in.
I was thinking about this last night, once the kernel is compiling with
cygwin (ie. once the bulk of the work is done) I could probably help out
with merging/rewriting the syscall trapping code from LINE
(http://neomueller.org/~isamu/line/) into UML if nobody else is doing
it. The actual relevant code is quite small.

I'm a little unsure on the details but I wonder if we could create
something like another binfmt for native x86 Linux executables on the
cygwin/NT port. Then you could do uber-cool things like have native
cygwin (COFF?) Linux applications running in the same UML as regular x86
Linux apps. But I guess that also depends on how the native cygwin apps
invoke a syscall. If they use a int 0x80 like real Linux apps then there
would be no need for the distinction (except perhaps in the loader).
Although I don't think that int 0x80 would be the best method for native
NT apps given a choice.

Just some random thoughts...

Mike
Michael Vines
2000-12-08 14:55:36 UTC
Permalink
Argh! Sorry for two excess messages...I think I've figured it out. I had
a period (.) all alone by itself on a line, and my PINE wasn't escaping it
properly :)


---------- Forwarded message ----------
Date: Fri, 8 Dec 2000 09:38:21 -0500 (EST)
From: Michael Vines <***@undergrad.math.uwaterloo.ca>
To: user-mode-linux-***@lists.sourceforge.net
Subject: RE: [uml-devel] cygnus-win32 port of User Mode Linux
Post by Dan Aloni
Yesterday was a first attempt to port UML to Micro$oft Windows.
.
<snip snip>
.
Post by Dan Aloni
This is very preliminary, so no public patches at the moment. If anyone
can help, jump in.
I was thinking about this last night, once the kernel is compiling with
cygwin (ie. once the bulk of the work is done) I could probably help out
with merging/rewriting the syscall trapping code from LINE
(http://neomueller.org/~isamu/line/) into UML if nobody else is doing
it. The actual relevant code is quite small.

I'm a little unsure on the details but I wonder if we could create
something like another binfmt for native x86 Linux executables on the
cygwin/NT port. Then you could do uber-cool things like have native
cygwin (COFF?) Linux applications running in the same UML as regular x86
Linux apps. But I guess that also depends on how the native cygwin apps
invoke a syscall. If they use a int 0x80 like real Linux apps then there
would be no need for the distinction (except perhaps in the loader).
Although I don't think that int 0x80 would be the best method for native
NT apps given a choice.

Just some random thoughts...

Mike
Jeff Dike
2000-12-08 18:02:34 UTC
Permalink
Post by Michael Vines
I was thinking about this last night, once the kernel is compiling
with cygwin (ie. once the bulk of the work is done) I could probably
help out with merging/rewriting the syscall trapping code from LINE
(http://neomueller.org/~isamu/line/) into UML if nobody else is doing
it.
Send him mail :-) He's got a few undefined symbols left, but he's telling it
to produce a binary anyway, and it's starting to run.
Post by Michael Vines
I'm a little unsure on the details but I wonder if we could create
something like another binfmt for native x86 Linux executables on the
cygwin/NT port.
That's the native elf binfmt - that's already there.
Post by Michael Vines
Then you could do uber-cool things like have native cygwin (COFF?)
Linux applications running in the same UML as regular x86 Linux apps.
The windows binaries are apparently COFF only, which is causing some pain.

If there isn't a binfmt for COFF already, one could be added. The problem is
that those binaries make windows system calls, which a Linux kernel is not
prepared to deal with.

Don had a look at your code (I haven't :-(), and apparently you're changing
all of the 'int 0x80's to 'int 0x3's. I don't think that will work too well
with UML. It would be a lot nicer to leave the 'int 0x80's in place and
intercept them. Is that doable?

Jeff
Michael Vines
2000-12-08 17:15:35 UTC
Permalink
Post by Jeff Dike
Post by Michael Vines
Then you could do uber-cool things like have native cygwin (COFF?)
Linux applications running in the same UML as regular x86 Linux apps.
The windows binaries are apparently COFF only, which is causing some pain.
If there isn't a binfmt for COFF already, one could be added. The problem is
that those binaries make windows system calls, which a Linux kernel is not
prepared to deal with.
Has there been any thought into how the windows binaries will be invoking
the Linux syscalls?

Apart from the mess of freak hybred applications, I don't see why making
windows system calls is a big deal. How would that interfere with the
Linux kernel? Assuming that the actual Linux syscalls are invoked by a
'call' instruction or something similiar (UML needs to be mapped into the
app process space right?) then it seems to me that the app could happily
execute all the Windows API calls it wanted and the kernel would be
completely oblivious to it all.
Post by Jeff Dike
Don had a look at your code (I haven't :-(), and apparently you're changing
all of the 'int 0x80's to 'int 0x3's. I don't think that will work too well
with UML. It would be a lot nicer to leave the 'int 0x80's in place and
intercept them. Is that doable?
Yes and no.

It only converts int 0x80's to int 03's on Win9X platforms, because Win9X
dies a horrible death whenever it executes int 0x80. The NT version keeps
all the int 0x80s because it can handle them no problem.

I suspect that what is happening on Win9X is that when it encounters the
int 0x80 it actually tries to jump to the int 0x80 handler instead of
trapping the instruction as a fault. It may be fixable by writing a
little VxD which installs a "real" int 0x80 handler that redirects
execution to UML.

Alternatively cygwin/UML could just support NT for now.


Mike
Jeff Dike
2000-12-08 20:27:22 UTC
Permalink
Post by Michael Vines
Has there been any thought into how the windows binaries will be
invoking the Linux syscalls?
There won't be any windows binaries at all, except for the kernel itself.
Post by Michael Vines
Apart from the mess of freak hybred applications, I don't see why
making windows system calls is a big deal. How would that interfere
with the Linux kernel? Assuming that the actual Linux syscalls are
invoked by a 'call' instruction or something similiar (UML needs to be
mapped into the app process space right?) then it seems to me that the
app could happily execute all the Windows API calls it wanted and the
kernel would be completely oblivious to it all.
Why would they be running under UML at all? It may be possible, but I don't
see how it's useful.
Post by Michael Vines
It only converts int 0x80's to int 03's on Win9X platforms, because
Win9X dies a horrible death whenever it executes int 0x80. The NT
version keeps all the int 0x80s because it can handle them no problem.
OK, good.
Post by Michael Vines
Alternatively cygwin/UML could just support NT for now.
That sounds like a good idea. Ultimately, UML on W9X might be fun, because
vmware apparently is NT-only.

Jeff
Michael Vines
2000-12-08 20:29:18 UTC
Permalink
Post by Jeff Dike
Post by Michael Vines
Has there been any thought into how the windows binaries will be
invoking the Linux syscalls?
There won't be any windows binaries at all, except for the kernel itself.
Post by Michael Vines
Apart from the mess of freak hybred applications, I don't see why
making windows system calls is a big deal. How would that interfere
with the Linux kernel? Assuming that the actual Linux syscalls are
invoked by a 'call' instruction or something similiar (UML needs to be
mapped into the app process space right?) then it seems to me that the
app could happily execute all the Windows API calls it wanted and the
kernel would be completely oblivious to it all.
Why would they be running under UML at all? It may be possible, but I don't
see how it's useful.
Say you fire up cywin/UML and start compiling $(insert_program_here).
Would the resulting executable be simply a "regular" Linux x86 executable,
as if it was compiled directly on a real Linux system?

For some reason I got the idea in my head that another possibilty would be
that the resulting executable could optionally be a "Linux cygwin/UML
executable". Basically treating cygwin/UML as a seperate arch platform.

The immediate advantage I can see for doing that would be that trapping
int 0x80 instructions are time consuming, much more so than doing a
straight 'call' into the syscall handler. But cygwin/UML should still be
able to execute regular Linux executables as well.

This idea can be extended to the original UML as well. Currently UML
executes "real" Linux apps. But what if there was an actual UML
archecture defined with it's own syscall calling convention and such so
that you could compile apps directly (and only) for UML.

This would significantly ease the porting of UML to other Linux platforms
(and other OSes). Say someone wanted to run UML on Linux/PPC. They may
not be able to run native Linux/PPC apps (without implementing that
functionality in UML) but they could cross-compile apps into the UML
archecture and run them that way. You could also do this to run Linux
apps on platforms that Linux hasn't even been ported to yet (so long as
you can compile UML on that platform).

Note that the compiled apps for UML would still be tied to a particular
architecture as it would still execute normally on the cpu. The difference
is that that instead of invoking the Linux syscalls the normal "int 0x80"
way, the syscalls would be invoked the "UML way".

I guess this basically turns UML into a fancy POSIX subsystem or
something. Anyways, just a weird idea I had :)

Mike
Jeff Dike
2000-12-08 22:09:42 UTC
Permalink
Post by Michael Vines
Would the resulting executable be simply a "regular" Linux x86
executable, as if it was compiled directly on a real Linux system?
gcc runs exactly as it would on the native kernel, so the answer is yes.
Anything different would require fiddling the compiler (or the libraries, more
likely).
Post by Michael Vines
The immediate advantage I can see for doing that would be that
trapping int 0x80 instructions are time consuming, much more so than
doing a straight 'call' into the syscall handler.
Calling into the syscall would be faster, but if the platform is to be at all
secure, there also needs to be an unevadable mode switch from unprivileged to
privileged mode.
Post by Michael Vines
Say someone wanted to run UML on Linux/PPC. They may not be able to
run native Linux/PPC apps (without implementing that functionality in
UML) but they could cross-compile apps into the UML archecture and run
them that way.
You'd still have to port UML to PPC, and then it would run native Linux/PPC
apps. So, this wouldn't make anything easier.
Post by Michael Vines
You could also do this to run Linux apps on platforms that Linux
hasn't even been ported to yet (so long as you can compile UML on that
platform).
If Linux doesn't run on it, you have to write a fair amount of code to port
it, because UML steals most of its headers and some C files from the
underlying arch. In this case, what apps are there for UML to run? If gcc
supports the machine, it's compiling binaries for whatever OS is there, so
that platform already runs all those apps.

Jeff
Michael Vines
2000-12-08 21:27:41 UTC
Permalink
Post by Jeff Dike
Post by Michael Vines
Would the resulting executable be simply a "regular" Linux x86
executable, as if it was compiled directly on a real Linux system?
gcc runs exactly as it would on the native kernel, so the answer is yes.
Anything different would require fiddling the compiler (or the libraries, more
likely).
Post by Michael Vines
The immediate advantage I can see for doing that would be that
trapping int 0x80 instructions are time consuming, much more so than
doing a straight 'call' into the syscall handler.
Calling into the syscall would be faster, but if the platform is to be at all
secure, there also needs to be an unevadable mode switch from unprivileged to
privileged mode.
Doesn't UML need to be mapped into the address space of the
application? If that is the case, then what is stopping the application
from peaking into UML?
Post by Jeff Dike
Post by Michael Vines
You could also do this to run Linux apps on platforms that Linux
hasn't even been ported to yet (so long as you can compile UML on that
platform).
If Linux doesn't run on it, you have to write a fair amount of code to port
it, because UML steals most of its headers and some C files from the
underlying arch.
ahhh..ok.

I guess I was sort of talking about creating a generic virtual UML arch
which isn't really tied to a particular physical arch. That way so long
as you had a gcc+glibc for your platform you could compile and run a UML.
However admittedly, I can't think of a good reason to do this as the
moment apart from it being "interesting". But this may be partially due
to the fact that it's Friday afternoon :)


Mike
Jeff Dike
2000-12-08 23:53:01 UTC
Permalink
Post by Michael Vines
Doesn't UML need to be mapped into the address space of the
application?
Yup.
Post by Michael Vines
If that is the case, then what is stopping the
application from peaking into UML?
Nothing now, but I'm going to make it protect (or unmap) the kernel whenever
it's in userspace. That will make it impossible to fiddle kernel data from
the process, and that will make it a fairly secure jail.
Post by Michael Vines
That way so long as you had a gcc+glibc for your platform you could
compile and run a UML.
And if you already have glibc, you've already got Posix.

Jeff
Michael Vines
2000-12-08 22:50:40 UTC
Permalink
Post by Jeff Dike
Post by Michael Vines
If that is the case, then what is stopping the
application from peaking into UML?
Nothing now, but I'm going to make it protect (or unmap) the kernel whenever
it's in userspace. That will make it impossible to fiddle kernel data from
the process, and that will make it a fairly secure jail.
That sounds like a pretty severe performance hit.
Post by Jeff Dike
Post by Michael Vines
That way so long as you had a gcc+glibc for your platform you could
compile and run a UML.
And if you already have glibc, you've already got Posix.
Touche.


Mike
Erik Paulson
2000-12-09 00:21:00 UTC
Permalink
Post by Michael Vines
Post by Jeff Dike
Post by Michael Vines
If that is the case, then what is stopping the
application from peaking into UML?
Nothing now, but I'm going to make it protect (or unmap) the kernel whenever
it's in userspace. That will make it impossible to fiddle kernel data from
the process, and that will make it a fairly secure jail.
That sounds like a pretty severe performance hit.
It can't be much worse than ptracing, rewriting the system call, making a new
system call, and then returning.

-Erik
Jeff Dike
2000-12-09 02:16:10 UTC
Permalink
Post by Michael Vines
That sounds like a pretty severe performance hit.
Maybe. If it turns out to be a problem, we can just have a slow secure mode
and a fast insecure mode.

Jeff
Jeff Dike
2000-12-09 03:52:57 UTC
Permalink
Post by Erik Paulson
It can't be much worse than ptracing, rewriting the system call,
making a new system call, and then returning.
I'm hoping to get that overhead down to the point where something like
protecting the kernel becomes a noticable performance problem :-)

Jeff
Dan Aloni
2000-12-09 17:08:59 UTC
Permalink
Post by Michael Vines
Post by Dan Aloni
This is very preliminary, so no public patches at the moment. If anyone
can help, jump in.
I was thinking about this last night, once the kernel is compiling with
cygwin (ie. once the bulk of the work is done) I could probably help out
with merging/rewriting the syscall trapping code from LINE
(http://neomueller.org/~isamu/line/) into UML if nobody else is doing
it. The actual relevant code is quite small.
I'm a little unsure on the details but I wonder if we could create
something like another binfmt for native x86 Linux executables on the
cygwin/NT port. Then you could do uber-cool things like have native
cygwin (COFF?) Linux applications running in the same UML as regular x86
Linux apps. But I guess that also depends on how the native cygwin apps
invoke a syscall. If they use a int 0x80 like real Linux apps then there
would be no need for the distinction (except perhaps in the loader).
Although I don't think that int 0x80 would be the best method for native
NT apps given a choice.
Native cygwin apps are linked to the cygwin DLL, and are regular win32
executables.

I thought how can we have executable on Windows to be runnable with the
int 0x80 calling method - we can run the executable in a debug mode, and
trap 'calls' to int 0x80. This is what LINE does. I think we can also
write a Windows driver to handle those executable ; imagine Windows being
able to load ELF executables, and adnle int 0x80 calls like Linux does?
--
Dan Aloni
***@karrde.org
Loading...