billziss@windows:~/Projects/ext$ git clone https://github.com/libfuse/sshfs.git Cloning into 'sshfs'... [snip] billziss@windows:~/Projects/ext$ cd sshfs/ billziss@windows:~/Projects/ext/sshfs [master]$ autoreconf -i [snip] billziss@windows:~/Projects/ext/sshfs [master]$ ./configure [snip] configure: error: Package requirements (fuse >= 2.3 glib-2.0 gthread-2.0) were not met: No package 'fuse' found
SSHFS Port Case Study
This document is a case study in porting SSHFS to Windows and WinFsp. At the time that the document was started WinFsp had a native API, but no FUSE compatible API. The main purpose of the case study was to develop a FUSE compatible API for WinFsp.
Step 1: Gather Information about SSHFS
The SSHFS project is one of the early FUSE projects. The project was originally written by Miklos Szeredi who is also the author of FUSE. SSHFS provides a file system interface on top of SFTP (Secure File Transfer Protocol).
The project’s website is at https://github.com/libfuse/sshfs. A quick perusal of the source code shows that this is a POSIX program, the file configure.ac
further shows that it depends on GLib and FUSE.
Luckily Cygwin on Windows provides a POSIX interface and it also includes GLib and pkg-config. We are missing FUSE of course. Let’s try it anyway:
As expected we get an error because there is no package named FUSE. So let’s create one.
Step 2: Create a FUSE Compatible Package
After a few days of development there exists now an initial FUSE implementation within WinFsp. Most of the FUSE API’s from the header files fuse.h
, fuse_common.h
and fuse_opt.h
have been implemented. However none of the fuse_operations
currently work as the necessary work to translate WinFsp requests to FUSE requests has not happened yet.
Challenges
-
The FUSE API is old and somewhat hairy. There are multiple versions of it and choosing the right one was not easy. In the end version 2.8 of the API was chosen for implementation.
-
The FUSE API uses a number of OS specific types (notably
struct stat
). Sometimes these types have multiple definitions even within the same OS (e.g.struct stat
andstruct stat64
). For this reason it was decided to define our ownfuse_*
types (e.g.struct fuse_stat
) instead of relying on the ones that come with MSVC. Care was taken to ensure that these types remain compatible with Cygwin as it is one of our primary target environments. -
The WinFsp DLL does not use the MSVCRT and uses its own memory allocator (
HeapAlloc
,HeapFree
). Even if it used the MSVCRTmalloc
, it does not have access to the Cygwinmalloc
. The FUSE API has a few cases where users are expected to usefree
to deallocate memory (e.g.fuse_opt_add_opt
). But whichfree
is that for a Cygwin program? The Cygwinfree
, the MSVCRTfree
or our ownMemFree
?To solve this problem we use the following pattern: every FUSE API is implemented as a
static inline
function that calls a WinFsp-FUSE API and passes it an extra argument that describes the environment:static inline int fuse_opt_add_opt(char **opts, const char *opt) { return fsp_fuse_opt_add_opt(fsp_fuse_env(), opts, opt); }
The
fsp_fuse_env
function is anotherstatic inline
function that simply "captures" the current environment (things like the environment’smalloc
andfree
).... #elif defined(__CYGWIN__) ... #define FSP_FUSE_ENV_INIT \ { \ 'C', \ malloc, free, \ fsp_fuse_daemonize, \ fsp_fuse_set_signal_handlers, \ fsp_fuse_remove_signal_handlers,\ } ... #else ... static inline struct fsp_fuse_env *fsp_fuse_env(void) { static struct fsp_fuse_env env = FSP_FUSE_ENV_INIT; return &env; }
-
The implementation of
fuse_opt
proved an unexpected challenge. The functionfuse_opt_parse
is very flexible, but it also has a lot of quirks. It took a lot of trial and error to arrive at a clean reimplementation.
Things that worked rather nicely
-
The pattern
fuse_new
/fuse_loop
/fuse_destroy
fits nicely to the WinFsp service model:FspServiceCreate
/FspServiceLoop
/FspServiceDelete
. This means that every (high-level) FUSE file system can rather easily be converted into a Windows service if desired.
Integrating with Cygwin
It remains to show how to use the WinFsp-FUSE implementation from Cygwin and SSHFS. SSHFS uses pkg-config
for its build configuration. Pkg-config
requires a fuse.pc
file:
arch=x64 prefix=${pcfiledir}/.. incdir=${prefix}/inc/fuse implib=${prefix}/bin/winfsp-${arch}.dll Name: fuse Description: WinFsp FUSE compatible API Version: 2.8 URL: https://winfsp.dev Libs: "${implib}" Cflags: -I"${incdir}"
The WinFsp installer has been modified to place this file within its installation directory. It remains to point pkg-config
to the appropriate location (using PKG_CONFIG_PATH
) and the SSHFS configuration process can now find the FUSE package.
SSHFS-Win
The sshfs-win open-source project (work in progress) can be found here: https://bitbucket.org/billziss/sshfs-win
Step 3: Mapping Windows to POSIX
It would seem that we are now ready to start implementing the fuse_operations
. However there is another matter that we need to attend to first and that is mapping the Windows file system view of the world to the POSIX one and vice-versa.
Mapping Paths
The Windows and POSIX file systems both use paths to address files. The path conventions are different, so we need a technique to convert between the two. This goes beyond a simple translation of the backslash character (\
) to slash (/
), because several characters are reserved and cannot be used in a Windows file path, but are legal when used in a POSIX path.
The reserved Windows characters are:
< > : " / \ | ? * any character between 0 and 31
POSIX only has two reserved characters: slash (/
) and NUL
.
So how do we map between the two? Luckily this problem has been solved before by "Services for Macintosh" (SFM), "Services for UNIX" (SFU) and Gygwin. The solution involves the use of the Unicode "private use area". When mapping a POSIX path to Windows, if we encounter any of the Windows reserved characters we simply map it to the Unicode range U+F000 - U+F0FF. The reverse mapping from Windows to POSIX is obvious.
Mapping Security
Mapping Windows security to POSIX (and vice-versa) is a much more interesting (and difficult) problem. We have the following requirements:
-
We need a method to map a Windows SID (Security Identifier) to a POSIX uid/gid.
-
We need a method to map a Windows ACL (Access Control List) to a POSIX permission set.
-
We want any mapping method we come up with to be bijective (to the extent that it is possible).
Luckily "Services for UNIX" (and Cygwin) come to the rescue again. The following Cygwin document describes in great detail a method to map a Windows SID to a POSIX uid/gid that is compatible with SFU: https://cygwin.com/cygwin-ug-net/ntsec.html. A different document from SFU describes how to map a Windows ACL to POSIX permissions: https://technet.microsoft.com/en-us/library/bb463216.aspx.
The mappings provided are not perfect, but they come pretty close. They are also proven as they have been used in SFU and Cygwin for years.
WinFsp Implementation
A WinFsp implementation of the above mappings can be found in the file src/dll/posix.c
.
Step 4: Implementing FUSE Core
We are now finally ready to implement the fuse_operations
. This actually proves to be a straightforward mapping of the WinFSP FSP_FILE_SYSTEM_INTERACE
to fuse_operations
:
- GetVolumeInfo
-
Mapped to
statfs
. Volume labels are not supported by FUSE (see below). - SetVolumeLabel
-
No equivalent on FUSE, so simply return
STATUS_INVALID_PARAMETER
. One thought is to map this call into asetxattr("sys.VolumeLabel")
(or similar) call on the root directory (/
). - GetSecurityByName
-
Mapped to
fgetattr
/getattr
. The returnedstat
information is translated into a Windows security descriptor usingFspPosixMapPermissionsToSecurityDescriptor
. - Create
-
This is used to create a new file or directory. If a file is created this is mapped to
create
ormknod
;`open`. If a directory is created this is mapped tomkdir
;`opendir` calls (the reason is that on Windows a directory remains open after being created). In some circumstances achown
may be issued as well. After the file or directory has been created afgetattr
/getattr
is issued to getstat
information to return to the FSD. - Open
-
This is used to open a new file or directory. First a
fgetattr
/getattr
is issued. If the file is not a directory it is followed byopen
. If the file is a directory it is followed byopendir
. - Overwrite
-
This is used to overwrite a file when one of the
FILE_OVERWRITE
,FILE_SUPERSEDE
orFILE_OVERWRITE_IF
flags has been set. Mapped toftruncate
/truncate
. - Cleanup
-
Mapped to
unlink
when deleting a file andrmdir
when deleting a directory. - Close
-
Mapped to
flush
;`release` when closing a file andreleasedir
when closing a directory. - Read
-
Mapped to
read
. - Write
-
Mapped to
fgetattr
/getattr
andwrite
. - Flush
-
Mapped to
fsync
orfsyncdir
. - GetFileInfo
-
Mapped to
fgetattr
/getattr
. - SetBasicInfo
-
Mapped to
utimens
/utime
. - SetAllocationSize
-
Mapped to
fgetattr
/getattr
followed byftruncate
/truncate
. Note that this call andSetFileSize
may be consolidated soon in the WinFsp API. - SetFileSize
-
Mapped to
fgetattr
/getattr
followed byftruncate
/truncate
. Note that this call andSetAllocationSize
may be consolidated soon in the WinFsp API. - CanDelete
-
For directories only: mapped to a
getdir
/readdir
call to determine if they are empty and can therefore be deleted. - Rename
-
Mapped to
fgetattr
/getattr
on the destination file name andrename
. - GetSecurity
-
Mapped to
fgetattr
/getattr
. The returnedstat
information is translated into a Windows security descriptor usingFspPosixMapPermissionsToSecurityDescriptor
. - SetSecurity
-
Mapped to
fgetattr
/getattr
followed bychmod
and/orchown
. - ReadDirectory
-
Mapped to
getdir
/readdir
. Note that because of how the Windows directory enumeration API’s work there is a furtherfgetattr
/getattr
per file returned!
Some Additional Challenges
Let us now discuss a couple of final challenges in getting a proper FUSE port working under Cygwin: the implementation of fuse_set_signal_handlers
/fuse_remove_signal_handlers
and fuse_daemonize
.
Let us start with fuse_set_signal_handlers
/fuse_remove_signal_handlers
. Cygwin supports POSIX signals and we can simply set up signal handlers similar to what libfuse does. However this simple approach does not work within WinFsp, because it uses native API’s that Cygwin cannot interrupt with its signal mechanism. For example, the fuse_loop
FUSE call eventually results in a WaitForSingleObject
API call that Cygwin cannot interrupt. Even trying with an alertable WaitForSingleObjectEx
did not work as unfortunately Cygwin does not issue a QueueUserAPC
when issuing a signal. So we need an alternative mechanism to support signals.
The alternative is to use sigwait
in a separate thread. Fsp_fuse_signal_handler
is a WinFsp API that knows how to interrupt that WaitForSingleObject
(actually it just signals the waited event).
static inline void *fsp_fuse_signal_thread(void *psigmask) { int sig; if (0 == sigwait(psigmask, &sig)) fsp_fuse_signal_handler(sig); return 0; }
Let us now move to fuse_daemonize
. This FUSE call allows a FUSE file system to become a (UNIX) daemon. This is achieved by using the POSIX fork call, which unfortunately has many limitations in Cygwin. One such limitation (and the one that bit us in WinFsp) is that it does not know how to clone Windows heaps (HeapAlloc
/HeapFree
).
Recall that WinFsp uses its own memory allocator (just a thin wrapper around HeapAlloc
/HeapFree
). This means that any allocations made prior to the fork() call are doomed after a fork(); with good luck the pointers will point to invalid memory and one will get an Access Violation; with bad luck the pointers will point to valid memory that contains bad data and the program may stumble for a while, just enough to hide the actual cause of the problem.
Luckily there is a rather straightforward work-around: "do not allocate any non-Cygwin resources prior to fork". This is actually possible within WinFsp, because we are already capturing the Cygwin environment and its malloc
/free
(see fsp_fuse_env
in "Step 2"). It is also possible, because the typical FUSE program structure looks like this:
fuse_new fuse_daemonize // do not allocate any non-Cygwin resources prior to this fuse_loop/fuse_loop_mt // safe to allocate non-Cygwin resources fuse_destroy
With this change fuse_daemonize
works and allows me to declare the Cygwin portion of the SSHFS port complete!
Step 5: POSIX special files
Although WinFsp now has a working FUSE implementation there remains an important problem: how to handle POSIX special files such as named pipes (FIFO), devices (CHR, BLK), sockets (SOCK) or symbolic links (LNK).
While Windows has support for symbolic links (LNK) there is no direct support for other POSIX special files. The question then is how to represent such files when they are accessed by Windows. This is especially important to systems like Cygwin that understand POSIX special files and can even create them.
Cygwin normally emulates symbolic links and special files using special shortcut (.lnk) files. However many FUSE file systems support POSIX special files; it is desirable then that applications, like Cygwin, that understand them should be able to create and access them without resorting to hacks like using .lnk files.
The problem was originally mentioned by Herbert Stocker on the Cygwin mailing list:
The mkfifo system call will have Cygwin create a .lnk file and WinFsp will forward it as such to the file system process. The system calls readdir or open will then have the file system process tell WinFsp that there is a .lnk file and Cygwin will translate this back to a fifo, so in this sense it does work.
But the file system will see a file (with name *.lnk) where it should see a pipe (mknod call with \'mode' set to S_IFIFO). IMHO one could say this is a break of the FUSE API.
Practically it will break:
File systems that special-treat pipe files (or .lnk files).
If one uses sshfs to connect to a Linux based server and issues the command mkfifo foo from Cygwin, the server will end up with a .lnk file instead of a pipe special file.
Imagine something like mysqlfs, which stores the stuff in a database. When you run SQL statements to analyze the data in the file system, you won’t see the pipes as such. Or if you open the file system from Linux you’ll see the .lnk files.
Herbert is of course right. A .lnk file is not a FIFO to any application other than Cygwin. We need a better mechanism for representing special files. One such mechanism is reparse points.
Reparse points can be viewed as a form of special metadata that can be attached to a file or directory. The interesting thing about reparse points is that they can have special meaning to a file system driver (NTFS/WinFsp), a filter driver (e.g. a hierarchical storage system) or even an application (Cygwin).
Symbolic links are already implemented as reparse points on Windows. We could perhaps define a new reparse point type for representing POSIX special files. Turns out that this is unnecessary, because Microsoft has already defined a reparse point type for special files on NFS: https://msdn.microsoft.com/en-us/library/dn617178.aspx
It is a relatively straightforward task then to map reparse point operations into their FUSE equivalents:
- GetReparsePoint
-
Mapped to
getattr
/fgetattr
and possiblyreadlink
(in the case of a symbolic link). The returnedstat.st_mode
information is transformed to the appropriate reparse point information. - SetReparsePoint
-
Mapped to
symlink
ormknod
depending on whether a symbolic link or other special file is created.