Opened 16 years ago

Closed 16 years ago

Last modified 8 years ago

#16145 closed defect (fixed)

Erlang R12B-3 needs a patch for fsync problem

Reported by: jerry.jalava@… Owned by: bfulgham@…
Priority: Normal Milestone:
Component: ports Version: 1.6.0
Keywords: Cc: henri.bergius@…, jerry.jalava@…, febeling@…, jyrkiwahlstedt
Port: erlang

Description (last modified by ryandesign (Ryan Carsten Schmidt))

The following has been taken from the erlang-patches mailing list. (http://www.erlang.org/pipermail/erlang-patches/2008-July/000258.html)


Dear OTP Team,

traditionally, on UNIX systems the fsync() system call is used to flush any filesystem buffers to disk. The file::sync() function triggers a fsync() call in OTP R12B-3

On Darwin (and Mac OS X) systems, the fsync() system call exists, but does not guarantee the writing to disk. The fcntl(F_FULLFSYNC) function call exists to achieve this. See http://developer.apple.com/documentation/Darwin/Reference/ManPages/man2/fsync.2.html and http://developer.apple.com/documentation/Darwin/Reference/ManPages/man2/fcntl.2.html for reference.

The patch below adds some #ifdefs to figure out if it is compiled on a Darwin system and uses the fcntl() function instead of the fsync() function.

Please note my poor understanding of how your build system actually detects the target host. My way of detecting Darwin was shamelessly ripped from elsewhere in the source tree and might be wrong in this case (although testing was successful). Please feel free to make any necessary changes in case you choose to integrate the patch.

I'm working on the CouchDB project (yes, a database in Erlang), and we require a reliable file::sync()-behaviour for data consistency.

Adding this to OTP would be highly appreciated.

Cheers
Jan & the CouchDB Team

--- erts/emulator/drivers/unix/unix_efile.c.orig	2008-07-17  20:44:23.000000000 +0200
+++ erts/emulator/drivers/unix/unix_efile.c	2008-07-17  20:44:21.000000000 +0200
@@ -44,6 +44,14 @@
  #endif
  #endif /* _OSE_ */

+#if defined(__APPLE__) && defined(__MACH__) && !defined(__DARWIN__)
+#define DARWIN 1
+#endif
+
+#ifdef DARWIN
+#include <fcntl.h>
+#endif /* DARWIN */
+
  #ifdef VXWORKS
  #include <ioLib.h>
  #include <dosFsLib.h>
@@ -818,7 +826,11 @@
    undefined fsync
  #endif /* VXWORKS */
  #else
+#if defined(DARWIN) && defined(F_FULLFSYNC)
+    return check_error(fcntl(fd, F_FULLFSYNC), errInfo);
+#else
      return check_error(fsync(fd), errInfo);
+#endif /* DARWIN */
  #endif /* NO_FSYNC */
  }

Until this hopefulle gets added to the OTP at some point it would be great to include it in Port.

Regards,
Jerry

Change History (9)

comment:1 Changed 16 years ago by febeling@…

Cc: henri.bergius@… jerry.jalava@… added; bfulgham@… removed
Owner: changed from macports-tickets@… to bfulgham@…

put authors of dup (#16151) into CC and assign to maintainer

comment:2 Changed 16 years ago by febeling@…

Cc: febeling@… added

Cc Me!

comment:3 Changed 16 years ago by jyrkiwahlstedt

Cc: jwa@… added

Cc Me!

comment:4 Changed 16 years ago by febeling@…

This is really a nasty bug in the upstream package, and I vote for fixing it here with a patch.

comment:5 Changed 16 years ago by bfulgham@…

The port has been modified to incorporate this patch. Could someone doing CouchDB or other fsync-related work please update and confirm it provides the desired correction?

comment:6 Changed 16 years ago by febeling@…

It is a bit hard to tell because the errors you get are potentially dependent load and io patters of the OS kernel.

I had problems in fairly simple test cases though, and those are indeed gone now. Just for the record, the whole problem is not something esoteric which only couchdb encounters, but it is simply the behavior of the function file:sync/1 from kernel. OTOH, one can't really speak of an erlang bug, because the documented behavior explicitly makes no guarantee:

sync(IoDevice) -> ok | {error, Reason}
Types:

IoDevice = io_device()
Reason = ext_posix() | terminated
Makes sure that any buffers kept by the operating system (not by the Erlang runtime system) are written to disk. On some platforms, this function might have no effect.

But I don't see how you could program any serious disk persistence without this function. This is also the view held in the fsync man page from Apple.

After the patch I used dtruss to watch syscalls while syncing a file from an erlang shell and this confirms, that we call fctnl now

79500/0x7f944f0:  fcntl(0x7, 0x33, 0xFFFFFFFFB01A8BD8)		 = 0 0

0x33 or 51 is the value of F_FULLSYNC

./sys/fcntl.h:206:#define F_FULLFSYNC     51		/* fsync + ask the drive to flush to the media */

And that was the intention of the patch. So I think this issue could be closed.

comment:7 Changed 16 years ago by febeling@…

I asked on the erlang list, they probably apply the patch upstream:

http://www.erlang.org/pipermail/erlang-questions/2008-August/037173.html

comment:8 Changed 16 years ago by bfulgham@…

Resolution: fixed
Status: newclosed

comment:9 Changed 8 years ago by ryandesign (Ryan Carsten Schmidt)

Description: modified (diff)
Port: erlang added
Note: See TracTickets for help on using tickets.